Taxonomy tree!

Yesterday I wasted invested several hours in making this figure.  I used the amazing “Common Taxonomy Tree” through GenBank to type in the genus of each predator.  Then opened my new phylogeny in Mesquite (I have *never* used Mesquite before! or at least not since 3rd year) I popped in some extra polytomies for morphospecies of predators which are not identified to species (several are actually new to science!).  I felt pretty self-satisfied

Then I got to spend several hours trying to get the thing into R!  Turns out the trouble was branch lengths for my newly-added branches.  I eventually succeeded with this “arbitrarily ultrametric” tree.  It contains very little information — this is good, because we have very little information.  In this ‘tree’, the distances are only supposed to rank species by their relatedness.  So we see, for example, that Monopelopia and Bezzia are more closely related to each other than either is to Culex, which is satisfying.  It also shows the leeches (Hirudinidae) as being very far from everything else — though not so far as they actually are, since the division between these groups goes VERY far back.

This might not be quite the right way to include taxonomic information in an analysis.  But it IS approximately the one I was picturing in my head for weeks.

predator.phylogeny.cardoso2008

5 thoughts on “Taxonomy tree!

  1. Seems reasonable as long as you’re up front about it, and about how the ordering at the next levels were done (families within orders?). Make sure that you do have a polytomy for every classification bit that you don’t know (three genera in one family, say). One thing you might do is include only one species per genus, to make it clear that it’s just for imagining relationships among them. Also, in R, if you do phy$edge.length <- NULL to delete branch lengths, the tree will always be plotted ultrametric (with arbitrary branch lengths). However, if they're all on genbank, you might just grab whatever the standard animal mitochondrial marker is and throw it at raxml.

  2. How do you add polytomies you ask on twitter (https://twitter.com/polesasunder/status/309058965067665410). I had some code floating around that did this, but it was fiddly and I can’t find it. Perhaps slightly less fiddly is to edit the newick string directly. Your tree is (ignoring the different species):

    plot(read.tree(text=”(H,(L,((T,(E,D)),(C,(B,M)))));”))

    You can just remove parentheses to collapse things into clades:

    plot(read.tree(text=”(H,(L,((T,E,D),(C,(B,M)))));”))

  3. Hey Andrew!
    It has been awhile without a chat! Where are you now? Glad I found your new blog.
    Just wanted to add my two cents. If you were interested in adding a bit of information about taxonomic distance into the tree, you could try adding branch-lengths based on the time-tree of life (http://www.timetree.org). Type in any two taxa and the site will spit out all of the available estimates for time of divergence from the literature. As an example (http://www.timetree.org/index.php?taxon_a=tabanidae&taxon_b=bezzia&submit=Search). You would still have to decide which estimate to use (or how to combine multiple estimates). I suppose it depends on what you are using the tree for.

    1. Thanks for the resource, Russell! Looks very useful. I’ll have to make a more detailed post about what I mean to use it for …

Comments are closed.