Friday, November 9, 2012

Phylogenetic Tree Generation for 200/2133 sequences


Abstract

Traditional phylogenetic tree viewing method includes different graph of following:
Rectangular phylogram, rectangular cladogram, circular phylogram, circular cladogram, slanted phylogram and slated cladogram.
All these methods can present the distance from actual sequence (leaf node) to their parents (internal node) precisely. However, there is no way to observe the corrleations between the leaf nodes.
We proposed a new way of presenting phylogenetic trees, where not only the evolution of the species is shown, but also the relations between different taxas can be observed directly to each other (not through their ancestors). The spherical phylogram and cuboid cladogram (previously called side tree) has been added to our dimension reduction result for thousands of sequences. Not only we used a neighbor joining algorithm to show the structure of the tree based on that, but also we use interpolation for other phylogenetic tree generation method, such as Raxml, Ninja and Fast Tree. 

Description

Environment: Single machine for 200, PolarGrid for 2133 sequences
Aligner: SmithWaterman
ScoringMatrix: EDNAFULL
GapOpen: -16
GapExt: -4
DistanceType: Percentage Identity
Method: all varied
WithReverse: dynamic determine algorithm
Multiple Sequence Alignment: Clustal Omega, Clustal W2 and Muscle
Tree Generation Method: Raxml, Fast Tree, Ninja and Neighbor Joining.

Dataset

1. 200 sequences:
1) 126 new centers from fungi454 (id: 0 ~ 125)
2) 74 sequences from GenBank; (id: 126 ~ 199)
2. 2133 sequences:
1) 126 new centers from fungi454 (id: 0 ~ 125)
2) 74 sequences from GenBank; (id: 126 ~ 199)
3) 988 unique sequences from Wittiya; (id: 200 ~ 1187, original 1013 sequences)
4) 945 unique sequences from Kruger; (id: 1188 ~ 2132, original 1130 sequences)

Tree Configuration

Level configuration: 
larger size means higher level in tree
Color scheme:
1) Spherical Phylogram
fungi454: Green
GenBank: Orange
Wittiya: Yellow
Kruger: Blue
2) Cuboid Cladogram
Colored by branches, 200 sequence set use 20 points per branch, 2133 sequence set use 300 points per branch

Final Result

Speherical Phylogram and Cuboid Cladogram are in pviz format, need to use PlotViz3 to open. The Rectangle Cladogram is in pdf format.
1. 200 sequence set result:
1) Clustal Omega with Fast Tree: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
2) Clustal Omega with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
3) Clustal W2 with Fast Tree: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
4) Clustal W2 with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
5) Muscle with Fast Tree: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
6) Muscle with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
7) Ninja with Original SWG Pid Distance: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
8) Ninja with 10D Pid Distance: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
9) Neighbor Joining with 3D plot Distance: Spherical Phylogram

2. 2133 sequence set result:
1) Clustal Omega 1 iteration with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
2) Clustal Omega 1 iteration (Anna's edition) with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
3) Clustal Omega 10 iterations with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
4) Clustal Omega 20 iterations with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
5) Muscle 2 iterations with Raxml: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
6) Ninja with Original SWG Pid Distance: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
7) Ninja with 3D plot Distance: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
8) Ninja with 10d Pid Distance: Spherical Phylogram/Cuboid Cladogram/Rectangle Cladogram
9) Neighbor Joining with 3D plot Distance: Spherical Phylogram

Screen Shot

With PlotViz, we are able to show sequences in a spherical phylogram with their name shown or not, here is a screen shot from 200 sample neighbor joining algorithm from 3D plot distance:

The cuboid plot from Ninja using 3D plot distance is shown as below:
Part of the cuboid tree can be hidden to make some more interesting branches shown in a clear fashion:








4 comments:

  1. Hi,
    Thanks for posting your work online!
    I am a masters and student and currently working on my thesis. It involves generation of phylogenetic tree for which I need some datasets. Would it be possible for you to give the 200 and 2133 sets as they would be immensely useful for me too...

    Thank you

    ReplyDelete
    Replies
    1. Aniket, we are sorry, but these data sets are not ours. So we don't have the right to hand it over to you, but you if you are looking for sequences that are available publicly you may consider looking in NCBI (http://www.ncbi.nlm.nih.gov/)

      Delete
  2. Hi,

    do you provide any of your phylogenetic calculations as newick/phylip/... formatted files, too? Thanks.

    ReplyDelete
    Replies
    1. Hi pyr0, currently we only provide the visualization result of the data but not the actually data files including all other format of files. If you could tell us the reason you need that we might be able to provide that in private.

      Delete