1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135
|
#Please insert up references in the next lines (line starts with keyword UP)
UP arb.hlp
UP glossary.hlp
#Please insert subtopic references (line starts with keyword SUB)
#SUB subtopic.hlp
# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}
#************* Title of helpfile !! and start of real helpfile ********
TITLE Branch analysis
OCCURRENCE ARB_NT/Tree/Branch analysis
DESCRIPTION Branch analysis functions
Functions provided here may be useful to
- detect wrong placed species or groups (or poor data, which might also lead to wrong placement)
- detect anomalies caused by (wrong) tree reconstruction
- gather information about tree topologies.
Each function reports several values gathered during execution.
SECTION Analyse distances in tree
For the whole tree
- the in-tree-distance (ITD = sum of all branchlengths) and
- the per-species-distance (PSD = ITD / number of species) are displayed.
For all leafs in the tree the following values are calculated:
- mean distance to all other leafs
- minimum distance to any other leaf
- maximum distance to any other leaf
It reports the mean and the range of each of these 3 values separately
for all and for marked species.
SECTION Using the PSD to compare trees
The PSD is useful when comparing tree topologies (based on similar sets
of species) that were reconstructed using different methods. Imagine
you have two trees:
- tree_raxml (reconstructed with RAxML)
- tree_arbpars (reconstructed with ARB parsimony)
The PSDs of both trees will be quite different (maybe by factor 50 or 60).
Calculating the ratio of both PSDs, give you a good value for scaling
the branchlengths of (a copy of) one of the trees. For example the PSDs
might be
- PSDr = PSD(tree_raxml) = 0.002219
- PSDp = PSD(tree_arbpars) = 0.118483
Now you may scale the branchlengths of tree_arbpars by factor 0.01873 (=PSDr/PSDp) or those
of tree_raxml by factor 53.39 (==PSDp/PSDr) to ease comparison of the two trees.
SECTION Mark long branches
For each furcation in the tree, the relative difference between the distances of its
subtrees is calculated.
'Distance' here is the sum of all branches between the furcation and
the least distant leaf of the left resp. right subtree.
Relative difference Meaning
The nearer subtree has at least
10% 90%
50% 50%
75% 25%
90% 10%
of the farther subtrees distance.
Starting from the tree-tips, this function marks the more distant subtree of any furcation
where the relative and absolute difference are above the specified minimas.
When a subtree has been marked, all further furcations between that subtree and the root
of the whole tree will be ignored.
Poorly aligned sequences often result in long branches in the tree. Being able to identify
those branches quickly helps to find those sequences.
The indented workflow is
* search long branches
* check alignment and data and fix any problems
* recalculate tree parts (see LINK{pa_add.hlp})
* search again. Now you may find other branches, nearer to the tree root.
SECTION Mark deep leafs
Marks alls leafs in tree that have
- depth above min.depth and
- root-distance above min.root-distance
'depth' is the number of branches between root and leaf. Multifurcations
are respected properly.
The 'root-distance' is the sum of the lengths of all branches between the root and
a leaf.
SECTION Mark degenerated branches
Branches are considered degenerated when two subtrees of an inner node
differ in size (=number of members) by a reasonable factor.
This function allows you to specify that degeneration factor.
For each degenerated inner node, the smaller subtree will be marked as whole.
The not-marked subtree will be examined for further degenerated nodes.
Common reasons for degenerated trees:
- subsequently adding species using the 'quick add marked'-feature of
ARB parsimony without ever optimizing the whole tree.
- some "phylogenetic areas" are explored more thoroughly than others, resulting in
unbalanced representation of the evolution as it took place.
This is especially relevant if your database contains many clone-variants and you try
to calculate a tree.
Solutions:
- Optimize your tree. For big trees you might try to
- mark questionable species using this function and then
- perform local/global optimization of marked species in ARB parsimony.
- Replace over-represented areas by one or few representatives (see also LINK{di_clusters.hlp}).
Calculate a new or optimize an existing tree with that subset of species.
Then quick-add previously removed species into that tree.
NOTES To compare the information of two or more trees,
open new ARB_NT-window using 'File/New window' and popup their
'Branch analysis'-windows.
EXAMPLES None
WARNINGS None
BUGS No bugs known
|