File: branch_analysis.hlp

package info (click to toggle)
arb 6.0.2-1%2Bdeb8u1
  • links: PTS, VCS
  • area: non-free
  • in suites: jessie
  • size: 65,916 kB
  • ctags: 53,258
  • sloc: ansic: 394,903; cpp: 250,252; makefile: 19,620; sh: 15,878; perl: 10,461; fortran: 6,019; ruby: 683; xml: 503; python: 53; awk: 32
file content (135 lines) | stat: -rw-r--r-- 6,236 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
#Please insert up references in the next lines (line starts with keyword UP)
UP	arb.hlp
UP	glossary.hlp

#Please insert subtopic references  (line starts with keyword SUB)
#SUB	subtopic.hlp

# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}

#************* Title of helpfile !! and start of real helpfile ********
TITLE		Branch analysis

OCCURRENCE	ARB_NT/Tree/Branch analysis

DESCRIPTION     Branch analysis functions

                Functions provided here may be useful to 
                - detect wrong placed species or groups (or poor data, which might also lead to wrong placement)
                - detect anomalies caused by (wrong) tree reconstruction
                - gather information about tree topologies.

                Each function reports several values gathered during execution.

SECTION         Analyse distances in tree

                For the whole tree
                 - the in-tree-distance (ITD = sum of all branchlengths) and
                 - the per-species-distance (PSD = ITD / number of species) are displayed.

                For all leafs in the tree the following values are calculated:
                 - mean distance to all other leafs
                 - minimum distance to any other leaf
                 - maximum distance to any other leaf
                
                It reports the mean and the range of each of these 3 values separately
                for all and for marked species.

SECTION         Using the PSD to compare trees

                The PSD is useful when comparing tree topologies (based on similar sets
                of species) that were reconstructed using different methods. Imagine
                you have two trees:

                 - tree_raxml (reconstructed with RAxML)
                 - tree_arbpars (reconstructed with ARB parsimony)

                The PSDs of both trees will be quite different (maybe by factor 50 or 60).
                Calculating the ratio of both PSDs, give you a good value for scaling
                the branchlengths of (a copy of) one of the trees. For example the PSDs
                might be

                 - PSDr = PSD(tree_raxml)    =  0.002219
                 - PSDp = PSD(tree_arbpars)  =  0.118483

                Now you may scale the branchlengths of tree_arbpars by factor 0.01873 (=PSDr/PSDp) or those
                of tree_raxml by factor 53.39 (==PSDp/PSDr) to ease comparison of the two trees.

SECTION         Mark long branches

                For each furcation in the tree, the relative difference between the distances of its
                subtrees is calculated.

                'Distance' here is the sum of all branches between the furcation and
                the least distant leaf of the left resp. right subtree.

                Relative difference     Meaning
                                        The nearer subtree has at least 
                10%                     90%
                50%                     50%
                75%                     25%
                90%                     10%
                                        of the farther subtrees distance.

                Starting from the tree-tips, this function marks the more distant subtree of any furcation
                where the relative and absolute difference are above the specified minimas.

                When a subtree has been marked, all further furcations between that subtree and the root
                of the whole tree will be ignored.

                Poorly aligned sequences often result in long branches in the tree. Being able to identify
                those branches quickly helps to find those sequences.

                The indented workflow is
                * search long branches
                * check alignment and data and fix any problems
                * recalculate tree parts (see LINK{pa_add.hlp})
                * search again. Now you may find other branches, nearer to the tree root.

SECTION         Mark deep leafs

                Marks alls leafs in tree that have
                - depth above min.depth and
                - root-distance above min.root-distance

                'depth' is the number of branches between root and leaf. Multifurcations
                are respected properly.

                The 'root-distance' is the sum of the lengths of all branches between the root and
                a leaf.

SECTION         Mark degenerated branches

                Branches are considered degenerated when two subtrees of an inner node
                differ in size (=number of members) by a reasonable factor.

                This function allows you to specify that degeneration factor.

                For each degenerated inner node, the smaller subtree will be marked as whole.
                The not-marked subtree will be examined for further degenerated nodes.

                Common reasons for degenerated trees:
                - subsequently adding species using the 'quick add marked'-feature of
                  ARB parsimony without ever optimizing the whole tree.
                - some "phylogenetic areas" are explored more thoroughly than others, resulting in
                  unbalanced representation of the evolution as it took place.
                  This is especially relevant if your database contains many clone-variants and you try
                  to calculate a tree.

                Solutions:
                - Optimize your tree. For big trees you might try to
                  - mark questionable species using this function and then
                  - perform local/global optimization of marked species in ARB parsimony.
                - Replace over-represented areas by one or few representatives (see also LINK{di_clusters.hlp}).
                  Calculate a new or optimize an existing tree with that subset of species.
                  Then quick-add previously removed species into that tree.

NOTES		To compare the information of two or more trees,
                open new ARB_NT-window using 'File/New window' and popup their
                'Branch analysis'-windows.

EXAMPLES	None

WARNINGS	None

BUGS		No bugs known