File: REVshared.bf

package info (click to toggle)
hyphy 2.5.69%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 26,728 kB
  • sloc: cpp: 81,964; xml: 467; lisp: 341; python: 166; javascript: 117; sh: 106; makefile: 87; ansic: 86
file content (140 lines) | stat: -rw-r--r-- 2,406 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
/* This is an example HY-PHY Batch File.



   It reads in a '#' nucleotide dataset data/hiv.nuc and estimates

   maximum ln-likelihood based on the tree contained in the data file,

   using the General Reversible model with the parameters

   shared by all branches.

   

   Output is printed out as a Newick Style tree with branch lengths

   representing the number of expected substitutions per branch (which

   is the default setting for nucleotide models w/o rate variation).

   

   

   Sergei L. Kosakovsky Pond and Spencer V. Muse 

   December 1999. 

*/



/* 1. Read in the data and store the result in a DataSet variable.*/



DataSet 		nucleotideSequences = ReadDataFile ("data/hiv.nuc");

   

/* 2. Filter the data, specifying that all of the data is to be used

	  and that it is to be treated as nucleotides.*/

	  

DataSetFilter	filteredData = CreateFilter (nucleotideSequences,1);



/* 3. Collect observed nucleotide frequencies from the filtered data. observedFreqs will

	  store the vector of frequencies. */



HarvestFrequencies (observedFreqs, filteredData, 1, 1, 1);



/* 4. Define the KHY substitution matrix. '*' is defined to be -(sum of off-diag row elements).

	  AG,AT,CG,CT,GT are the shared parameters, representing the ratio
	  of corresponding substitution rates to the AC rate.
	  t is the "branch length"*/



global AG;

global AT;

global CG;

global CT;

global GT;



REVRateMatrix = 

		{{*,t,AG*t,AT*t}

		 {t,*,CG*t,CT*t}

		 {AG*t,CG*t,*,GT*t}

		 {AT*t,CT*t,GT*t,*}};

		 

/*5.  Define the REV model, by combining the substitution matrix with the vector of observed (equilibrium)

	  frequencies. */

	  

Model REV	 = (REVRateMatrix, observedFreqs);



/*6.  Now we can define the tree variable, using the tree string read from the data file,

	  and, by default, assigning the last defined model (REV) to all tree branches. */

	  

Tree	givenTree = DATAFILE_TREE;



/*7.  Since all the likelihood function ingredients (data, tree, equilibrium frequencies)

	  have been defined we are ready to construct the likelihood function. */

	  

LikelihoodFunction  theLnLik = (filteredData, givenTree);



/*8.  Maximize the likelihood function, storing parameter values in the matrix paramValues */



Optimize (paramValues, theLnLik);



/*9.  Print the tree with optimal branch lengths to the console. */



fprintf  (stdout, theLnLik);