1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130
|
/* This is an example HY-PHY Batch File.
It reads in a '#' nucleotide dataset data/hiv.nuc and estimates
maximum ln-likelihood based on the tree contained in the data file,
using Hasegawa et all 85 (HKY 85) model with transition/transversion ratio
shared by all branches.
Output is printed out as a Newick Style tree with branch lengths
representing the number of expected substitutions per branch (which
is the default setting for nucleotide models w/o rate variation).
Sergei L. Kosakovsky Pond and Spencer V. Muse
December 1999.
*/
/* 1. Read in the data and store the result in a DataSet variable.*/
DataSet nucleotideSequences = ReadDataFile ("data/hiv.nuc");
/* 2. Filter the data, specifying that all of the data is to be used
and that it is to be treated as nucleotides.*/
DataSetFilter filteredData = CreateFilter (nucleotideSequences,1);
/* 3. Collect observed nucleotide frequencies from the filtered data. observedFreqs will
store the vector of frequencies. */
HarvestFrequencies (observedFreqs, filteredData, 1, 1, 1);
/* 4. Define the KHY substitution matrix. '*' is defined to be -(sum of off-diag row elements).
The variable R is the global transition/transversion ratio. */
global R;
HKY85RateMatrix =
{{*,trvs,R*trvs,trvs}
{trvs,*,trvs,R*trvs}
{R*trvs,trvs,*,trvs}
{trvs,R*trvs,trvs,*}};
/*5. Define the HKY85 model, by combining the substitution matrix with the vector of observed (equilibrium)
frequencies. */
Model HKY85 = (HKY85RateMatrix, observedFreqs);
/*6. Now we can define the tree variable, using the tree string read from the data file,
and, by default, assigning the last defined model (HKY85) to all tree branches. */
Tree givenTree = DATAFILE_TREE;
/*7. Since all the likelihood function ingredients (data, tree, equilibrium frequencies)
have been defined we are ready to construct the likelihood function. */
LikelihoodFunction theLnLik = (filteredData, givenTree);
/*8. Maximize the likelihood function, storing parameter values in the matrix paramValues */
Optimize (paramValues, theLnLik);
/*9. Print the tree with optimal branch lengths to the console. */
fprintf (stdout, theLnLik);
|