File: mcf_alignment_path_adder.hh

package info (click to toggle)
last-align 1651-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 14,688 kB
  • sloc: cpp: 44,419; python: 5,217; ansic: 1,938; sh: 710; makefile: 457
file content (66 lines) | stat: -rw-r--r-- 2,124 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
// Author: Martin C. Frith 2021
// SPDX-License-Identifier: GPL-3.0-or-later

// This calculates a kind of maximum local similarity score between
// two sequences.

// These inputs are parameters of an alignment probability model,
// Model A in [Fri20]:
// substitutionProbs: S' in [Fri20] section 2.1
// delInitProb: a'_D in [Fri20] section 5.1
// delNextProb: b'_D in [Fri20] section 5.1
// insInitProb: a'_I in [Fri20] section 5.1
// insNextProb: b'_I in [Fri20] section 5.1

// Any path through the model corresponds to a local alignment.  The
// model defines a probability for each path.  A path's "probability
// ratio" is its probability divided by the probability of a "null"
// path [Fri20].

// This code calculates the maximum, over coordinates i in sequence 1
// and j in sequence 2, sum of probability ratios of all paths passing
// through (i,j).  This is just like [Yao21] section 2.5, but for
// model A in [Fri20].

// The first "border" letters in both sequences are treated as a
// border, like Figure 2 in [Alt01].  With no borders, an edge effect
// reduces the score.  With 4 borders (but without "anchoring" like in
// [Alt01]), an edge effect increases the score.  With 2 borders,
// these edge effects somewhat cancel each other.

// [Fri20]: "How sequence alignment scores correspond to probability
// models", MC Frith 2020, Bioinformatics 36(2):408-415

// [Yao21]: "Improved DNA-versus-protein homology search for protein
// fossils", Y Yao & MC Frith

// [Alt01]: "The estimation of statistical parameters for local
// alignment score distributions", SF Altschul et al. NAR 29:351-361

#ifndef MCF_ALIGNMENT_PATH_ADDER_HH
#define MCF_ALIGNMENT_PATH_ADDER_HH

#include <vector>

namespace mcf {

typedef const double *const_dbl_ptr;

class AlignmentPathAdder {
public:
  typedef unsigned char uchar;

  double maxSum(const uchar *seq1, int len1,
		const uchar *seq2, int len2,
		const const_dbl_ptr *substitutionProbs,
		double delInitProb, double delNextProb,
		double insInitProb, double insNextProb,
		int border = 0);

private:
  std::vector<double> dynamicProgrammingValues;
};

}

#endif