1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
The "background probabilities" (by default, dinucleotide
probabilities) that Sigma uses are, if no other options exist,
extracted from the input sequence itself. This is generally not
advisable: it is better to estimate them from larger quantities of
similar sequence (eg, all intergenic sequence in the organism of
interest). Sigma offers two options for this: with the -b option,
one can supply an auxiliary file (in fasta format) containing such
sequences, or with the -B option, one can supply a file containing
just the single-nucleotide and di-nucleotide frequencies. An
example is in this directory. Each line either begins with # (and
is ignored), or contains a single nucleotide or di-nucleotide,
whitespace, and the frequency. All possible single- and
di-nucleotides should be present; tri-nucleotides and higher will be
ignored.
An example file is yeast_bg in this directory and corresponds to
intergenic sequence in S. cerevisiae.
|