File: todo.cpp

package info (click to toggle)
gentle 1.9+cvs20100605+dfsg-2
  • links: PTS, VCS
  • area: contrib
  • in suites: squeeze
  • size: 12,264 kB
  • ctags: 5,235
  • sloc: cpp: 41,571; ansic: 3,978; sh: 1,420; makefile: 291
file content (87 lines) | stat: -rwxr-xr-x 2,971 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
/*
Bugs :
* Bericht drucken : lange Zeilen in Vektor-Beschreibung umbrechen
* Primer an Enden von linearen St�cken matchen nicht
* Features in Alignment-Ausdruck verschoben
* Restriktion mit �berlappenden Restriktionssites

WONTFIX:
* Absturz bei Inline-Plot-�nderung : Ncoils-bug

Feature Requests :
* Dot-Plot (DNA)
* Optimierung �berhang-Primer
* Primer-Bearbeiten-Dialog : Spalten klicken => sortieren
* Primer secondary structure prediction
* Homology Plot
* Export plot data
* Helical wheel
* That ligation thingy 

* Alignment-Farben wie bei ABI
* Save all
* Feature names movable / keep position with changes
* Change start number for alignments

SEE: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

----
http://www.atgc.org/GenomePixelizer/Examples/GenoPixExamples.html

----

EMBOSS:
* codon usage statistic
* antigenic prediction
* dotplot
* PEST motif prediction
* transmembrane region prediction


----
________________________________________________________________________________
DONE:
ALGORITHM for finding siRNA duplexes in mRNA:
for each input sequence:

    find the start position of the CDS in the feature table
    if there is no such CDS, take the -sbegin position as the CDS start

    for each 23 base window along the sequence:

        set the score for this window = 0
        if base 2 of the window is not 'a': ignore this window
        if the window is within 50 bases of the CDS start: ignore this window
	if the window is within 100 bases of the CDS: score = -2
	measure the %GC of the 20 bases from position 2 to 21 of the window
	for the following %GC values change the score:
		%GC <= 25% (<= 5 bases): ignore this window
		%GC 30% (6 bases): score + 0
		%GC 35% (7 bases): score + 2
		%GC 40% (8 bases): score + 4
		%GC 45% (9 bases): score + 5
		%GC 50% (10 bases): score + 6
		%GC 55% (11 bases): score + 5
		%GC 60% (12 bases): score + 4
		%GC 65% (13 bases): score + 2
		%GC 70% (14 bases): score + 0
		%GC >= 75% (>= 15 bases): ignore this window



	if the window starts with a 'AA': score + 3
	if the window does not start 'AA' and it is required: ignore this window
	if the window ends with a 'TT': score + 1
	if the window does not end 'TT' and it is required: ignore this window
	if 4 G's in a row are found: ignore this window
	if any 4 bases in a row are present and not required: ignore this window
	if PolIII probes are required and the window is not NARN(17)YNN: ignore this window
        if the score is > 0: store this window for output
	
    sort the windows found by their score
    output the 23-base windows to the sequence file
    if the 'context' qualifier is specified, output window bases 1 and 2 in brackets to the report file
    take the window bases 3 to 21, add 'dTdT' output to the report file
    take the window bases 3 to 21, reverse complement, add 'dTdT' output to the report file

*/