File: last-scripts.txt

package info (click to toggle)
last-align 128-1
  • links: PTS, VCS
  • area: main
  • in suites: squeeze
  • size: 1,656 kB
  • ctags: 1,820
  • sloc: cpp: 18,045; python: 836; ansic: 635; makefile: 93; sh: 65
file content (167 lines) | stat: -rw-r--r-- 5,640 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
Description of scripts that accompany LAST
==========================================

last-dotplot.py
---------------

This script makes a dotplot, a.k.a. Oxford Grid, of alignments in LAST
tabular format.  It requires the Python Imaging Library to be
installed.  To get a usage message::

  last-dotplot.py --help

To make a png-format dotplot of alignments in a file called "al"::

  last-dotplot.py al al.png

To get a nicer font, try something like::

  last-dotplot.py -f /usr/share/fonts/truetype/freefont/FreeSans.ttf al al.png

If the fonts are located somewhere different on your computer, change
this as appropriate.  To turn off the text and margins completely::

  last-dotplot.py -s0 al al.png

To limit the plot to 500x500 pixels::

  last-dotplot.py -x500 -y500 al al.png

If there are too many chromosomes, the dotplot will be very cluttered,
or the script might give up with an error message.  So you may want to
remove alignments involving fragmentary chromosomes first.  For
example, you could use "grep -v" to remove alignments involving
chromosomes with names like "chr1_random"::

  grep -v 'random' al > plotme
  last-dotplot.py plotme plotme.png


maf-convert.py
--------------

This script can convert MAF-format alignments to tabular format.  This
is needed to feed MAF alignments to last-dotplot.py.  Usage::

  maf-convert.py tab my-alignments.maf > my-alignments.tab

It can also convert MAF to AXT format, should you wish to do that::

  maf-convert.py axt my-alignments.maf > my-alignments.axt


last-reduce-alignments.sh
-------------------------

This script removes "uninteresting" alignments from LAST genome
comparisons in MAF format.  Roughly speaking, it removes paralog
alignments and keeps ortholog alignments.  More precisely, if region A
in genome 1 aligns with region B in genome 2, but if A also aligns
more strongly with a different region X and B aligns more strongly
with a different region Y, then the alignment of A with B is removed.
This procedure is conservative: it is unlikely to remove one-to-one
orthologs, but it may keep some paralogs, e.g. if the ortholog is
(wholly or partially) deleted in one genome.  The usage is simple::

  last-reduce-alignments.sh my-alignments.maf > reduced-alignments.maf

There is also an option to remove alignments more aggressively: if A
aligns more strongly with X *or* B aligns more strongly with Y, then
the alignment of A with B is removed.  This is still unlikely to
remove one-to-one orthologs, but it may cause some regions that are
alignable to something to be aligned to nothing.  This option is
selected with "-d"::

  last-reduce-alignments.sh -d my-alignments.maf > reduced-alignments.maf


maf-join.py
-----------

This script joins two or more sets of pairwise (or multiple)
alignments into multiple alignments::

  maf-join.py aln1.maf aln2.maf aln3.maf > joined.maf

The top genome in each input file should be the same, and the script
simply joins alignment columns that are at the same position in the
top genome.  IMPORTANT LIMITATION: alignment columns with gaps in the
top sequence get joined arbitrarily, and probably wrongly.  Please
disregard such columns in downstream analyses.  Each input file must
have been sorted using maf-sort.sh (but the output of
last-reduce-alignments.sh is already in the right order).  For an
example of using LAST and maf-join.py, see multiMito.sh in the
examples directory.


maf-swap.py
-----------

This script changes the order of the sequences in MAF-format
alignments.  You can use option "-n" to move the "n"th sequence to the
top (it defaults to 2)::

  maf-swap.py -n3 my-alignments.maf > my-swapped.maf


maf-sort.sh
-----------

This sorts MAF-format alignments by sequence name, then start
position, then end position, of the top sequence.


last-remove-dominated.py
------------------------

This script is used by last-reduce-alignments.sh.  It reads sorted
alignments, and removes alignments of A in genome 1 with B in genome 2
if A also aligns more strongly to a different region X.


maf2html.py
-----------

This script converts MAF-format alignments to a human-friendly HTML
format::

  maf2html.py multiMito.maf > multiMito.html

You can change the number of letters per line using the "-l" option::

  maf2html.py -l50 multiMito.maf > multiMito.html

Each alignment column gets coloured according to its probability,
given by MAF lines starting with 'p'.  (To get MAF lines starting with
'p', run lastal with option -j4 or -j5.)  If an alignment has multiple
'p' lines (e.g. after using maf-join.py), then the column-wise
products are used.


last-map-probs.py
-----------------

This script calculates "mapping probabilities" for alignments in LAST
tabular format.  If one query sequence participates in more than one
alignment, the script ASSUMES that ONE of these alignments is the true
mapping.  The script estimates a probability of each alignment being
the true mapping (based on their scores).  It writes the alignments
plus an extra column with the probabilities::

  last-map-probs.py my-alignments.tab > my-alignments-with-probs

This script does not indicate whether alignments are significantly
similar: for that, you need lastex.  Furthermore, if you feed this
script weak alignments that might exist just by chance, then its
assumption is unlikely to be valid.


Limitations
-----------

1) last-reduce-alignments.sh and last-remove-dominated.py do not work
   with centroid alignments.

2) The scripts that read MAF format work with the simple subset of MAF
   produced by lastal, but they don't necessarily work with more
   complex MAF data from elsewhere.