1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167
|
Description of scripts that accompany LAST
==========================================
last-dotplot.py
---------------
This script makes a dotplot, a.k.a. Oxford Grid, of alignments in LAST
tabular format. It requires the Python Imaging Library to be
installed. To get a usage message::
last-dotplot.py --help
To make a png-format dotplot of alignments in a file called "al"::
last-dotplot.py al al.png
To get a nicer font, try something like::
last-dotplot.py -f /usr/share/fonts/truetype/freefont/FreeSans.ttf al al.png
If the fonts are located somewhere different on your computer, change
this as appropriate. To turn off the text and margins completely::
last-dotplot.py -s0 al al.png
To limit the plot to 500x500 pixels::
last-dotplot.py -x500 -y500 al al.png
If there are too many chromosomes, the dotplot will be very cluttered,
or the script might give up with an error message. So you may want to
remove alignments involving fragmentary chromosomes first. For
example, you could use "grep -v" to remove alignments involving
chromosomes with names like "chr1_random"::
grep -v 'random' al > plotme
last-dotplot.py plotme plotme.png
maf-convert.py
--------------
This script can convert MAF-format alignments to tabular format. This
is needed to feed MAF alignments to last-dotplot.py. Usage::
maf-convert.py tab my-alignments.maf > my-alignments.tab
It can also convert MAF to AXT format, should you wish to do that::
maf-convert.py axt my-alignments.maf > my-alignments.axt
last-reduce-alignments.sh
-------------------------
This script removes "uninteresting" alignments from LAST genome
comparisons in MAF format. Roughly speaking, it removes paralog
alignments and keeps ortholog alignments. More precisely, if region A
in genome 1 aligns with region B in genome 2, but if A also aligns
more strongly with a different region X and B aligns more strongly
with a different region Y, then the alignment of A with B is removed.
This procedure is conservative: it is unlikely to remove one-to-one
orthologs, but it may keep some paralogs, e.g. if the ortholog is
(wholly or partially) deleted in one genome. The usage is simple::
last-reduce-alignments.sh my-alignments.maf > reduced-alignments.maf
There is also an option to remove alignments more aggressively: if A
aligns more strongly with X *or* B aligns more strongly with Y, then
the alignment of A with B is removed. This is still unlikely to
remove one-to-one orthologs, but it may cause some regions that are
alignable to something to be aligned to nothing. This option is
selected with "-d"::
last-reduce-alignments.sh -d my-alignments.maf > reduced-alignments.maf
maf-join.py
-----------
This script joins two or more sets of pairwise (or multiple)
alignments into multiple alignments::
maf-join.py aln1.maf aln2.maf aln3.maf > joined.maf
The top genome in each input file should be the same, and the script
simply joins alignment columns that are at the same position in the
top genome. IMPORTANT LIMITATION: alignment columns with gaps in the
top sequence get joined arbitrarily, and probably wrongly. Please
disregard such columns in downstream analyses. Each input file must
have been sorted using maf-sort.sh (but the output of
last-reduce-alignments.sh is already in the right order). For an
example of using LAST and maf-join.py, see multiMito.sh in the
examples directory.
maf-swap.py
-----------
This script changes the order of the sequences in MAF-format
alignments. You can use option "-n" to move the "n"th sequence to the
top (it defaults to 2)::
maf-swap.py -n3 my-alignments.maf > my-swapped.maf
maf-sort.sh
-----------
This sorts MAF-format alignments by sequence name, then start
position, then end position, of the top sequence.
last-remove-dominated.py
------------------------
This script is used by last-reduce-alignments.sh. It reads sorted
alignments, and removes alignments of A in genome 1 with B in genome 2
if A also aligns more strongly to a different region X.
maf2html.py
-----------
This script converts MAF-format alignments to a human-friendly HTML
format::
maf2html.py multiMito.maf > multiMito.html
You can change the number of letters per line using the "-l" option::
maf2html.py -l50 multiMito.maf > multiMito.html
Each alignment column gets coloured according to its probability,
given by MAF lines starting with 'p'. (To get MAF lines starting with
'p', run lastal with option -j4 or -j5.) If an alignment has multiple
'p' lines (e.g. after using maf-join.py), then the column-wise
products are used.
last-map-probs.py
-----------------
This script calculates "mapping probabilities" for alignments in LAST
tabular format. If one query sequence participates in more than one
alignment, the script ASSUMES that ONE of these alignments is the true
mapping. The script estimates a probability of each alignment being
the true mapping (based on their scores). It writes the alignments
plus an extra column with the probabilities::
last-map-probs.py my-alignments.tab > my-alignments-with-probs
This script does not indicate whether alignments are significantly
similar: for that, you need lastex. Furthermore, if you feed this
script weak alignments that might exist just by chance, then its
assumption is unlikely to be valid.
Limitations
-----------
1) last-reduce-alignments.sh and last-remove-dominated.py do not work
with centroid alignments.
2) The scripts that read MAF format work with the simple subset of MAF
produced by lastal, but they don't necessarily work with more
complex MAF data from elsewhere.
|