1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122
|
Description of scripts that accompany LAST
==========================================
last-dotplot.py
---------------
This script makes a dotplot, a.k.a. Oxford Grid, of alignments in LAST
tabular format. It requires the Python Imaging Library to be
installed. To get a usage message::
last-dotplot.py --help
To make a png-format dotplot of alignments in a file called "al"::
last-dotplot.py al al.png
To get a nicer font, try something like::
last-dotplot.py -f /usr/share/fonts/truetype/freefont/FreeSans.ttf al al.png
If the fonts are located somewhere different on your computer, change
this as appropriate. To turn off the text and margins completely::
last-dotplot.py -s0 al al.png
To limit the plot to 500x500 pixels::
last-dotplot.py -x500 -y500 al al.png
If there are too many chromosomes, the dotplot will be very cluttered,
or the script might give up with an error message. So you may want to
remove alignments involving fragmentary chromosomes first. For
example, you could use "grep -v" to remove alignments involving
chromosomes with names like "chr1_random"::
grep -v 'random' al > plotme
last-dotplot.py plotme plotme.png
last-reduce-alignments.sh
-------------------------
This script removes "uninteresting" alignments from LAST genome
comparisons in MAF format. Roughly speaking, it removes paralog
alignments and keeps ortholog alignments. More precisely, if region A
in genome 1 aligns with region B in genome 2, but if A also aligns
more strongly with a different region X and B aligns more strongly
with a different region Y, then the alignment of A with B is removed.
This procedure is conservative: it is unlikely to remove one-to-one
orthologs, but it may keep some paralogs, e.g. if the ortholog is
(wholly or partially) deleted in one genome. The usage is simple::
last-reduce-alignments.sh my-alignments.maf > reduced-alignments.maf
There is also an option to remove alignments more aggressively: if A
aligns more strongly with X *or* B aligns more strongly with Y, then
the alignment of A with B is removed. This is still unlikely to
remove one-to-one orthologs, but it may cause some regions that are
alignable to something to be aligned to nothing. This option is
selected with "-d"::
last-reduce-alignments.sh -d my-alignments.maf > reduced-alignments.maf
maf-join.py
-----------
This script joins two or more sets of pairwise (or multiple)
alignments into multiple alignments::
maf-join.py aln1.maf aln2.maf aln3.maf > joined.maf
The top genome in each input file should be the same, and the script
simply joins alignment columns that are at the same position in the
top genome. IMPORTANT LIMITATION: alignment columns with gaps in the
top sequence get joined arbitrarily, and probably wrongly. Please
disregard such columns in downstream analyses. Each input file must
have been sorted using maf-sort.sh (but the output of
last-reduce-alignments.sh is already in the right order). For an
example of using LAST and maf-join.py, see multiMito.sh in the
examples directory.
maf-swap.py
-----------
This script changes the order of the sequences in MAF-format
alignments. You can use option "-n" to move the "n"th sequence to the
top (it defaults to 2)::
maf-swap.py -n3 my-alignments.maf > my-swapped.maf
maf-sort.sh
-----------
This sorts MAF-format alignments by sequence name, then strand, then
start position, then end position, of the top sequence. You can use
option "-n" to sort by the "n"th sequence instead of the top sequence.
last-remove-dominated.py
------------------------
This script is used by last-reduce-alignments.sh. It reads sorted
alignments, and removes alignments of A in genome 1 with B in genome 2
if A also aligns more strongly to a different region X.
Limitations
-----------
1) last-reduce-alignments.sh and last-remove-dominated.py do not work
with centroid alignments.
2) The scripts that read MAF format work with the simple subset of MAF
produced by lastal, but they don't necessarily work with more
complex MAF data from elsewhere.
3) These scripts do not work for DNA-versus-protein alignments:
last-dotplot.py, maf-join.py, last-reduce-alignments.sh,
last-remove-dominated.py.
|