1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254
|
pepwindowall
Function
Displays protein hydropathy of a set of sequences
Description
pepwindowall produces a set of superimposed Kyte & Doolittle
hydropathy plots from an aligned set of protein sequences.
The result is the same as running pepwindow on a set of proteins with
aligning gaps and superimposing the plots.
It is useful for visualising the average hydropathy and its
variability along the alignment.
Usage
Here is a sample session with pepwindowall
% pepwindowall globins.msf -gxtitle="Base Number" -gytitle="hydropathy"
Displays protein hydropathy of a set of sequences
Graph type [x11]: cps
Created pepwindowall.ps
Go to the input files for this example
Go to the output files for this example
Command line arguments
Standard (Mandatory) qualifiers: [-sequences] seqset File containing a
sequence alignment [-graph] xygraph [$EMBOSS_GRAPHICS value, or x11]
Graph type (ps, hpgl, hp7470, hp7580, meta, cps, x11, tekt, tek, none,
data, xterm, png, gif) Additional (Optional) qualifiers: -datafile
datafile [Enakai.dat] AAINDEX entry data file -length integer [7]
Window size (Integer from 1 to 200) Advanced (Unprompted) qualifiers:
(none) Associated qualifiers: "-sequences" associated qualifiers
-sbegin1 integer Start of each sequence to be used -send1 integer End
of each sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1
boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is
nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean
Make lower case -supper1 boolean Make upper case -sformat1 string
Input sequence format -sdbname1 string Database name -sid1 string
Entryname -ufo1 string UFO features -fformat1 string Features format
-fopenfile1 string Features file name "-graph" associated qualifiers
-gprompt2 boolean Graph prompting -gdesc2 string Graph description
-gtitle2 string Graph title -gsubtitle2 string Graph subtitle
-gxtitle2 string Graph x axis title -gytitle2 string Graph y axis
title -goutfile2 string Output file for non interactive displays
-gdirectory2 string Output directory General qualifiers: -auto boolean
Turn off prompts -stdout boolean Write standard output -filter boolean
Read standard input, write standard output -options boolean Prompt for
standard and additional values -debug boolean Write debug output to
program.dbg -verbose boolean Report some/full command line options
-help boolean Report command line options. More information on
associated and general qualifiers can be found with -help -verbose
-warning boolean Report warnings -error boolean Report errors -fatal
boolean Report fatal errors -die boolean Report dying program messages
Input file format
pepwindowall reads any protein sequence USA for one or more aligned
sequences.
Input files for usage example
File: globins.msf
!!AA_MULTIPLE_ALIGNMENT 1.0
../data/globins.msf MSF: 164 Type: P 25/06/01 CompCheck: 4278 ..
Name: HBB_HUMAN Len: 164 Check: 6914 Weight: 0.61
Name: HBB_HORSE Len: 164 Check: 6007 Weight: 0.65
Name: HBA_HUMAN Len: 164 Check: 3921 Weight: 0.65
Name: HBA_HORSE Len: 164 Check: 4770 Weight: 0.83
Name: MYG_PHYCA Len: 164 Check: 7930 Weight: 1.00
Name: GLB5_PETMA Len: 164 Check: 1857 Weight: 0.91
Name: LGB2_LUPLU Len: 164 Check: 2879 Weight: 0.43
//
1 50
HBB_HUMAN ~~~~~~~~VHLTPEEKSAVTALWGKVN.VDEVGGEALGR.LLVVYPWTQR
HBB_HORSE ~~~~~~~~VQLSGEEKAAVLALWDKVN.EEEVGGEALGR.LLVVYPWTQR
HBA_HUMAN ~~~~~~~~~~~~~~VLSPADKTNVKAA.WGKVGAHAGEYGAEALERMFLS
HBA_HORSE ~~~~~~~~~~~~~~VLSAADKTNVKAA.WSKVGGHAGEYGAEALERMFLG
MYG_PHYCA ~~~~~~~VLSEGEWQLVLHVWAKVEAD.VAGHGQDILIR.LFKSHPETLE
GLB5_PETMA PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQE
LGB2_LUPLU ~~~~~~~~GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKD
51 100
HBB_HUMAN FFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSE
HBB_HORSE FFDSFGDLSNPGAVMGNPKVKAHGKKVLHSFGEGVHHLDNLKGTFAALSE
HBA_HUMAN FPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSD
HBA_HORSE FPTTKTYFPHFDLSHGSAQVKAHGKKVGDALTLAVGHLDDLPGALSNLSD
MYG_PHYCA KFDRFKHLKTEAEMKASEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQ
GLB5_PETMA FFPKFKGLTTADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRD
LGB2_LUPLU LFSFLKGTSEVPQNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKN
101 150
HBB_HUMAN LHCDKLH..VDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVA
HBB_HORSE LHCDKLH..VDPENFRLLGNVLVVVLARHFGKDFTPELQASYQKVVAGVA
HBA_HUMAN LHAHKLR..VDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVS
HBA_HORSE LHAHKLR..VDPVNFKLLSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVS
MYG_PHYCA SHATKHK..IPIKYLEFISEAIIHVLHSRHPGDFGADAQGAMNKALELFR
GLB5_PETMA LSGKHAK..SFQVDPQYFKVLAAVIADTVAAGDAGFEKLMSMICILLRSA
LGB2_LUPLU LGSVHVSKGVADAHFPVVKEAILKTIKEVVGAKWSEELNSAWTIAYDELA
151 164
HBB_HUMAN NALAHKYH~~~~~~
HBB_HORSE NALAHKYH~~~~~~
HBA_HUMAN TVLTSKYR~~~~~~
HBA_HORSE TVLTSKYR~~~~~~
MYG_PHYCA KDIAAKYKELGYQG
GLB5_PETMA Y~~~~~~~~~~~~~
LGB2_LUPLU IVIKKEMNDAA~~~
Output file format
An image is displayed on the specified graphics device.
Output files for usage example
Graphics File: pepwindowall.ps
[pepwindowall results]
Data files
pepwindow reads the Kyte-Doolittle hydropathy data from the file
'Enakai.dat'
EMBOSS data files are distributed with the application and stored in
the standard EMBOSS data directory, which is defined by the EMBOSS
environment variable EMBOSS_DATA.
To see the available EMBOSS data files, run:
% embossdata -showall
To fetch one of the data files (for example 'Exxx.dat') into your
current directory for you to inspect or modify, run:
% embossdata -fetch -file Exxx.dat
Users can provide their own data files in their own directories.
Project specific files can be put in the current directory, or for
tidier directory listings in a subdirectory called ".embossdata".
Files for all EMBOSS runs can be put in the user's home directory, or
again in a subdirectory called ".embossdata".
The directories are searched in the following order:
* . (your current directory)
* .embossdata (under your current directory)
* ~/ (your home directory)
* ~/.embossdata
The EMBOSS data file 'Enakai.dat' contains :-
D Hydropathy index (Kyte-Doolittle, 1982)
R 0807099
A Kyte, J. and Doolittle, R.F.
T A simple method for displaying the hydropathic character of a protein
J J. Mol. Biol. 157, 105-132 (1982)
C CHOC760103 0.964 JANJ780102 0.922 DESM900102 0.898
EISD860103 0.897 CHOC760104 0.889 WOLR810101 0.885
RADA880101 0.884 MANP780101 0.881 EISD840101 0.878
PONP800103 0.870 NAKH920108 0.868 JANJ790101 0.867
JANJ790102 0.866 PONP800102 0.861 MEIH800103 0.856
PONP800101 0.851 PONP800108 0.850 WARP780101 0.845
RADA880108 0.842 ROSG850102 0.841 DESM900101 0.837
BIOV880101 0.829 RADA880107 0.828 LIFS790102 0.824
KANM800104 0.824 CIDH920104 0.824 MIYS850101 0.821
RADA880104 0.819 NAKH900111 0.817 NISK800101 0.812
FAUJ830101 0.811 ARGP820103 0.806 NAKH920105 0.803
ARGP820102 0.803 KRIW790101 -0.805 CHOC760102 -0.838
GUYH850101 -0.843 RACS770102 -0.844 JANJ780103 -0.845
ROSM880101 -0.845 PRAM900101 -0.850 JANJ780101 -0.852
GRAR740102 -0.859 MEIH800102 -0.871 ROSM880102 -0.878
OOBM770101 -0.899
I A/L R/K N/M D/F C/P Q/S E/T G/W H/Y I/V
1.8 -4.5 -3.5 -3.5 2.5 -3.5 -3.5 -0.4 -3.2 4.5
3.8 -3.9 1.9 2.8 -1.6 -0.8 -0.7 -0.9 -1.3 4.2
//
Notes
None.
References
1. Kyte, J. and Doolittle, R.F. A simple method for displaying the
hydropathic character of a protein J. Mol. Biol. 157, 105-132
(1982)
Warnings
None.
Diagnostic Error Messages
None.
Exit status
0 upon successful completion.
Known bugs
None.
See also
Program name Description
backtranambig Back translate a protein sequence to ambiguous codons
backtranseq Back translate a protein sequence
charge Protein charge plot
checktrans Reports STOP codons and ORF statistics of a protein
compseq Count composition of dimer/trimer/etc words in a sequence
emowse Protein identification by mass spectrometry
freak Residue/base frequency table or plot
iep Calculates the isoelectric point of a protein
mwcontam Shows molwts that match across a set of files
mwfilter Filter noisy molwts from mass spec output
octanol Displays protein hydropathy
pepinfo Plots simple amino acid properties in parallel
pepstats Protein statistics
pepwindow Displays protein hydropathy
Author(s)
Ian Longden (il sanger.ac.uk)
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge,
CB10 1SA, UK.
History
Completed 4th June 1999.
Target users
This program is intended to be used by everyone and everything, from
naive users to embedded scripts.
Comments
None
|