File: Papers_NLS.dat

package info (click to toggle)
predictnls 1.0.20-8
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 4,988 kB
  • sloc: perl: 1,186; sh: 639; makefile: 36
file content (125 lines) | stat: -rwxr-xr-x 12,220 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
VNEAFETLKRC                             MyoD            Vandromme M, PNAS 1995,(10):4646-50
experimental motif.
TLKRC			16	100	0	myf5_bovin,myod_brare,myf5_chick,myod_chick,myod_cotja,myf5_human,myod_human,myf5_mouse,myod_mouse,myf5_notvi,myo1_oncmy,myod_pig,myod_rat,cut1_schpo,myod_sheep,myf5_xenla	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
this motif occurs in more than one family
VNEAFE	18	100	0	myod_caebr,myod_caeel,myod_chick,myod_cotja,myog_chick,myog_cotja,myod_drome,myod_human,myog_human,sum1_lytva,myod_mouse,myog_mouse,myod_pig,myog_pig,myod_rat,myog_rat,myod_sheep,myod_xenla	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
is only found in myod family. Seems to be a well conserved part of seq
within the family.Probably not part of the NLS.
[DE][ST][PL]KR[STC]	16	93.75	6.25	myf5_bovin,myod_brare,myf5_chick,myod_chick,myod_cotja,dif_drome,myf5_human,myod_human,myf5_mouse,myod_mouse,myf5_notvi,myo1_oncmy,myod_pig,myod_rat,myod_sheep,myf5_xenla	nuc,nuc,nuc,nuc,nuc,cyt,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
# /data/swissprot/current/d/dif_drome ANNOTATION IN SWISSPROT
#SUBCELLULAR LOCATION: CYTOPLASMIC; ACCUMULATES IN THE NUCLEUS UPON BACTERIAL INFECTION OF INJURY.
#KW   Nuclear protein; DNA-binding; Transcription regulation; Activator;
look at mpi3_pig and mpi3_mouse seq id. Same family but different NLS's.
RRKx{3,5}R[DE]R{3,}?[PLV]	24	100	0	myf5_bovin,myod_brare,myf5_chick,myf6_chick,myod_caebr,myod_caeel,myod_chick,myod_cotja,myod_drome,myf5_human,myf6_human,myod_human,myf5_mouse,myf6_mouse,myod_mouse,myf5_notvi,myo1_oncmy,myo2_oncmy,myod_pig,myf6_rat,myod_rat,myod_sheep,myf5_xenla,myod_xenla	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc

# CONSERVATION OF NLS AMONG families. Look at myod_chick and
#  myod_xenla. 77% seq IDE.  

[DE][ST][PL]KR[STCYFW]	21	76.1904761904762	23.8095238095238	if2_aquae,if2_bacst,myf5_bovin,myod_brare,myf5_chick,myod_chick,myod_cotja,dif_drome,aact_human,myf5_human,myod_human,myf5_mouse,myod_mouse,myf5_notvi,myo1_oncmy,myod_pig,myod_rat,myod_sheep,myf5_xenla,myod_xenla,sywc_yeast	cyt,cyt,nuc,nuc,nuc,nuc,nuc,cyt,ext,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,cyt
# the addition of only Y also
[DE][ST][PL]KR[STCY]	18	88.8888888888889	11.1111111111111	if2_bacst,myf5_bovin,myod_brare,myf5_chick,myod_chick,myod_cotja,dif_drome,myf5_human,myod_human,myf5_mouse,myod_mouse,myf5_notvi,myo1_oncmy,myod_pig,myod_rat,myod_sheep,myf5_xenla,myod_xenla	cyt,nuc,nuc,nuc,nuc,nuc,cyt,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
#-----------------------------------------------------------------------------------------------
REKKEKEQKEKCA                           prot.Hsc9       Nederlof PM, PNAS 1995,(26):12060-4     
TEST: R[DE][KR][KR][DE][KR][DE][QM][KR][DE]K	2	0	100	prc9_human,prc9_rat	cyt,cyt
TEKK[QG]KSILYDCA                        prot.Hsc3       Nederlof PM, PNAS 1995,(26):12060-4
SDKKVRSRLIECA                           Ta alpha        Nederlof PM, PNAS 1995,(26):12060-4
None of these look like an NLS.
TEST: [DE]KK[QMGA]K[ST]	9	44.4444444444444	55.5555555555556	prc3_carau,t2fa_drome,prc3_human,prc3_mouse,prc3_rat,h1_tigca,b4_xenla,prc3_xenla,pr16_yeast	cyt,nuc,cyt,cyt,cyt,nuc,nuc,cyt,nuc

#------------------------------------------------------------------------------------------------
/data/swissprot/current/h/tf2d_human	has a 30 residue long stretch
of Q.
NLS Motifs hard to find.
2 putative: RIREPRT & RLXXRKXXRV
[QM][RK][VI][RK][DE][PL][RK][ST]	29	100	0	tf21_arath,tf22_arath,tf2d_acaca,tf2d_acecl,tf2d_artsf,tf2d_caeel,tf2d_canal,tf2d_chick,tf2d_dicdi,tf2d_drome,tf2d_emeni,tf2d_human,tf21_maize,tf22_maize,tf2d_mesau,tf2d_mescr,tf2d_mouse,tf2d_schpo,tf2d_soltu,tf2d_soybn,tf2d_spofr,tf2d_strpu,tf2d_tobac,tf2d_trifl,tf2d_triga,tf21_wheat,tf22_wheat,tf2d_xenla,tf2d_yeast	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
#but all from same family
R[PL]xx[KR]{2,}?xx[KR]V	25	100	0
hsp2_alose,prt1_clupa,prt2_clupa,tf2d_chick,hsp2_horse,hsp3_horse,tf2d_human,tf2d_mesau,tf2d_mouse,prt1_oncke,prt2_oncmy,prt5_oncmy,prt6_oncmy,prt7_oncmy,prt8_oncmy,prt9_oncmy,prta_oncmy,prt1_salir,prt2_salir,prt3_salir,tf2d_strpu,tf2d_trifl,tf2d_triga,tf2d_xenla,est1_yeast
nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
# has different fmilies.

#---------------------------------------------------------------------------------------------------

TKRSxxxM                                influenzaNP     Wang P, J Virol 1997,(71):1850-6
found in only cha4_yeast; but this protein has a bipartite motif: [RK]{3,}?x{8,16}[RK]{4,}?

#---------------------------------------------------------------------------------------------------

LKRKLQR                                 Pax-QNR         Carriere C, Cell Gr Diff 1995,(6):1531-40

# results of db scan
[PL][KR]{3,}?[PL][QM]R	8	100	0	pax6_brare,pax6_chick,pax6_cotja,pax6_human,pax6_mouse,pax6_oryla,pax6_rat,pax6_xenla	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc

# not found in any other families. 
# Other putative motifs
#/data/swissprot/current/h/pax6_human AN INTERESTING EXAMPLE OF USE
# FOR OUR TOOL.
#pax6_human:	 NRRAKWRRE
R[RK]x[KR]x[RK]{2,}?[DE]	132	100	0	hxa5_ambme,hxb1_ambme,scr_apime,h114_brare,hox3_brafl,hx5l_brare,hxa1_brare,hxb4_brare,hxb5_brare,hxb6_brare,hxc5_brare,hxc6_brare,hxd4_brare,pax6_brare,ctcf_chick,hxa4_chick,hxa7_cotja,hxb1_chick,hxb1_cypca,hxb3_chick,hxb4_chick,hxb5_chick,hxb6_chick,hxd3_chick,hxd4_chick,hxd8_chick,pax6_chick,pax6_cotja,un30_caeel,brm_drome,croc_drome,hmdf_drome,hmft_drohy,hmft_drome,hmux_drome,hmz1_drome,scr_drome,sus_drome,t2d2_drome,hxb4_fugru,ak95_human,cg2f_human,chd3_human,ctcf_human,fre3_human,hxa1_human,hxa3_human,hxa4_human,hxa5_human,hxa6_human,hxa7_human,hxb1_human,hxb3_human,hxb4_human,hxb5_human,hxb6_human,hxb7_human,hxb8_human,hxc4_human,hxc5_human,hxc6_human,hxc8_human,hxd3_human,hxd4_human,hxd8_human,ipf1_human,pax6_human,cx10_mouse,fre3_mouse,gsh2_mouse,gshi_mouse,hxa1_mouse,hxa3_mouse,hxa4_mouse,hxa5_mouse,hxa6_mouse,hxa7_mouse,hxb1_mouse,hxb3_mouse,hxb4_mouse,hxb5_mouse,hxb6_mouse,hxb7_mouse,hxb8_mouse,hxc4_mouse,hxc5_mouse,hxc6_mouse,hxc8_mouse,hxd1_mouse,hxd3_mouse,hxd4_mouse,hxd8_mouse,ipf1_mesau,ipf1_mouse,pax6_mouse,hxc5_notvi,hxc6_notvi,pax6_oryla,dpod_plafk,hxb8_pig,ak95_rat,hxa4_rat,hxa5_rat,hxa7_rat,hxb7_rat,hxb8_rat,hxc4_rat,hxc8_rat,hxd3_rat,ipf1_rat,pax6_rat,h2b1_strpu,h2b2_strpu,hxa4_sheep,hxa5_salsa,hxa5_sheep,hxa7_sheep,hxc6_sheep,hb7a_xenla,hb7b_xenla,hm8_xenla,hxa1_xenla,hxa7_xenla,hxb3_xenla,hxb4_xenla,hxb5_xenla,hxb6_xenla,hxc5_xenla,hxc6_xenla,hxd1_xenla,pax6_xenla,snf2_yeast	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc


#-------------------------------------------------------------------------------------------------

KRAAEDDEDDDVDTKKQK                      hProTalpha      Rubstov YT, FEBS Let 1997,(413):135-41

#results of db scan
KRx{12}KKQK 4	100	0	thya_bovin,thya_human,thya_mouse,thya_rat	nuc,nuc,nuc,nuc
#only in 1 family. probably not NLS.
[PL]K[DE]KK[DE]	5	100	0	thya_bovin,thya_human,scp1_mesau,thya_mouse,thya_rat	nuc,nuc,nuc,nuc,nuc

[GA]E{8}	13	100	0	thya_bovin,cenb_crigr,cenb_human,irf5_human,thya_human,cenb_mouse,irf5_mouse,ku70_mouse,thya_mouse,thya_rat,cenb_sheep,nupl_xenla,leur_yeast	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc
# CHECK IF DNA BINDING PROTEINS ARE RICH IN E
#--------------------------------------------------------------------------------------------
Look at /data/swissprot/current/h/vbp1_human location annotation.
Cytoplasmic but translocates to the nucleus when bound to C terminal
of vhl.
#---------------------------------------------------------------------------------------
3446	ebn2_ebv       	nuc            	euka   	 NUCLEAR. ASSOCIATED WITH THE NUCLEAR MATRIX.

#-------------------------------------------------------------------------------------------------
All 3 cyt proteins have extremely long stretches of R.
fxr1_human	cyt	621	RRRxRRRR,RRRRxRRR	502,501
fxr2_human	cyt	673	RRRxRRRR,RRRRxRRR	544,543
fxr1_xenla	cyt	648	RRRxRRRR,RRRRxRRR	531,530
x is S.

#------------------------------------------------------------------------------------------------
Non interchangeability of R and K.

KKKKKK	24	70.8333333333333	29.1666666666667	hs9a_chick,hs9a_crigr,sr72_canfa,ssrp_chick,no60_drome,t2d2_drome,chd4_human,gcf_human,hs9a_human,pwp1_human,rms1_human,sn24_human,sr72_human,t2fa_human,hs9a_mouse,phi1_myted,pls1_mouse,dpoa_oxyno,hs9a_pig,rpc1_plafa,top2_plafk,dkc1_rat,t2fa_xenla,top2_yeast	cyt,cyt,cyt,nuc,nuc,nuc,nuc,nuc,cyt,nuc,nuc,nuc,cyt,nuc,cyt,nuc,nuc,nuc,cyt,nuc,nuc,nuc,nuc,nuc

RRRRRR	81	100	0	hsp1_alose,hsp1_antla,hsp1_antst,hsp1_antsw,prt_antgr,prta_acist,prtb_acigu,gatb_bommo,hsp1_bovin,hsp2_bovin,prt1_bufja,prt2_bufja,cdp_canfa,hsp1_caefu,hsp1_cavpo,hsp2_calja,hsp_chick,hsp_cotja,hsp1_dasro,hsp1_dasvi,hsp1_didma,hsp1_droau,ebn6_ebv,hsp1_gorgo,fre4_human,hsp1_horse,hsp1_human,hsp1_hylla,hsp2_horse,hsp3_horse,rev_hv2be,rev_hv2d1,ve2_hpv37,h2b2_lytpi,hsp1_macag,hsp1_maceu,hsp1_macgi,hsp1_macrg,hsp1_macru,hsp1_mouse,hsp1_murlo,hsp2_macmu,hsp2_macne,hsp1_notty,prt1_oncke,prt5_oncmy,prt6_oncmy,prt7_oncmy,prt8_oncmy,prt9_oncmy,h2b1_paran,h2b2_paran,h2b3_paran,hsp1_parbi,hsp1_phaci,hsp1_pig,hsp1_plagi,hsp1_plain,hsp1_plams,hsp1_plate,hsp1_psecu,hsp2_pig,hsp1_rabit,hsp1_rat,h2b1_strpu,h2b2_strpu,hsp1_sagim,hsp1_sarha,hsp1_sheep,prt1_salir,prt1_sepof,prt2_salir,prt2_scyca,prt2_sepof,prt3_salir,prt3_scyca,rev_sivs4,rev_sivsp,hsp1_tacac,hsp1_trivu,mcm2_yeast	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc

#---------------------------------------------------------------------------------------------------
Expt suggested NLS's for AMIDA
RGRRRRQR			Amida		Irie Y, J.Biol.Chem.2000(275):2647-53
RKRRR				Amida		Irie Y, J.Biol.Chem.2000(275):2647-53
Seq of Amida

TITLE:Molecular cloning and characterization of amida, a novel protein
                which interacts with a neuron-specific immediate early gene product
                arc, contains novel nuclear localization signals, and causes cell
                death in cultured cells
SEQ:		1 meleqregtm aavgfeefsa ppgselalpp lfgghilese letevefvsg glggsglrer
		61 deeeeaargr rrrqrelnrr kyqalgrrcr eieqvnervl nrlhqvqrit rrlqqerrfl
		121 mrvldsygdd yrasqftivl edegsqgtda ptpgnaenep peketlsppr rtpappepgs
		181 papgegpsgr krrrvprdgr ragnaltpel apvqikveed fgfeadeald sswvsrgpdk
		241 llpyptlasp asd
results od db scan:
RKRRR	20	85	15	ht31_arath,mb11_copci,sdc3_caeel,chd3_human,sn22_human,ve2_hpv04,ve2_hpv07,ve2_hpv40,atf3_mouse,rms5_neucr,h2b_patgr,rpb1_plafd,fre6_rat,spm1_rat,leu3_salty,prt2_scyca,tat_sivmk,tat_sivml,leu3_theaq,yox1_yeast	nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,nuc,mit,nuc,nuc,nuc,nuc,cyt,nuc,nuc,nuc,cyt,nuc
[RK][GA][RK]{4}[QMN][RK]	3	66.6666666666667	33.3333333333333	ctcf_chick,ctcf_human,apt_myctu	nuc,nuc,cyt

#----------------------------------------------------------------------------------------------------
PPVKRERTS			RanBP3		Welch K, Mol Cell Biol.1999(19):8400-11


#-----------------------------------------------------------------------------------------------------
# CONS OF NLS WITHIN FAMILY.
consider tala_povba and tala_bfdv. Sequence Identity is 37%, weighted
similarity 50%.

-------- tala_povaba has NLS: PKKKRKV
aligned segment of tala_bfdv: ENVSVPD 
Not a well conserved part of sequence. 

tala_bfdv has NLS: [KR][DE][KR][DE]xx[KR][KR][KR][KR]