Optimal Homology probes
The normal function of prion protein is not currently understood. By comparing the sequence of prion protein to the existing databases of known sequences, homologous proteins of known function in other species might be found. In organisms such as yeast, the genome is entirely sequenced and there are very rapid methods for identifying protein function.These homologous proteins might have a [slightly] different biological role in their host but might have binding sites for the same or similar small ligands. And this is all that is needed to construct therapeutic substrate analogues that could covalently and irreversibly inactivate the rogue form of prion protein.
Now homology searches for the entire prion amino acid sequence do not return matches other than prion proteins themselves. The reason for this is partly that the prion gene is single copy and without 'close' members of its superfamily. Also, homology search engines such as BLASTp are poor at inserting gaps -- the prion gene has numerous small gaps already in the mammals.
The octapeptide repeat has a variable expansion length. This region also amplifies slightly different palindromes in older lines. Its high glycine content further misleads BLASTp into high priority returned mismatches with unrelated proteins with poly-glycine stretches. The repeat generating region is a better candidate for searches than the repeats themselves.
It also makes no sense to include the N-terminal signal peptide, which is found in many unrelated proteins and quite variable in length and sequence, not requiring a canonical sequence. (Note however that the basic residues C-terminal to the cleavage are strongly conserved.)
For the same reasons, the GPI sequence should not be used in deep homology probes: it weakens the signal used by the homology search engine. There are separate compilations of all known GPI proteins that can be accessed by a Medline search and examined individually.
Since we are looking for distant homology (to nematode, fruit fly, zebrafish, yeast, or bacteria) or weak paralogous homology within mammals, the search should avoid using regions already hyper-variable within the placental mammals, such as the post-helix H3 stretch. Special emphasis should be given to invariant (or quasi-invariant) residues, especially those that are conserved in marsupial and chicken. In cases where placental mammal disagrees with chicken and marsupial, but a common placental substitution does agree, that variant should rule in the probe. Residues that are not quite invariant may only exhibit conservative subsitutions (e.g., valine to leucine) -- information that yields partial agreement in comparisons.
Proteins are commonly composed of domains. In some cases, domains are seemingly assembled from disparate sources by recombination and transposition; however, here the prion protein consists of a single exon. The search should still reflect domains identified in the NMR structure as well as apparent exposed hinges inferred from hydrolytic cleavage by proteases. Anchors such as the cysteines and substitued asparagines are also of value.
In short, a series of probes that span single or adjacent well-conserved structural features offer the optimal possibilities for finding homologies. Candidates can then be further scrutinized by more subtle testing. On this basis, the best probes for prion protein are:
Recommended Probe Sequence Probe Name rpkgggwntggsnrypgqpgspggnryph pre-repeat P1 kpskpktnlkhvagaaaagavvgglggymlgsams core P2a -----pktnlkhvagaaaagavvgglggymlg---- strong core P2b msrpiihfgneyedryyrenmyrypnqvyyrp Helix H1 P3 yssqnnfvhdcvnitvkqhtvttttkgenftetdikimervveqmcitqy Helix H2 P4
Post-signal, pre-repeat: rpkgggwntggs-r---ypgq-gspggnryph: probe aligned rpkgggwntggs-r---ypgq-gspggnrypp: mammal rp-gggwnsggsnr---ypgqpgspggnryph: marsupial kpsgggw-gagshrqpsyprqpg------yph: chicken Mid-core past beta B1: kpsk-pktnlkhvagaaaagavvgglggymlgsams: probe aligned kpwkppktnfkhvagaaaagavvgglg-yamgrvms: chicken kpdk-pktnlkhvagaaaagavvgglggymlgsams: marsupial kpsk-pktnmkhmagaaaagavvgglggymlgsams: mammal ---------l--v-----------------------: relevent mmalian variants Helix H1 through beta B2: msrpiihfgneyedryyrenmyrypnqvyyrd: probe aligned msgmnyhfdspdeyrwwsensarypnrvyyrd: chicken msrpvihfgneyedryyrenqyrypnqvmyrp: marsupial msrpiihfgsdyedryyrenmhrypnqvyyrp: mammal ---------ne----------y----------: mammal variants Mid-helix H2 through helix H3: yss---qnnfvhdcvnitv---------kqhtvttttkgenftetdikime-rvv-----eqmcitqy: probe aligned ysspvpqdvfvadcfnitvteysigpaakkntsea-vaaanqteve---menkvvtkvire-mcvqqy: chicken yss---qnnfvhdcvnitv---------kqhttttttkgenftetdikime-rvv-----eqmcitqy: marsupial ysn---qnnfvhdcvniti---------kqhtvttttkgenftetdvkmme-rvv-----eqmcitqy: mammal --s---------------v---------------------------i-i---------------v---: rel.mammal variants
rpkgggwntggsnrypgqpgspggnryph
kpskpktnlkhvagaaaagavvgglggymlg
samsmsrpiihfgneyedryyrenmyrypnqvyyrpyssqnnfvhdcvnitvkqhtvttttkgenftetdikimervveqmcitqy
![]()
A DNA probe for exon 1:
tcccccgcgttgtcggatcagcagaccgattctgggcgctgcgtcgcatcggtggcag
A DNA probe for exon 2:
gactcctgagtatatttcagaactgaaccatttcaaccgagctgaagcattctgccttcctagtggtaccagtccaatttaggagagccaagcagactg
A DNA probe for exon 3, UTR 5' leader:
ttttgcag agaagtcatc ATG [end of intron, beginning of exon 3, beginning of ORF]
A DNA probe for exon 3, UTR 3' trailer:
gggaggccttcctgcttgttccttcgcatttctcgtggtctaggctgggggaggggttatcc
![]()
11 Feb 1997 Homology searches:
Exon 1 Blastn results:
Hamster (Syrian golden)
Query: 1 TCCCCCGCGTTGTCGGATCAGCAGACCGA 29
||||||||| ||| || |||||||||||
Sbjct: 345 TCCCCCGCGGCGTCCGAGCAGCAGACCGA 373
tccccc gcggcgtccg agcagcagac cgagaaggca catcgagtcc actcgtcgcg tcggtggcag M.auratus match
Exon 2 Blastn results:
sequence esentially unique to prion promoters.
Exon 3 Blastn results:
seqkuence a little short for good probabilities.
Exon 3 3' trailer Blastn results:
some human matches but none significant.
C. elegans to GenBank similarities
C18H7 - Chromosome 4 - Finisher Aye Mon Tin 960924
Production: Bill Fronick
Bases: 40978
Bases: 40978
Genes: 13
cDNAs: 11
111.00 SW:PRIO_CHICK P27177 MAJOR PRION PROTEIN HOMOLOG PRECURSOR(ACETYLCHOLINE RECEPTOR-INDUCING ACTIVITY)
740.00 TR:G559703 MRNA (KIAA0068) FOR ORF, PARTIAL CDS
110.00 SW:FES_FSVST P00543 TYROSINE-PROTEIN KINASE TRANSFORMING PROTEIN FES
515.00 TR:G395145 CUTICULAR COLLAGEN
f11g11/950504
F11G11 - finishers Phil Latreille Rebecca Deadman 950504
Bases: 34348
Bases: 34348
Genes: 13
tRNAs: 0
cDNAs: 5
93.00 KCRB_CHICK P05122 CREATINE KINASE, B CHAIN
233.00 CC13_CAEEL P20631 CUTICLE COLLAGEN 13 PRECURSOR
94.00 THTR_CHICK P25324 THIOSULFATE SULFURTRANSFERASE
26.00 PRIO_CHICK P27177 MAJOR PRION PROTEIN HOMOLOG PRECURSOR
C. elegans BLAST search 16 Feb 1997
F25E2
Length = 29,976
Plus Strand HSPs:
Score = 101 (27.9 bits), Expect = 5.4, P = 1.0
Identities = 33/49 (67%), Positives = 33/49 (67%), Strand = Plus / Plus
Query: 1 ATGGTGAAAATCCACATAGGCAGCTGGATCCTGGTTCTCTTTGTGGCCA 49
||||||||||||| ||| | || |||| | || | || | |||
Sbjct: 19395 ATGGTGAAAATCCGGATAAGGAGTTGGAAACCCGTACAAGTTCGGACCA 19443
>F47C12
Length = 34,425
Minus Strand HSPs:
Score = 108 (29.8 bits), Expect = 3.2, P = 0.96
Identities = 28/36 (77%), Positives = 28/36 (77%), Strand = Minus / Plus
Query: 47 CTTCAGCTCGGTTGAAATGGTTCAGTTCTGAAATAT 12
|||||| | || || ||||||||||| | | ||| |
Sbjct: 30343 CTTCAGATAGGGTGGAATGGTTCAGTGCCGCAATGT 30378
>B0205
Length = 40,951
Minus Strand HSPs:
Score = 115 (31.8 bits), Expect = 0.021, P = 0.020
Identities = 31/41 (75%), Positives = 31/41 (75%), Strand = Minus / Plus
Query: 62 GGATAACCCCTCCCCCAGCCTAGACCACGAGAAATGCGAAG 22
||| | || | | | | |||| ||||||||||||||| ||
Sbjct: 26547 GGACAGCCTCACTCACGGCCTCCACCACGAGAAATGCGCAG 26587
>T27D1
Length = 22,559
Minus Strand HSPs:
Score = 96 (26.5 bits), Expect = 8.2, P = 1.0
Identities = 24/30 (80%), Positives = 24/30 (80%), Strand = Minus / Plus
Query: 51 CCCCCAGCCTAGACCACGAGAAATGCGAAG 22
| | | |||| ||||||||||||||| ||
Sbjct: 18677 CACACGGCCTCCACCACGAGAAATGCGCAG 18706
------------
Feb 18 1996 yeast search with chicken core probe
Score = 46 (21.1 bits), Expect = 0.35, P = 0.29
Identities = 11/27 (40%), Positives = 16/27 (59%)
Query: 1 PKTNLKHVAGAAAAGAVVGGLGGYMLG 27
P+T L+ +AG +G +GGL Y G
Sbjct: 1407 PQTPLRSLAGLIDSGIPLGGLTLYGSG 1433
YDR420W 1306301 1311709 HKR1 Hanenula mrakii killer toxin-resistance protein
Annotation : ann-05369
Gene_info HKR1
Summary References available in SGD for HKR1
Reference Yabe, T., et al. (1996) HKR1 encodes a
cell surface protein that regulates both
cell wall beta-glucan synthesis and
budding pattern in the yeast
Saccharomyces cerevisiae. J Bacteriol
178:477-483
Kasahara, S., et al. (1994) Cloning of
the Saccharomyces cerevisiae gene whose
overexpression overcomes the effects of
HM-1 killer toxin, which inhibits
beta-glucan synthesis. J Bacteriol
176:1488-1499
results for search with entry wp:k04h4.1>k04h4.1 ce00246 emb-9: collagen (cambridge) sw:p17139
msrlsllgltaavvllssfcqdrihvdaaaackgcappcvcpgtkgergnpgfggepghpgapgqdgpegapgapgmfgaegdfgdmgskgargdrglpgspghpglqgldglpglkgeegipgcngtdvsdlsksdicniihlsdvvsvlrvslecpdlldlqgnldktetlddqdspdhqekevsihkdakelkenledqefqvfqgnsgypglkgakgdpgpyglpgfpgvsglkgrmgvrtsgvkgekglpgppgppgqpgsypwaskpiemevlqglsdqlvgvkgekgrdgpvgppgmlgldgppgypglkgqkgdlgdagqrgkrgkdgvpgnygekgsqgeqglggtpgypgtkggagepgypgrpgfegdcgpegplgegtgapgqpgidgmpgytekgdrgedgypgfagepglpgepgdcgypgedglpgydiqgppgldgqsgrdgfpgipgdigdpgysgekgfpgtgvnkvgppgmtglpgepgmpgrigvdgypgppgnngergedcgycpdgvpgnagdpgfpgmngypgppgpngdhgdcgmpgapgkprsagsdglsgspglpgipgypgmkgeageivgpmenpagipglkgdhglpglpgrpgsdglpgypggpgqngfpglqgepglagidgkrgrqgslgipglqgppgdsfpgqpgtpgykgergadglpglpgaqgprgipaplrivnqvagqpgvdgmpglpgdrgadglpglpgpvgpdgypgtpgergmdglpgfpglhgepgmrgqqgevgfngidgdcgepgldgypgatrapgapgetgfgfpgqvgypgpngdagaaglpgpdgypgrdglpgtpgypgeagmngqdgapgqpgsrgesglvgidgkkgrdgtpgtrgqdggpgysgeagapgqngmdgypgapgdqgypgspgqdgypgpsgipgedglvgfpglrgehgdnglpglegecgeegsrgldgvpgypgehgtdglpglpgadgqpgfvgeagepgtpgyrgqpgepgnlaypgqpgdvgypgpdgppglpgqdglpglngergdngdsypgnpglsgqpgdagydgldgvpgppgypgitgmpglkgesglpglpgrqgndgipgqpglegecgedgfpgspgqpgypgqqgregekgypgipgenglpglrgqdgqpglkgengldgqpgypgsagqlgtpgdvgypgapgengdngnqgrdgqpglrgesgqpgqpglpgrdgqpgpvgppgddgypgapgqdiygptgqagqdgypgldglpgapglngepgspgqygmpglpggpgesglpgypgerglpgldgkrghdglpgapgvpgvegvpglegdcgedgypgapgapgsngypgerglpgvpgqqgrsgdngypgapgqpgikgprgddgfpgrdgldglpgrpgreglpgpmamavrnppgqpgengypgekgypglpgdnglsgppgkagypgapgtdgypgppglsgmpghggdqgfqgaagrtgnpglpgtpgypgspggwapsrgftfakhsqttavpqcppgasqlwegysllyvqgngrasgqdlgqpgsclskfntmpfmfcnmnsvchvssrndysfwlstdepmtpmmnpvtgtairpyisrcavcevptqiiavhsqdtsvpqcpqgwsgmwtgysfvmhtaagaegtgqslqspgscleefravpfiechgrgtcnyyatnhgfwlsivdqdkqfrkpmsqtlkagglkdrvsrcqvclknr