Six medline abstracts on chicken prionWhat are the prospects for TCE in chickens? We expect it certainly, from the extended Gibbs Principle that states every vertebrate will exhibit an uneradicatible, endemic level of TSE disease from natural germline and somatic mutation. So far, chickens are only non-mammal for which the primary structure of prion is known. While the chicken sequence aligns well in spots (ominously, almost perfectly to toxic fragment 92-122) , serves as the best match to a chicken genomic DNA library, and occurs as a single copy in a single ORF, it has not actually been shown to be orthogolous to mammalian prion [e.g., myoglobin and hemoglobin both bind oxygen] and overall sequence agreement is so-so. Homology is always complicated by the potential fusion of remote individual domains. TCEs: Transmissable Chickeniform Encephalopathies
Supposing that chicken gene is only paralogous, there could be a second undiscovered orthologue of it in mammals, which would then be a promising candidate for a new (though related) class of prion diseases, under the paralogous Gibbs Principle. So far a mammal probe has been used in chicken, but a degenerate chicken probe has not been used in mammals. Thus mammals could, in fact, have an undetected additional prion-like gene with a corresponding second form of prionic disease.
One special problem with chicken prion is its hexapeptide repeat, which occurs in 9 rather faithful copies. While the repeats occur in a comparable region to the 4-5 repeats of placental mammalian octapeptide repeat and the 4 repeats of marsupial nonapeptide, it is hard to infer an ancestral vertebral sequence that could give rise to 6x9=54, 8x5=40, and 9x4=36 (but see below). Further, polymorphisms involving extra repeats and repeat deletions have been reported in cow and in familial CJD (where they cause roughly half the adjusted incidence). This is evidently a very labile region.
Chickens are a highly inbred and long-domesticated species; the prion sequence may be unrepresentative of the avian lineage. We can't rule out, without more sequences from the lower vertebrates, the worry that birds normally have a 4-5 fold repeat, and that the 9 repeats in domestic chicken correspond to comparable pathogenic inserts seen in humans. That is, every chicken could be inadvertently incubating a TSE. In support of this, repeats R3, R4, and R5 are identical as DNA in White Leghorns (as noted in PNAS 89: 9097 1993), i.e., the duplication is so recent that not even silent mutations have had time to accumulate. In human CJD, additional octapeptide usually have accommodate 2-3 wobble nucleotide changes in the new repeat event; thus the modest changes in R7-R9 could also be very recent repeats in chickens. Silent third codon positions are not equilibrated, supporting recent amplification (or need for stable hairpin in nucleic acid).
In short, we have a pressing need for more prion sequences from (so-called) lower vertebrates, for this reason and also to augment currently failing BLASTp homology searches. That is, we have prion back to sharks, but can't find it in C. elegans, Drosophila, or yeast. A phylogentically motivated list of the "10 most wanted" prion sequences might include other birds, lisards, turtles, frogs, bony fish, notochordates, tunicates, echinoderms, amphioxus, and so on. (The yeast psi system is not homologous; while a welcome addition, it may fade in relevence as researchers gets deeper into the particulars.)
Indeed, the main paradox in prion evolution is the absence of paralogues in the face of extraordinary sequence conservation. That is, if it is truly not in yeast or C. elegans, where did it come from and where is the rest of its superfamily? Otherwise, it has to have undergone burst evolution from something now unrecognizeable into something that has hardly changed in135,000,000 years [ie, changed function].
By comparing chicken, marsupial, and mammal repeat regions, it emerges that nearby but different stretches of DNA were amplified in a series of events. Thus the repeat regions are not structurally homologous to each other, though their ultimate function may be similar. Beginning with a short palindrome, the forestem was repeated in mammals, the backstem in chickens. The repeat region may be analagous to trlnucleotide repeat neurological genes, serving to modulate translation and in situ translocation. Repeat-CJD may thus be a gene dosage effect or mis-compartmentalization, and not arise from the altered protein structure per se.
Because the chicken sequence is quite divergent, it is best to clamp the sequences by domain and to specific anchors, such as the known leader signal peptide, cysteine cross-bridge, glycosylated asparagines, alpha helices, beta sheets, and GPI tail, using conventional alignment between the clamped residues. (Homology software has difficulties with divergent sequences involving insertions and deletions and makes no use of 3D information in optimizing alignments. Having aligned the sequences, it is then instructive to compare variation in beta strands, helices, and so on across species. The core region (post-repeat, pre beta 1) is the region best conserved across all species (as expected a ligand binding site or hydrophobic core); unfortunately, it was lost in preparing sample for 3D NMR. It is predominantly basic with prolines N-terminal but very hydrophobic except for one lysine in the C-terminal 16 residues, which scores high in theoretical alpha helix forming potential because of the many alanines.
Theee glycosylation sites in chickens creates some potential for alignment confusion with mammals, which have only two sites. The region in question is between the start of helix 2 and the end of helix 3 at positions 194, 209, and 218. According to a recent review of asparagine glycosylation criteria, all three potential sites are >consistent with the consensus sequence. The best fit is obtained by taking sites A1 and A2 as homologous to mammals. However, A3 may also by glycosylated in vivo.
...cfnitvteysigpaakkntseavaaanqtevemenkvvtkviremcvqq...
Internal Comparison of Chicken Repeats
Nucelotide Sequence Protein Repeat aaa ccc agt ggt ggg ggt tgg ggc gcc ggg agc kpsgggwgags pre-repeat CAT CGC CAG CCC AGC TAC CCC HRQPSYP R1 --- CGC CAG CCG GGC TAC CCT -RQPGYP R2 CAT --- AAC CCA GGG TAC CCC H-NPGYP R3 CAT --- AAC CCA GGG TAC CCC H-NPGYP R4 CAT --- AAC CCA GGG TAC CCC H-NPGYP R5 CAC --- AAC CCT GGC TAT CCC H-NPGYP R6 CAT --- AAC CCC GGC TAC CCC H-NPGYP R7 CAG --- AAC CCT GGC TAC CCC Q-NPGYP R8 CAT --- AAC CCA GGT TAC CCA H-NPGYP R9 ggc tgg ggt caa ggc* tac aac cca gwgqgynp post-repeat *These four codons are conserved across all species; the absence of silent mutations suggests that nucleotide sequence is more important than amino acid sequence.
Alignment of Chicken, Marsupial, Human species # Leader Signal Peptide Signal Basic # chicken
marsupial
human 1
1
1 marllttccllalllaactdvals
mgkiqlgywilvlfivtwsdlglc
manl--gcwmlvlfvatwsdlglc kkg-kgk
kk-pkpr
kkrpk-- 30
30
27
Stem-Loop-Backstem Ancient Repeats Individualized Species Repeats Loop psgggw.-gags.hrqpsy prqpgy.phnpgy phnpgy.phnpgy.phnpgy.phnpgy.pqnpgy.phnpgy.pgwgqg-y npssggsyhnq p-gggw.nsggs.nr---y pgqpgs.pggnry.pgwgh pqgggtnwgq.phpggsnwgq.phpggsswgq.phggsn.....wgqggy ---nkw p--ggw.ntggs.-r---y pgq-gs.pggnry.p pqggggwgq.phggg-wgq.phgggwgq.phgggwgq.phgg.gwgqgg- gthsqwn
core region beta 1 loop helix 1 loop beta 2 loop kpwkppktnfkhvagaaaagavvgglgg .yamg. rvmsgmnyhfds pdeyrwwsens arypnr .vyyr. --d-ysspvpqdvfvad kpdk-pktnlkhvagaaaagavvgglgg .ymlg. samsrpvihfgn eyedryyrenq yrypnq .vmyr. pidqyss---qnnfvhd kpsk-pktnmkhmagaaaagavvgglgg .ymlg. samsrpiihfgs dyedryyrenm hrypnq .vyyr. pmdeysn---qnnfvhd
cys 1 ... asn 1 ... helix 2 asn 2 helix3 ... cys2 pre-GPI GPI cfnitvteysigpaakkntsea- vaaanqt eve---menkvvtkvire-mcvqq y-rey-----rla sgiqlhpadtwlavlllllttl-famh- cvnitv---------kqhttttt tkgenft etdikime-rvv-----eqmcitq yqaeyeaaaqr-a ynmaffsapp---vtllflsfliflivs cvniti---------kqhtvttt tkgenft etdvkmme-rvv-----eqmcitq yeresqayyqrg- ssmvlfsspp---villi-sfliflivg
Origin of the Repeat Region
---------early vertebrate lineage B O1 H1 B O1 H1 H2 H3 ---------chicken lineage B O1 H1 H2 H3 H4 H5 H6 H7 B O1 H1 H2 H3 H3 H3 H4 H5 H6 H7 -----------marsupial lineage B O1 H1 H2 H3 O2 O3 O3 O3 -----------mammalian lineage B O1 H1 H2 H3 H3 H3
Helix C in humans begins at boundary of first repeat:
_ac cgc tac cca cct cag ggc ggt ggt (N) R Y P P Q G G G
Variation in Key Central Regions
Variations found in 34 species are listed alphabetically with frequency
for each of 7 key features.pre-core invariant core # beta sheet # helix 1 # KPNKPKTSMKHM AGAAAAGAVVGGLGG 1
YMLG VYYR 27
DYEDRYYRENM 26 KPSKPKSNMKHM AGAAAAGAVVGGLGG 1
YMLG VYYK 2
DWEDRYYRENM 4 KPSKPKTNLKHV AGAAAAGAVVGGLGG 2
YMLG VMYR 1
EYEDRYYRENQ 1 KPSKPKTNLKHVW AGAAAAGAVVGGLGG 1
YAMG VYYR 1
PDEYRWWSENS 1 KPSKPKTNMKHM AGAAAAGAVVGGLGG 8
KPSKPKTNMKHV AGAAAAGAVVGGLGG 11
KPSKPKTSMKHM AGAAAAGAVVGGLGG 3
KPSKPKTSMKHV AGAAAAGAVVGGLGG 2
KPWKPPKTNFKHV AGAAAAGAVVGGLGG 1
Variation in Key C-Terminal Regions helix 2 # asn 2 # helix 3 # CFNITVTEYSIGPAAKKNTSEA 1
KGENFT 1
ETDIKIMERVVEQMCITQ 3 CVNITIKQHTVTTT 19
TKGENFT 29
ETDIKIMERVVEQMCTTQ 1 CVNITVKEHTVTTT 2
TKGENLT 1
ETDIKMMERVVEQMCITQ 5 CVNITVKQHTTTTTT 1
VAAANQT 1
ETDMKIMERVVEQMCVTQ 2 CVNITVKQHTVTTT 8
ETDVKIMERVVEQMCITQ 1 CVNVTIKQHTVTTT 1
ETDVKMIERVVEQMCITQ 1
ETDVKMMERVVEQMCITQ 15
ETDVKMMERVVEQMCVTQ 3
EVEMENKVVTKVIREMCVQQ 1
SwissProt Review: It has been known for a long time that potential N-glycosylation sites are specific to the consensus sequence Asn-Xaa-Ser/Thr. It must be noted that the presence of the consensus tripeptide is not sufficient to conclude that an asparagine residue is glycosylated, due to the fact that the folding of the protein plays an important role in the regulation of N-glycosylation.Ithas been shown that the presence of proline between Asn and Ser/Thr will inhibit N-glycosylation; this has been confirmed by a recent statistical analysis of glycosylation sites, which also shows that about 50% of the sites that have a proline C-terminal to Ser/Thr are not glycosylated.
It must also be noted that there are a few reported cases of glycosylation sites with the pattern Asn-Xaa-Cys; an experimentally demonstrated occurrence of such a non-standard site is found in the plasma protein C."