Sequence determination and analysis of the 3' region of chicken pro-alpha 1(I) and pro-alpha 2(I) collagen messenger ribonucleic acids including the carboxy-terminal propeptide sequences.

Fuller F., Boedtker H.

Three pro-alpha 1 collagen cDNA clones, pCg1, pCg26, and pCg54, and two pro-alpha 2 collagen cDNA clones, pCg 13 and pCg45, were subjected to extensive DNA sequence determination. The combined sequences specified the amino acid sequences for chicken pro-alpha 1 and pro-alpha 2 type I collagens starting at residue 814 in the collagen triple-helical region and continuing to the procollagen C-termini as determined by the first in-phase termination codon. Thus, the sequences of 272 pro-alpha 1 C-terminal, 260 pro-alpha 2 C-terminal, 201 pro-alpha 1 helical, and 201 pro-alpha 2 helical amino acids were established. In addition, the sequences of several hundred nucleotides corresponding to noncoding regions of both procollagen mRNAs were determined. In total, 1589 pro-alpha 1 base pairs and 1691 pro-alpha 2 base pairs were sequenced, corresponding to approximately one-third of the total length of each mRNA. Both procollagen mRNA sequences have a high G+C content. The pro-alpha 1 mRNA is 75% G+C in the helical coding region sequenced and 61% G&C in the C-terminal coding region while the pro-alpha 2 mRNA is 60% and 48% G+C, respectively, in these regions. The dinucleotide sequence pCG occurs at a higher frequence in both sequences than is normally found in vertebrate DNAs and is approximately 5 times more frequent in the pro-alpha 1 sequence than in the pro-alpha 2 sequence. Nucleotide homology in the helical coding regions is very limited given that these sequences code for the repeating Gly-X-Y tripeptide in a region where X and Y residues are 50% conserved. These differences are clearly reflected in the preferred codon usages of the two mRNAs.

Biochemistry 20:996-1006(1981) [PubMed] [Europe PMC]