The glyceraldehyde 3 phosphate dehydrogenase gene family: structure of a human cDNA and of an X chromosome linked pseudogene; amazing complexity of the gene family in mouse.
In an experiment designed to find sequences common to a skeletal muscle cDNA library and an X chromosome specific library, we have isolated cDNA clones corresponding to glyceraldehyde 3 phosphate dehydrogenase (GAPD), (whose gene is assigned to chromosome 12), and a DNA fragment from the X chromosome short arm which contains an intron-less GAPD pseudogene. A 1210-bp cDNA sequence has been established which covers all of the protein-coding region, most of the 5' non-coding region and part of the 3' non-coding region. It corresponds to the major (and possibly unique) GAPD mRNA present in skeletal muscle. Unexpectedly, the amino acid sequence derived from the cDNA clones differs at 10% of the residues from that established for the human protein purified from skeletal muscle. The X-linked pseudogene has been localised in the p22-p11 region of the human X chromosome. It has the structure of a complete retrotranscript of a processed mRNA, including the poly(A) tail and is 96% homologous to the cDNA sequence. The pseudogene is flanked by a 15-bp direct repeat, and an Alu-like sequence is found in the 3'-flanking region. About 25 GAPD sequences are found in the human genome, 12 of which have high homology to the cDNA probe. A similar complexity is found in hamster. In contrast, the mouse genome contains an amazing number of GAPD related fragments (at least 200). The hybridization pattern suggests that this multiplicity has been generated by two different mechanisms: first the generation of approximately 40 different sequences, which were subsequently amplified (probably by tandem duplication).