Prediction of the coding sequences of unidentified human genes. IV. The coding sequences of 40 new genes (KIAA0121-KIAA0160) deduced by analysis of cDNA clones from human cell line KG-1.
In this series of projects regarding the accumulation of sequence information of unidentified human genes, we newly deduced the sequences of 40 full-length cDNA clones of human cell line KG-1, and predicted the coding sequences of the corresponding genes, named KIAA0121 to 0160. The results of a computer search of public databases indicated that the sequences of 13 genes were unrelated to any reported genes, while the remaining 27 genes carried sequences which showed some similarities to known genes. Obvious unique sequences noted were as follows. A stretch of triplet repeats was contained in each of three genes: These were GAG(Glu) in KIAA0122 and KIAA0147, and TCC(Ser) in KIAA0150. A stretch of 10 amino acid-residues was repeated 21 times in KIAA0139, and a homologous sequence of 76-78 nucleotides was found repeated 6 times in the untranslated region of KIAA0125. Northern hybridization analysis demonstrated that 13 genes were expressed in a cell- or tissue-specific manner. Although a vast number of expressed sequence tags (ESTs) have been registered for comprehensive analysis of cDNA clones, our sequence data indicated that their distribution is very unbalanced: e.g. while no EST hit 7 genes, 85 ESTs fell in a single gene.