Prediction of the coding sequences of unidentified human genes. II. The coding sequences of 40 new genes (KIAA0041-KIAA0080) deduced by analysis of cDNA clones from human cell line KG-1.
By applying the protocol previously established, we isolated and sequenced full-length cDNA clones longer than 2 kb from cDNA library of human immature myeloid cell line KG-1, and the coding sequences of 40 new genes were predicted. A computer search of the sequences indicated that 29 genes contained sequences with similarities to reported genes in the GenBank/EMBL databases. Significant transmembrane domains were identified in 9 genes, 5 of which harbored multiple hydrophobic regions. Protein motifs that matched those in the PROSITE motif database were identified in 13 genes. In terms of sequence similarities and protein motifs, 5 genes were related to transcriptional factors. Repetitive sequences were found in the 3'-untranslated region of 8 genes. Northern hybridization demonstrated that the expression of 9 genes was tissue-specific, while the remaining 31 genes were expressed ubiquitously. It was also noted that 17 genes yielded different sizes of bands possibly due to either alternative splicing or alternative initiation. The chromosomal location of these genes has been determined.