Structural analysis of Arabidopsis thaliana chromosome 5. VI. Sequence features of the regions of 1,367,185 bp covered by 19 physically assigned P1 and TAC clones.
Nineteen P1 and TAC clones, which have been mapped on the fine physical map of the Arabidopsis thaliana chromosome 5, were sequenced according to the shotgun-based strategy, and their structural features were analysed. The total length of the regions sequenced in this study was 1,367,185 bp. Combining this with the regions covered by 90 P1 and TAC clones previously reported, the total length of chromosome 5 sequenced to date becomes 8,058,855 bp. On the basis of similarity search against protein and EST databases and gene modeling with computer programs, a total of 330 potential protein-coding regions were identified, bringing an average density of the genes to approximately one gene per 4.1 kb. Introns were identified in 81.0% of the potential protein genes for which the entire gene structure was predicted, with an average number per gene of 4.2 and an average length of the introns of 180 bp. The RNA-coding genes identified were 9 tRNA genes corresponding to 8 amino acid species and 2 genes for U2 nuclear RNA. These sequence features are essentially identical to those in the previously reported sequences. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.