摘要
根据GenBank的序列数据,构建了真核生物内含子数据库(EID).对EID统计规律的研究表明,数据库共有103 848个基因,478 484个内含子,582 332个外显子,平均每个基因有4.61个内含子,5.61个外显子,内含子长度为40~120个核苷酸的最多.对人、大鼠、小鼠、鸡、果蝇、线虫、拟南芥、玉米和裂殖酵母等9种模式生物的数据的统计分析表明,在真核生物中,并不是生物越高等,基因中的内含子数或外显子数就越大.进一步,对各种模式生物的基因组大小与内含子比例及内含子密度的关系、内含子相位、内含子剪接位点等特征进行了统计研究.
A database called Eukaryotic Intron Database (EID) was developed based on the data from GenBank. Studies of the statistical characteristics of Eli) show that there are 103848 genes, 478484 introns and 582332 exons, with averagely 4.61 introns and 5.61 exons per gene. Introns with 40 - 120 nt in length are abundant in the database. Results of the statistical analyses of the data from 9 model species show that in eukaryotes, higher species do not necessarily have more introns or exons in a gene than lower species. Furthermore, characteristics of EID such as intron phase, distribution of different splice sites, relationship between genome size and intron proportion or intron density have been studied.
出处
《中山大学学报(自然科学版)》
CAS
CSCD
北大核心
2005年第6期79-82,共4页
Acta Scientiarum Naturalium Universitatis Sunyatseni
基金
国家自然科学基金资助项目(30270752)
广东省自然科学基金资助项目(031616)
关键词
真核生物
内含子
数据库
特征统计
eukaryote
intron
database
characteristic statistics