摘要
对大量的大肠杆菌(Escherichiacoli)、酵母(Yeast)和果蝇(Drosophilamelanogaster)已知基因起始密码子和终止密码子上、下游各30个碱基序列 ,用重新定义单碱基信息冗余 (记为D1(l) ,l是位点 )和紧邻碱基的信息冗余 (记为D2(l)) ,统计计算每个位点的D1(l)和D2(l)值。从结果看 ,双碱基比单碱基携带更多的信息 ;酵母和果蝇基因起始密码子上游 -3位点D1( -3)和D2( -3)有一明显峰值 ;大肠杆菌基因起始密码子上游SD区域D1(l)和D2(l)有明显峰值 ,与他人结论相同。发现酵母基因起始密码子下游的 +4位点与 +5位点的紧邻碱基的D2(l)有一峰值 ,其关联模式为TC(联合概率为0.211)。这说明用重新定义的信息冗余去确认DNA序列中存在的保守位点是完全可行的。
The formulation of the single base information redundancy D1(l)and the adjacent base related information redundancy D2(l)are revised. For the sequences of upstream and downstream the start codon and the terminal codons of E.coli, yeast and Drosophila genes, the D1(l) and D2(l) for each site l (l=-30, -29, …, +32, +33) are calculated. The results shown that D2(l) have more information than D1(l). In site -3 of coding start sequences, D1(-3) and D2(-3) have a distinct peak value for yeast and Drosophila. In the SD region of E.coli gene sequences, D1(l) and D2(l)have obvious peak value distribution, which is consistent with the others' results. D2(l) in site +4 of coding start sequences in yeast also have a peak value, whose related mode is TC (the combined probability is 0.211). Therefore, the revised information redundancies applied in this thesis are feasible to confirm the conservative sites in DNA sequence.
出处
《生物物理学报》
CAS
CSCD
北大核心
2002年第1期71-75,共5页
Acta Biophysica Sinica
基金
国家自然科学基金 (10147204)资助项目
内蒙古自治区自然科学基金 (20001301)