摘要
定义了核苷酸序列的信息关联和16种碱基关联,并求得了它们的涨落限.从关联长度和关联强度两个方面研究了编码序列和包含编码区和非编码区的全序列的核苷酸关联性质,并用快速付立叶分析的方法研究了关联的周期性.结果表明,编码序列的信息关联的主极大超过涨落限的7~10倍(高等生物),全序列则为15~19倍,主极大强度与进化有一定的相关性,原核大于真核.80%以上的序列的主极大分布在紧邻和次紧邻范围.当α-cup取1.5(3),编码序列中关联长度≤2的约占25%~50%(80%),全序列则为10%(60%).约10%~20%的序列的信息关联谱中有明显的3周期,也发现了一些大于100的长周期.文中还给出了16种碱基关联主极大的位置、强度以及关联谱中的3周期性,例如CG模具有最大关联强度,并且出现于紧邻位点.
Define the informational correlation {D k} and base correlation {F (k) ij } F (k) ij =(p (k) ij -p ip j) 2 .The fluctuation bounds for these correlations have been deduced.The correlation properties of nucleotides in coding sequences and corresponding whole sequences are studied from both correlation length and correlation strength.The results show that the main maximum in informational correlation exceeds the fluctuation bound by a factor 7 ̄9 for coding sequences and 15 for whole sequences.There exists an evolutionary correlation for these main maxima in different species.(For example,the main maxima in eukaryotes are larger than prokaryotes).For 80% or more sequences the main maxima are distributed in neighboring sites.When α cut is taken to be 1.5 or 3 the proportion of coding sequences with correlation length ≤2 is about 25% ̄50% or 80% resportion.For whole sequences the corresponding in both k≤5 and k>5. The CG mode has the maximal correlation strength in general and it occurs in neighboring sites ( k=1). The periodicity in informational correlation and base correlation is investigated by Fase Fourier Transformation method.The results show that the 3 periodicity exists for about 10% ̄20% sequences.The long periodicity (for example,larger then 100) has also been discovered.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
1997年第2期169-176,共8页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金
关键词
核苷酸关联
编码序列
DNA
碱基关联
信息关联
nucleotide correlation coding sequence noncoding sequence correlation length correlation strength