摘要
将NCBI公共数据库62 592条蓖麻EST序列进行拼接,得到无冗余Unigene 11 708条,总长度921 kb。在无冗余序列中发现含有SSR的EST序列2 471条,共3 271个位点。SSR发生频率为27.94%,平均分布距离为2.81 kb。在1~6 bp的重复基元中,单核苷酸重复基元出现频率最高(37.51%),其次是三核苷酸重复基元(34.63%)、二核苷酸重复基元(25.61%)。出现较多的重复基元是A/T(36.32%),其次是AG/CT(18.28%)。蓖麻的EST-SSR出现频率较高、类型较丰富、多态性潜能较高,具有较高的利用价值。对蓖麻的EST-SSR功能注释(COG,SwissProt,KEGG),有9条序列注释到脂肪酸合成与代谢相关的基因,为后续有针对性开发EST-SSR标记提供重要依据,也为进一步开发蓖麻EST-SSR标记奠定基础。
62 592 ESTs of Caster bean in the database of NCBI were downloaded and analyzed. After the preprocession, we got 11 708 non-redundant ESTs with total length about 921 kb. Total 2471 SSRs distributed in 3 271 ESTs were detected, accounting for 21.10% of the non-redundant ESTs. The average length and distribution distance of the EST-SSRs were about 2.81 kb. Dinucleotide and trinucleotide are the main repeat types with similar frequency, accounting for 25.61% and 34.63% of all the SSRs. A/T (36.32%) and AG/CT (18.28%) are the most frequent motifs, 575 (23.3%) , 1 166 (47.2%) and 1 817 (73.5%) matched with genes with known function within COG, SwissProt and KEGG. These EST-SSRs will help to develop SSR markers with high pol3,morphism for peanut.
出处
《热带作物学报》
CSCD
北大核心
2012年第12期2138-2143,共6页
Chinese Journal of Tropical Crops
基金
中南美洲热带作物种质收集与共同研究(No.2011DFB31690)