摘要
The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a "whole genome shotgun" approach, we have pro-duced a draft rice genome sequence of Oryza sativa ssp. in-dica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from an indica variant cultivar 93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEM-BLER, a custom-designed software package
The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a 'whole genome shotgun' approach, we have pro-duced a draft rice genome sequence of Oryza saliva ssp. in-dica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from an indica variant cultivar 93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEMBLER, a custom-designed software package used for the initial assembly, were verified rigorously by comparisons against finished BAG clone sequences from both indica and japanica strains, available from the public databases. Over all, 96.3% of full-length cDNAs, 96.4% of STS, STR, RFLP markers, 94.0% of ESTs and 94.9% unigene clusters were identified from the draft sequence. Our preliminary analysis on the data set shows that our rice draft sequence is consistent with the cornman standard accepted by the genome sequencing community. The unconditional release of the draft to the public also undoubtedly provides a fundamental resource to the international scientific communities to facilitate genomic and genetic studies on rice biology.
作者
YU Jun, HU Songnian, WANG Jun,LI Songgang WONG Ka-Shu Gane, LIU Bin,DENG Yajun, DAI Li, ZHOU Yan,ZHANG Xiuqing, CAO Mengliang, LIU Jing,SUN Jiandong , TANG Jiabin, CHEN Yanjiong,HUANG Xiaobing, LIN Wei, YE Chen, TONG Wei,CONG Lijuan, GENG Jianing, HAN Yujun, LI Lin,LI Wei, HU Guangqiang, HUANG Xiangang,LI Wenjie, LI Jian, LIU Zhanwei, LI Long,LIU Jianping, Ql Qiuhui, LIU Jinsong, LI Li,WANG Xuegang, LU Hong, WU Tingling,ZHU Miao, Nl Peixiang, HAN Hua, DONG Wei,REN Xiaoyu, FENG Xiaoli, GUI Peng,LI Xianran, WANG Hao, XU Xin, ZHAI Wenxue,XU Zhao, ZHANG Jinsong, HE Sijie,ZHANG Jianguo, XU Jichen, ZHANG Kunlin,ZHENG Xianwu, DONG Jianhai, ZENG Wanyong,TAO Lin, CHEN Xuewei, HE Jun, LIU Daofeng,TIAN Wei, TIAN Chaoguang, XIA Hongai,LI Gang, GAO Hui, LI Ping, CHEN Wei ,WANG Xudong, ZHANG Yong, HU Jianfei,WANG Jing, LIU Song, YANG Jian,ZHANG Guangyu, XIONG Yuqing, LI Zhijie,MAO Long, ZHOU Chengshu, ZHU Zhen,CHEN Runsheng, HAO Bailin,ZHENG Weimou, CHEN Shouyi, QUO Wei,LI Guojie, LIU Siqi, HUANG Guyang,TAO Ming, WANG Jian, ZHU Lihuang,YUAN Longping& YANG HuanmingBeijing Genomics Institute/Center of Genomics & Bioinformatics, Chinese Academy of Sciences, Beijing 101300, China
Hangzhou Genomics Institute/Institute of Bioinformatics of Zhejiang University/Key Laboratory of Bioinformatics of Zhejiang Province, Hangzhou 310007, China
Institute of Genetics, Chinese Academy of Sciences, Beijing 100101, China
National Hybrid Rice R & D Center, Changsha 410125, China
Laboratory of Bioinformatics, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
College of Life Sciences, Peking University, Beijing 100871, China
Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 1Q0080, China
Digital China Ltd., Beijing 100080, China
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
Medical College, Xi’an Jiaotong University, Xi’an 710061, ChinaThese authors contributed equally to this work.Corresponding author.Corresponden
基金
This work was sponsored by the Chinese Academy of Sciences, the Commission for Economy Planning, the Ministry of Science and Technology, the National Natural Science Foundation of China, Beijing Municipal Government, Zhejiang Provincial Government, and H