The accuracy of genomic annotation is crucial for subsequent functional investigations;however,computational protocols used in high-throughput annotation of open reading frames(ORFs)can introduce inconsistencies.These...The accuracy of genomic annotation is crucial for subsequent functional investigations;however,computational protocols used in high-throughput annotation of open reading frames(ORFs)can introduce inconsistencies.These inconsistencies,which lead to non-uniform extension or truncation of sequence ends,pose challenges for downstream analyses.Existing strategies to rectify these inconsistencies are time-consuming and labor-intensive,lacking specific approaches.To address this gap,we developed to GC,a tool that integrates genomic annotation with RNA-seq datasets to rectify annotation inconsistencies.Using to GC,we achieved an accuracy of nearly 100%accuracy in correcting inconsistencies in published Phytophthora sojae ORFs.We applied this innovative pipeline to the GPCR-bigrams gene family,which was predicted to have 42 members in the P.sojae genome but lacked experimental validation.By employing to GC,we identified 32 GPCR-bigram ORFs with inconsistencies between previous annotations and to GC-corrected sequences.Notably,among these were 5 genes(GPCR-TKL9,GPCR-TKL15,GPCR-PDE3,GPCR-AC3,and GPCR-AC4)showed substantial inconsistencies.Experimental gene annotation confirmed the effectiveness of to GC,as sequences obtained through cloning matched those annotated by to GC.Importantly,we discovered two novel GPCRs(GPCR-AC3 and GPCR-AC4),which were previously mispredicted as a single gene.CRISPR/Cas9-mediated knockout experiments revealed the involvement of GPCR-AC4 but not GPCR-AC3 in oospore production,further confirming their status as two separate genes.In addition to P.sojae,the reliability of the to GC pipeline in Phytophthora capsici and Pythium ultimum further emphasizes the robustness of this pipeline.Our findings highlight the utility of to GC for reliable gene model correction,facilitating investigations into biological functions and offering potential applications in diverse species analyses.展开更多
基金supported by the grants to Min Qiu and Ming Wang from the National Natural Science Foundation of China(32100160 and 32100044)the grants to Ming Wang from the Jiangsu“Innovative and Entrepreneurial Talent”Program,China(JSSCRC2021510)the grants to Yuanchao Wang from the Chinese Modern Agricultural Industry Technology System(CARS-004-PS14)。
文摘The accuracy of genomic annotation is crucial for subsequent functional investigations;however,computational protocols used in high-throughput annotation of open reading frames(ORFs)can introduce inconsistencies.These inconsistencies,which lead to non-uniform extension or truncation of sequence ends,pose challenges for downstream analyses.Existing strategies to rectify these inconsistencies are time-consuming and labor-intensive,lacking specific approaches.To address this gap,we developed to GC,a tool that integrates genomic annotation with RNA-seq datasets to rectify annotation inconsistencies.Using to GC,we achieved an accuracy of nearly 100%accuracy in correcting inconsistencies in published Phytophthora sojae ORFs.We applied this innovative pipeline to the GPCR-bigrams gene family,which was predicted to have 42 members in the P.sojae genome but lacked experimental validation.By employing to GC,we identified 32 GPCR-bigram ORFs with inconsistencies between previous annotations and to GC-corrected sequences.Notably,among these were 5 genes(GPCR-TKL9,GPCR-TKL15,GPCR-PDE3,GPCR-AC3,and GPCR-AC4)showed substantial inconsistencies.Experimental gene annotation confirmed the effectiveness of to GC,as sequences obtained through cloning matched those annotated by to GC.Importantly,we discovered two novel GPCRs(GPCR-AC3 and GPCR-AC4),which were previously mispredicted as a single gene.CRISPR/Cas9-mediated knockout experiments revealed the involvement of GPCR-AC4 but not GPCR-AC3 in oospore production,further confirming their status as two separate genes.In addition to P.sojae,the reliability of the to GC pipeline in Phytophthora capsici and Pythium ultimum further emphasizes the robustness of this pipeline.Our findings highlight the utility of to GC for reliable gene model correction,facilitating investigations into biological functions and offering potential applications in diverse species analyses.