An idea is presented about the development of a data processing and analysis system for ICF experiments, which is based on an object oriented framework. The design and preliminary implementation of the data processing...An idea is presented about the development of a data processing and analysis system for ICF experiments, which is based on an object oriented framework. The design and preliminary implementation of the data processing and analysis framework based on the ROOT system have been completed. Software for unfolding soft X-ray spectra has been developed to test the functions of this framework.展开更多
Genomic selection(GS)and phenotypic selection(PS)are widely used for accelerating plant breeding.However,the accuracy,robustness,and transferability of these two selection methods are underexplored,especially when add...Genomic selection(GS)and phenotypic selection(PS)are widely used for accelerating plant breeding.However,the accuracy,robustness,and transferability of these two selection methods are underexplored,especially when addressing complex traits.In this study,we introduce a novel data fusion framework,GPS(genomic and phenotypic selection),designed to enhance predictive performance by integrating genomic and phenotypic data through three distinct fusion strategies:data fusion,feature fusion,and result fusion.The GPS framework was rigorously tested using an extensive suite of models,including statistical approaches(GBLUP and BayesB),machine learning models(Lasso,RF,SVM,XGBoost,and LightGBM),a deep learning method(DNNGP),and a recent phenotype-assisted prediction model(MAK).These models were applied to large datasets from four crop species,maize,soybean,rice,and wheat,demonstrating the versatility and robustness of the framework.Our results indicated that:(1)data fusion achieved the highest accuracy compared with the feature fusion and result fusion strategies.The top-performing data fusion model(Lasso_D)improved the selection accuracy by 53.4%compared to the best GS model(LightGBM)and by 18.7%compared to the best PS model(Lasso).(2)Lasso_D exhibited exceptional robustness,achieving high predictive accuracy even with a sample size as small as 200 and demonstrating resilience to single-nucleotide polymorphism(SNP)density variations,underscoring its adaptability to diverse data conditions.Moreover,the model’s accuracy improved with the number of auxiliary traits and their correlation strength with target traits,further highlighting its adaptability to complex trait prediction.(3)Lasso_D demonstrated broad transferability,with substantial improvements in predictive accuracy when incorporating multi-environmental data.This enhancement resulted in only a 0.3%reduction in accuracy compared to predictions generated using data from the same environment,affirming the model’s reliability in crossenvironmental scenarios.This study provides groundbreaking insights,pushing the boundaries of predictive accuracy,robustness,and transferability in trait prediction.These findings represent a significant contribution to plant science,plant breeding,and the broader interdisciplinary fields of statistics and artificial intelligence.展开更多
基金This project supported by the National High-Tech Research and Development Plan (863-804-3)
文摘An idea is presented about the development of a data processing and analysis system for ICF experiments, which is based on an object oriented framework. The design and preliminary implementation of the data processing and analysis framework based on the ROOT system have been completed. Software for unfolding soft X-ray spectra has been developed to test the functions of this framework.
基金supported in part by the National Key Research and Development Program of China(2022YFD2300700)the Fundamental Research Funds for the Central Universities(YDZX2025021,KYT2024005,QTPY2025006)+9 种基金the Jiangsu Province Key Research and Development Program(BE2023369)the Natural Science Foundation of Jiangsu Province(BK20231469)the Hainan Yazhou Bay Seed Laboratory(B21H J1005)the National Natural Science Foundation of China(32201656)the Sichuan Provincial Finance Department Project of China(1+3 ZYGG001)the JBGS Project of Seed Industry Revitalization in Jiangsu Province(JBGS[2021]007)the Young Elite Scientists Sponsorship Program by CAST(YESS)the Science and Technology Innovation 2030-Major Project(2023ZD04034,2023ZD0405605)the Zhongshan Biological Breeding Laboratory(ZSBBL-KY2023-03)the Jiangsu Provincial Special Fund for Basic Research(Major Innovation Platform Plan)(BM2024005).
文摘Genomic selection(GS)and phenotypic selection(PS)are widely used for accelerating plant breeding.However,the accuracy,robustness,and transferability of these two selection methods are underexplored,especially when addressing complex traits.In this study,we introduce a novel data fusion framework,GPS(genomic and phenotypic selection),designed to enhance predictive performance by integrating genomic and phenotypic data through three distinct fusion strategies:data fusion,feature fusion,and result fusion.The GPS framework was rigorously tested using an extensive suite of models,including statistical approaches(GBLUP and BayesB),machine learning models(Lasso,RF,SVM,XGBoost,and LightGBM),a deep learning method(DNNGP),and a recent phenotype-assisted prediction model(MAK).These models were applied to large datasets from four crop species,maize,soybean,rice,and wheat,demonstrating the versatility and robustness of the framework.Our results indicated that:(1)data fusion achieved the highest accuracy compared with the feature fusion and result fusion strategies.The top-performing data fusion model(Lasso_D)improved the selection accuracy by 53.4%compared to the best GS model(LightGBM)and by 18.7%compared to the best PS model(Lasso).(2)Lasso_D exhibited exceptional robustness,achieving high predictive accuracy even with a sample size as small as 200 and demonstrating resilience to single-nucleotide polymorphism(SNP)density variations,underscoring its adaptability to diverse data conditions.Moreover,the model’s accuracy improved with the number of auxiliary traits and their correlation strength with target traits,further highlighting its adaptability to complex trait prediction.(3)Lasso_D demonstrated broad transferability,with substantial improvements in predictive accuracy when incorporating multi-environmental data.This enhancement resulted in only a 0.3%reduction in accuracy compared to predictions generated using data from the same environment,affirming the model’s reliability in crossenvironmental scenarios.This study provides groundbreaking insights,pushing the boundaries of predictive accuracy,robustness,and transferability in trait prediction.These findings represent a significant contribution to plant science,plant breeding,and the broader interdisciplinary fields of statistics and artificial intelligence.