Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequ...Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequencing of 707 individuals from a full-sib family to develop comprehensive single nucleotide polymorphism(SNP)markers and constructed a high-density genetic linkage map of 19 linkage groups.The total genetic length of the map reached 3623.65 cM with an average marker interval of 0.34 cM.By integrating multidimensional phenotypic data,89 quantitative trait loci(QTL)associated with growth,wood physical and chemical properties,disease resistance,and leaf morphology traits were identified,with logarithm of odds(LOD)scores ranging from 3.13 to 21.72 Notably,pleiotropic analysis revealed significant colocaliza and phenotypic variance explained between 1.7% and 11.6%.-tion hotspots on chromosomes LG1,LG5,LG6,LG8,and LG14,with epistatic interaction network analysis confirming genetic basis of coordinated regulation across multiple traits.Functional annotation of 207 candidate genes showed that R2R3-MYB and bHLH transcription factors and pyruvate kinase-encoding genes were significantly enriched,suggesting crucial roles in lignin biosynthesis and carbon metabolic pathways.Allelic effect analysis indicated that the frequency of favorable alleles associated with target traits ranged from 0.20 to 0.55.Incorporation of QTL-derived favorable alleles as random effects into Bayesian-based genomic selection models led to an increase in prediction accuracy ranging from 1% to 21%,with Bayesian ridge regression as the best predictive model.This study provides valuable genomic resources and genetic insights for deciphering complex trait architecture and advancing molecular breeding in poplar.展开更多
The principle of genomic selection(GS) entails estimating breeding values(BVs) by summing all the SNP polygenic effects. The visible/near-infrared spectroscopy(VIS/NIRS) wavelength and abundance values can directly re...The principle of genomic selection(GS) entails estimating breeding values(BVs) by summing all the SNP polygenic effects. The visible/near-infrared spectroscopy(VIS/NIRS) wavelength and abundance values can directly reflect the concentrations of chemical substances, and the measurement of meat traits by VIS/NIRS is similar to the processing of genomic selection data by summing all ‘polygenic effects' associated with spectral feature peaks. Therefore, it is meaningful to investigate the incorporation of VIS/NIRS information into GS models to establish an efficient and low-cost breeding model. In this study, we measured 6 meat quality traits in 359Duroc×Landrace×Yorkshire pigs from Guangxi Zhuang Autonomous Region, China, and genotyped them with high-density SNP chips. According to the completeness of the information for the target population, we proposed 4breeding strategies applied to different scenarios: Ⅰ, only spectral and genotypic data exist for the target population;Ⅱ, only spectral data exist for the target population;Ⅲ, only spectral and genotypic data but with different prediction processes exist for the target population;and Ⅳ, only spectral and phenotypic data exist for the target population.The 4 scenarios were used to evaluate the genomic estimated breeding value(GEBV) accuracy by increasing the VIS/NIR spectral information. In the results of the 5-fold cross-validation, the genetic algorithm showed remarkable potential for preselection of feature wavelengths. The breeding efficiency of Strategies Ⅱ, Ⅲ, and Ⅳ was superior to that of traditional GS for most traits, and the GEBV prediction accuracy was improved by 32.2, 40.8 and 15.5%, respectively on average. Among them, the prediction accuracy of Strategy Ⅱ for fat(%) even improved by 50.7% compared to traditional GS. The GEBV prediction accuracy of Strategy Ⅰ was nearly identical to that of traditional GS, and the fluctuation range was less than 7%. Moreover, the breeding cost of the 4 strategies was lower than that of traditional GS methods, with Strategy Ⅳ being the lowest as it did not require genotyping.Our findings demonstrate that GS methods based on VIS/NIRS data have significant predictive potential and are worthy of further research to provide a valuable reference for the development of effective and affordable breeding strategies.展开更多
With marker and phenotype information from observed populations, genomic selection (GS) can be used to establish associations between markers and phenotypes. It aims to use genome-wide markers to estimate the effect...With marker and phenotype information from observed populations, genomic selection (GS) can be used to establish associations between markers and phenotypes. It aims to use genome-wide markers to estimate the effects of all loci and thereby predict the genetic values of untested populations, so as to achieve more comprehensive and reliable selection and to accelerate genetic progress in crop breeding. GS models usually face the problem that the number of markers is much higher than the number of phenotypic observations. To overcome this issue and improve prediction accuracy, many models and algorithms, including GBLUP, Bayes, and machine learning have been employed for GS. As hot issues in GS research, the estimation of non-additive genetic effects and the combined analysis of multiple traits or multiple environments are also important for improving the accuracy of prediction. In recent years, crop breeding has taken advantage of the development of GS. The principles and characteristics of current popular GS methods and research progress in hese methods for crop improvement are reviewed in this paper.展开更多
Rice(Oryza sativa)provides a staple food source for more than half the world population.However,the current pace of rice breeding in yield growth is insufficient to meet the food demand of the everincreasing global po...Rice(Oryza sativa)provides a staple food source for more than half the world population.However,the current pace of rice breeding in yield growth is insufficient to meet the food demand of the everincreasing global population.Genomic selection(GS)holds a great potential to accelerate breeding progress and is cost-effective via early selection before phenotypes are measured.Previous simulation and experimental studies have demonstrated the usefulness of GS in rice breeding.However,several affecting factors and limitations require careful consideration when performing GS.In this review,we summarize the major genetics and statistical factors affecting predictive performance as well as current progress in the application of GS to rice breeding.We also highlight effective strategies to increase the predictive ability of various models,including GS models incorporating functional markers,genotype by environment interactions,multiple traits,selection index,and multiple omic data.Finally,we envision that integrating GS with other advanced breeding technologies such as unmanned aerial vehicles and open-source breeding platforms will further improve the efficiency and reduce the cost of breeding.展开更多
In wheat breeding, it is a difficult task to select the most suitable parents for making crosses aimed at the improvement of both grain yield and grain quality. By quantitative genetics theory,the best cross should ha...In wheat breeding, it is a difficult task to select the most suitable parents for making crosses aimed at the improvement of both grain yield and grain quality. By quantitative genetics theory,the best cross should have high progeny mean and large genetic variance, and ideally yield and quality should be less negatively or positively correlated. Usefulness is built on population mean and genetic variance, which can be used to select the best crosses or populations to achieve the breeding objective. In this study, we first compared five models(RR-BLUP, Bayes A, Bayes B, Bayes ridge regression, and Bayes LASSO) for genomic selection(GS) with respect to prediction of usefulness of a biparental cross and two criteria for parental selection, using simulation. The two parental selection criteria were usefulness and midparent genomic estimated breeding value(GEBV). Marginal differences were observed among GS models. Parental selection with usefulness resulted in higher genetic gain than midparent GEBV. In a population of 57 wheat fixed lines genotyped with 7588 selected markers, usefulness of each biparental cross was calculated to evaluate the cross performance, a key target of breeding programs aimed at developing pure lines. It was observed that progeny mean was a major determinant of usefulness, but the usefulness ratings of quality traits were more influenced by their genetic variances in the progeny population. Near-zero or positive correlations between yield and major quality traits were found in some crosses, although they were negatively correlated in the population of parents. A selection index incorporating yield, extensibility, and maximum resistance was formed as a new trait and its usefulness for selecting the crosses with the best potential to improve yield and quality simultaneously was calculated. It was shown that applying the selection index improved both yield and quality while retaining more genetic variance in the selected progenies than the individual trait selection. It was concluded that combining genomic selection with simulation allows the prediction of cross performance in simulated progenies and thereby identifies candidate parents before crosses are made in the field for pure-line breeding programs.展开更多
Single nucleotide polymorphism(SNP)armays are a powerful genotyping tool used in genetic research and genomic breeding programs.Japanese flounder(Paralichthys olivaceus)is an economically-important aquaculture flatfis...Single nucleotide polymorphism(SNP)armays are a powerful genotyping tool used in genetic research and genomic breeding programs.Japanese flounder(Paralichthys olivaceus)is an economically-important aquaculture flatfish in many countries.However,the lack of high-efficient genotyping tools has impeded the genomic breeding programs for Japanese flounder.We developed a 50K Japanese flounder SNP array,"Yuxin No.1,"and report its utility in genomic selection(GS)for disease resistance to bacterial pathogens.We screened more than 42,.2 million SNPs from the whole-genome resequencing data of 1099 individuals and selected 48697 SNPs that were evenly distributed across the genome to anchor the array with Affymetrix Axiom genotyping technology.Evaluation of the array performance with 168 fishs howed that 74.7%of the loci were successfully genotyped with high call rates(>98%)and that the poly-morphic SNPs had good cluster separations.More than 85%of the SNPs were concordant with SNPs obtained from the whole-genome resequencing data.To validate"Yuxin No.1"for GS,the arrayed geno-typing data of 27 individuals from a candidate population and 931 individuals from a reference popula-tion were used to calculate the genomic estimated breeding values(GEBVs)for disease resistance toEdwardsiella tarda.There was a 21.2%relative increase in the accuracy of GEBV using the weighted geno-mic best linear unpiased prediction(wGBLUJP),compared to traditional pedigree-based best linear unbi-ased prediction(ABLUP),suggesting good performance of the'Yuxin No.1"SNP array for GS.In summary,we developed the"Yuxin No.1"50K SNP array,which provides a useful platform for high-quality geno-typing that may be beneficial to the genomic selective breeding of Japanese flounder.展开更多
Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding ...Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value(GEBV).This study is a fi rst attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits.The performance of GS models in L.vannamei was evaluated in a population consisting of 205 individuals,which were genotyped for 6 359 single nucleotide polymorphism(SNP)markers by specifi c length amplifi ed fragment sequencing(SLAF-seq)and phenotyped for body length and body weight.Three GS models(RR-BLUP,Bayes A,and Bayesian LASSO)were used to obtain the GEBV,and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes.The mean reliability of the GEBVs for body length and body weight predicted by the dif ferent models was 0.296 and 0.411,respectively.For each trait,the performances of the three models were very similar to each other with respect to predictability.The regression coeffi cients estimated by the three models were close to one,suggesting near to zero bias for the predictions.Therefore,when GS was applied in a L.vannamei population for the studied scenarios,all three models appeared practicable.Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.展开更多
Genomic selection(GS)has been widely used in livestock,which greatly accelerated the genetic progress of complex traits.The population size was one of the significant factors affecting the prediction accuracy,while it...Genomic selection(GS)has been widely used in livestock,which greatly accelerated the genetic progress of complex traits.The population size was one of the significant factors affecting the prediction accuracy,while it was limited by the purebred population.Compared to directly combining two uncorrelated purebred populations to extend the reference population size,it might be more meaningful to incorporate the correlated crossbreds into reference population for genomic prediction.In this study,we simulated purebred offspring(PAS and PBS)and crossbred offspring(CAB)base on real genotype data of two base purebred populations(PA and PB),to evaluate the performance of genomic selection on purebred while incorporating crossbred information.The results showed that selecting key crossbred individuals via maximizing the expected genetic relationship(REL)was better than the other methods(individuals closet or farthest to the purebred population,CP/FP)in term of the prediction accuracy.Furthermore,the prediction accuracy of reference populations combining PA and CAB was significantly better only based on PA,which was similar to combine PA and PAS.Moreover,the rank correlation between the multiple of the increased relationship(MIR)and reliability improvement was 0.60-0.70.But for individuals with low correlation(Cor(Pi,PA or B),the reliability improvement was significantly lower than other individuals.Our findings suggested that incorporating crossbred into purebred population could improve the performance of genetic prediction compared with using the purebred population only.The genetic relationship between purebred and crossbred population is a key factor determining the increased reliability while incorporating crossbred population in the genomic prediction on pure bred individuals.展开更多
A biparental soybean population of 364 recombinant inbred lines(RILs)derived from Zhongdou 41×ZYD02.878 was used to identify quantitative trait loci(QTL)associated with hundred-seed weight(100-SW),pod length(PL),...A biparental soybean population of 364 recombinant inbred lines(RILs)derived from Zhongdou 41×ZYD02.878 was used to identify quantitative trait loci(QTL)associated with hundred-seed weight(100-SW),pod length(PL),and pod width(PW).100-SW,PL,and PW showed moderate correlations among one another,and 100-SW was correlated most strongly with PW(0.64–0.74).Respectively 74,70,75 and19 QTL accounting for 38.7%–78.8%of total phenotypic variance were identified by inclusive composite interval mapping,restricted two-stage multi-locus genome-wide association analysis,3 variancecomponent multi-locus random-SNP-effect mixed linear model analysis,and conditional genome-wide association analysis.Of these QTL,189 were novel,and 24 were detected by multiple methods.Six loci were associated with 100-SW,PL,and PW and may be pleiotropic loci.A total of 284 candidate genes were identified in colocalizing QTL regions,including the verified gene Seed thickness 1(ST1).Eleven genes with functions involved in pectin biosynthesis,phytohormone,ubiquitin-protein,and photosynthesis pathways were prioritized by examining single nucleotide polymorphism(SNP)variation,calculating genetic differentiation index,and inquiring gene expression.The prediction accuracies of genomic selection(GS)for 100-SW,PL,and PW based on single trait-associated markers reached 0.82,0.76,and 0.86 respectively,but selection index(SI)-assisted GS strategy did not increase GS efficiency and inclusion of trait-associated markers as fixed effects reduced prediction accuracy.These results shed light on the genetic basis of 100-SW,PL,and PW and provide GS models for these traits with potential application in breeding programs.展开更多
Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of...Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.展开更多
In the face of climate change and the growing global population,there is an urgent need to accelerate the development of high-yielding crop varieties.To this end,vast amounts of genotype-to-phenotype data have been co...In the face of climate change and the growing global population,there is an urgent need to accelerate the development of high-yielding crop varieties.To this end,vast amounts of genotype-to-phenotype data have been collected,and many machine learning(ML)models have been developed to predict phenotype from a given genotype.However,the requirement for high densities of single-nucleotide polymorphisms(SNPs)and the labor-intensive collection of phenotypic data are hampering the use of these models to advance breeding.Furthermore,recently developed genomic selection(GS)models,such as deep learning(DL),are complicated and inconvenient for breeders to navigate and optimize within their breeding programs.Here,we present the development of an intelligent breeding platform named AutoGP(http://autogp.hzau.edu.cn),which integrates genotype extraction,phenotypic extraction,and GS models of genotype-to-phenotype data within a user-friendly web interface.AutoGP has three main advantages over previously developed platforms:1)an efficient sequencing chip to identify high-quality,high-confidence SNPs throughout gene-regulatory networks;2)a complete workflow for extraction of plant phenotypes(such as plant height and leaf count)from smartphone-captured video;and 3)a broad model pool,enabling users to select from five ML models(support vector machine,extreme gradient boosting,gradient-boosted de-cision tree,multilayer perceptron,and random forest)and four commonly used DL models(deep learning genomic selection,deep learning genomic-wide association study,deep neural network for genomic pre-diction,and SoyDNGP).For the convenience of breeders,we use datasets from the maize(Zea mays)com-plete-diallel design plus unbalanced breeding-like inter-cross population as a case study to demonstrate the usefulness of AutoGP.We show that our genotype chips can effectively extract high-quality SNPs asso-ciated with days to tasseling and plant height.The models show reliable predictive accuracy on different populations and can provide effective guidance for breeders.Overall,AutoGP offers a practical solution to streamline the breeding process,enabling breeders to achieve more efficient and accurate genomic selection.展开更多
Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security.The advent of Genomic Selection heralds a new epoch in breeding,characterized by its capacity to harness whole...Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security.The advent of Genomic Selection heralds a new epoch in breeding,characterized by its capacity to harness whole-genome variation for genomic prediction.This approach transcends the need for prior knowledge of genes associated with specific traits.Nonetheless,the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the“curse of dimensionality”,where traditional statistical,machine learning,and deep learning methods are prone to overfitting and suboptimal predictive performance.To surmount this challenge,we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model(VMGP)that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks.This approach provides a robust solution,offering a formidable predictive framework that has been rigorously validated across public datasets for wheat,rice,and maize.Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction,successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility.Furthermore,by integrating VMGP with model interpretability,we can effectively triage relevant single nucleotide polymorphisms,thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions.The VMGP framework,with its simplicity,stable predictive prowess,and open-source code,is exceptionally well-suited for broad dissemination within plant breeding programs.It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.展开更多
The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects acc...The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects accurately.Machine learning models have demonstrated remarkable potential in addressing these challenges.In this study,we introduced the concept of mixed kernel functions to explore the performance of support vector machine regression(SVR) in GS.Six single kernel functions(SVR_L,SVR_C,SVR_G,SVR_P,SVR_S,SVR_L) and four mixed kernel functions(SVR_GS,SVR_GP,SVR_LS,SVR_LP) were used to predict genome breeding values.The prediction accuracy,mean squared error(MSE) and mean absolute error(MAE) were used as evaluation indicators to compare with two traditional parametric models(GBLUP,BayesB) and two popular machine learning models(RF,KcRR).The results indicate that in most cases,the performance of the mixed kernel function model significantly outperforms that of GBLUP,BayesB and single kernel function.For instance,for T1 in the pig dataset,the predictive accuracy of SVR_GS is improved by 10% compared to GBLUP,and by approximately 4.4 and 18.6% compared to SVR_G and SVR_S respectively.For E1 in the wheat dataset,SVR_GS achieves 13.3% higher prediction accuracy than GBLUP.Among single kernel functions,the Laplacian and Gaussian kernel functions yield similar results,with the Gaussian kernel function performing better.The mixed kernel function notably reduces the MSE and MAE when compared to all single kernel functions.Furthermore,regarding runtime,SVR_GS and SVR_GP mixed kernel functions run approximately three times faster than GBLUP in the pig dataset,with only a slight increase in runtime compared to the single kernel function model.In summary,the mixed kernel function model of SVR demonstrates speed and accuracy competitiveness,and the model such as SVR_GS has important application potential for GS.展开更多
Genomic selection (GS) and high-throughput phenotyping have recently been captivating the interest of the crop breeding com- munity from both the public and private sectors world-wide. Both approaches promise to rev...Genomic selection (GS) and high-throughput phenotyping have recently been captivating the interest of the crop breeding com- munity from both the public and private sectors world-wide. Both approaches promise to revolutionize the prediction of complex traits, including growth, yield and adaptation to stress. Whereas high-throughput phenotyping may help to improve understanding of crop physiology, most powerful techniques for high-throughput field phenotyping are empirical rather than analytical and compa- rable to genomic selection. Despite the fact that the two method- ological approaches represent the extremes of what is understood as the breeding process (phenotype versus genome), they both consider the targeted traits (e.g. grain yield, growth, phenology, plant adaptation to stress) as a black box instead of dissectingthem as a set of secondary traits (i.e. physiological) putatively related to the target trait. Both GS and high-throughput phenotyping have in common their empirical approach enabling breeders to use genome profile or phenotype without understanding the underlying biology. This short review discusses the main aspects of both approaches and focuses on the case of genomic selection of maize flowering traits and near-infrared spectroscopy (NIRS) and plant spectral reflectance as high-throughput field phenotyping methods for complex traits such as crop growth and yield.展开更多
Genomic selection,the application of genomic prediction(GP)models to select candidate individuals,has significantly advanced in the past two decades,effectively accelerating genetic gains in plant breeding.This articl...Genomic selection,the application of genomic prediction(GP)models to select candidate individuals,has significantly advanced in the past two decades,effectively accelerating genetic gains in plant breeding.This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period.We delved into the pivotal roles of training population size and genetic diversity,and their relationship with the breeding population,in determining GP accuracy.Special emphasis was placed on optimizing training population size.We explored its benefits and the associated diminishing returns beyond an optimum size.This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms.The density and distribution of single-nucleotide polymorphisms,level of linkage disequilibrium,genetic complexity,trait heritability,statistical machine-learning methods,and non-additive effects are the other vital factors.Using wheat,maize,and potato as examples,we summarize the effect of these factors on the accuracy of GP for various traits.The search for high accuracy in GP—theoretically reaching one when using the Pearson’s correlation as a metric—is an active research area as yet far from optimal for various traits.We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets,effective training population optimization methods and support from other omics approaches(transcriptomics,metabolomics and proteomics)coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy,making genomic selection an effective tool in plant breeding.展开更多
Identifying mechanisms and pathways involved in gene–environment interplay and phenotypic plasticity is a long-standing challenge.It is highly desirable to establish an integrated framework with an environmental dime...Identifying mechanisms and pathways involved in gene–environment interplay and phenotypic plasticity is a long-standing challenge.It is highly desirable to establish an integrated framework with an environmental dimension for complex trait dissection and prediction.A critical step is to identify an environmental index that is both biologically relevant and estimable for new environments.With extensive field-observed complex traits,environmental profiles,and genome-wide single nucleotide polymorphisms for three major crops(maize,wheat,and oat),we demonstrated that identifying such an environmental index(i.e.,a combination of environmental parameter and growth window)enables genome-wide association studies and genomic selection of complex traits to be conducted with an explicit environmental dimension.Interestingly,genes identified for two reaction-norm parameters(i.e.,intercept and slope)derived from flowering time values along the environmental index were less colocalized for a diverse maize panel than for wheat and oat breeding panels,agreeing with the different diversity levels and genetic constitutions of the panels.In addition,we showcased the usefulness of this framework for systematically forecasting the performance of diverse germplasm panels in new environments.This general framework and the companion CERIS-JGRA analytical package should facilitate biologically informed dissection of complex traits,enhanced performance prediction in breeding for future climates,and coordinated efforts to enrich our understanding of mechanisms underlying phenotypic variation.展开更多
Alfalfa(Medicago sativa L.) is an important forage crop worldwide. However, little is known about the effects of breeding status and different geographical populations on alfalfa improvement. Here, we sequenced 220 al...Alfalfa(Medicago sativa L.) is an important forage crop worldwide. However, little is known about the effects of breeding status and different geographical populations on alfalfa improvement. Here, we sequenced 220 alfalfa core germplasms and determined that Chinese alfalfa cultivars form an independent group, as evidenced by comparisons of FSTvalues between different subgroups, suggesting that geographical origin plays an important role in group differentiation. By tracing the influence of geographical regions on the genetic diversity of alfalfa varieties in China, we identified 350 common candidate genetic regions and 548 genes under selection. We also defined 165 loci associated with 24 important traits from genome-wide association studies. Of those, 17 genomic regions closely associated with a given phenotype were under selection, with the underlying haplotypes showing significant differences between subgroups of distinct geographical origins. Based on results from expression analysis and association mapping,we propose that 6-phosphogluconolactonase(MsPGL) and a gene encoding a protein with NHL domains(MsNHL) are critical candidate genes for root growth. In conclusion, our results provide valuable information for alfalfa improvement via molecular breeding.展开更多
Recent advances in molecular genetics techniques have made dense marker maps available, and the prediction of breeding value at the genome level has been employed in genetics research. However, an increasingly large n...Recent advances in molecular genetics techniques have made dense marker maps available, and the prediction of breeding value at the genome level has been employed in genetics research. However, an increasingly large number of markers raise both statistical and computational issues in genomic selection (GS), and many methods have been developed for genomic prediction to address these problems, including ridge regression-best linear unbiased prediction (RR-BLUP), genomic best linear unbiased prediction, BayesA, BayesB, BayesCπ, and Bayesian LASSO. In this paper, these methods were compared regarding inference under different conditions, using real data from a wheat data set and simulated scenarios with a small number of quantitative trait loci (QTL) (20), a moderate number of QTL (60, 180) and an extreme number of QTL (540). This study showed that the genetic architecture of a trait should be fully considered when a GS method is chosen. If a small amount of loci had a large effect on a trait, great differences were found between the predictive ability of various methods and BayesCπ was recommended. Although there was almost no significant difference between the predictive ability of BayesCπ andBayesB, BayesCπ is more feasible than BayesB for real data analysis. If a trait was controlled by a moderate number of genes, the absolute differences between the various methods were small, but BayesA was also found to be the most accurate method. Furthermore, BayesA was widely adaptable and could perform well with different numbers of QTL. If a trait was controlled by an extreme number of minor genes, almost no significant differences were detected between the predictive ability of various methods, but RR-BLUP slightly outperformed the others in both simulated scenarios and real data analysis, thus demonstrating its robustness and indicating that it was quite effective in this case.展开更多
Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies,the rate of genetic gain needs to be accelerated to meet humanity’s demand for agricultural product...Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies,the rate of genetic gain needs to be accelerated to meet humanity’s demand for agricultural products.In this regard,genomic selection(GS)has been considered most promising for genetic improvement of the complex traits controlled by many genes each with minor effects.Livestock scientists pioneered GS application largely due to livestock’s significantly higher individual values and the greater reduction in generation interval that can be achieved in GS.Large-scale application of GS in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects,along with significant cost reduction.Moreover,it would be more effective to integrate GS with other breeding tools and platforms for accelerating the breeding process and thereby further enhancing genetic gain.In addition,establishing an open-source breeding network and developing transdisciplinary approaches would be essential in enhancing breeding efficiency for small-and medium-sized enterprises and agricultural research systems in developing countries.New strategies centered on GS for enhancing genetic gain need to be developed.展开更多
基金supported by the National Key Research and Development Plan of China(2021YFD2200202)the Key Research and Development Project of Jiangsu Province,China(BE2021366).
文摘Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequencing of 707 individuals from a full-sib family to develop comprehensive single nucleotide polymorphism(SNP)markers and constructed a high-density genetic linkage map of 19 linkage groups.The total genetic length of the map reached 3623.65 cM with an average marker interval of 0.34 cM.By integrating multidimensional phenotypic data,89 quantitative trait loci(QTL)associated with growth,wood physical and chemical properties,disease resistance,and leaf morphology traits were identified,with logarithm of odds(LOD)scores ranging from 3.13 to 21.72 Notably,pleiotropic analysis revealed significant colocaliza and phenotypic variance explained between 1.7% and 11.6%.-tion hotspots on chromosomes LG1,LG5,LG6,LG8,and LG14,with epistatic interaction network analysis confirming genetic basis of coordinated regulation across multiple traits.Functional annotation of 207 candidate genes showed that R2R3-MYB and bHLH transcription factors and pyruvate kinase-encoding genes were significantly enriched,suggesting crucial roles in lignin biosynthesis and carbon metabolic pathways.Allelic effect analysis indicated that the frequency of favorable alleles associated with target traits ranged from 0.20 to 0.55.Incorporation of QTL-derived favorable alleles as random effects into Bayesian-based genomic selection models led to an increase in prediction accuracy ranging from 1% to 21%,with Bayesian ridge regression as the best predictive model.This study provides valuable genomic resources and genetic insights for deciphering complex trait architecture and advancing molecular breeding in poplar.
基金supported by the National Natural Science Foundation of China(32160782 and 32060737).
文摘The principle of genomic selection(GS) entails estimating breeding values(BVs) by summing all the SNP polygenic effects. The visible/near-infrared spectroscopy(VIS/NIRS) wavelength and abundance values can directly reflect the concentrations of chemical substances, and the measurement of meat traits by VIS/NIRS is similar to the processing of genomic selection data by summing all ‘polygenic effects' associated with spectral feature peaks. Therefore, it is meaningful to investigate the incorporation of VIS/NIRS information into GS models to establish an efficient and low-cost breeding model. In this study, we measured 6 meat quality traits in 359Duroc×Landrace×Yorkshire pigs from Guangxi Zhuang Autonomous Region, China, and genotyped them with high-density SNP chips. According to the completeness of the information for the target population, we proposed 4breeding strategies applied to different scenarios: Ⅰ, only spectral and genotypic data exist for the target population;Ⅱ, only spectral data exist for the target population;Ⅲ, only spectral and genotypic data but with different prediction processes exist for the target population;and Ⅳ, only spectral and phenotypic data exist for the target population.The 4 scenarios were used to evaluate the genomic estimated breeding value(GEBV) accuracy by increasing the VIS/NIR spectral information. In the results of the 5-fold cross-validation, the genetic algorithm showed remarkable potential for preselection of feature wavelengths. The breeding efficiency of Strategies Ⅱ, Ⅲ, and Ⅳ was superior to that of traditional GS for most traits, and the GEBV prediction accuracy was improved by 32.2, 40.8 and 15.5%, respectively on average. Among them, the prediction accuracy of Strategy Ⅱ for fat(%) even improved by 50.7% compared to traditional GS. The GEBV prediction accuracy of Strategy Ⅰ was nearly identical to that of traditional GS, and the fluctuation range was less than 7%. Moreover, the breeding cost of the 4 strategies was lower than that of traditional GS methods, with Strategy Ⅳ being the lowest as it did not require genotyping.Our findings demonstrate that GS methods based on VIS/NIRS data have significant predictive potential and are worthy of further research to provide a valuable reference for the development of effective and affordable breeding strategies.
基金supported by grants from the National High Technology Research and Development Program of China(2014AA10A601-5)the National Key Research and Development Program of China(2016YFD0100303)+5 种基金the National Natural Science Foundation of China(91535103)the Natural Science Foundations of Jiangsu Province(BK20150010)the Natural Science Foundation of the Jiangsu Higher Education Institutions(14KJA210005)the Open Research Fund of State Key Laboratory of Hybrid Rice(Wuhan University)(KF201701)the Science and Technology Innovation Fund Project in Yangzhou University(2016CXJ021)the Priority Academic Program Development of Jiangsu Higher Education Institutions and the Innovative Research Team of Universities in Jiangsu Province
文摘With marker and phenotype information from observed populations, genomic selection (GS) can be used to establish associations between markers and phenotypes. It aims to use genome-wide markers to estimate the effects of all loci and thereby predict the genetic values of untested populations, so as to achieve more comprehensive and reliable selection and to accelerate genetic progress in crop breeding. GS models usually face the problem that the number of markers is much higher than the number of phenotypic observations. To overcome this issue and improve prediction accuracy, many models and algorithms, including GBLUP, Bayes, and machine learning have been employed for GS. As hot issues in GS research, the estimation of non-additive genetic effects and the combined analysis of multiple traits or multiple environments are also important for improving the accuracy of prediction. In recent years, crop breeding has taken advantage of the development of GS. The principles and characteristics of current popular GS methods and research progress in hese methods for crop improvement are reviewed in this paper.
基金supported by the National Natural Science Foundation of China(31801028,32061143030,and 41801013)the National Key Technology Research and Development Program of China(2016YFD0100303)+2 种基金the Priority Academic Program Development of Jiangsu Higher Education Institutionsthe Innovative Research Team of Ministry of Agriculturethe Qing-Lan Project of Yangzhou University。
文摘Rice(Oryza sativa)provides a staple food source for more than half the world population.However,the current pace of rice breeding in yield growth is insufficient to meet the food demand of the everincreasing global population.Genomic selection(GS)holds a great potential to accelerate breeding progress and is cost-effective via early selection before phenotypes are measured.Previous simulation and experimental studies have demonstrated the usefulness of GS in rice breeding.However,several affecting factors and limitations require careful consideration when performing GS.In this review,we summarize the major genetics and statistical factors affecting predictive performance as well as current progress in the application of GS to rice breeding.We also highlight effective strategies to increase the predictive ability of various models,including GS models incorporating functional markers,genotype by environment interactions,multiple traits,selection index,and multiple omic data.Finally,we envision that integrating GS with other advanced breeding technologies such as unmanned aerial vehicles and open-source breeding platforms will further improve the efficiency and reduce the cost of breeding.
基金supported by the National Key Basic Research Program of China(2014CB138105)the National Natural Science Foundation of China(31371623)
文摘In wheat breeding, it is a difficult task to select the most suitable parents for making crosses aimed at the improvement of both grain yield and grain quality. By quantitative genetics theory,the best cross should have high progeny mean and large genetic variance, and ideally yield and quality should be less negatively or positively correlated. Usefulness is built on population mean and genetic variance, which can be used to select the best crosses or populations to achieve the breeding objective. In this study, we first compared five models(RR-BLUP, Bayes A, Bayes B, Bayes ridge regression, and Bayes LASSO) for genomic selection(GS) with respect to prediction of usefulness of a biparental cross and two criteria for parental selection, using simulation. The two parental selection criteria were usefulness and midparent genomic estimated breeding value(GEBV). Marginal differences were observed among GS models. Parental selection with usefulness resulted in higher genetic gain than midparent GEBV. In a population of 57 wheat fixed lines genotyped with 7588 selected markers, usefulness of each biparental cross was calculated to evaluate the cross performance, a key target of breeding programs aimed at developing pure lines. It was observed that progeny mean was a major determinant of usefulness, but the usefulness ratings of quality traits were more influenced by their genetic variances in the progeny population. Near-zero or positive correlations between yield and major quality traits were found in some crosses, although they were negatively correlated in the population of parents. A selection index incorporating yield, extensibility, and maximum resistance was formed as a new trait and its usefulness for selecting the crosses with the best potential to improve yield and quality simultaneously was calculated. It was shown that applying the selection index improved both yield and quality while retaining more genetic variance in the selected progenies than the individual trait selection. It was concluded that combining genomic selection with simulation allows the prediction of cross performance in simulated progenies and thereby identifies candidate parents before crosses are made in the field for pure-line breeding programs.
文摘Single nucleotide polymorphism(SNP)armays are a powerful genotyping tool used in genetic research and genomic breeding programs.Japanese flounder(Paralichthys olivaceus)is an economically-important aquaculture flatfish in many countries.However,the lack of high-efficient genotyping tools has impeded the genomic breeding programs for Japanese flounder.We developed a 50K Japanese flounder SNP array,"Yuxin No.1,"and report its utility in genomic selection(GS)for disease resistance to bacterial pathogens.We screened more than 42,.2 million SNPs from the whole-genome resequencing data of 1099 individuals and selected 48697 SNPs that were evenly distributed across the genome to anchor the array with Affymetrix Axiom genotyping technology.Evaluation of the array performance with 168 fishs howed that 74.7%of the loci were successfully genotyped with high call rates(>98%)and that the poly-morphic SNPs had good cluster separations.More than 85%of the SNPs were concordant with SNPs obtained from the whole-genome resequencing data.To validate"Yuxin No.1"for GS,the arrayed geno-typing data of 27 individuals from a candidate population and 931 individuals from a reference popula-tion were used to calculate the genomic estimated breeding values(GEBVs)for disease resistance toEdwardsiella tarda.There was a 21.2%relative increase in the accuracy of GEBV using the weighted geno-mic best linear unpiased prediction(wGBLUJP),compared to traditional pedigree-based best linear unbi-ased prediction(ABLUP),suggesting good performance of the'Yuxin No.1"SNP array for GS.In summary,we developed the"Yuxin No.1"50K SNP array,which provides a useful platform for high-quality geno-typing that may be beneficial to the genomic selective breeding of Japanese flounder.
基金Supported by the National High Technology Research and Development Program of China(863 Program)(No.2012AA10A404)the National Natural Science Foundation of China(No.31502161)Financially Supported by Qingdao National Laboratory for Marine Science and Technology(No.2015ASKJ02)
文摘Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value(GEBV).This study is a fi rst attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits.The performance of GS models in L.vannamei was evaluated in a population consisting of 205 individuals,which were genotyped for 6 359 single nucleotide polymorphism(SNP)markers by specifi c length amplifi ed fragment sequencing(SLAF-seq)and phenotyped for body length and body weight.Three GS models(RR-BLUP,Bayes A,and Bayesian LASSO)were used to obtain the GEBV,and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes.The mean reliability of the GEBVs for body length and body weight predicted by the dif ferent models was 0.296 and 0.411,respectively.For each trait,the performances of the three models were very similar to each other with respect to predictability.The regression coeffi cients estimated by the three models were close to one,suggesting near to zero bias for the predictions.Therefore,when GS was applied in a L.vannamei population for the studied scenarios,all three models appeared practicable.Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.
基金supported by the earmarked fund for China Agriculture Research System(CARS-35)the National Natural Science Foundation of China(32022078)supported by the National Supercomputer Centre in Guangzhou。
文摘Genomic selection(GS)has been widely used in livestock,which greatly accelerated the genetic progress of complex traits.The population size was one of the significant factors affecting the prediction accuracy,while it was limited by the purebred population.Compared to directly combining two uncorrelated purebred populations to extend the reference population size,it might be more meaningful to incorporate the correlated crossbreds into reference population for genomic prediction.In this study,we simulated purebred offspring(PAS and PBS)and crossbred offspring(CAB)base on real genotype data of two base purebred populations(PA and PB),to evaluate the performance of genomic selection on purebred while incorporating crossbred information.The results showed that selecting key crossbred individuals via maximizing the expected genetic relationship(REL)was better than the other methods(individuals closet or farthest to the purebred population,CP/FP)in term of the prediction accuracy.Furthermore,the prediction accuracy of reference populations combining PA and CAB was significantly better only based on PA,which was similar to combine PA and PAS.Moreover,the rank correlation between the multiple of the increased relationship(MIR)and reliability improvement was 0.60-0.70.But for individuals with low correlation(Cor(Pi,PA or B),the reliability improvement was significantly lower than other individuals.Our findings suggested that incorporating crossbred into purebred population could improve the performance of genetic prediction compared with using the purebred population only.The genetic relationship between purebred and crossbred population is a key factor determining the increased reliability while incorporating crossbred population in the genomic prediction on pure bred individuals.
基金supported by the Key Science and Technology Project of Yunnan(202202AE090014)the National Natural Science Foundation of China(32072016)+1 种基金the Agricultural Science and Technology Innovation Program(ASTIP)of Chinese Academy of Agricultural Sciencesthe Open Fund of Engineering Research Center of Ecology and Agricultural Use of Wetland,Ministry of Education,China(201910)。
文摘A biparental soybean population of 364 recombinant inbred lines(RILs)derived from Zhongdou 41×ZYD02.878 was used to identify quantitative trait loci(QTL)associated with hundred-seed weight(100-SW),pod length(PL),and pod width(PW).100-SW,PL,and PW showed moderate correlations among one another,and 100-SW was correlated most strongly with PW(0.64–0.74).Respectively 74,70,75 and19 QTL accounting for 38.7%–78.8%of total phenotypic variance were identified by inclusive composite interval mapping,restricted two-stage multi-locus genome-wide association analysis,3 variancecomponent multi-locus random-SNP-effect mixed linear model analysis,and conditional genome-wide association analysis.Of these QTL,189 were novel,and 24 were detected by multiple methods.Six loci were associated with 100-SW,PL,and PW and may be pleiotropic loci.A total of 284 candidate genes were identified in colocalizing QTL regions,including the verified gene Seed thickness 1(ST1).Eleven genes with functions involved in pectin biosynthesis,phytohormone,ubiquitin-protein,and photosynthesis pathways were prioritized by examining single nucleotide polymorphism(SNP)variation,calculating genetic differentiation index,and inquiring gene expression.The prediction accuracies of genomic selection(GS)for 100-SW,PL,and PW based on single trait-associated markers reached 0.82,0.76,and 0.86 respectively,but selection index(SI)-assisted GS strategy did not increase GS efficiency and inclusion of trait-associated markers as fixed effects reduced prediction accuracy.These results shed light on the genetic basis of 100-SW,PL,and PW and provide GS models for these traits with potential application in breeding programs.
基金supported by the National Natural Science Foundation of China(Grant No.30800776)the State High-Tech Development Plan of China(Grant No.2008AA101002)the Recommend International Advanced Agricultural Science and Technology Plan of China(Grant No2011-G2A)
基金Supported by the National Key Research and Development Program of China(2021YFD1201103-01-05)。
文摘Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.
基金supported by Biological Breeding-National Science and Technology Major Project(2023ZD04076)the National Key Research and Development Program of China(2023YFF1000100)+2 种基金the National Natural Science Foundation of China(32321005 and 32261143463)the Fundamental Research Funds for the Central Universities of China(2662024XXPY001)the Outstanding Youth Team Cultivation Project of Center Universities(2662023PY007).
文摘In the face of climate change and the growing global population,there is an urgent need to accelerate the development of high-yielding crop varieties.To this end,vast amounts of genotype-to-phenotype data have been collected,and many machine learning(ML)models have been developed to predict phenotype from a given genotype.However,the requirement for high densities of single-nucleotide polymorphisms(SNPs)and the labor-intensive collection of phenotypic data are hampering the use of these models to advance breeding.Furthermore,recently developed genomic selection(GS)models,such as deep learning(DL),are complicated and inconvenient for breeders to navigate and optimize within their breeding programs.Here,we present the development of an intelligent breeding platform named AutoGP(http://autogp.hzau.edu.cn),which integrates genotype extraction,phenotypic extraction,and GS models of genotype-to-phenotype data within a user-friendly web interface.AutoGP has three main advantages over previously developed platforms:1)an efficient sequencing chip to identify high-quality,high-confidence SNPs throughout gene-regulatory networks;2)a complete workflow for extraction of plant phenotypes(such as plant height and leaf count)from smartphone-captured video;and 3)a broad model pool,enabling users to select from five ML models(support vector machine,extreme gradient boosting,gradient-boosted de-cision tree,multilayer perceptron,and random forest)and four commonly used DL models(deep learning genomic selection,deep learning genomic-wide association study,deep neural network for genomic pre-diction,and SoyDNGP).For the convenience of breeders,we use datasets from the maize(Zea mays)com-plete-diallel design plus unbalanced breeding-like inter-cross population as a case study to demonstrate the usefulness of AutoGP.We show that our genotype chips can effectively extract high-quality SNPs asso-ciated with days to tasseling and plant height.The models show reliable predictive accuracy on different populations and can provide effective guidance for breeders.Overall,AutoGP offers a practical solution to streamline the breeding process,enabling breeders to achieve more efficient and accurate genomic selection.
基金supported by the National Key Research and Development Program of China(No.2024YFD1201500)the Key Research and Development Program of Jiangsu Province,China(No.BE2022337,BE2023302,and BE2023315)the National Innovation Center for Digital Seed Industry,Beijing,China,100097.
文摘Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security.The advent of Genomic Selection heralds a new epoch in breeding,characterized by its capacity to harness whole-genome variation for genomic prediction.This approach transcends the need for prior knowledge of genes associated with specific traits.Nonetheless,the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the“curse of dimensionality”,where traditional statistical,machine learning,and deep learning methods are prone to overfitting and suboptimal predictive performance.To surmount this challenge,we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model(VMGP)that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks.This approach provides a robust solution,offering a formidable predictive framework that has been rigorously validated across public datasets for wheat,rice,and maize.Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction,successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility.Furthermore,by integrating VMGP with model interpretability,we can effectively triage relevant single nucleotide polymorphisms,thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions.The VMGP framework,with its simplicity,stable predictive prowess,and open-source code,is exceptionally well-suited for broad dissemination within plant breeding programs.It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.
基金supported by the China Agriculture Research System of MOF and MARAthe National Natural Science Foundation of China (31872337 and 31501919)the Agricultural Science and Technology Innovation Project,China (ASTIP-IAS02)。
文摘The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects accurately.Machine learning models have demonstrated remarkable potential in addressing these challenges.In this study,we introduced the concept of mixed kernel functions to explore the performance of support vector machine regression(SVR) in GS.Six single kernel functions(SVR_L,SVR_C,SVR_G,SVR_P,SVR_S,SVR_L) and four mixed kernel functions(SVR_GS,SVR_GP,SVR_LS,SVR_LP) were used to predict genome breeding values.The prediction accuracy,mean squared error(MSE) and mean absolute error(MAE) were used as evaluation indicators to compare with two traditional parametric models(GBLUP,BayesB) and two popular machine learning models(RF,KcRR).The results indicate that in most cases,the performance of the mixed kernel function model significantly outperforms that of GBLUP,BayesB and single kernel function.For instance,for T1 in the pig dataset,the predictive accuracy of SVR_GS is improved by 10% compared to GBLUP,and by approximately 4.4 and 18.6% compared to SVR_G and SVR_S respectively.For E1 in the wheat dataset,SVR_GS achieves 13.3% higher prediction accuracy than GBLUP.Among single kernel functions,the Laplacian and Gaussian kernel functions yield similar results,with the Gaussian kernel function performing better.The mixed kernel function notably reduces the MSE and MAE when compared to all single kernel functions.Furthermore,regarding runtime,SVR_GS and SVR_GP mixed kernel functions run approximately three times faster than GBLUP in the pig dataset,with only a slight increase in runtime compared to the single kernel function model.In summary,the mixed kernel function model of SVR demonstrates speed and accuracy competitiveness,and the model such as SVR_GS has important application potential for GS.
基金Participation of Jos Luis Araus and María Dolors Serret was supported by the Spanish Project AGL2010-20180 (subprogram AGR)the FP7 European Project OPTICHINA (266045)
文摘Genomic selection (GS) and high-throughput phenotyping have recently been captivating the interest of the crop breeding com- munity from both the public and private sectors world-wide. Both approaches promise to revolutionize the prediction of complex traits, including growth, yield and adaptation to stress. Whereas high-throughput phenotyping may help to improve understanding of crop physiology, most powerful techniques for high-throughput field phenotyping are empirical rather than analytical and compa- rable to genomic selection. Despite the fact that the two method- ological approaches represent the extremes of what is understood as the breeding process (phenotype versus genome), they both consider the targeted traits (e.g. grain yield, growth, phenology, plant adaptation to stress) as a black box instead of dissectingthem as a set of secondary traits (i.e. physiological) putatively related to the target trait. Both GS and high-throughput phenotyping have in common their empirical approach enabling breeders to use genome profile or phenotype without understanding the underlying biology. This short review discusses the main aspects of both approaches and focuses on the case of genomic selection of maize flowering traits and near-infrared spectroscopy (NIRS) and plant spectral reflectance as high-throughput field phenotyping methods for complex traits such as crop growth and yield.
基金supported by SLU Grogrund(#SLU-LTV.2020.1.1.1-654)an Einar and Inga Nilsson Foundation grant.J.I.y.S.was supported by grant PID2021-123718OB-I00+4 种基金funded by MCIN/AEI/10.13039/501100011033by“ERDF A way of making Europe,”CEX2020-000999-S.R.R.V.supported by Novo Nordisk Fonden(0074727)SLU’s Centre for Biological ControlIn addition,J.I.y.S.and J.F.-G.were supported by the Beatriz Galindo Program BEAGAL 18/00115.
文摘Genomic selection,the application of genomic prediction(GP)models to select candidate individuals,has significantly advanced in the past two decades,effectively accelerating genetic gains in plant breeding.This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period.We delved into the pivotal roles of training population size and genetic diversity,and their relationship with the breeding population,in determining GP accuracy.Special emphasis was placed on optimizing training population size.We explored its benefits and the associated diminishing returns beyond an optimum size.This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms.The density and distribution of single-nucleotide polymorphisms,level of linkage disequilibrium,genetic complexity,trait heritability,statistical machine-learning methods,and non-additive effects are the other vital factors.Using wheat,maize,and potato as examples,we summarize the effect of these factors on the accuracy of GP for various traits.The search for high accuracy in GP—theoretically reaching one when using the Pearson’s correlation as a metric—is an active research area as yet far from optimal for various traits.We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets,effective training population optimization methods and support from other omics approaches(transcriptomics,metabolomics and proteomics)coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy,making genomic selection an effective tool in plant breeding.
基金supported by the Agriculture and Food Research Initiative competitive grant(2021-67013-33833)the USDA National Institute of Food and Agriculture,the Advanced Research Projects Agency-Energy program(DEAR0000826)+1 种基金the Department of Energy,the National Science Foundation(IOS-1546657)the Iowa State University Ray-mond F.Baker Center for Plant Breeding,and the Iowa State University Plant Sciences Institute.
文摘Identifying mechanisms and pathways involved in gene–environment interplay and phenotypic plasticity is a long-standing challenge.It is highly desirable to establish an integrated framework with an environmental dimension for complex trait dissection and prediction.A critical step is to identify an environmental index that is both biologically relevant and estimable for new environments.With extensive field-observed complex traits,environmental profiles,and genome-wide single nucleotide polymorphisms for three major crops(maize,wheat,and oat),we demonstrated that identifying such an environmental index(i.e.,a combination of environmental parameter and growth window)enables genome-wide association studies and genomic selection of complex traits to be conducted with an explicit environmental dimension.Interestingly,genes identified for two reaction-norm parameters(i.e.,intercept and slope)derived from flowering time values along the environmental index were less colocalized for a diverse maize panel than for wheat and oat breeding panels,agreeing with the different diversity levels and genetic constitutions of the panels.In addition,we showcased the usefulness of this framework for systematically forecasting the performance of diverse germplasm panels in new environments.This general framework and the companion CERIS-JGRA analytical package should facilitate biologically informed dissection of complex traits,enhanced performance prediction in breeding for future climates,and coordinated efforts to enrich our understanding of mechanisms underlying phenotypic variation.
基金This work was supported by the Collaborative Research Key Project between China and EU(2017YFE0111000)the National Natural Science Foundation of China(31971758,31772656)the Innovation Program of CAAS(ASTIP-IAS14)。
文摘Alfalfa(Medicago sativa L.) is an important forage crop worldwide. However, little is known about the effects of breeding status and different geographical populations on alfalfa improvement. Here, we sequenced 220 alfalfa core germplasms and determined that Chinese alfalfa cultivars form an independent group, as evidenced by comparisons of FSTvalues between different subgroups, suggesting that geographical origin plays an important role in group differentiation. By tracing the influence of geographical regions on the genetic diversity of alfalfa varieties in China, we identified 350 common candidate genetic regions and 548 genes under selection. We also defined 165 loci associated with 24 important traits from genome-wide association studies. Of those, 17 genomic regions closely associated with a given phenotype were under selection, with the underlying haplotypes showing significant differences between subgroups of distinct geographical origins. Based on results from expression analysis and association mapping,we propose that 6-phosphogluconolactonase(MsPGL) and a gene encoding a protein with NHL domains(MsNHL) are critical candidate genes for root growth. In conclusion, our results provide valuable information for alfalfa improvement via molecular breeding.
基金supported by the National Basic Research Program of China(2011CB100100)the Priority Academic Program Development of Jiangsu Higher Education Institutions+4 种基金the National Natural Science Foundations(31391632,31200943,and31171187)the National High-tech R&D Program(863 Program)(2014AA10A601-5)the Natural Science Foundations of Jiangsu Province(BK2012261)the Natural Science Foundation of the Jiangsu Higher Education Institutions(14KJA210005)the Innovative Research Team of Universities in Jiangsu Province
文摘Recent advances in molecular genetics techniques have made dense marker maps available, and the prediction of breeding value at the genome level has been employed in genetics research. However, an increasingly large number of markers raise both statistical and computational issues in genomic selection (GS), and many methods have been developed for genomic prediction to address these problems, including ridge regression-best linear unbiased prediction (RR-BLUP), genomic best linear unbiased prediction, BayesA, BayesB, BayesCπ, and Bayesian LASSO. In this paper, these methods were compared regarding inference under different conditions, using real data from a wheat data set and simulated scenarios with a small number of quantitative trait loci (QTL) (20), a moderate number of QTL (60, 180) and an extreme number of QTL (540). This study showed that the genetic architecture of a trait should be fully considered when a GS method is chosen. If a small amount of loci had a large effect on a trait, great differences were found between the predictive ability of various methods and BayesCπ was recommended. Although there was almost no significant difference between the predictive ability of BayesCπ andBayesB, BayesCπ is more feasible than BayesB for real data analysis. If a trait was controlled by a moderate number of genes, the absolute differences between the various methods were small, but BayesA was also found to be the most accurate method. Furthermore, BayesA was widely adaptable and could perform well with different numbers of QTL. If a trait was controlled by an extreme number of minor genes, almost no significant differences were detected between the predictive ability of various methods, but RR-BLUP slightly outperformed the others in both simulated scenarios and real data analysis, thus demonstrating its robustness and indicating that it was quite effective in this case.
基金The research involved in this report was supported by the National Key Research and Development Program of China(2016YFD0101803)the National Key Basic Research Program of China(2014 CB138206)+1 种基金the Agricultural Science and Technology Innovation Program of CAAS,and Fundamental Research Funds for Central Non-Profit of Institute of Crop Sciences,CAAS(1610092016124)Research activities of CIMMYT staff have been supported by the Bill and Melinda Gates Foundation and the CGIAR Research Program MAIZE.
文摘Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies,the rate of genetic gain needs to be accelerated to meet humanity’s demand for agricultural products.In this regard,genomic selection(GS)has been considered most promising for genetic improvement of the complex traits controlled by many genes each with minor effects.Livestock scientists pioneered GS application largely due to livestock’s significantly higher individual values and the greater reduction in generation interval that can be achieved in GS.Large-scale application of GS in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects,along with significant cost reduction.Moreover,it would be more effective to integrate GS with other breeding tools and platforms for accelerating the breeding process and thereby further enhancing genetic gain.In addition,establishing an open-source breeding network and developing transdisciplinary approaches would be essential in enhancing breeding efficiency for small-and medium-sized enterprises and agricultural research systems in developing countries.New strategies centered on GS for enhancing genetic gain need to be developed.