摘要
为满足荞麦品质鉴定和育种工作的需要,采用竞争性自适应重加权采样算法(CARS)提取特征光谱,结合定量偏最小二乘法对荞麦叶片总黄酮和蛋白质含量进行了快速测定研究。首先利用Kennard-Stone(KS)算法划分训练集和测试集,训练集总黄酮含量的平均值、最大值和最小值含量分别是55.8、92.5和28.1 mg·g^(-1),测试集样品的平均值、最大值和最小值含量分别是71.0、99.8和31.5 mg·g^(-1)。训练集蛋白质含量的平均值、最大值和最小值含量分别是169.6、331.0和121.2 mg·g^(-1),测试集样品蛋白质含量的平均值、最大值和最小值含量分别是158.2、183.0和129.1 mg·g^(-1)。然后分别使用归一化、归一化+多元散射校正、归一化+标准正太变换、归一化+一阶导数、归一化+二阶导数、归一化+SG平滑滤波对波长在4000~12000 cm-1范围内光谱进行预处理,再采用CARS算法提取特征波段,最后利用偏最小二乘法建立预测模型。通过对模型训练集决定系数(Rc)、测试集决定系数(Rp)、交叉验证均方根误差(RMSECV)、测试集均方根误差(RMSEP)和剩余预测偏差(RPD)的综合分析,得到可预测荞麦总黄酮和蛋白质的最佳模型。其中3个总黄酮预测模型是可用的,最佳的预测模型使用了1102个波段中的46个特征波段,所使用的预处理方法为归一化+一阶导数,其模型的R_(c)、R_(p)、RMSECV、RMSEP和RPD分别为0.997、0.933、0.170、0.829和2.893。4个蛋白质预测模型是可用的,其最佳的预测模型使用了42个特征波长,所使用的预处理方法为归一化+二阶导数,其模型的R_(c)、R_(p)、RMSECV、RMSEP、RPD分别为0.998、0.965、0.202、0.353和3.849。结果表明,将KS算法和CARS算法应用到近红外光谱模型的建立过程,可以利用较少的样本建立可靠的预测模型,满足对荞麦叶片总黄酮和蛋白质的快速测定,为叶用荞麦育种工作提供有力的工具。
To meet the requirements of buckwheat quality determination and breeding work,the Competitive Adaptive Re Weighted Sampling(CARS)algorithm was used in this study to extract the characteristic spectrum and combined with the quantitative partialleast squares method to rapidly determine the total flavonoid and protein content in buckwheat leaves.First,the Kennard-Stone(KS)algorithm was used to split the training and test sets.The training set,s average,maximum,and minimum total flavonoid contents were 55.8,92.5and 28.1mg·g^(-1),respectively.The test set,s average,maximum,and minimum total flavonoid contents were 71.0,99.8and 31.5mg·g^(-1),respectively.The training set,s average,maximum,and minimum protein contents were 169.6,331.0and 121.2 mg·g^(-1),respectively.The samples,average,maximum,and minimum protein contents in the test set are 158.2,183.0and 129.1 mg·g^(-1),respectively.Then use Normalization,Normalization+Multiplicative Scatter Correction(MSC),Normalization+Standard Normal Variate Transform(SNV),Normalization+First Derivative,Normalization+Second Order Derivative,Normalization+Savitzky-Golay Smoothing Filter(SG)to preprocess the spectrum in the wavelength range from 4000to 12000cm-1,then use CARS algorithm to extract the characteristic spectrum,and finally use the partial least squares method to build prediction models.Through a comprehensive analysis of the coefficient of determination of the training model(Rc),the coefficient of determination of the test model(Rp),the root mean square error of cross-validation(RMSECV),the root mean square error of the test model(RMSEP)and the residual predictive deviation(RPD),we obtain the best model for the prediction of total flavonoid and protein in buckwheat.Three available prediction models for total flavonoidswere constructed.The best prediction model used 46characteristic wavenumber points out of 1102wavenumber points.The preprocessing method used was normalization+first derivative.The model,s R_(c),R_(p),RMSECV,RMSEP,and RPD were 0.997,0.933,0.170,0.829and 2.893,respectively.Four available protein prediction models were created,the best of which used 42characteristic wavenumber points,and the preprocessing method used was normalization+second derivative.The model,s R_(c),R_(p),RMSECV,RMSEP,and RPD are 0.998,0.965,0.202,0.353and 3.849,respectively.The results show that the application of the KS algorithm and CARS algorithm in building the near-infrared spectroscopy model requires fewer samples to build a reliable prediction model,enables the rapid determination of total flavonoids and protein of buckwheat leaves,and provides powerful tools for buckwheat breeding.
作者
朱丽伟
杜千禧
唐国红
李洪有
张晓娜
陈庆富
石桃雄
ZHU Li-wei;DU Qian-xi;TANG Guo-hong;LI Hong-you;ZHANG Xiao-na;CHEN Qing-fu;SHI Tao-xiong(Research Center of Buckwheat Industry Technology,College of Life Science,Guizhou Normal University,Guiyang 550001,China)
出处
《光谱学与光谱分析》
北大核心
2025年第9期2585-2589,共5页
Spectroscopy and Spectral Analysis
基金
现代农业产业技术体系建设专项资金(CARS-07-A5)
贵州省特色杂粮生物育种重点实验室(黔科合平台[2025]026)
贵州师范大学学术新苗项目(黔科合平台人才11904/0520070)资助。
关键词
近红外光谱
荞麦
KS算法
CARS算法
总黄酮
蛋白质
Near infrared spectroscopy
Buckwheat
KS algorithm
CARS algorithm
Total flavonoids
Protein