A novel method named two-level group contribution (GC-K) method for the estimation of octanol-water partition coefficient (Kow) of chloride hydrocarbon is presented. The equation includes only normal boiling point...A novel method named two-level group contribution (GC-K) method for the estimation of octanol-water partition coefficient (Kow) of chloride hydrocarbon is presented. The equation includes only normal boiling points and molecular weight of compounds. Group contribution parameters of 12 first-level groups and 7 second-level groups for Kow are obtained by correlating experimental data of three types including 57 compounds. By comparing the estimation results of the first-level with that of the two-level groups, it was observed that the latter is better with the addition of the modification of proximity effects. When compared with Marrero's three-level group contribution approach and atom-fragment contribution method (AFC), the accuracy of the average relative error of GC-K by first-level groups is 7.20% and is preferred to other methods.展开更多
This study aims to significantly improve existing quantitative structure-property relationship(QSPR)models for predicting the octanol-water partition coefficient(KOW).This is because accurate predictions of KOW are cr...This study aims to significantly improve existing quantitative structure-property relationship(QSPR)models for predicting the octanol-water partition coefficient(KOW).This is because accurate predictions of KOW are crucial for assessing the environmental behavior and bioaccumulation potential of chemicals.Previous models have reported determination coefficient(R^(2))values between 0.9451 and 0.9681,and this research seeks to exceed these benchmarks.Three machine learning(ML)models are explored,i.e.,feed-forward neural networks(FNN),extreme gradient boosting(XGBoost),and random forest(RF).Using a dataset of 14,610 solvents(14,580 after data cleaning)and 21 molecular descriptors derived from SMILES representations,we rigorously evaluate these models based on R^(2),mean absolute error(MAE),root mean squared error(RMSE),and mean relative error(MRE).Notably,the best model developed,the XGBoost-based QSPR,demonstrated exceptional performance,exhibiting an impressive R^(2)value of 0.9772,surpassing benchmarks set by prior research models.Additionally,shapley additive explanation(SHAP)analysis is also employed for model interpretation,and it is revealed that the top five influential input features include SMR_VSA8,SMR_VSA3,Kappa2,HeavyAtomCount,and fr_furan.This study not only sets a new benchmark for KOW prediction accuracy but also enhances the interpretability of QSPR models.展开更多
文摘A novel method named two-level group contribution (GC-K) method for the estimation of octanol-water partition coefficient (Kow) of chloride hydrocarbon is presented. The equation includes only normal boiling points and molecular weight of compounds. Group contribution parameters of 12 first-level groups and 7 second-level groups for Kow are obtained by correlating experimental data of three types including 57 compounds. By comparing the estimation results of the first-level with that of the two-level groups, it was observed that the latter is better with the addition of the modification of proximity effects. When compared with Marrero's three-level group contribution approach and atom-fragment contribution method (AFC), the accuracy of the average relative error of GC-K by first-level groups is 7.20% and is preferred to other methods.
基金supported by the National Natural Science Foundation of China(22308037,22378030)the National Natural Science Foundation for Excellent Young Scientists of China(22122802)+2 种基金China Postdoctoral Science Foundation(2024T171135,2024M754114)Natural Science Foundation of Chongqing,China(Grant No.CSTB2022NSCQ-MSX0655)Chongqing Special Support Fund for Post Doctor(Grant No.2022CQBSHTB3047).
文摘This study aims to significantly improve existing quantitative structure-property relationship(QSPR)models for predicting the octanol-water partition coefficient(KOW).This is because accurate predictions of KOW are crucial for assessing the environmental behavior and bioaccumulation potential of chemicals.Previous models have reported determination coefficient(R^(2))values between 0.9451 and 0.9681,and this research seeks to exceed these benchmarks.Three machine learning(ML)models are explored,i.e.,feed-forward neural networks(FNN),extreme gradient boosting(XGBoost),and random forest(RF).Using a dataset of 14,610 solvents(14,580 after data cleaning)and 21 molecular descriptors derived from SMILES representations,we rigorously evaluate these models based on R^(2),mean absolute error(MAE),root mean squared error(RMSE),and mean relative error(MRE).Notably,the best model developed,the XGBoost-based QSPR,demonstrated exceptional performance,exhibiting an impressive R^(2)value of 0.9772,surpassing benchmarks set by prior research models.Additionally,shapley additive explanation(SHAP)analysis is also employed for model interpretation,and it is revealed that the top five influential input features include SMR_VSA8,SMR_VSA3,Kappa2,HeavyAtomCount,and fr_furan.This study not only sets a new benchmark for KOW prediction accuracy but also enhances the interpretability of QSPR models.
文摘采用定量结构性质相关(QSPR)方法,利用SEDs(Steric and Electronic Descriptors)建立了预测GC-RRT,Kow和Sw的QSPR模型,进行了交叉验证(包括Leave-one-out方法和Leave-more-out方法),并且对缺乏性质数据的PCBs进行了预测.