The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significan...The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significant contributions to the foundational aspects of the research warranted recognition,and he has now been added as a co-author.展开更多
Converting CO_(2)with green hydrogen to methanol as a carbon-neutral liquid fuel is a promising route for the long-term storage and distribution of intermittent renewable energy.Nevertheless,attaining highly efficient...Converting CO_(2)with green hydrogen to methanol as a carbon-neutral liquid fuel is a promising route for the long-term storage and distribution of intermittent renewable energy.Nevertheless,attaining highly efficient methanol synthesis catalysts from the vast composition space remains a significant challenge.Here we present a machine learning framework for accelerating the development of high space-time yield(STY)methanol synthesis catalysts.A database of methanol synthesis catalysts has been compiled,consisting of catalyst composition,preparation parameters,structural characteristics,reaction conditions and their corresponding catalytic performance.A methodology for constructing catalyst features based on the intrinsic physicochemical properties of the catalyst components has been developed,which significantly reduced the data dimensionality and enhanced the efficiency of machine learning operations.Two high-precision machine learning prediction models for the activities and product selectivity of catalysts were trained and obtained.Using this machine learning framework,an efficient search was achieved within the catalyst composition space,leading to the successful identification of high STY multielement oxide methanol synthesis catalysts.Notably,the CuZnAlTi catalyst achieved high STYs of 0.49 and 0.65 g_(MeOH)/(g_(catalyst)h)for CO_(2)and CO hydrogenation to methanol at 250℃,respectively,and the STY was further increased to 2.63 g_(Me OH)/(g_(catalyst)h)in CO and CO_(2)co-hydrogenation.展开更多
PM_(1.0),particulate matter with an aerodynamic diameter smaller than 1.0μm,can adversely affect human health.However,fewer stations are capable of measuring PM_(1.0) concentrations than PM2.5 and PM10 concentrations...PM_(1.0),particulate matter with an aerodynamic diameter smaller than 1.0μm,can adversely affect human health.However,fewer stations are capable of measuring PM_(1.0) concentrations than PM2.5 and PM10 concentrations in real time(i.e.,only 9 locations for PM_(1.0) vs.623 locations for PM2.5 or PM10)in South Korea,making it impossible to conduct a nationwide health risk analysis of PM_(1.0).Thus,this study aimed to develop a PM_(1.0) prediction model using a random forest algorithm based on PM_(1.0) data from the nine measurement stations and various environmental input factors.Cross validation,in which the model was trained in eight stations and tested in the remaining station,achieved an average R^(2) of 0.913.The high R^(2) value achieved undermutually exclusive training and test locations in the cross validation can be ascribed to the fact that all the locations had similar relationships between PM_(1.0) and the input factors,which were captured by our model.Moreover,results of feature importance analysis showed that PM2.5 and PM10 concentrations were the two most important input features in predicting PM_(1.0) concentration.Finally,the model was used to estimate the PM_(1.0) concentrations in 623 locations,where input factors such as PM2.5 and PM10 can be obtained.Based on the augmented profile,we identified Seoul and Ansan to be PM_(1.0) concentration hotspots.These regions are large cities or the center of anthropogenic and industrial activities.The proposed model and the augmented PM_(1.0) profiles can be used for large epidemiological studies to understand the health impacts of PM_(1.0).展开更多
The accurate prediction of peak particle velocity(PPV)is essential for effectively managing blastinduced vibrations in mining operations.This study presents a novel PPV prediction method based on the social network se...The accurate prediction of peak particle velocity(PPV)is essential for effectively managing blastinduced vibrations in mining operations.This study presents a novel PPV prediction method based on the social network search and LightGBM(SNS-LightGBM)deep gradient cooperative learning framework.The SNS algorithm enhances LightGBM’s learning process by optimizing hyperparameters through global search capabilities and balancing model complexity to improve generalization.To assess its performance,five baseline machine learning models and a hybrid model combining SNS-LightGBM were developed for comparison.The predictive performance of these models was evaluated using metrics such as coefficient of determination(R^(2)),mean absolute error(MAE),mean absolute percentage error(MAPE),mean squared error(MSE),and root mean squared error(RMSE).The results indicate that the SNSLightGBM model substantially improves both the accuracy and stability of PPV predictions.The SNS-LightGBM model outperformed all other models,achieving an R^(2) of 0.975,MAE of 0.086,MAPE of 0.071,MSE of 0.019,and RMSE of 0.138.Additionally,a feature importance analysis revealed that distance and charge weight are the most significant factors influencing PPV,far surpassing other parameters.These findings offer valuable insights for improving the precision of blast vibration prediction and optimizing blasting designs.展开更多
The solid oxide electrolysis cell(SOEC)holds great promise to efficiently convert renewable energy into hydrogen.However,traditional modeling methods are limited to a specific or reported SOEC system.Therefore,four ma...The solid oxide electrolysis cell(SOEC)holds great promise to efficiently convert renewable energy into hydrogen.However,traditional modeling methods are limited to a specific or reported SOEC system.Therefore,four machine learning models are developed to predict the performance of SOEC processes of various types,operating parameters,and feed conditions.The impact of these features on the SOEC's outputs is explained by the Shapley additive explanations and partial dependency plot analyses.The preferredmodel is integratedwith a genetic algorithmto determine the optimal values of each input feature.Results show the improved extreme gradient enhanced regression(XGBoost)algorithm is the core of the machine learning model of the process since it has the highest R^(2)(>0.95)in the three outputs.The electrolytic cell descriptors have a greater impact on the system performance,contributing up to 54.5%.The effective area,voltage,and temperature are the three most influential factors in the SOEC system,contributing 21.6%,16.6%,and 13.0%to its performance.High temperature,high pressure,and low effective area are the most favorable conditions for H_(2)production rate.After conducting multi-objective optimization,the optimal current intensity and hydrogen production rate were determined to be 1.61 A/cm^(2)and 1.174 L/(h⋅cm^(2)).展开更多
Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universa...Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.展开更多
Landslides pose a significant threat to the lives and livelihoods of marginalised communities residing in rural areas and the delicate ecological balance of the environment.Implementing advanced technologies is crucia...Landslides pose a significant threat to the lives and livelihoods of marginalised communities residing in rural areas and the delicate ecological balance of the environment.Implementing advanced technologies is crucial for improving hazard risk assessment and enhancing preparedness measures in regions characterised by diverse topography and complex geological formations.Geospatial applications and modelling techniques have emerged as indispensable in mitigating landslide risks,particularly in environmentally sensitive areas.This study presents a comprehensive approach to landslide susceptibility mapping in the Nilgiri District of Tamil Nadu,India,leveraging the power of Artificial Neural Networks(ANNs)and integrating multi-dimensional geospatial datasets.Integrating ANN-based modelling and geospatial techniques offers significant advantages in terms of statistical robustness,reproducibility,and the ability to analyze the complex interplay of factors influencing landslide hazards quantitatively.The methodology involves rigorous pre-processing and integrating spatial data,including landslide event occurrences as the dependent variable and ten independent parameters influencing landslide susceptibility.These parameters encompass elevation,slope aspect,slope degree,distance to roads,land use patterns,geomorphology,lithology,drainage density,lineament density,and rainfall distribution.Feature extraction and selection techniques are employed to effectively model the complex interactions between these factors and landslide occurrences.This process identifies the most relevant variables influencing landslide susceptibility,enhancing the model's predictive capabilities.The state-of-the-art ANNs are trained using historical landslide occurrence data and the selected influencing factors,enabling the development of a robust and accurate landslide susceptibility model.The performance of the developed model is rigorously evaluated using a comprehensive suite of metrics,including accuracy,precision,and the Area under the Receiver Operating Characteristic(ROC)curve.Preliminary results indicate that the ANN-based landslide susceptibility model outperforms traditional zonation methods,demonstrating higher accuracy and reliability in predicting landslideprone areas.The resulting Landslide Susceptibility Map(LSM)categorises the study area into five distinct hazard zones,ranging from very high(664.1 km^(2)),high(598.9 km^(2)),moderate(639.7 km^(2)),low(478.9 km^(2))and to very low(170.9 km^(2)).Notably,the eastern and central regions of the district emerge as particularly vulnerable to landslide occurrences.The study's findings have far-reaching implications for disaster risk reduction efforts,landuse planning,and sustainable development strategies in the ecologically sensitive Nilgiri District and beyond.展开更多
文摘The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significant contributions to the foundational aspects of the research warranted recognition,and he has now been added as a co-author.
基金supported by the Zhejiang Provincial Natural Science Foundation of China(LDT23E06012E06)National Key R&D Program of China(2023YFC3710800)+3 种基金the National EnergySaving and Low-Carbon Materials Production and Application Demonstration Platform Program(TC220H06N)Pioneer R&D Program of Zhejiang Province-China(2024SSYS0066,2023C03016)National Natural Science Foundation of China(42341208)Zhejiang Energy Group Research Fund(ZNKJ-2023-100)。
文摘Converting CO_(2)with green hydrogen to methanol as a carbon-neutral liquid fuel is a promising route for the long-term storage and distribution of intermittent renewable energy.Nevertheless,attaining highly efficient methanol synthesis catalysts from the vast composition space remains a significant challenge.Here we present a machine learning framework for accelerating the development of high space-time yield(STY)methanol synthesis catalysts.A database of methanol synthesis catalysts has been compiled,consisting of catalyst composition,preparation parameters,structural characteristics,reaction conditions and their corresponding catalytic performance.A methodology for constructing catalyst features based on the intrinsic physicochemical properties of the catalyst components has been developed,which significantly reduced the data dimensionality and enhanced the efficiency of machine learning operations.Two high-precision machine learning prediction models for the activities and product selectivity of catalysts were trained and obtained.Using this machine learning framework,an efficient search was achieved within the catalyst composition space,leading to the successful identification of high STY multielement oxide methanol synthesis catalysts.Notably,the CuZnAlTi catalyst achieved high STYs of 0.49 and 0.65 g_(MeOH)/(g_(catalyst)h)for CO_(2)and CO hydrogenation to methanol at 250℃,respectively,and the STY was further increased to 2.63 g_(Me OH)/(g_(catalyst)h)in CO and CO_(2)co-hydrogenation.
基金supported by the Fine Particle Research Initiative in East Asia Considering National Differences Project through the National Research Foundation of Korea(NRF)funded by the Ministry of Science and ICT(No.NRF-2023M3G1A1090660)supported by a grant from the National Institute of Environmental Research(NIER),funded by the Ministry of Environment of the Republic of Korea(No.NIER-2023-04-02-056).
文摘PM_(1.0),particulate matter with an aerodynamic diameter smaller than 1.0μm,can adversely affect human health.However,fewer stations are capable of measuring PM_(1.0) concentrations than PM2.5 and PM10 concentrations in real time(i.e.,only 9 locations for PM_(1.0) vs.623 locations for PM2.5 or PM10)in South Korea,making it impossible to conduct a nationwide health risk analysis of PM_(1.0).Thus,this study aimed to develop a PM_(1.0) prediction model using a random forest algorithm based on PM_(1.0) data from the nine measurement stations and various environmental input factors.Cross validation,in which the model was trained in eight stations and tested in the remaining station,achieved an average R^(2) of 0.913.The high R^(2) value achieved undermutually exclusive training and test locations in the cross validation can be ascribed to the fact that all the locations had similar relationships between PM_(1.0) and the input factors,which were captured by our model.Moreover,results of feature importance analysis showed that PM2.5 and PM10 concentrations were the two most important input features in predicting PM_(1.0) concentration.Finally,the model was used to estimate the PM_(1.0) concentrations in 623 locations,where input factors such as PM2.5 and PM10 can be obtained.Based on the augmented profile,we identified Seoul and Ansan to be PM_(1.0) concentration hotspots.These regions are large cities or the center of anthropogenic and industrial activities.The proposed model and the augmented PM_(1.0) profiles can be used for large epidemiological studies to understand the health impacts of PM_(1.0).
基金the National Key Research and Development Program of China-2023 Key Special Project(No.2023YFC2907400)the National Natural Science Foundation of China(Grant No.52104109)the Natural Science Foundation of Hunan Province,China(No.2022JJ40602).
文摘The accurate prediction of peak particle velocity(PPV)is essential for effectively managing blastinduced vibrations in mining operations.This study presents a novel PPV prediction method based on the social network search and LightGBM(SNS-LightGBM)deep gradient cooperative learning framework.The SNS algorithm enhances LightGBM’s learning process by optimizing hyperparameters through global search capabilities and balancing model complexity to improve generalization.To assess its performance,five baseline machine learning models and a hybrid model combining SNS-LightGBM were developed for comparison.The predictive performance of these models was evaluated using metrics such as coefficient of determination(R^(2)),mean absolute error(MAE),mean absolute percentage error(MAPE),mean squared error(MSE),and root mean squared error(RMSE).The results indicate that the SNSLightGBM model substantially improves both the accuracy and stability of PPV predictions.The SNS-LightGBM model outperformed all other models,achieving an R^(2) of 0.975,MAE of 0.086,MAPE of 0.071,MSE of 0.019,and RMSE of 0.138.Additionally,a feature importance analysis revealed that distance and charge weight are the most significant factors influencing PPV,far surpassing other parameters.These findings offer valuable insights for improving the precision of blast vibration prediction and optimizing blasting designs.
基金the National Natural Science Foundation of China(No.22108052)the High-end chemicals and cutting-edge new materials Technology Innovation Center of Hefei(HCHC202309).
文摘The solid oxide electrolysis cell(SOEC)holds great promise to efficiently convert renewable energy into hydrogen.However,traditional modeling methods are limited to a specific or reported SOEC system.Therefore,four machine learning models are developed to predict the performance of SOEC processes of various types,operating parameters,and feed conditions.The impact of these features on the SOEC's outputs is explained by the Shapley additive explanations and partial dependency plot analyses.The preferredmodel is integratedwith a genetic algorithmto determine the optimal values of each input feature.Results show the improved extreme gradient enhanced regression(XGBoost)algorithm is the core of the machine learning model of the process since it has the highest R^(2)(>0.95)in the three outputs.The electrolytic cell descriptors have a greater impact on the system performance,contributing up to 54.5%.The effective area,voltage,and temperature are the three most influential factors in the SOEC system,contributing 21.6%,16.6%,and 13.0%to its performance.High temperature,high pressure,and low effective area are the most favorable conditions for H_(2)production rate.After conducting multi-objective optimization,the optimal current intensity and hydrogen production rate were determined to be 1.61 A/cm^(2)and 1.174 L/(h⋅cm^(2)).
基金partially supported by the National Natural Science Foundation of China(grant number 62272044)the Ministry of Science and Technology of the People’s Republic of China with the STI2030-Major Projects(grant number 2021ZD0201900)+5 种基金the Teli Young Fellow Program from the Beijing Institute of Technology,Chinathe Grants-in-Aid for Scientific Research(grant number 20H00569)from the Ministry of Education,Culture,Sports,Science and Technology(MEXT),Japanthe JSPS KAKENHI(grant number 20H00569),Japanthe JST Mirai Program(grant number 21473074),Japanthe JST MOONSHOT Program(grant number JPMJMS229B),Japanthe BIT Research and Innovation Promoting Project(grant number 2023YCXZ014).
文摘Cardiovascular diseases are a prominent cause of mortality,emphasizing the need for early prevention and diagnosis.Utilizing artificial intelligence(AI)models,heart sound analysis emerges as a noninvasive and universally applicable approach for assessing cardiovascular health conditions.However,real-world medical data are dispersed across medical institutions,forming“data islands”due to data sharing limitations for security reasons.To this end,federated learning(FL)has been extensively employed in the medical field,which can effectively model across multiple institutions.Additionally,conventional supervised classification methods require fully labeled data classes,e.g.,binary classification requires labeling of positive and negative samples.Nevertheless,the process of labeling healthcare data is timeconsuming and labor-intensive,leading to the possibility of mislabeling negative samples.In this study,we validate an FL framework with a naive positive-unlabeled(PU)learning strategy.Semisupervised FL model can directly learn from a limited set of positive samples and an extensive pool of unlabeled samples.Our emphasis is on vertical-FL to enhance collaboration across institutions with different medical record feature spaces.Additionally,our contribution extends to feature importance analysis,where we explore 6 methods and provide practical recommendations for detecting abnormal heart sounds.The study demonstrated an impressive accuracy of 84%,comparable to outcomes in supervised learning,thereby advancing the application of FL in abnormal heart sound detection.
文摘Landslides pose a significant threat to the lives and livelihoods of marginalised communities residing in rural areas and the delicate ecological balance of the environment.Implementing advanced technologies is crucial for improving hazard risk assessment and enhancing preparedness measures in regions characterised by diverse topography and complex geological formations.Geospatial applications and modelling techniques have emerged as indispensable in mitigating landslide risks,particularly in environmentally sensitive areas.This study presents a comprehensive approach to landslide susceptibility mapping in the Nilgiri District of Tamil Nadu,India,leveraging the power of Artificial Neural Networks(ANNs)and integrating multi-dimensional geospatial datasets.Integrating ANN-based modelling and geospatial techniques offers significant advantages in terms of statistical robustness,reproducibility,and the ability to analyze the complex interplay of factors influencing landslide hazards quantitatively.The methodology involves rigorous pre-processing and integrating spatial data,including landslide event occurrences as the dependent variable and ten independent parameters influencing landslide susceptibility.These parameters encompass elevation,slope aspect,slope degree,distance to roads,land use patterns,geomorphology,lithology,drainage density,lineament density,and rainfall distribution.Feature extraction and selection techniques are employed to effectively model the complex interactions between these factors and landslide occurrences.This process identifies the most relevant variables influencing landslide susceptibility,enhancing the model's predictive capabilities.The state-of-the-art ANNs are trained using historical landslide occurrence data and the selected influencing factors,enabling the development of a robust and accurate landslide susceptibility model.The performance of the developed model is rigorously evaluated using a comprehensive suite of metrics,including accuracy,precision,and the Area under the Receiver Operating Characteristic(ROC)curve.Preliminary results indicate that the ANN-based landslide susceptibility model outperforms traditional zonation methods,demonstrating higher accuracy and reliability in predicting landslideprone areas.The resulting Landslide Susceptibility Map(LSM)categorises the study area into five distinct hazard zones,ranging from very high(664.1 km^(2)),high(598.9 km^(2)),moderate(639.7 km^(2)),low(478.9 km^(2))and to very low(170.9 km^(2)).Notably,the eastern and central regions of the district emerge as particularly vulnerable to landslide occurrences.The study's findings have far-reaching implications for disaster risk reduction efforts,landuse planning,and sustainable development strategies in the ecologically sensitive Nilgiri District and beyond.