Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered so...Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.展开更多
Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of...Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.展开更多
Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decisio...Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.展开更多
Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status ...Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.展开更多
The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and u...The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and uncertainties during optimization remains a formidable challenge. In this study, a strategy combining interpretable machine learning with metaheuristic optimization algorithms is employed to optimize the reaction process. First, experimental data from a biodiesel production process are collected to establish a database. These data are then used to construct a predictive model based on artificial neural network (ANN) models. Subsequently, interpretable machine learning techniques are applied for quantitative analysis and verification of the model. Finally, four metaheuristic optimization algorithms are coupled with the ANN model to achieve the desired optimization. The research results show that the methanol: palm fatty acid distillate (PFAD) molar ratio contributes the most to the reaction outcome, accounting for 41%. The ANN-simulated annealing (SA) hybrid method is more suitable for this optimization, and the optimal process parameters are a catalyst concentration of 3.00% (mass), a methanol: PFAD molar ratio of 8.67, and a reaction time of 30 min. This study provides deeper insights into reaction process optimization, which will facilitate future applications in various reaction optimization processes.展开更多
While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used imag...While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used image classification method classified into three categories to evaluate their segmentation capabilities for extracting UF across eight cities.The results indicate that pixel-based methods only excel in clear urban environments,and their overall accuracy is not consistently high.RF and SVM perform well but lack stability in object-based UF extraction,influenced by feature selection and classifier performance.Deep learning enhances feature extraction but requires powerful computing and faces challenges with complex urban layouts.SAM excels in medium-sized urban areas but falters in intricate layouts.Integrating traditional and deep learning methods optimizes UF extraction,balancing accuracy and processing efficiency.Future research should focus on adapting algorithms for diverse urban landscapes to enhance UF extraction accuracy and applicability.展开更多
Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasi...Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasing volume of user-generated content on social media and digital platforms,sentiment analysis has become essential for deriving actionable insights across various sectors.This study presents a systematic literature review of sentiment analysis methodologies,encompassing traditional machine learning algorithms,lexicon-based approaches,and recent advancements in deep learning techniques.The review follows a structured protocol comprising three phases:planning,execution,and analysis/reporting.During the execution phase,67 peer-reviewed articles were initially retrieved,with 25 meeting predefined inclusion and exclusion criteria.The analysis phase involved a detailed examination of each study’s methodology,experimental setup,and key contributions.Among the deep learning models evaluated,Long Short-Term Memory(LSTM)networks were identified as the most frequently adopted architecture for sentiment classification tasks.This review highlights current trends,technical challenges,and emerging opportunities in the field,providing valuable guidance for future research and development in applications such as market analysis,public health monitoring,financial forecasting,and crisis management.展开更多
Due to the rapid advancement of information technology,data has emerged as the core resource driving decision-making and innovation across all industries.As the foundation of artificial intelligence,machine learning(M...Due to the rapid advancement of information technology,data has emerged as the core resource driving decision-making and innovation across all industries.As the foundation of artificial intelligence,machine learning(ML)has expanded its applications into intelligent recommendation systems,autonomous driving,medical diagnosis,and financial risk assessment.However,it relies on massive datasets,which contain sensitive personal information.Consequently,Privacy-Preserving Machine Learning(PPML)has become a critical research direction.To address the challenges of efficiency and accuracy in encrypted data computation within PPML,Homomorphic Encryption(HE)technology is a crucial solution,owing to its capability to facilitate computations on encrypted data.However,the integration of machine learning and homomorphic encryption technologies faces multiple challenges.Against this backdrop,this paper reviews homomorphic encryption technologies,with a focus on the advantages of the Cheon-Kim-Kim-Song(CKKS)algorithm in supporting approximate floating-point computations.This paper reviews the development of three machine learning techniques:K-nearest neighbors(KNN),K-means clustering,and face recognition-in integration with homomorphic encryption.It proposes feasible schemes for typical scenarios,summarizes limitations and future optimization directions.Additionally,it presents a systematic exploration of the integration of homomorphic encryption and machine learning from the essence of the technology,application implementation,performance trade-offs,technological convergence and future pathways to advance technological development.展开更多
High-entropy alloys(HEAs)have attracted considerable attention because of their excellent properties and broad compositional design space.However,traditional trial-and-error methods for screening HEAs are costly and i...High-entropy alloys(HEAs)have attracted considerable attention because of their excellent properties and broad compositional design space.However,traditional trial-and-error methods for screening HEAs are costly and inefficient,thereby limiting the development of new materials.Although density functional theory(DFT),molecular dynamics(MD),and thermodynamic modeling have improved the design efficiency,their indirect connection to properties has led to limitations in calculation and prediction.With the awarding of the Nobel Prize in Physics and Chemistry to artificial intelligence(AI)related researchers,there has been a renewed enthusiasm for the application of machine learning(ML)in the field of alloy materials.In this study,common and advanced ML models and strategies in HEA design were introduced,and the mechanism by which ML can play a role in composition optimization and performance prediction was investigated through case studies.The general workflow of ML application in material design was also introduced from the programmer’s point of view,including data preprocessing,feature engineering,model training,evaluation,optimization,and interpretability.Furthermore,data scarcity,multi-model coupling,and other challenges and opportunities at the current stage were analyzed,and an outlook on future research directions was provided.展开更多
This study aims to eliminate the subjectivity and inconsistency inherent in the traditional International Association of Drilling Contractors(IADC)bit wear rating process,which heavily depends on the experience of dri...This study aims to eliminate the subjectivity and inconsistency inherent in the traditional International Association of Drilling Contractors(IADC)bit wear rating process,which heavily depends on the experience of drilling engineers and often leads to unreliable results.Leveraging advancements in computer vision and deep learning algorithms,this research proposes an automated detection and classification method for polycrystalline diamond compact(PDC)bit damage.YOLOv10 was employed to locate the PDC bit cutters,followed by two SqueezeNet models to perform wear rating and wear type classifications.A comprehensive dataset was created based on the IADC dull bit evaluation standards.Additionally,this study discusses the necessity of data augmentation and finds that certain methods,such as cropping,splicing,and mixing,may reduce the accuracy of cutter detection.The experimental results demonstrate that the proposed method significantly enhances the accuracy of bit damage detection and classification while also providing substantial improvements in processing speed and computational efficiency,offering a valuable tool for optimizing drilling operations and reducing costs.展开更多
Deep Learning(DL)offers promising solutions for analyzing wearable signals and gaining valuable insights into cognitive disorders.While previous review studies have explored various aspects of DL in cognitive healthca...Deep Learning(DL)offers promising solutions for analyzing wearable signals and gaining valuable insights into cognitive disorders.While previous review studies have explored various aspects of DL in cognitive healthcare,there remains a lack of comprehensive analysis that integrates wearable signals,data processing techniques,and the broader applications,benefits,and challenges of DL methods.Addressing this limitation,our study provides an extensive review of DL’s role in cognitive healthcare,with a particular emphasis on wearables,data processing,and the inherent challenges in this field.This review also highlights the considerable promise of DL approaches in addressing a broad spectrum of cognitive issues.By enhancing the understanding and analysis of wearable signal modalities,DL models can achieve remarkable accuracy in cognitive healthcare.Convolutional Neural Network(CNN),Recurrent Neural Network(RNN),and Long Short-term Memory(LSTM)networks have demonstrated improved performance and effectiveness in the early diagnosis and progression monitoring of neurological disorders.Beyond cognitive impairment detection,DL has been applied to emotion recognition,sleep analysis,stress monitoring,and neurofeedback.These applications lead to advanced diagnosis,personalized treatment,early intervention,assistive technologies,remote monitoring,and reduced healthcare costs.Nevertheless,the integration of DL and wearable technologies presents several challenges,such as data quality,privacy,interpretability,model generalizability,ethical concerns,and clinical adoption.These challenges emphasize the importance of conducting future research in areas such as multimodal signal analysis and explainable AI.The findings of this review aim to benefit clinicians,healthcare professionals,and society by facilitating better patient outcomes in cognitive healthcare.展开更多
This paper explores the possibility of using machine learning algorithms to predict type 2 diabetes.We selected two commonly used classification models:random forest and logistic regression,modeled patients’clinical ...This paper explores the possibility of using machine learning algorithms to predict type 2 diabetes.We selected two commonly used classification models:random forest and logistic regression,modeled patients’clinical and lifestyle data,and compared their prediction performance.We found that the random forest model achieved the highest accuracy,demonstrated excellent classification results on the test set,and better distinguished between diabetic and non-diabetic patients by the confusion matrix and other evaluation metrics.The support vector machine and logistic regression perform slightly less well but achieve a high level of accuracy.The experimental results validate the effectiveness of the three machine learning algorithms,especially random forest,in the diabetes prediction task and provide useful practical experience for the intelligent prevention and control of chronic diseases.This study promotes the innovation of the diabetes prediction and management model,which is expected to alleviate the pressure on medical resources,reduce the burden of social health care,and improve the prognosis and quality of life of patients.In the future,we can consider expanding the data scale,exploring other machine learning algorithms,and integrating multimodal data to further realize the potential of artificial intelligence(AI)in the field of diabetes.展开更多
BACKGROUND Lotus plumule and its active components have demonstrated inhibitory effects on gastric cancer(GC).However,the molecular mechanism of lotus plumule against GC remains unclear and requires further investigat...BACKGROUND Lotus plumule and its active components have demonstrated inhibitory effects on gastric cancer(GC).However,the molecular mechanism of lotus plumule against GC remains unclear and requires further investigation.AIM To identify the key hub genes associated with the anti-GC effects of lotus plumule.METHODS This study investigated the potential targets of traditional Chinese medicine for inhibiting GC using weighted gene co-expression network analysis and bio-informatics.Initially,the active components and targets of the lotus plumule and the differentially expressed genes associated with GC were identified.Sub-sequently,a protein-protein interaction network was constructed to elucidate the interactions between drug targets and disease-related genes,facilitating the identification of hub genes within the network.The clinical significance of these hub genes was evaluated,and their upstream transcription factors and down-stream targets were identified.The binding ability of a hub gene with its down-stream targets was verified using molecular docking technology.Finally,molecular docking was performed to evaluate the binding affinity between the active ingredients of lotus plumule and the hub gene.RESULTS This study identified 26 genes closely associated with GC.Machine learning analysis and external validation narrowed the list to four genes:Aldo-keto reductase family 1 member B10,fructose-bisphosphatase 1,protein arginine methyltransferase 1,and carbonic anhydrase 9.These genes indicated a strong correlation with anti-GC activity.CONCLUSION Lotus plumule exhibits anti-GC effects.This study identified four hub genes with potential as novel targets for diagnosing and treating GC,providing innovative perspectives for its clinical management.展开更多
Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resour...Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.展开更多
Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing ...Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing and preparation involve two processes: data cleaning and feature scaling. Several machine learning algorithms, including Linear Regression(LR), Decision Tree(DT), Support Vector Machine(SVM),Random Forest(RF), and Gradient Boosting(GB) for classification, were tested using different iterations and various combinations of features and parameters. The support vector radial kernel training model achieved an accuracy of 72.49% without grid search and 64.02% with grid search, while the blind-well test scores were 71.01% and 69.67%, respectively. The Decision Tree(DT) Hyperparameter Optimization model showed an accuracy of 64.15% for training and 67.45% for testing. In comparison, the Decision Tree coupled with grid search yielded better results, with a training score of 69.91% and a testing score of67.89%. The model's validation was carried out using the blind well validation approach, which achieved an accuracy of 69.81%. Three algorithms were used to generate the gradient-boosting model. During training, the Gradient Boosting classifier achieved an accuracy score of 71.57%, and during testing, it achieved 69.89%. The Grid Search model achieved a higher accuracy score of 72.14% during testing. The Extreme Gradient Boosting model had the lowest accuracy score, with only 66.13% for training and66.12% for testing. For validation, the Gradient Boosting(GB) classifier model achieved an accuracy score of 75.41% on the blind well test, while the Gradient Boosting with Grid Search achieved an accuracy score of 71.36%. The Enhanced Random Forest and Random Forest with Bagging algorithms were the most effective, with validation accuracies of 78.30% and 79.18%, respectively. However, the Random Forest and Random Forest with Grid Search models displayed significant variance between their training and testing scores, indicating the potential for overfitting. Random Forest(RF) and Gradient Boosting(GB) are highly effective for facies classification because they handle complex relationships and provide high predictive accuracy. The choice between the two depends on specific project requirements, including interpretability, computational resources, and data nature.展开更多
Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To sa...Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To satisfy quality of service(QoS)requirements of various users,it is critical to research efficient routing strategies to fully utilize satellite resources.This paper proposes a multi-QoS information optimized routing algorithm based on reinforcement learning for LEO satellite networks,which guarantees high level assurance demand services to be prioritized under limited satellite resources while considering the load balancing performance of the satellite networks for low level assurance demand services to ensure the full and effective utilization of satellite resources.An auxiliary path search algorithm is proposed to accelerate the convergence of satellite routing algorithm.Simulation results show that the generated routing strategy can timely process and fully meet the QoS demands of high assurance services while effectively improving the load balancing performance of the link.展开更多
The distributed permutation flow shop scheduling problem(DPFSP)has received increasing attention in recent years.The iterated greedy algorithm(IGA)serves as a powerful optimizer for addressing such a problem because o...The distributed permutation flow shop scheduling problem(DPFSP)has received increasing attention in recent years.The iterated greedy algorithm(IGA)serves as a powerful optimizer for addressing such a problem because of its straightforward,single-solution evolution framework.However,a potential draw-back of IGA is the lack of utilization of historical information,which could lead to an imbalance between exploration and exploitation,especially in large-scale DPFSPs.As a consequence,this paper develops an IGA with memory and learning mechanisms(MLIGA)to efficiently solve the DPFSP targeted at the mini-malmakespan.InMLIGA,we incorporate a memory mechanism to make a more informed selection of the initial solution at each stage of the search,by extending,reconstructing,and reinforcing the information from previous solutions.In addition,we design a twolayer cooperative reinforcement learning approach to intelligently determine the key parameters of IGA and the operations of the memory mechanism.Meanwhile,to ensure that the experience generated by each perturbation operator is fully learned and to reduce the prior parameters of MLIGA,a probability curve-based acceptance criterion is proposed by combining a cube root function with custom rules.At last,a discrete adaptive learning rate is employed to enhance the stability of the memory and learningmechanisms.Complete ablation experiments are utilized to verify the effectiveness of the memory mechanism,and the results show that this mechanism is capable of improving the performance of IGA to a large extent.Furthermore,through comparative experiments involving MLIGA and five state-of-the-art algorithms on 720 benchmarks,we have discovered that MLI-GA demonstrates significant potential for solving large-scale DPFSPs.This indicates that MLIGA is well-suited for real-world distributed flow shop scheduling.展开更多
Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research probl...Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.展开更多
Deep learning algorithm is an effective data mining method and has been used in many fields to solve practical problems.However,the deep learning algorithms often contain some hyper-parameters which may be continuous,...Deep learning algorithm is an effective data mining method and has been used in many fields to solve practical problems.However,the deep learning algorithms often contain some hyper-parameters which may be continuous,integer,or mixed,and are often given based on experience but largely affect the effectiveness of activity recognition.In order to adapt to different hyper-parameter optimization problems,our improved Cuckoo Search(CS)algorithm is proposed to optimize the mixed hyper-parameters in deep learning algorithm.The algorithm optimizes the hyper-parameters in the deep learning model robustly,and intelligently selects the combination of integer type and continuous hyper-parameters that make the model optimal.Then,the mixed hyper-parameter in Convolutional Neural Network(CNN),Long-Short-Term Memory(LSTM)and CNN-LSTM are optimized based on the methodology on the smart home activity recognition datasets.Results show that the methodology can improve the performance of the deep learning model and whether we are experienced or not,we can get a better deep learning model using our method.展开更多
DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we presen...DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we present Fs-LSA (F-score based Learning Search Algorithm), a novel gene selection algorithm designed to enhance the precision and efficiency of target gene identification from microarray data for cancer classification. This algorithm is divided into two phases: the first leverages F-score values to prioritize and select feature genes with the most significant differential expression;the second phase introduces our Learning Search Algorithm (LSA), which harnesses swarm intelligence to identify the optimal subset among the remaining genes. Inspired by human social learning, LSA integrates historical data and collective intelligence for a thorough search, with a dynamic control mechanism that balances exploration and refinement, thereby enhancing the gene selection process. We conducted a rigorous validation of Fs-LSA’s performance using eight publicly available cancer microarray expression datasets. Fs-LSA achieved accuracy, precision, sensitivity, and F1-score values of 0.9932, 0.9923, 0.9962, and 0.994, respectively. Comparative analyses with state-of-the-art algorithms revealed Fs-LSA’s superior performance in terms of simplicity and efficiency. Additionally, we validated the algorithm’s efficacy independently using glioblastoma data from GEO and TCGA databases. It was significantly superior to those of the comparison algorithms. Importantly, the driver genes identified by Fs-LSA were instrumental in developing a predictive model as an independent prognostic indicator for glioblastoma, underscoring Fs-LSA’s transformative potential in genomics and personalized medicine.展开更多
文摘Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.
文摘Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.
基金financially supported by the National Natural Science Foundation of China(No.52073031)the National Key Research and Development Program of China(Nos.2023YFB3208102,2021YFB3200304)+4 种基金the China National Postdoctoral Program for Innovative Talents(No.BX2021302)the Beijing Nova Program(Nos.Z191100001119047,Z211100002121148)the Fundamental Research Funds for the Central Universities(No.E0EG6801X2)the‘Hundred Talents Program’of the Chinese Academy of Sciencesthe BrainLink program funded by the MSIT through the NRF of Korea(No.RS-2023-00237308).
文摘Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.
文摘Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.
基金supported by the National Natural Science Foundation of China(22408227,22238005)the Postdoctoral Research Foundation of China(GZC20231576).
文摘The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and uncertainties during optimization remains a formidable challenge. In this study, a strategy combining interpretable machine learning with metaheuristic optimization algorithms is employed to optimize the reaction process. First, experimental data from a biodiesel production process are collected to establish a database. These data are then used to construct a predictive model based on artificial neural network (ANN) models. Subsequently, interpretable machine learning techniques are applied for quantitative analysis and verification of the model. Finally, four metaheuristic optimization algorithms are coupled with the ANN model to achieve the desired optimization. The research results show that the methanol: palm fatty acid distillate (PFAD) molar ratio contributes the most to the reaction outcome, accounting for 41%. The ANN-simulated annealing (SA) hybrid method is more suitable for this optimization, and the optimal process parameters are a catalyst concentration of 3.00% (mass), a methanol: PFAD molar ratio of 8.67, and a reaction time of 30 min. This study provides deeper insights into reaction process optimization, which will facilitate future applications in various reaction optimization processes.
文摘While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used image classification method classified into three categories to evaluate their segmentation capabilities for extracting UF across eight cities.The results indicate that pixel-based methods only excel in clear urban environments,and their overall accuracy is not consistently high.RF and SVM perform well but lack stability in object-based UF extraction,influenced by feature selection and classifier performance.Deep learning enhances feature extraction but requires powerful computing and faces challenges with complex urban layouts.SAM excels in medium-sized urban areas but falters in intricate layouts.Integrating traditional and deep learning methods optimizes UF extraction,balancing accuracy and processing efficiency.Future research should focus on adapting algorithms for diverse urban landscapes to enhance UF extraction accuracy and applicability.
基金supported by the“Technology Commercialization Collaboration Platform Construction”project of the Innopolis Foundation(Project Number:2710033536)the Competitive Research Fund of The University of Aizu,Japan.
文摘Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasing volume of user-generated content on social media and digital platforms,sentiment analysis has become essential for deriving actionable insights across various sectors.This study presents a systematic literature review of sentiment analysis methodologies,encompassing traditional machine learning algorithms,lexicon-based approaches,and recent advancements in deep learning techniques.The review follows a structured protocol comprising three phases:planning,execution,and analysis/reporting.During the execution phase,67 peer-reviewed articles were initially retrieved,with 25 meeting predefined inclusion and exclusion criteria.The analysis phase involved a detailed examination of each study’s methodology,experimental setup,and key contributions.Among the deep learning models evaluated,Long Short-Term Memory(LSTM)networks were identified as the most frequently adopted architecture for sentiment classification tasks.This review highlights current trends,technical challenges,and emerging opportunities in the field,providing valuable guidance for future research and development in applications such as market analysis,public health monitoring,financial forecasting,and crisis management.
基金supported by the fllowing projects:Natural Science Foundation of China under Grant 62172436Self-Initiated Scientific Research Project of the Chinese People's Armed Police Force under Grant ZZKY20243129Basic Frontier Innovation Project of the Engineering University of the Chinese People's Armed Police Force under Grant WJY202421.
文摘Due to the rapid advancement of information technology,data has emerged as the core resource driving decision-making and innovation across all industries.As the foundation of artificial intelligence,machine learning(ML)has expanded its applications into intelligent recommendation systems,autonomous driving,medical diagnosis,and financial risk assessment.However,it relies on massive datasets,which contain sensitive personal information.Consequently,Privacy-Preserving Machine Learning(PPML)has become a critical research direction.To address the challenges of efficiency and accuracy in encrypted data computation within PPML,Homomorphic Encryption(HE)technology is a crucial solution,owing to its capability to facilitate computations on encrypted data.However,the integration of machine learning and homomorphic encryption technologies faces multiple challenges.Against this backdrop,this paper reviews homomorphic encryption technologies,with a focus on the advantages of the Cheon-Kim-Kim-Song(CKKS)algorithm in supporting approximate floating-point computations.This paper reviews the development of three machine learning techniques:K-nearest neighbors(KNN),K-means clustering,and face recognition-in integration with homomorphic encryption.It proposes feasible schemes for typical scenarios,summarizes limitations and future optimization directions.Additionally,it presents a systematic exploration of the integration of homomorphic encryption and machine learning from the essence of the technology,application implementation,performance trade-offs,technological convergence and future pathways to advance technological development.
基金the National Natural Science Foundation of China(52161011)the Central Guiding Local Science and Technology Development Fund Project(Guike ZY23055005,Guike ZY24212036 and GuikeAB25069457)+5 种基金the Guangxi Science and Technology Project(2023GXNSFDA026046 and Guike AB24010247)the Scientifc Research and Technology Development Program of Guilin(20220110-3 and 20230110-3)the Scientifc Research and Technology Development Program of Nanning Jiangnan district(20230715-02)the Guangxi Key Laboratory of Superhard Material(2022-K-001)the Guangxi Key Laboratory of Information Materials(231003-Z,231033-K and 231013-Z)the Innovation Project of GUET Graduate Education(2025YCXS177)for the fnancial support given to this work.
文摘High-entropy alloys(HEAs)have attracted considerable attention because of their excellent properties and broad compositional design space.However,traditional trial-and-error methods for screening HEAs are costly and inefficient,thereby limiting the development of new materials.Although density functional theory(DFT),molecular dynamics(MD),and thermodynamic modeling have improved the design efficiency,their indirect connection to properties has led to limitations in calculation and prediction.With the awarding of the Nobel Prize in Physics and Chemistry to artificial intelligence(AI)related researchers,there has been a renewed enthusiasm for the application of machine learning(ML)in the field of alloy materials.In this study,common and advanced ML models and strategies in HEA design were introduced,and the mechanism by which ML can play a role in composition optimization and performance prediction was investigated through case studies.The general workflow of ML application in material design was also introduced from the programmer’s point of view,including data preprocessing,feature engineering,model training,evaluation,optimization,and interpretability.Furthermore,data scarcity,multi-model coupling,and other challenges and opportunities at the current stage were analyzed,and an outlook on future research directions was provided.
基金support of the CNPC International Collaborative Research Project(No.2022DQ0410)。
文摘This study aims to eliminate the subjectivity and inconsistency inherent in the traditional International Association of Drilling Contractors(IADC)bit wear rating process,which heavily depends on the experience of drilling engineers and often leads to unreliable results.Leveraging advancements in computer vision and deep learning algorithms,this research proposes an automated detection and classification method for polycrystalline diamond compact(PDC)bit damage.YOLOv10 was employed to locate the PDC bit cutters,followed by two SqueezeNet models to perform wear rating and wear type classifications.A comprehensive dataset was created based on the IADC dull bit evaluation standards.Additionally,this study discusses the necessity of data augmentation and finds that certain methods,such as cropping,splicing,and mixing,may reduce the accuracy of cutter detection.The experimental results demonstrate that the proposed method significantly enhances the accuracy of bit damage detection and classification while also providing substantial improvements in processing speed and computational efficiency,offering a valuable tool for optimizing drilling operations and reducing costs.
基金the Asian Institute of Technology,Khlong Nueng,Thailand for their support in carrying out this study。
文摘Deep Learning(DL)offers promising solutions for analyzing wearable signals and gaining valuable insights into cognitive disorders.While previous review studies have explored various aspects of DL in cognitive healthcare,there remains a lack of comprehensive analysis that integrates wearable signals,data processing techniques,and the broader applications,benefits,and challenges of DL methods.Addressing this limitation,our study provides an extensive review of DL’s role in cognitive healthcare,with a particular emphasis on wearables,data processing,and the inherent challenges in this field.This review also highlights the considerable promise of DL approaches in addressing a broad spectrum of cognitive issues.By enhancing the understanding and analysis of wearable signal modalities,DL models can achieve remarkable accuracy in cognitive healthcare.Convolutional Neural Network(CNN),Recurrent Neural Network(RNN),and Long Short-term Memory(LSTM)networks have demonstrated improved performance and effectiveness in the early diagnosis and progression monitoring of neurological disorders.Beyond cognitive impairment detection,DL has been applied to emotion recognition,sleep analysis,stress monitoring,and neurofeedback.These applications lead to advanced diagnosis,personalized treatment,early intervention,assistive technologies,remote monitoring,and reduced healthcare costs.Nevertheless,the integration of DL and wearable technologies presents several challenges,such as data quality,privacy,interpretability,model generalizability,ethical concerns,and clinical adoption.These challenges emphasize the importance of conducting future research in areas such as multimodal signal analysis and explainable AI.The findings of this review aim to benefit clinicians,healthcare professionals,and society by facilitating better patient outcomes in cognitive healthcare.
文摘This paper explores the possibility of using machine learning algorithms to predict type 2 diabetes.We selected two commonly used classification models:random forest and logistic regression,modeled patients’clinical and lifestyle data,and compared their prediction performance.We found that the random forest model achieved the highest accuracy,demonstrated excellent classification results on the test set,and better distinguished between diabetic and non-diabetic patients by the confusion matrix and other evaluation metrics.The support vector machine and logistic regression perform slightly less well but achieve a high level of accuracy.The experimental results validate the effectiveness of the three machine learning algorithms,especially random forest,in the diabetes prediction task and provide useful practical experience for the intelligent prevention and control of chronic diseases.This study promotes the innovation of the diabetes prediction and management model,which is expected to alleviate the pressure on medical resources,reduce the burden of social health care,and improve the prognosis and quality of life of patients.In the future,we can consider expanding the data scale,exploring other machine learning algorithms,and integrating multimodal data to further realize the potential of artificial intelligence(AI)in the field of diabetes.
基金Supported by Ningxia Key Research and Development Program,No.2023BEG02015Talent Development Projects of Young Qihuang of National Administration of Traditional Chinese Medicine(2020).
文摘BACKGROUND Lotus plumule and its active components have demonstrated inhibitory effects on gastric cancer(GC).However,the molecular mechanism of lotus plumule against GC remains unclear and requires further investigation.AIM To identify the key hub genes associated with the anti-GC effects of lotus plumule.METHODS This study investigated the potential targets of traditional Chinese medicine for inhibiting GC using weighted gene co-expression network analysis and bio-informatics.Initially,the active components and targets of the lotus plumule and the differentially expressed genes associated with GC were identified.Sub-sequently,a protein-protein interaction network was constructed to elucidate the interactions between drug targets and disease-related genes,facilitating the identification of hub genes within the network.The clinical significance of these hub genes was evaluated,and their upstream transcription factors and down-stream targets were identified.The binding ability of a hub gene with its down-stream targets was verified using molecular docking technology.Finally,molecular docking was performed to evaluate the binding affinity between the active ingredients of lotus plumule and the hub gene.RESULTS This study identified 26 genes closely associated with GC.Machine learning analysis and external validation narrowed the list to four genes:Aldo-keto reductase family 1 member B10,fructose-bisphosphatase 1,protein arginine methyltransferase 1,and carbonic anhydrase 9.These genes indicated a strong correlation with anti-GC activity.CONCLUSION Lotus plumule exhibits anti-GC effects.This study identified four hub genes with potential as novel targets for diagnosing and treating GC,providing innovative perspectives for its clinical management.
基金funded by the KRICT Project (KK2512-10) of the Korea Research Institute of Chemical Technology and the Ministry of Trade, Industry and Energy (MOTIE)the Korea Institute for Advancement of Technology (KIAT) through the Virtual Engineering Platform Program (P0022334)+1 种基金supported by the Carbon Neutral Industrial Strategic Technology Development Program (RS-202300261088) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)Further support was provided by research fund of Chungnam National University。
文摘Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.
文摘Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing and preparation involve two processes: data cleaning and feature scaling. Several machine learning algorithms, including Linear Regression(LR), Decision Tree(DT), Support Vector Machine(SVM),Random Forest(RF), and Gradient Boosting(GB) for classification, were tested using different iterations and various combinations of features and parameters. The support vector radial kernel training model achieved an accuracy of 72.49% without grid search and 64.02% with grid search, while the blind-well test scores were 71.01% and 69.67%, respectively. The Decision Tree(DT) Hyperparameter Optimization model showed an accuracy of 64.15% for training and 67.45% for testing. In comparison, the Decision Tree coupled with grid search yielded better results, with a training score of 69.91% and a testing score of67.89%. The model's validation was carried out using the blind well validation approach, which achieved an accuracy of 69.81%. Three algorithms were used to generate the gradient-boosting model. During training, the Gradient Boosting classifier achieved an accuracy score of 71.57%, and during testing, it achieved 69.89%. The Grid Search model achieved a higher accuracy score of 72.14% during testing. The Extreme Gradient Boosting model had the lowest accuracy score, with only 66.13% for training and66.12% for testing. For validation, the Gradient Boosting(GB) classifier model achieved an accuracy score of 75.41% on the blind well test, while the Gradient Boosting with Grid Search achieved an accuracy score of 71.36%. The Enhanced Random Forest and Random Forest with Bagging algorithms were the most effective, with validation accuracies of 78.30% and 79.18%, respectively. However, the Random Forest and Random Forest with Grid Search models displayed significant variance between their training and testing scores, indicating the potential for overfitting. Random Forest(RF) and Gradient Boosting(GB) are highly effective for facies classification because they handle complex relationships and provide high predictive accuracy. The choice between the two depends on specific project requirements, including interpretability, computational resources, and data nature.
基金National Key Research and Development Program(2021YFB2900604)。
文摘Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To satisfy quality of service(QoS)requirements of various users,it is critical to research efficient routing strategies to fully utilize satellite resources.This paper proposes a multi-QoS information optimized routing algorithm based on reinforcement learning for LEO satellite networks,which guarantees high level assurance demand services to be prioritized under limited satellite resources while considering the load balancing performance of the satellite networks for low level assurance demand services to ensure the full and effective utilization of satellite resources.An auxiliary path search algorithm is proposed to accelerate the convergence of satellite routing algorithm.Simulation results show that the generated routing strategy can timely process and fully meet the QoS demands of high assurance services while effectively improving the load balancing performance of the link.
基金supported in part by the National Key Research and Development Program of China under Grant No.2021YFF0901300in part by the National Natural Science Foundation of China under Grant Nos.62173076 and 72271048.
文摘The distributed permutation flow shop scheduling problem(DPFSP)has received increasing attention in recent years.The iterated greedy algorithm(IGA)serves as a powerful optimizer for addressing such a problem because of its straightforward,single-solution evolution framework.However,a potential draw-back of IGA is the lack of utilization of historical information,which could lead to an imbalance between exploration and exploitation,especially in large-scale DPFSPs.As a consequence,this paper develops an IGA with memory and learning mechanisms(MLIGA)to efficiently solve the DPFSP targeted at the mini-malmakespan.InMLIGA,we incorporate a memory mechanism to make a more informed selection of the initial solution at each stage of the search,by extending,reconstructing,and reinforcing the information from previous solutions.In addition,we design a twolayer cooperative reinforcement learning approach to intelligently determine the key parameters of IGA and the operations of the memory mechanism.Meanwhile,to ensure that the experience generated by each perturbation operator is fully learned and to reduce the prior parameters of MLIGA,a probability curve-based acceptance criterion is proposed by combining a cube root function with custom rules.At last,a discrete adaptive learning rate is employed to enhance the stability of the memory and learningmechanisms.Complete ablation experiments are utilized to verify the effectiveness of the memory mechanism,and the results show that this mechanism is capable of improving the performance of IGA to a large extent.Furthermore,through comparative experiments involving MLIGA and five state-of-the-art algorithms on 720 benchmarks,we have discovered that MLI-GA demonstrates significant potential for solving large-scale DPFSPs.This indicates that MLIGA is well-suited for real-world distributed flow shop scheduling.
基金supported by the Key Research and Development Program in Shaanxi Province,China(No.2022ZDLSF07-05)the Fundamental Research Funds for the Central Universities,CHD(No.300102352901)。
文摘Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.
基金Supported by the Anhui Province Sports Health Information Monitoring Technology Engineering Research Center Open Project (KF2023012)。
文摘Deep learning algorithm is an effective data mining method and has been used in many fields to solve practical problems.However,the deep learning algorithms often contain some hyper-parameters which may be continuous,integer,or mixed,and are often given based on experience but largely affect the effectiveness of activity recognition.In order to adapt to different hyper-parameter optimization problems,our improved Cuckoo Search(CS)algorithm is proposed to optimize the mixed hyper-parameters in deep learning algorithm.The algorithm optimizes the hyper-parameters in the deep learning model robustly,and intelligently selects the combination of integer type and continuous hyper-parameters that make the model optimal.Then,the mixed hyper-parameter in Convolutional Neural Network(CNN),Long-Short-Term Memory(LSTM)and CNN-LSTM are optimized based on the methodology on the smart home activity recognition datasets.Results show that the methodology can improve the performance of the deep learning model and whether we are experienced or not,we can get a better deep learning model using our method.
基金supported by the National Natural Science Foundation of China(Grant Number 62341210)Natural Science Foundation of Guangxi Province(Grant Number:2025GXNSFHA069267)Science and Technology Development Plan for Baise City(Grant Number 20233654).
文摘DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we present Fs-LSA (F-score based Learning Search Algorithm), a novel gene selection algorithm designed to enhance the precision and efficiency of target gene identification from microarray data for cancer classification. This algorithm is divided into two phases: the first leverages F-score values to prioritize and select feature genes with the most significant differential expression;the second phase introduces our Learning Search Algorithm (LSA), which harnesses swarm intelligence to identify the optimal subset among the remaining genes. Inspired by human social learning, LSA integrates historical data and collective intelligence for a thorough search, with a dynamic control mechanism that balances exploration and refinement, thereby enhancing the gene selection process. We conducted a rigorous validation of Fs-LSA’s performance using eight publicly available cancer microarray expression datasets. Fs-LSA achieved accuracy, precision, sensitivity, and F1-score values of 0.9932, 0.9923, 0.9962, and 0.994, respectively. Comparative analyses with state-of-the-art algorithms revealed Fs-LSA’s superior performance in terms of simplicity and efficiency. Additionally, we validated the algorithm’s efficacy independently using glioblastoma data from GEO and TCGA databases. It was significantly superior to those of the comparison algorithms. Importantly, the driver genes identified by Fs-LSA were instrumental in developing a predictive model as an independent prognostic indicator for glioblastoma, underscoring Fs-LSA’s transformative potential in genomics and personalized medicine.