Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered so...Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.展开更多
Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of...Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.展开更多
Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decisio...Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.展开更多
Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status ...Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.展开更多
The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and u...The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and uncertainties during optimization remains a formidable challenge. In this study, a strategy combining interpretable machine learning with metaheuristic optimization algorithms is employed to optimize the reaction process. First, experimental data from a biodiesel production process are collected to establish a database. These data are then used to construct a predictive model based on artificial neural network (ANN) models. Subsequently, interpretable machine learning techniques are applied for quantitative analysis and verification of the model. Finally, four metaheuristic optimization algorithms are coupled with the ANN model to achieve the desired optimization. The research results show that the methanol: palm fatty acid distillate (PFAD) molar ratio contributes the most to the reaction outcome, accounting for 41%. The ANN-simulated annealing (SA) hybrid method is more suitable for this optimization, and the optimal process parameters are a catalyst concentration of 3.00% (mass), a methanol: PFAD molar ratio of 8.67, and a reaction time of 30 min. This study provides deeper insights into reaction process optimization, which will facilitate future applications in various reaction optimization processes.展开更多
While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used imag...While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used image classification method classified into three categories to evaluate their segmentation capabilities for extracting UF across eight cities.The results indicate that pixel-based methods only excel in clear urban environments,and their overall accuracy is not consistently high.RF and SVM perform well but lack stability in object-based UF extraction,influenced by feature selection and classifier performance.Deep learning enhances feature extraction but requires powerful computing and faces challenges with complex urban layouts.SAM excels in medium-sized urban areas but falters in intricate layouts.Integrating traditional and deep learning methods optimizes UF extraction,balancing accuracy and processing efficiency.Future research should focus on adapting algorithms for diverse urban landscapes to enhance UF extraction accuracy and applicability.展开更多
Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasi...Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasing volume of user-generated content on social media and digital platforms,sentiment analysis has become essential for deriving actionable insights across various sectors.This study presents a systematic literature review of sentiment analysis methodologies,encompassing traditional machine learning algorithms,lexicon-based approaches,and recent advancements in deep learning techniques.The review follows a structured protocol comprising three phases:planning,execution,and analysis/reporting.During the execution phase,67 peer-reviewed articles were initially retrieved,with 25 meeting predefined inclusion and exclusion criteria.The analysis phase involved a detailed examination of each study’s methodology,experimental setup,and key contributions.Among the deep learning models evaluated,Long Short-Term Memory(LSTM)networks were identified as the most frequently adopted architecture for sentiment classification tasks.This review highlights current trends,technical challenges,and emerging opportunities in the field,providing valuable guidance for future research and development in applications such as market analysis,public health monitoring,financial forecasting,and crisis management.展开更多
Due to the rapid advancement of information technology,data has emerged as the core resource driving decision-making and innovation across all industries.As the foundation of artificial intelligence,machine learning(M...Due to the rapid advancement of information technology,data has emerged as the core resource driving decision-making and innovation across all industries.As the foundation of artificial intelligence,machine learning(ML)has expanded its applications into intelligent recommendation systems,autonomous driving,medical diagnosis,and financial risk assessment.However,it relies on massive datasets,which contain sensitive personal information.Consequently,Privacy-Preserving Machine Learning(PPML)has become a critical research direction.To address the challenges of efficiency and accuracy in encrypted data computation within PPML,Homomorphic Encryption(HE)technology is a crucial solution,owing to its capability to facilitate computations on encrypted data.However,the integration of machine learning and homomorphic encryption technologies faces multiple challenges.Against this backdrop,this paper reviews homomorphic encryption technologies,with a focus on the advantages of the Cheon-Kim-Kim-Song(CKKS)algorithm in supporting approximate floating-point computations.This paper reviews the development of three machine learning techniques:K-nearest neighbors(KNN),K-means clustering,and face recognition-in integration with homomorphic encryption.It proposes feasible schemes for typical scenarios,summarizes limitations and future optimization directions.Additionally,it presents a systematic exploration of the integration of homomorphic encryption and machine learning from the essence of the technology,application implementation,performance trade-offs,technological convergence and future pathways to advance technological development.展开更多
High-entropy alloys(HEAs)have attracted considerable attention because of their excellent properties and broad compositional design space.However,traditional trial-and-error methods for screening HEAs are costly and i...High-entropy alloys(HEAs)have attracted considerable attention because of their excellent properties and broad compositional design space.However,traditional trial-and-error methods for screening HEAs are costly and inefficient,thereby limiting the development of new materials.Although density functional theory(DFT),molecular dynamics(MD),and thermodynamic modeling have improved the design efficiency,their indirect connection to properties has led to limitations in calculation and prediction.With the awarding of the Nobel Prize in Physics and Chemistry to artificial intelligence(AI)related researchers,there has been a renewed enthusiasm for the application of machine learning(ML)in the field of alloy materials.In this study,common and advanced ML models and strategies in HEA design were introduced,and the mechanism by which ML can play a role in composition optimization and performance prediction was investigated through case studies.The general workflow of ML application in material design was also introduced from the programmer’s point of view,including data preprocessing,feature engineering,model training,evaluation,optimization,and interpretability.Furthermore,data scarcity,multi-model coupling,and other challenges and opportunities at the current stage were analyzed,and an outlook on future research directions was provided.展开更多
Permeability is one of the main oil reservoir characteristics.It affects potential oil production,well-completion technologies,the choice of enhanced oil recovery methods,and more.The methods used to determine and pre...Permeability is one of the main oil reservoir characteristics.It affects potential oil production,well-completion technologies,the choice of enhanced oil recovery methods,and more.The methods used to determine and predict reservoir permeability have serious shortcomings.This article aims to refine and adapt machine learning techniques using historical data from hydrocarbon field development to evaluate and predict parameters such as the skin factor and permeability of the remote reservoir zone.The article analyzes data from 4045 wells tests in oil fields in the Perm Krai(Russia).An evaluation of the performance of different Machine Learning(ML)al-gorithms in the prediction of the well permeability is performed.Three different real datasets are used to train more than 20 machine learning regressors,whose hyperparameters are optimized using Bayesian Optimization(BO).The resulting models demonstrate significantly better predictive performance compared to traditional methods and the best ML model found is one that never was applied before to this problem.The permeability prediction model is characterized by a high R^(2) adjusted value of 0.799.A promising approach is the integration of machine learning methods and the use of pressure recovery curves to estimate permeability in real-time.The work is unique for its approach to predicting pressure recovery curves during well operation without stopping wells,providing primary data for interpretation.These innovations are exclusive and can improve the accuracy of permeability forecasts.It also reduces well downtime associated with traditional well-testing procedures.The proposed methods pave the way for more efficient and cost-effective reservoir development,ultimately sup-porting better decision-making and resource optimization in oil production.展开更多
The nonlinearity of hedonic datasets demands flexible automated valuation models to appraise housing prices accurately,and artificial intelligence models have been employed in mass appraisal to this end.However,they h...The nonlinearity of hedonic datasets demands flexible automated valuation models to appraise housing prices accurately,and artificial intelligence models have been employed in mass appraisal to this end.However,they have been referred to as“blackbox”models owing to difficulties associated with interpretation.In this study,we compared the results of traditional hedonic pricing models with those of machine learning algorithms,e.g.,random forest and deep neural network models.Commonly implemented measures,e.g.,Gini importance and permutation importance,provide only the magnitude of each explanatory variable’s importance,which results in ambiguous interpretability.To address this issue,we employed the SHapley Additive exPlanation(SHAP)method and explored its effectiveness through comparisons with traditionally explainable measures in hedonic pricing models.The results demonstrated that(1)the random forest model with the SHAP method could be a reliable instrument for appraising housing prices with high accuracy and sufficient interpretability,(2)the interpretable results retrieved from the SHAP method can be consolidated by the support of statistical evidence,and(3)housing characteristics and local amenities are primary contributors in property valuation,which is consistent with the findings of previous studies.Thus,our novel methodological framework and robust findings provide informative insights into the use of machine learning methods in property valuation based on the comparative analysis.展开更多
This study aims to eliminate the subjectivity and inconsistency inherent in the traditional International Association of Drilling Contractors(IADC)bit wear rating process,which heavily depends on the experience of dri...This study aims to eliminate the subjectivity and inconsistency inherent in the traditional International Association of Drilling Contractors(IADC)bit wear rating process,which heavily depends on the experience of drilling engineers and often leads to unreliable results.Leveraging advancements in computer vision and deep learning algorithms,this research proposes an automated detection and classification method for polycrystalline diamond compact(PDC)bit damage.YOLOv10 was employed to locate the PDC bit cutters,followed by two SqueezeNet models to perform wear rating and wear type classifications.A comprehensive dataset was created based on the IADC dull bit evaluation standards.Additionally,this study discusses the necessity of data augmentation and finds that certain methods,such as cropping,splicing,and mixing,may reduce the accuracy of cutter detection.The experimental results demonstrate that the proposed method significantly enhances the accuracy of bit damage detection and classification while also providing substantial improvements in processing speed and computational efficiency,offering a valuable tool for optimizing drilling operations and reducing costs.展开更多
Deep Learning(DL)offers promising solutions for analyzing wearable signals and gaining valuable insights into cognitive disorders.While previous review studies have explored various aspects of DL in cognitive healthca...Deep Learning(DL)offers promising solutions for analyzing wearable signals and gaining valuable insights into cognitive disorders.While previous review studies have explored various aspects of DL in cognitive healthcare,there remains a lack of comprehensive analysis that integrates wearable signals,data processing techniques,and the broader applications,benefits,and challenges of DL methods.Addressing this limitation,our study provides an extensive review of DL’s role in cognitive healthcare,with a particular emphasis on wearables,data processing,and the inherent challenges in this field.This review also highlights the considerable promise of DL approaches in addressing a broad spectrum of cognitive issues.By enhancing the understanding and analysis of wearable signal modalities,DL models can achieve remarkable accuracy in cognitive healthcare.Convolutional Neural Network(CNN),Recurrent Neural Network(RNN),and Long Short-term Memory(LSTM)networks have demonstrated improved performance and effectiveness in the early diagnosis and progression monitoring of neurological disorders.Beyond cognitive impairment detection,DL has been applied to emotion recognition,sleep analysis,stress monitoring,and neurofeedback.These applications lead to advanced diagnosis,personalized treatment,early intervention,assistive technologies,remote monitoring,and reduced healthcare costs.Nevertheless,the integration of DL and wearable technologies presents several challenges,such as data quality,privacy,interpretability,model generalizability,ethical concerns,and clinical adoption.These challenges emphasize the importance of conducting future research in areas such as multimodal signal analysis and explainable AI.The findings of this review aim to benefit clinicians,healthcare professionals,and society by facilitating better patient outcomes in cognitive healthcare.展开更多
This paper explores the possibility of using machine learning algorithms to predict type 2 diabetes.We selected two commonly used classification models:random forest and logistic regression,modeled patients’clinical ...This paper explores the possibility of using machine learning algorithms to predict type 2 diabetes.We selected two commonly used classification models:random forest and logistic regression,modeled patients’clinical and lifestyle data,and compared their prediction performance.We found that the random forest model achieved the highest accuracy,demonstrated excellent classification results on the test set,and better distinguished between diabetic and non-diabetic patients by the confusion matrix and other evaluation metrics.The support vector machine and logistic regression perform slightly less well but achieve a high level of accuracy.The experimental results validate the effectiveness of the three machine learning algorithms,especially random forest,in the diabetes prediction task and provide useful practical experience for the intelligent prevention and control of chronic diseases.This study promotes the innovation of the diabetes prediction and management model,which is expected to alleviate the pressure on medical resources,reduce the burden of social health care,and improve the prognosis and quality of life of patients.In the future,we can consider expanding the data scale,exploring other machine learning algorithms,and integrating multimodal data to further realize the potential of artificial intelligence(AI)in the field of diabetes.展开更多
BACKGROUND Lotus plumule and its active components have demonstrated inhibitory effects on gastric cancer(GC).However,the molecular mechanism of lotus plumule against GC remains unclear and requires further investigat...BACKGROUND Lotus plumule and its active components have demonstrated inhibitory effects on gastric cancer(GC).However,the molecular mechanism of lotus plumule against GC remains unclear and requires further investigation.AIM To identify the key hub genes associated with the anti-GC effects of lotus plumule.METHODS This study investigated the potential targets of traditional Chinese medicine for inhibiting GC using weighted gene co-expression network analysis and bio-informatics.Initially,the active components and targets of the lotus plumule and the differentially expressed genes associated with GC were identified.Sub-sequently,a protein-protein interaction network was constructed to elucidate the interactions between drug targets and disease-related genes,facilitating the identification of hub genes within the network.The clinical significance of these hub genes was evaluated,and their upstream transcription factors and down-stream targets were identified.The binding ability of a hub gene with its down-stream targets was verified using molecular docking technology.Finally,molecular docking was performed to evaluate the binding affinity between the active ingredients of lotus plumule and the hub gene.RESULTS This study identified 26 genes closely associated with GC.Machine learning analysis and external validation narrowed the list to four genes:Aldo-keto reductase family 1 member B10,fructose-bisphosphatase 1,protein arginine methyltransferase 1,and carbonic anhydrase 9.These genes indicated a strong correlation with anti-GC activity.CONCLUSION Lotus plumule exhibits anti-GC effects.This study identified four hub genes with potential as novel targets for diagnosing and treating GC,providing innovative perspectives for its clinical management.展开更多
BACKGROUND Delayed wound healing is a common clinical complication following gastric cancer radical surgery,adversely affecting patient prognosis.With advances in artificial intelligence,machine learning offers a prom...BACKGROUND Delayed wound healing is a common clinical complication following gastric cancer radical surgery,adversely affecting patient prognosis.With advances in artificial intelligence,machine learning offers a promising approach for developing predictive models that can identify high-risk patients and support early clinical intervention.AIM To construct machine learning-based risk prediction models for delayed wound healing after gastric cancer surgery to support clinical decision-making.METHODS We reviewed a total of 514 patients who underwent gastric cancer radical surgery under general anesthesia from January 1,2014 to December 30,2023.Seventy percent of the dataset was selected as the training set and 30%as the validation set.Decision trees,support vector machines,and logistic regression were used to construct a risk prediction model.The performance of the model was evaluated using accuracy,recall,precision,F1 index,and area under the receiver operating characteristic curve and decision curve.RESULTS This study included five variables:Sex,elderly,duration of abdominal drainage,preoperative white blood cell(WBC)count,and absolute value of neutrophils.These variables were selected based on their clinical relevance and statistical significance in predicting delayed wound healing.The results showed that the decision tree model outperformed the logistic regression and support vector machine models in both the training and validation sets.Specifically,the decision tree model achieved higher accuracy,F1 index,recall,and area under the curve(AUC)values.The support vector machine model also demonstrated better performance than logistic regression,with higher accuracy,recall,and F1 index,but a slightly lower AUC.The key variables of sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils were found to be strong predictors of delayed wound healing.Patients with longer duration of abdominal drainage had a significantly higher risk of delayed wound healing,with a risk ratio of 1.579 compared to those with shorter duration of abdominal drainage.Similarly,preoperative WBC count,sex,elderly,and absolute value of neutrophils were associated with a higher risk of delayed wound healing,highlighting the importance of these variables in the model.CONCLUSION The model is able to identify high-risk patients based on sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils can provide valuable insights for clinical decision-making.展开更多
Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resour...Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.展开更多
Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing ...Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing and preparation involve two processes: data cleaning and feature scaling. Several machine learning algorithms, including Linear Regression(LR), Decision Tree(DT), Support Vector Machine(SVM),Random Forest(RF), and Gradient Boosting(GB) for classification, were tested using different iterations and various combinations of features and parameters. The support vector radial kernel training model achieved an accuracy of 72.49% without grid search and 64.02% with grid search, while the blind-well test scores were 71.01% and 69.67%, respectively. The Decision Tree(DT) Hyperparameter Optimization model showed an accuracy of 64.15% for training and 67.45% for testing. In comparison, the Decision Tree coupled with grid search yielded better results, with a training score of 69.91% and a testing score of67.89%. The model's validation was carried out using the blind well validation approach, which achieved an accuracy of 69.81%. Three algorithms were used to generate the gradient-boosting model. During training, the Gradient Boosting classifier achieved an accuracy score of 71.57%, and during testing, it achieved 69.89%. The Grid Search model achieved a higher accuracy score of 72.14% during testing. The Extreme Gradient Boosting model had the lowest accuracy score, with only 66.13% for training and66.12% for testing. For validation, the Gradient Boosting(GB) classifier model achieved an accuracy score of 75.41% on the blind well test, while the Gradient Boosting with Grid Search achieved an accuracy score of 71.36%. The Enhanced Random Forest and Random Forest with Bagging algorithms were the most effective, with validation accuracies of 78.30% and 79.18%, respectively. However, the Random Forest and Random Forest with Grid Search models displayed significant variance between their training and testing scores, indicating the potential for overfitting. Random Forest(RF) and Gradient Boosting(GB) are highly effective for facies classification because they handle complex relationships and provide high predictive accuracy. The choice between the two depends on specific project requirements, including interpretability, computational resources, and data nature.展开更多
BACKGROUND Esophageal squamous cell carcinoma is a major histological subtype of esophageal cancer.Many molecular genetic changes are associated with its occurrence.Raman spectroscopy has become a new method for the e...BACKGROUND Esophageal squamous cell carcinoma is a major histological subtype of esophageal cancer.Many molecular genetic changes are associated with its occurrence.Raman spectroscopy has become a new method for the early diagnosis of tumors because it can reflect the structures of substances and their changes at the molecular level.AIM To detect alterations in Raman spectral information across different stages of esophageal neoplasia.METHODS Different grades of esophageal lesions were collected,and a total of 360 groups of Raman spectrum data were collected.A 1D-transformer network model was proposed to handle the task of classifying the spectral data of esophageal squamous cell carcinoma.In addition,a deep learning model was applied to visualize the Raman spectral data and interpret their molecular characteristics.RESULTS A comparison among Raman spectral data with different pathological grades and a visual analysis revealed that the Raman peaks with significant differences were concentrated mainly at 1095 cm^(-1)(DNA,symmetric PO,and stretching vibration),1132 cm^(-1)(cytochrome c),1171 cm^(-1)(acetoacetate),1216 cm^(-1)(amide III),and 1315 cm^(-1)(glycerol).A comparison among the training results of different models revealed that the 1Dtransformer network performed best.A 93.30%accuracy value,a 96.65%specificity value,a 93.30%sensitivity value,and a 93.17%F1 score were achieved.CONCLUSION Raman spectroscopy revealed significantly different waveforms for the different stages of esophageal neoplasia.The combination of Raman spectroscopy and deep learning methods could significantly improve the accuracy of classification.展开更多
BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning ofte...BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning often fail to capture the sparse and diagnostically critical features of metastatic potential.AIM To develop and validate a case-level multiple-instance learning(MIL)framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.METHODS The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected.A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosinstained primary tumour slides for each patient.These pathological features were subsequently integrated with clinical data,and model performance was evaluated using the area under the curve(AUC).RESULTS The case-level framework demonstrated superior LNM prediction over slide-level training,with the CONCH v1.5 model achieving a mean AUC(±SD)of 0.899±0.033 vs 0.814±0.083,respectively.Integrating pathology features with clinical data further enhanced performance,yielding a top model with a mean AUC of 0.904±0.047,in sharp contrast to a clinical-only model(mean AUC 0.584±0.084).Crucially,a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.CONCLUSION A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC.This method shows promise for risk stratification and therapy decisions,requiring further validation.展开更多
文摘Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.
文摘Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.
基金financially supported by the National Natural Science Foundation of China(No.52073031)the National Key Research and Development Program of China(Nos.2023YFB3208102,2021YFB3200304)+4 种基金the China National Postdoctoral Program for Innovative Talents(No.BX2021302)the Beijing Nova Program(Nos.Z191100001119047,Z211100002121148)the Fundamental Research Funds for the Central Universities(No.E0EG6801X2)the‘Hundred Talents Program’of the Chinese Academy of Sciencesthe BrainLink program funded by the MSIT through the NRF of Korea(No.RS-2023-00237308).
文摘Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.
文摘Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.
基金supported by the National Natural Science Foundation of China(22408227,22238005)the Postdoctoral Research Foundation of China(GZC20231576).
文摘The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and uncertainties during optimization remains a formidable challenge. In this study, a strategy combining interpretable machine learning with metaheuristic optimization algorithms is employed to optimize the reaction process. First, experimental data from a biodiesel production process are collected to establish a database. These data are then used to construct a predictive model based on artificial neural network (ANN) models. Subsequently, interpretable machine learning techniques are applied for quantitative analysis and verification of the model. Finally, four metaheuristic optimization algorithms are coupled with the ANN model to achieve the desired optimization. The research results show that the methanol: palm fatty acid distillate (PFAD) molar ratio contributes the most to the reaction outcome, accounting for 41%. The ANN-simulated annealing (SA) hybrid method is more suitable for this optimization, and the optimal process parameters are a catalyst concentration of 3.00% (mass), a methanol: PFAD molar ratio of 8.67, and a reaction time of 30 min. This study provides deeper insights into reaction process optimization, which will facilitate future applications in various reaction optimization processes.
文摘While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used image classification method classified into three categories to evaluate their segmentation capabilities for extracting UF across eight cities.The results indicate that pixel-based methods only excel in clear urban environments,and their overall accuracy is not consistently high.RF and SVM perform well but lack stability in object-based UF extraction,influenced by feature selection and classifier performance.Deep learning enhances feature extraction but requires powerful computing and faces challenges with complex urban layouts.SAM excels in medium-sized urban areas but falters in intricate layouts.Integrating traditional and deep learning methods optimizes UF extraction,balancing accuracy and processing efficiency.Future research should focus on adapting algorithms for diverse urban landscapes to enhance UF extraction accuracy and applicability.
基金supported by the“Technology Commercialization Collaboration Platform Construction”project of the Innopolis Foundation(Project Number:2710033536)the Competitive Research Fund of The University of Aizu,Japan.
文摘Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasing volume of user-generated content on social media and digital platforms,sentiment analysis has become essential for deriving actionable insights across various sectors.This study presents a systematic literature review of sentiment analysis methodologies,encompassing traditional machine learning algorithms,lexicon-based approaches,and recent advancements in deep learning techniques.The review follows a structured protocol comprising three phases:planning,execution,and analysis/reporting.During the execution phase,67 peer-reviewed articles were initially retrieved,with 25 meeting predefined inclusion and exclusion criteria.The analysis phase involved a detailed examination of each study’s methodology,experimental setup,and key contributions.Among the deep learning models evaluated,Long Short-Term Memory(LSTM)networks were identified as the most frequently adopted architecture for sentiment classification tasks.This review highlights current trends,technical challenges,and emerging opportunities in the field,providing valuable guidance for future research and development in applications such as market analysis,public health monitoring,financial forecasting,and crisis management.
基金supported by the fllowing projects:Natural Science Foundation of China under Grant 62172436Self-Initiated Scientific Research Project of the Chinese People's Armed Police Force under Grant ZZKY20243129Basic Frontier Innovation Project of the Engineering University of the Chinese People's Armed Police Force under Grant WJY202421.
文摘Due to the rapid advancement of information technology,data has emerged as the core resource driving decision-making and innovation across all industries.As the foundation of artificial intelligence,machine learning(ML)has expanded its applications into intelligent recommendation systems,autonomous driving,medical diagnosis,and financial risk assessment.However,it relies on massive datasets,which contain sensitive personal information.Consequently,Privacy-Preserving Machine Learning(PPML)has become a critical research direction.To address the challenges of efficiency and accuracy in encrypted data computation within PPML,Homomorphic Encryption(HE)technology is a crucial solution,owing to its capability to facilitate computations on encrypted data.However,the integration of machine learning and homomorphic encryption technologies faces multiple challenges.Against this backdrop,this paper reviews homomorphic encryption technologies,with a focus on the advantages of the Cheon-Kim-Kim-Song(CKKS)algorithm in supporting approximate floating-point computations.This paper reviews the development of three machine learning techniques:K-nearest neighbors(KNN),K-means clustering,and face recognition-in integration with homomorphic encryption.It proposes feasible schemes for typical scenarios,summarizes limitations and future optimization directions.Additionally,it presents a systematic exploration of the integration of homomorphic encryption and machine learning from the essence of the technology,application implementation,performance trade-offs,technological convergence and future pathways to advance technological development.
基金the National Natural Science Foundation of China(52161011)the Central Guiding Local Science and Technology Development Fund Project(Guike ZY23055005,Guike ZY24212036 and GuikeAB25069457)+5 种基金the Guangxi Science and Technology Project(2023GXNSFDA026046 and Guike AB24010247)the Scientifc Research and Technology Development Program of Guilin(20220110-3 and 20230110-3)the Scientifc Research and Technology Development Program of Nanning Jiangnan district(20230715-02)the Guangxi Key Laboratory of Superhard Material(2022-K-001)the Guangxi Key Laboratory of Information Materials(231003-Z,231033-K and 231013-Z)the Innovation Project of GUET Graduate Education(2025YCXS177)for the fnancial support given to this work.
文摘High-entropy alloys(HEAs)have attracted considerable attention because of their excellent properties and broad compositional design space.However,traditional trial-and-error methods for screening HEAs are costly and inefficient,thereby limiting the development of new materials.Although density functional theory(DFT),molecular dynamics(MD),and thermodynamic modeling have improved the design efficiency,their indirect connection to properties has led to limitations in calculation and prediction.With the awarding of the Nobel Prize in Physics and Chemistry to artificial intelligence(AI)related researchers,there has been a renewed enthusiasm for the application of machine learning(ML)in the field of alloy materials.In this study,common and advanced ML models and strategies in HEA design were introduced,and the mechanism by which ML can play a role in composition optimization and performance prediction was investigated through case studies.The general workflow of ML application in material design was also introduced from the programmer’s point of view,including data preprocessing,feature engineering,model training,evaluation,optimization,and interpretability.Furthermore,data scarcity,multi-model coupling,and other challenges and opportunities at the current stage were analyzed,and an outlook on future research directions was provided.
基金funded by the Ministry of Science and Higher Education of the Russian Federation(Project No.FSNM-2024-0005).
文摘Permeability is one of the main oil reservoir characteristics.It affects potential oil production,well-completion technologies,the choice of enhanced oil recovery methods,and more.The methods used to determine and predict reservoir permeability have serious shortcomings.This article aims to refine and adapt machine learning techniques using historical data from hydrocarbon field development to evaluate and predict parameters such as the skin factor and permeability of the remote reservoir zone.The article analyzes data from 4045 wells tests in oil fields in the Perm Krai(Russia).An evaluation of the performance of different Machine Learning(ML)al-gorithms in the prediction of the well permeability is performed.Three different real datasets are used to train more than 20 machine learning regressors,whose hyperparameters are optimized using Bayesian Optimization(BO).The resulting models demonstrate significantly better predictive performance compared to traditional methods and the best ML model found is one that never was applied before to this problem.The permeability prediction model is characterized by a high R^(2) adjusted value of 0.799.A promising approach is the integration of machine learning methods and the use of pressure recovery curves to estimate permeability in real-time.The work is unique for its approach to predicting pressure recovery curves during well operation without stopping wells,providing primary data for interpretation.These innovations are exclusive and can improve the accuracy of permeability forecasts.It also reduces well downtime associated with traditional well-testing procedures.The proposed methods pave the way for more efficient and cost-effective reservoir development,ultimately sup-porting better decision-making and resource optimization in oil production.
基金supported by the National Research Foundation of Korea grant funded by the Korea government(MSIT)(RS-2025-16067531:Kwangwon Ahn)Hankuk University of Foreign Studies Research Fund(0f 2025:Sihyun An).
文摘The nonlinearity of hedonic datasets demands flexible automated valuation models to appraise housing prices accurately,and artificial intelligence models have been employed in mass appraisal to this end.However,they have been referred to as“blackbox”models owing to difficulties associated with interpretation.In this study,we compared the results of traditional hedonic pricing models with those of machine learning algorithms,e.g.,random forest and deep neural network models.Commonly implemented measures,e.g.,Gini importance and permutation importance,provide only the magnitude of each explanatory variable’s importance,which results in ambiguous interpretability.To address this issue,we employed the SHapley Additive exPlanation(SHAP)method and explored its effectiveness through comparisons with traditionally explainable measures in hedonic pricing models.The results demonstrated that(1)the random forest model with the SHAP method could be a reliable instrument for appraising housing prices with high accuracy and sufficient interpretability,(2)the interpretable results retrieved from the SHAP method can be consolidated by the support of statistical evidence,and(3)housing characteristics and local amenities are primary contributors in property valuation,which is consistent with the findings of previous studies.Thus,our novel methodological framework and robust findings provide informative insights into the use of machine learning methods in property valuation based on the comparative analysis.
基金support of the CNPC International Collaborative Research Project(No.2022DQ0410)。
文摘This study aims to eliminate the subjectivity and inconsistency inherent in the traditional International Association of Drilling Contractors(IADC)bit wear rating process,which heavily depends on the experience of drilling engineers and often leads to unreliable results.Leveraging advancements in computer vision and deep learning algorithms,this research proposes an automated detection and classification method for polycrystalline diamond compact(PDC)bit damage.YOLOv10 was employed to locate the PDC bit cutters,followed by two SqueezeNet models to perform wear rating and wear type classifications.A comprehensive dataset was created based on the IADC dull bit evaluation standards.Additionally,this study discusses the necessity of data augmentation and finds that certain methods,such as cropping,splicing,and mixing,may reduce the accuracy of cutter detection.The experimental results demonstrate that the proposed method significantly enhances the accuracy of bit damage detection and classification while also providing substantial improvements in processing speed and computational efficiency,offering a valuable tool for optimizing drilling operations and reducing costs.
基金the Asian Institute of Technology,Khlong Nueng,Thailand for their support in carrying out this study。
文摘Deep Learning(DL)offers promising solutions for analyzing wearable signals and gaining valuable insights into cognitive disorders.While previous review studies have explored various aspects of DL in cognitive healthcare,there remains a lack of comprehensive analysis that integrates wearable signals,data processing techniques,and the broader applications,benefits,and challenges of DL methods.Addressing this limitation,our study provides an extensive review of DL’s role in cognitive healthcare,with a particular emphasis on wearables,data processing,and the inherent challenges in this field.This review also highlights the considerable promise of DL approaches in addressing a broad spectrum of cognitive issues.By enhancing the understanding and analysis of wearable signal modalities,DL models can achieve remarkable accuracy in cognitive healthcare.Convolutional Neural Network(CNN),Recurrent Neural Network(RNN),and Long Short-term Memory(LSTM)networks have demonstrated improved performance and effectiveness in the early diagnosis and progression monitoring of neurological disorders.Beyond cognitive impairment detection,DL has been applied to emotion recognition,sleep analysis,stress monitoring,and neurofeedback.These applications lead to advanced diagnosis,personalized treatment,early intervention,assistive technologies,remote monitoring,and reduced healthcare costs.Nevertheless,the integration of DL and wearable technologies presents several challenges,such as data quality,privacy,interpretability,model generalizability,ethical concerns,and clinical adoption.These challenges emphasize the importance of conducting future research in areas such as multimodal signal analysis and explainable AI.The findings of this review aim to benefit clinicians,healthcare professionals,and society by facilitating better patient outcomes in cognitive healthcare.
文摘This paper explores the possibility of using machine learning algorithms to predict type 2 diabetes.We selected two commonly used classification models:random forest and logistic regression,modeled patients’clinical and lifestyle data,and compared their prediction performance.We found that the random forest model achieved the highest accuracy,demonstrated excellent classification results on the test set,and better distinguished between diabetic and non-diabetic patients by the confusion matrix and other evaluation metrics.The support vector machine and logistic regression perform slightly less well but achieve a high level of accuracy.The experimental results validate the effectiveness of the three machine learning algorithms,especially random forest,in the diabetes prediction task and provide useful practical experience for the intelligent prevention and control of chronic diseases.This study promotes the innovation of the diabetes prediction and management model,which is expected to alleviate the pressure on medical resources,reduce the burden of social health care,and improve the prognosis and quality of life of patients.In the future,we can consider expanding the data scale,exploring other machine learning algorithms,and integrating multimodal data to further realize the potential of artificial intelligence(AI)in the field of diabetes.
基金Supported by Ningxia Key Research and Development Program,No.2023BEG02015Talent Development Projects of Young Qihuang of National Administration of Traditional Chinese Medicine(2020).
文摘BACKGROUND Lotus plumule and its active components have demonstrated inhibitory effects on gastric cancer(GC).However,the molecular mechanism of lotus plumule against GC remains unclear and requires further investigation.AIM To identify the key hub genes associated with the anti-GC effects of lotus plumule.METHODS This study investigated the potential targets of traditional Chinese medicine for inhibiting GC using weighted gene co-expression network analysis and bio-informatics.Initially,the active components and targets of the lotus plumule and the differentially expressed genes associated with GC were identified.Sub-sequently,a protein-protein interaction network was constructed to elucidate the interactions between drug targets and disease-related genes,facilitating the identification of hub genes within the network.The clinical significance of these hub genes was evaluated,and their upstream transcription factors and down-stream targets were identified.The binding ability of a hub gene with its down-stream targets was verified using molecular docking technology.Finally,molecular docking was performed to evaluate the binding affinity between the active ingredients of lotus plumule and the hub gene.RESULTS This study identified 26 genes closely associated with GC.Machine learning analysis and external validation narrowed the list to four genes:Aldo-keto reductase family 1 member B10,fructose-bisphosphatase 1,protein arginine methyltransferase 1,and carbonic anhydrase 9.These genes indicated a strong correlation with anti-GC activity.CONCLUSION Lotus plumule exhibits anti-GC effects.This study identified four hub genes with potential as novel targets for diagnosing and treating GC,providing innovative perspectives for its clinical management.
基金Supported by the Shandong Province Traditional Chinese Medicine Technology Project,No.Q-2023147the Weifang Health Commission Research Project,No.WFWSJK-2023-033+3 种基金the Weifang City Science and Technology Development Plan(Medical Category),No.2023YX057the Weifang Medical University 2022 Campus Level Education and Teaching Reform and Research Project,No.2022YB051Norman Bethune Public Welfare Foundation,No.ezmr2023-037Special Research Project on Optimized Management of Acute Pain,Wu Jieping Medical Foundation.
文摘BACKGROUND Delayed wound healing is a common clinical complication following gastric cancer radical surgery,adversely affecting patient prognosis.With advances in artificial intelligence,machine learning offers a promising approach for developing predictive models that can identify high-risk patients and support early clinical intervention.AIM To construct machine learning-based risk prediction models for delayed wound healing after gastric cancer surgery to support clinical decision-making.METHODS We reviewed a total of 514 patients who underwent gastric cancer radical surgery under general anesthesia from January 1,2014 to December 30,2023.Seventy percent of the dataset was selected as the training set and 30%as the validation set.Decision trees,support vector machines,and logistic regression were used to construct a risk prediction model.The performance of the model was evaluated using accuracy,recall,precision,F1 index,and area under the receiver operating characteristic curve and decision curve.RESULTS This study included five variables:Sex,elderly,duration of abdominal drainage,preoperative white blood cell(WBC)count,and absolute value of neutrophils.These variables were selected based on their clinical relevance and statistical significance in predicting delayed wound healing.The results showed that the decision tree model outperformed the logistic regression and support vector machine models in both the training and validation sets.Specifically,the decision tree model achieved higher accuracy,F1 index,recall,and area under the curve(AUC)values.The support vector machine model also demonstrated better performance than logistic regression,with higher accuracy,recall,and F1 index,but a slightly lower AUC.The key variables of sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils were found to be strong predictors of delayed wound healing.Patients with longer duration of abdominal drainage had a significantly higher risk of delayed wound healing,with a risk ratio of 1.579 compared to those with shorter duration of abdominal drainage.Similarly,preoperative WBC count,sex,elderly,and absolute value of neutrophils were associated with a higher risk of delayed wound healing,highlighting the importance of these variables in the model.CONCLUSION The model is able to identify high-risk patients based on sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils can provide valuable insights for clinical decision-making.
基金funded by the KRICT Project (KK2512-10) of the Korea Research Institute of Chemical Technology and the Ministry of Trade, Industry and Energy (MOTIE)the Korea Institute for Advancement of Technology (KIAT) through the Virtual Engineering Platform Program (P0022334)+1 种基金supported by the Carbon Neutral Industrial Strategic Technology Development Program (RS-202300261088) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)Further support was provided by research fund of Chungnam National University。
文摘Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.
文摘Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing and preparation involve two processes: data cleaning and feature scaling. Several machine learning algorithms, including Linear Regression(LR), Decision Tree(DT), Support Vector Machine(SVM),Random Forest(RF), and Gradient Boosting(GB) for classification, were tested using different iterations and various combinations of features and parameters. The support vector radial kernel training model achieved an accuracy of 72.49% without grid search and 64.02% with grid search, while the blind-well test scores were 71.01% and 69.67%, respectively. The Decision Tree(DT) Hyperparameter Optimization model showed an accuracy of 64.15% for training and 67.45% for testing. In comparison, the Decision Tree coupled with grid search yielded better results, with a training score of 69.91% and a testing score of67.89%. The model's validation was carried out using the blind well validation approach, which achieved an accuracy of 69.81%. Three algorithms were used to generate the gradient-boosting model. During training, the Gradient Boosting classifier achieved an accuracy score of 71.57%, and during testing, it achieved 69.89%. The Grid Search model achieved a higher accuracy score of 72.14% during testing. The Extreme Gradient Boosting model had the lowest accuracy score, with only 66.13% for training and66.12% for testing. For validation, the Gradient Boosting(GB) classifier model achieved an accuracy score of 75.41% on the blind well test, while the Gradient Boosting with Grid Search achieved an accuracy score of 71.36%. The Enhanced Random Forest and Random Forest with Bagging algorithms were the most effective, with validation accuracies of 78.30% and 79.18%, respectively. However, the Random Forest and Random Forest with Grid Search models displayed significant variance between their training and testing scores, indicating the potential for overfitting. Random Forest(RF) and Gradient Boosting(GB) are highly effective for facies classification because they handle complex relationships and provide high predictive accuracy. The choice between the two depends on specific project requirements, including interpretability, computational resources, and data nature.
基金Supported by Beijing Hospitals Authority Youth Programme,No.QML20200505.
文摘BACKGROUND Esophageal squamous cell carcinoma is a major histological subtype of esophageal cancer.Many molecular genetic changes are associated with its occurrence.Raman spectroscopy has become a new method for the early diagnosis of tumors because it can reflect the structures of substances and their changes at the molecular level.AIM To detect alterations in Raman spectral information across different stages of esophageal neoplasia.METHODS Different grades of esophageal lesions were collected,and a total of 360 groups of Raman spectrum data were collected.A 1D-transformer network model was proposed to handle the task of classifying the spectral data of esophageal squamous cell carcinoma.In addition,a deep learning model was applied to visualize the Raman spectral data and interpret their molecular characteristics.RESULTS A comparison among Raman spectral data with different pathological grades and a visual analysis revealed that the Raman peaks with significant differences were concentrated mainly at 1095 cm^(-1)(DNA,symmetric PO,and stretching vibration),1132 cm^(-1)(cytochrome c),1171 cm^(-1)(acetoacetate),1216 cm^(-1)(amide III),and 1315 cm^(-1)(glycerol).A comparison among the training results of different models revealed that the 1Dtransformer network performed best.A 93.30%accuracy value,a 96.65%specificity value,a 93.30%sensitivity value,and a 93.17%F1 score were achieved.CONCLUSION Raman spectroscopy revealed significantly different waveforms for the different stages of esophageal neoplasia.The combination of Raman spectroscopy and deep learning methods could significantly improve the accuracy of classification.
基金Supported by Chongqing Medical Scientific Research Project(Joint Project of Chongqing Health Commission and Science and Technology Bureau),No.2023MSXM060.
文摘BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning often fail to capture the sparse and diagnostically critical features of metastatic potential.AIM To develop and validate a case-level multiple-instance learning(MIL)framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.METHODS The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected.A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosinstained primary tumour slides for each patient.These pathological features were subsequently integrated with clinical data,and model performance was evaluated using the area under the curve(AUC).RESULTS The case-level framework demonstrated superior LNM prediction over slide-level training,with the CONCH v1.5 model achieving a mean AUC(±SD)of 0.899±0.033 vs 0.814±0.083,respectively.Integrating pathology features with clinical data further enhanced performance,yielding a top model with a mean AUC of 0.904±0.047,in sharp contrast to a clinical-only model(mean AUC 0.584±0.084).Crucially,a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.CONCLUSION A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC.This method shows promise for risk stratification and therapy decisions,requiring further validation.