Aircraft assembly is characterized by stringent precedence constraints,limited resource availability,spatial restrictions,and a high degree of manual intervention.These factors lead to considerable variability in oper...Aircraft assembly is characterized by stringent precedence constraints,limited resource availability,spatial restrictions,and a high degree of manual intervention.These factors lead to considerable variability in operator workloads and significantly increase the complexity of scheduling.To address this challenge,this study investigates the Aircraft Pulsating Assembly Line Scheduling Problem(APALSP)under skilled operator allocation,with the objective of minimizing assembly completion time.A mathematical model considering skilled operator allocation is developed,and a Q-Learning improved Particle Swarm Optimization algorithm(QLPSO)is proposed.In the algorithm design,a reverse scheduling strategy is adopted to effectively manage large-scale precedence constraints.Moreover,a reverse sequence encoding method is introduced to generate operation sequences,while a time decoding mechanism is employed to determine completion times.The problem is further reformulated as a Markov Decision Process(MDP)with explicitly defined state and action spaces.Within QLPSO,the Q-learning mechanism adaptively adjusts inertia weights and learning factors,thereby achieving a balance between exploration capability and convergence performance.To validate the effectiveness of the proposed approach,extensive computational experiments are conducted on benchmark instances of different scales,including small,medium,large,and ultra-large cases.The results demonstrate that QLPSO consistently delivers stable and high-quality solutions across all scenarios.In ultra-large-scale instances,it improves the best solution by 25.2%compared with the Genetic Algorithm(GA)and enhances the average solution by 16.9%over the Q-learning algorithm,showing clear advantages over the comparative methods.These findings not only confirm the effectiveness of the proposed algorithm but also provide valuable theoretical references and practical guidance for the intelligent scheduling optimization of aircraft pulsating assembly lines.展开更多
Lung cancer, the leading cause of cancer deaths worldwide and in China, has a 19.7% five-year survival rate due to terminal-stage diagnosis^([1-3]).Although low-dose computed tomography(CT) screening can reduce mortal...Lung cancer, the leading cause of cancer deaths worldwide and in China, has a 19.7% five-year survival rate due to terminal-stage diagnosis^([1-3]).Although low-dose computed tomography(CT) screening can reduce mortality, high false positive rates can create economic and psychological burdens.展开更多
Tian et al present a timely machine learning(ML)model integrating biochemical and novel traditional Chinese medicine(TCM)indicators(tongue edge redness,greasy coating)to predict hepatic steatosis in high metabolic ris...Tian et al present a timely machine learning(ML)model integrating biochemical and novel traditional Chinese medicine(TCM)indicators(tongue edge redness,greasy coating)to predict hepatic steatosis in high metabolic risk patients.Their prospective cohort design and dual-feature selection(LASSO+RFE)culminating in an interpretable XGBoost model(area under the curve:0.82)represent a significant methodological advance.The inclusion of TCM diagnostics addresses metabolic dysfunction-associated fatty liver disease(MAFLD’s)multisystem heterogeneity-a key strength that bridges holistic medicine with precision analytics and underscores potential cost savings over imaging-dependent screening.However,critical limitations impede clinical translation.First,the model’s singlecenter validation(n=711)lacks external/generalizability testing across diverse populations,risking bias from local demographics.Second,MAFLD subtyping(e.g.,lean MAFLD,diabetic MAFLD)was omitted despite acknowledged disease heterogeneity;this overlooks distinct pathophysiologies and may limit utility in stratified care.Third,while TCM features ranked among the top predictors in SHAP analysis,their clinical interpretability remains nebulous without mechanistic links to metabolic dysregulation.To resolve these gaps,we propose external validation in multiethnic cohorts using the published feature set(e.g.,aspartate aminotransferase/alanine aminotransferase,low-density lipoprotein cholesterol,TCM tongue markers)to assess robustness.Subtype-specific modeling to capture MAFLD heterogeneity,potentially enhancing accuracy in highrisk subgroups.Probing TCM microbiome/metabolomic correlations to ground tongue phenotypes in biological pathways,elevating model credibility.Despite shortcomings,this work pioneers a low-cost screening paradigm.Future iterations addressing these issues could revolutionize early MAFLD detection in resource-limited settings.展开更多
Efficient edge caching is essential for maximizing utility in video streaming systems,especially under constraints such as limited storage capacity and dynamically fluctuating content popularity.Utility,defined as the...Efficient edge caching is essential for maximizing utility in video streaming systems,especially under constraints such as limited storage capacity and dynamically fluctuating content popularity.Utility,defined as the benefit obtained per unit of cache bandwidth usage,degrades when static or greedy caching strategies fail to adapt to changing demand patterns.To address this,we propose a deep reinforcement learning(DRL)-based caching framework built upon the proximal policy optimization(PPO)algorithm.Our approach formulates edge caching as a sequential decision-making problem and introduces a reward model that balances cache hit performance and utility by prioritizing high-demand,high-quality content while penalizing degraded quality delivery.We construct a realistic synthetic dataset that captures both temporal variations and shifting content popularity to validate our model.Experimental results demonstrate that our proposed method improves utility by up to 135.9%and achieves an average improvement of 22.6%compared to traditional greedy algorithms and long short-term memory(LSTM)-based prediction models.Moreover,our method consistently performs well across a variety of utility functions,workload distributions,and storage limitations,underscoring its adaptability and robustness in dynamic video caching environments.展开更多
Traditional heat treatment methods require a significant amount of time and energy to affect atomic diffusion and enhance the spheroidization process of carbides in bearing steel,while pulsed current can accelerate at...Traditional heat treatment methods require a significant amount of time and energy to affect atomic diffusion and enhance the spheroidization process of carbides in bearing steel,while pulsed current can accelerate atomic diffusion to achieve ultra-fast spheroidization of carbides.However,the understanding of the mechanism by which different pulse current parameters regulate the dissolution behavior of carbides requires a large amount of experimental data to support,which limits the application of pulse current technology in the field of heat treatment.Based on this,quantify the obtained pulse current processing data to create an important dataset that could be applied to machine learning.Through machine learning,the mechanism of mutual influence between carbide regulation and various factors was elucidated,and the optimal spheroidization process parameters were determined.Compared to the 20 h required for traditional heat treatment,the application of pulsed electric current technology achieved ultra-fast spheroidization of GCr15 bearing steel within 90 min.展开更多
To capture the nonlinear dynamics and gain evolution in chirped pulse amplification(CPA)systems,the split-step Fourier method and the fourth-order Runge–Kutta method are integrated to iteratively address the generali...To capture the nonlinear dynamics and gain evolution in chirped pulse amplification(CPA)systems,the split-step Fourier method and the fourth-order Runge–Kutta method are integrated to iteratively address the generalized nonlinear Schrödinger equation and the rate equations.However,this approach is burdened by substantial computational demands,resulting in significant time expenditures.In the context of intelligent laser optimization and inverse design,the necessity for numerous simulations further exacerbates this issue,highlighting the need for fast and accurate simulation methodologies.Here,we introduce an end-to-end model augmented with active learning(E2E-AL)with decent generalization through different dedicated embedding methods over various parameters.On an identical computational platform,the artificial intelligence–driven model is 2000 times faster than the conventional simulation method.Benefiting from the active learning strategy,the E2E-AL model achieves decent precision with only two-thirds of the training samples compared with the case without such a strategy.Furthermore,we demonstrate a multi-objective inverse design of the CPA systems enabled by the E2E-AL model.The E2E-AL framework manifests the potential of becoming a standard approach for the rapid and accurate modeling of ultrafast lasers and is readily extended to simulate other complex systems.展开更多
Background Genomic prediction has revolutionized animal breeding,with GBLUP being the most widely used prediction model.In theory,the accuracy of genomic prediction could be improved by incorporating information from ...Background Genomic prediction has revolutionized animal breeding,with GBLUP being the most widely used prediction model.In theory,the accuracy of genomic prediction could be improved by incorporating information from QTL.This strategy could be especially beneficial for machine learning models that are able to distinguish informative from uninformative features.The objective of this study was to assess the benefit of incorporating QTL genotypes in GBLUP and machine learning models.This study simulated a selected livestock population where QTL and their effects were known.We used four genomic prediction models,GBLUP,(weighted)2GBLUP,random forest(RF),and support vector regression(SVR)to predict breeding values of young animals,and considered different scenarios that varied in the proportion of genetic variance explained by the included QTL.Results 2GBLUP resulted in the highest accuracy.Its accuracy increased when the included QTL explained up to 80%of the genetic variance,after which the accuracy dropped.With a weighted 2GBLUP model,the accuracy always increased when more QTL were included.Prediction accuracy of GBLUP was consistently higher than SVR,and the accuracy for both models slightly increased with more QTL information included.The RF model resulted in the lowest prediction accuracy,and did not improve by including QTL information.Conclusions Our results show that incorporating QTL information in GBLUP and SVR can improve prediction accuracy,but the extent of improvement varies across models.RF had a much lower prediction accuracy than the other models and did not show improvements when QTL information was added.Two possible reasons for this result are that the data structure in our data does not allow RF to fully realize its potential and that RF is not designed well for this particular prediction problem.Our study highlighted the importance of selecting appropriate models for genomic prediction and underscored the potential limitations of machine learning models when applied to genomic prediction in livestock.展开更多
The electromagnetic pulse valve,as a key component in baghouse dust removal systems,plays a crucial role in the performance of the system.However,despite the promising results of intelligent fault diagnosis methods ba...The electromagnetic pulse valve,as a key component in baghouse dust removal systems,plays a crucial role in the performance of the system.However,despite the promising results of intelligent fault diagnosis methods based on extensive data in diagnosing electromagnetic valves,real-world diagnostic scenarios still face numerous challenges.Collecting fault data for electromagnetic pulse valves is not only time-consuming but also costly,making it difficult to obtain sufficient fault data in advance,which poses challenges for small sample fault diagnosis.To address this issue,this paper proposes a fault diagnosis method for electromagnetic pulse valves based on deep transfer learning and simulated data.This method achieves effective transfer from simulated data to real data through four parameter transfer strategies,which combine parameter freezing and fine-tuning operations.Furthermore,this paper identifies a parameter transfer strategy that simultaneously fine-tunes the feature extractor and classifier,and introduces an attention mechanism to integrate fault features,thereby enhancing the correlation and information complementarity among multi-sensor data.The effectiveness of the proposed method is evaluated through two fault diagnosis cases under different operating conditions.In this study,small sample data accounted for 7.9%and 8.2%of the total dataset,and the experimental results showed transfer accuracies of 93.5%and 94.2%,respectively,validating the reliability and effectiveness of the method under small sample conditions.展开更多
Pulmonary embolism(PE)can range from minor,asymptomatic blood clots to life-threatening emboli capable of obstructing pulmonary arteries,potentially leading to cardiac arrest and fatal outcomes.Due to this significant...Pulmonary embolism(PE)can range from minor,asymptomatic blood clots to life-threatening emboli capable of obstructing pulmonary arteries,potentially leading to cardiac arrest and fatal outcomes.Due to this significant mortality risk,risk stratification is essential following PE diagnosis to guide appropriate therapeutic intervention.This study proposes a machine learning-based methodology for PE risk stratification,utilizing clinical data from a cohort of 139 patients.The predictive framework integrates an enhanced binary Honey Badger Algorithm(BCCHBA)with the K-Nearest Neighbor(KNN)classifier.To comprehensively evaluate the performance of the core optimization algorithm(CCHBA),a series of benchmark function tests were conducted.Furthermore,diagnostic validation tests were performed using real-world PE patient data collected from medical facilities,demonstrating the clinical significance and practical utility of the BCCHBA-KNN system.Analysis revealed the critical importance of specific indicators,including neutrophil percentage(NEUT%),systolic blood pressure(SBP),oxygen saturation(SaO2%),white blood cell count(WBC),and syncope.The classification results demonstrated exceptional performance,with the prediction model achieving 100%sensitivity and 99.09%accuracy.This approach holds promise as a novel and accurate method for assessing PE severity.展开更多
The design and optimization of nonlinear fiber laser sources,such as soliton self-frequency shift(SSFS)tunable sources and supercontinuum(SC)sources,have traditionally relied on manual tuning and simulations,posing ch...The design and optimization of nonlinear fiber laser sources,such as soliton self-frequency shift(SSFS)tunable sources and supercontinuum(SC)sources,have traditionally relied on manual tuning and simulations,posing challenges for real-time applications.Machine learning has shown promise in fiber nonlinear propagation characterization,but the optimization and design of nonlinear systems remain relatively unexplored,especially under multitarget optimization conditions.In this paper,we propose a method that combines deep reinforcement learning(DRL)and deep neural network(DNN)to achieve fast synchronization optimization of ultrafast pulse nonlinear propagation in optical fibers under multitarget optimization tasks,with applications demonstrated in complex SSFS and SC generation systems in the mid-infrared band.The results indicate that a set of optimization parameters can be obtained in a few seconds,enabling rapid,automated tuning of pulse parameters in pursuit of diverse optimization objectives.This integration of DRL and DNN models holds transformative potential for the real-time optimization of not only fiber lasers but also a wide variety of complex photonic systems,paving the way for intelligent,adaptive optical system design and operation.展开更多
Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face...Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face challenges,including high metal usage,high process costs,and low cyclohexene yield.This study utilizes existing literature data combined with machine learning methods to analyze the factors influencing benzene conversion,cyclohexene selectivity,and yield in the benzene hydrogenation to cyclohexene reaction.It constructs predictive models based on XGBoost and Random Forest algorithms.After analysis,it was found that reaction time,Ru content,and space velocity are key factors influencing cyclohexene yield,selectivity,and benzene conversion.Shapley Additive Explanations(SHAP)analysis and feature importance analysis further revealed the contribution of each variable to the reaction outcomes.Additionally,we randomly generated one million variable combinations using the Dirichlet distribution to attempt to predict high-yield catalyst formulations.This paper provides new insights into the application of machine learning in heterogeneous catalysis and offers some reference for further research.展开更多
Grasping is one of the most fundamental operations in modern robotics applications.While deep rein-forcement learning(DRL)has demonstrated strong potential in robotics,there is too much emphasis on maximizing the cumu...Grasping is one of the most fundamental operations in modern robotics applications.While deep rein-forcement learning(DRL)has demonstrated strong potential in robotics,there is too much emphasis on maximizing the cumulative reward in executing tasks,and the potential safety risks are often ignored.In this paper,an optimization method based on safe reinforcement learning(Safe RL)is proposed to address the robotic grasping problem under safety constraints.Specifically,considering the obstacle avoidance constraints of the system,the grasping problem of the manipulator is modeled as a Constrained Markov Decision Process(CMDP).The Lagrange multiplier and a dynamic weighted mechanism are introduced into the Proximal Policy Optimization(PPO)framework,leading to the development of the dynamic weighted Lagrange PPO(DWL-PPO)algorithm.The behavior of violating safety constraints is punished while the policy is optimized in this proposed method.In addition,the orientation control of the end-effector is included in the reward function,and a compound reward function adapted to changes in pose is designed.Ultimately,the efficacy and advantages of the suggested method are proved by extensive training and testing in the Pybullet simulator.The results of grasping experiments reveal that the recommended approach provides superior safety and efficiency compared with other advanced RL methods and achieves a good trade-off between model learning and risk aversion.展开更多
As urbanization continues to accelerate,the challenges associated with managing transportation in metropolitan areas become increasingly complex.The surge in population density contributes to traffic congestion,impact...As urbanization continues to accelerate,the challenges associated with managing transportation in metropolitan areas become increasingly complex.The surge in population density contributes to traffic congestion,impacting travel experiences and posing safety risks.Smart urban transportation management emerges as a strategic solution,conceptualized here as a multidimensional big data problem.The success of this strategy hinges on the effective collection of information from diverse,extensive,and heterogeneous data sources,necessitating the implementation of full⁃stack Information and Communication Technology(ICT)solutions.The main idea of the work is to investigate the current technologies of Intelligent Transportation Systems(ITS)and enhance the safety of urban transportation systems.Machine learning models,trained on historical data,can predict traffic congestion,allowing for the implementation of preventive measures.Deep learning architectures,with their ability to handle complex data representations,further refine traffic predictions,contributing to more accurate and dynamic transportation management.The background of this research underscores the challenges posed by traffic congestion in metropolitan areas and emphasizes the need for advanced technological solutions.By integrating GPS and GIS technologies with machine learning algorithms,this work aims to pay attention to the development of intelligent transportation systems that not only address current challenges but also pave the way for future advancements in urban transportation management.展开更多
BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning ofte...BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning often fail to capture the sparse and diagnostically critical features of metastatic potential.AIM To develop and validate a case-level multiple-instance learning(MIL)framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.METHODS The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected.A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosinstained primary tumour slides for each patient.These pathological features were subsequently integrated with clinical data,and model performance was evaluated using the area under the curve(AUC).RESULTS The case-level framework demonstrated superior LNM prediction over slide-level training,with the CONCH v1.5 model achieving a mean AUC(±SD)of 0.899±0.033 vs 0.814±0.083,respectively.Integrating pathology features with clinical data further enhanced performance,yielding a top model with a mean AUC of 0.904±0.047,in sharp contrast to a clinical-only model(mean AUC 0.584±0.084).Crucially,a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.CONCLUSION A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC.This method shows promise for risk stratification and therapy decisions,requiring further validation.展开更多
The solar cycle(SC),a phenomenon caused by the quasi-periodic regular activities in the Sun,occurs approximately every 11 years.Intense solar activity can disrupt the Earth’s ionosphere,affecting communication and na...The solar cycle(SC),a phenomenon caused by the quasi-periodic regular activities in the Sun,occurs approximately every 11 years.Intense solar activity can disrupt the Earth’s ionosphere,affecting communication and navigation systems.Consequently,accurately predicting the intensity of the SC holds great significance,but predicting the SC involves a long-term time series,and many existing time series forecasting methods have fallen short in terms of accuracy and efficiency.The Time-series Dense Encoder model is a deep learning solution tailored for long time series prediction.Based on a multi-layer perceptron structure,it outperforms the best previously existing models in accuracy,while being efficiently trainable on general datasets.We propose a method based on this model for SC forecasting.Using a trained model,we predict the test set from SC 19 to SC 25 with an average mean absolute percentage error of 32.02,root mean square error of 30.3,mean absolute error of 23.32,and R^(2)(coefficient of determination)of 0.76,outperforming other deep learning models in terms of accuracy and training efficiency on sunspot number datasets.Subsequently,we use it to predict the peaks of SC 25 and SC 26.For SC 25,the peak time has ended,but a stronger peak is predicted for SC 26,of 199.3,within a range of 170.8-221.9,projected to occur during April 2034.展开更多
Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic top...Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic topology of Flying Ad Hoc Networks(FANETs)present significant challenges for maintaining reliable,low-latency communication.Conventional geographic routing protocols often struggle in situations where link quality varies and mobility patterns are unpredictable.To overcome these limitations,this paper proposes an improved routing protocol based on reinforcement learning.This new approach integrates Q-learning with mechanisms that are both link-aware and mobility-aware.The proposed method optimizes the selection of relay nodes by using an adaptive reward function that takes into account energy consumption,delay,and link quality.Additionally,a Kalman filter is integrated to predict UAV mobility,improving the stability of communication links under dynamic network conditions.Simulation experiments were conducted using realistic scenarios,varying the number of UAVs to assess scalability.An analysis was conducted on key performance metrics,including the packet delivery ratio,end-to-end delay,and total energy consumption.The results demonstrate that the proposed approach significantly improves the packet delivery ratio by 12%–15%and reduces delay by up to 25.5%when compared to conventional GEO and QGEO protocols.However,this improvement comes at the cost of higher energy consumption due to additional computations and control overhead.Despite this trade-off,the proposed solution ensures reliable and efficient communication,making it well-suited for large-scale UAV networks operating in complex urban environments.展开更多
First-principles calculations based on density functional theory(DFT)have had a significant impact on chemistry,physics,and materials science,enabling in-depth exploration of the structural and electronic properties o...First-principles calculations based on density functional theory(DFT)have had a significant impact on chemistry,physics,and materials science,enabling in-depth exploration of the structural and electronic properties of a wide variety of materials.Among different implementations of DFT,the plane-wave method is widely used for periodic systems because of its high accuracy.However,this method typically requires a large number of basis functions for large systems,leading to high computational costs.Localized basis sets,such as the muffin-tin orbital(MTO)method,have been introduced to provide a more efficient description of electronic structure with a reduced basis set,albeit at the cost of reduced computational accuracy.In this work,we propose an optimization strategy using machine-learning techniques to automate MTO basis-set parameters,thereby improving the accuracy and efficiency of MTO-based calculations.Default MTO parameter settings primarily focus on lattice structure and give less consideration to element-specific differences.In contrast,our optimized parameters incorporate both structural and elemental information.Based on these converged parameters,we successfully recovered missing bands for CrTe_(2).For the other three materials—Si,GaAs,and CrI_(3)—we achieved band improvements of up to 2 e V.Furthermore,the generalization of the machine-learned method is validated by perturbation,strain,and elemental substitution,resulting in improved band structures.Additionally,lattice-constant optimization for Ga As using the converged parameters yields closer agreement with experiment.展开更多
The rapid advancement of machine learning based tight-binding Hamiltonian(MLTB)methods has opened new avenues for efficient and accurate electronic structure simulations,particularly in large-scale systems and long-ti...The rapid advancement of machine learning based tight-binding Hamiltonian(MLTB)methods has opened new avenues for efficient and accurate electronic structure simulations,particularly in large-scale systems and long-time scenarios.This review begins with a concise overview of traditional tight-binding(TB)models,including both(semi-)empirical and first-principles approaches,establishing the foundation for understanding MLTB developments.We then present a systematic classification of existing MLTB methodologies,grouped into two major categories:direct prediction of TB Hamiltonian elements and inference of empirical parameters.A comparative analysis with other ML-based electronic structure models is also provided,highlighting the advancement of MLTB approaches.Finally,we explore the emerging MLTB application ecosystem,highlighting how the integration of MLTB models with a diverse suite of post-processing tools from linear-scaling solvers to quantum transport frameworks and molecular dynamics interfaces is essential for tackling complex scientific problems across different domains.The continued advancement of this integrated paradigm promises to accelerate materials discovery and open new frontiers in the predictive simulation of complex quantum phenomena.展开更多
Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-through...Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.52475543)Natural Science Foundation of Henan(Grant No.252300421101)+1 种基金Henan Province University Science and Technology Innovation Talent Support Plan(Grant No.24HASTIT048)Science and Technology Innovation Team Project of Zhengzhou University of Light Industry(Grant No.23XNKJTD0101).
文摘Aircraft assembly is characterized by stringent precedence constraints,limited resource availability,spatial restrictions,and a high degree of manual intervention.These factors lead to considerable variability in operator workloads and significantly increase the complexity of scheduling.To address this challenge,this study investigates the Aircraft Pulsating Assembly Line Scheduling Problem(APALSP)under skilled operator allocation,with the objective of minimizing assembly completion time.A mathematical model considering skilled operator allocation is developed,and a Q-Learning improved Particle Swarm Optimization algorithm(QLPSO)is proposed.In the algorithm design,a reverse scheduling strategy is adopted to effectively manage large-scale precedence constraints.Moreover,a reverse sequence encoding method is introduced to generate operation sequences,while a time decoding mechanism is employed to determine completion times.The problem is further reformulated as a Markov Decision Process(MDP)with explicitly defined state and action spaces.Within QLPSO,the Q-learning mechanism adaptively adjusts inertia weights and learning factors,thereby achieving a balance between exploration capability and convergence performance.To validate the effectiveness of the proposed approach,extensive computational experiments are conducted on benchmark instances of different scales,including small,medium,large,and ultra-large cases.The results demonstrate that QLPSO consistently delivers stable and high-quality solutions across all scenarios.In ultra-large-scale instances,it improves the best solution by 25.2%compared with the Genetic Algorithm(GA)and enhances the average solution by 16.9%over the Q-learning algorithm,showing clear advantages over the comparative methods.These findings not only confirm the effectiveness of the proposed algorithm but also provide valuable theoretical references and practical guidance for the intelligent scheduling optimization of aircraft pulsating assembly lines.
基金supported by the National Natural Science Foundation of China(grant numbers 82204127 and 72204172)。
文摘Lung cancer, the leading cause of cancer deaths worldwide and in China, has a 19.7% five-year survival rate due to terminal-stage diagnosis^([1-3]).Although low-dose computed tomography(CT) screening can reduce mortality, high false positive rates can create economic and psychological burdens.
文摘Tian et al present a timely machine learning(ML)model integrating biochemical and novel traditional Chinese medicine(TCM)indicators(tongue edge redness,greasy coating)to predict hepatic steatosis in high metabolic risk patients.Their prospective cohort design and dual-feature selection(LASSO+RFE)culminating in an interpretable XGBoost model(area under the curve:0.82)represent a significant methodological advance.The inclusion of TCM diagnostics addresses metabolic dysfunction-associated fatty liver disease(MAFLD’s)multisystem heterogeneity-a key strength that bridges holistic medicine with precision analytics and underscores potential cost savings over imaging-dependent screening.However,critical limitations impede clinical translation.First,the model’s singlecenter validation(n=711)lacks external/generalizability testing across diverse populations,risking bias from local demographics.Second,MAFLD subtyping(e.g.,lean MAFLD,diabetic MAFLD)was omitted despite acknowledged disease heterogeneity;this overlooks distinct pathophysiologies and may limit utility in stratified care.Third,while TCM features ranked among the top predictors in SHAP analysis,their clinical interpretability remains nebulous without mechanistic links to metabolic dysregulation.To resolve these gaps,we propose external validation in multiethnic cohorts using the published feature set(e.g.,aspartate aminotransferase/alanine aminotransferase,low-density lipoprotein cholesterol,TCM tongue markers)to assess robustness.Subtype-specific modeling to capture MAFLD heterogeneity,potentially enhancing accuracy in highrisk subgroups.Probing TCM microbiome/metabolomic correlations to ground tongue phenotypes in biological pathways,elevating model credibility.Despite shortcomings,this work pioneers a low-cost screening paradigm.Future iterations addressing these issues could revolutionize early MAFLD detection in resource-limited settings.
文摘Efficient edge caching is essential for maximizing utility in video streaming systems,especially under constraints such as limited storage capacity and dynamically fluctuating content popularity.Utility,defined as the benefit obtained per unit of cache bandwidth usage,degrades when static or greedy caching strategies fail to adapt to changing demand patterns.To address this,we propose a deep reinforcement learning(DRL)-based caching framework built upon the proximal policy optimization(PPO)algorithm.Our approach formulates edge caching as a sequential decision-making problem and introduces a reward model that balances cache hit performance and utility by prioritizing high-demand,high-quality content while penalizing degraded quality delivery.We construct a realistic synthetic dataset that captures both temporal variations and shifting content popularity to validate our model.Experimental results demonstrate that our proposed method improves utility by up to 135.9%and achieves an average improvement of 22.6%compared to traditional greedy algorithms and long short-term memory(LSTM)-based prediction models.Moreover,our method consistently performs well across a variety of utility functions,workload distributions,and storage limitations,underscoring its adaptability and robustness in dynamic video caching environments.
基金supported by the National Key R&D Program of China(2020YFA0714900,2023YFB3709903)the National Natural Science Foundation of China(U21B2082,52474410)+6 种基金the Key R&D Program of Shandong Province,China(2023CXGC010406)the Scientific Research Special Project for First-Class Disciplines in Inner Mongolia Autonomous Region(YLXKZX-NKD-001)the International Science and Technology Cooperation Project of Higher Education Institutions in Inner Mongolia Autonomous Region(GHXM-002)the Natural Science Foundation of Inner Mongolia Autonomous Region of China(2024ZD06)the Technology Support Project for the Construction of Major Innovation Platforms in Inner Mongolia Autonomous Region(XM2024XTGXQ16)the Beijing Municipal Natural Science Foundation(2222065)the Fundamental Research Funds for the Central Universities(FRF-TP-22-02C2).
文摘Traditional heat treatment methods require a significant amount of time and energy to affect atomic diffusion and enhance the spheroidization process of carbides in bearing steel,while pulsed current can accelerate atomic diffusion to achieve ultra-fast spheroidization of carbides.However,the understanding of the mechanism by which different pulse current parameters regulate the dissolution behavior of carbides requires a large amount of experimental data to support,which limits the application of pulse current technology in the field of heat treatment.Based on this,quantify the obtained pulse current processing data to create an important dataset that could be applied to machine learning.Through machine learning,the mechanism of mutual influence between carbide regulation and various factors was elucidated,and the optimal spheroidization process parameters were determined.Compared to the 20 h required for traditional heat treatment,the application of pulsed electric current technology achieved ultra-fast spheroidization of GCr15 bearing steel within 90 min.
基金supported by the National Natural Science Foundation of China(Grant Nos.62227821,62025503,and 62205199).
文摘To capture the nonlinear dynamics and gain evolution in chirped pulse amplification(CPA)systems,the split-step Fourier method and the fourth-order Runge–Kutta method are integrated to iteratively address the generalized nonlinear Schrödinger equation and the rate equations.However,this approach is burdened by substantial computational demands,resulting in significant time expenditures.In the context of intelligent laser optimization and inverse design,the necessity for numerous simulations further exacerbates this issue,highlighting the need for fast and accurate simulation methodologies.Here,we introduce an end-to-end model augmented with active learning(E2E-AL)with decent generalization through different dedicated embedding methods over various parameters.On an identical computational platform,the artificial intelligence–driven model is 2000 times faster than the conventional simulation method.Benefiting from the active learning strategy,the E2E-AL model achieves decent precision with only two-thirds of the training samples compared with the case without such a strategy.Furthermore,we demonstrate a multi-objective inverse design of the CPA systems enabled by the E2E-AL model.The E2E-AL framework manifests the potential of becoming a standard approach for the rapid and accurate modeling of ultrafast lasers and is readily extended to simulate other complex systems.
基金the financial support from China Scholarship Council(CSC,File No.202007720040)which has sponsored Jifan Yang's PhD study at Wageningen University&Research.
文摘Background Genomic prediction has revolutionized animal breeding,with GBLUP being the most widely used prediction model.In theory,the accuracy of genomic prediction could be improved by incorporating information from QTL.This strategy could be especially beneficial for machine learning models that are able to distinguish informative from uninformative features.The objective of this study was to assess the benefit of incorporating QTL genotypes in GBLUP and machine learning models.This study simulated a selected livestock population where QTL and their effects were known.We used four genomic prediction models,GBLUP,(weighted)2GBLUP,random forest(RF),and support vector regression(SVR)to predict breeding values of young animals,and considered different scenarios that varied in the proportion of genetic variance explained by the included QTL.Results 2GBLUP resulted in the highest accuracy.Its accuracy increased when the included QTL explained up to 80%of the genetic variance,after which the accuracy dropped.With a weighted 2GBLUP model,the accuracy always increased when more QTL were included.Prediction accuracy of GBLUP was consistently higher than SVR,and the accuracy for both models slightly increased with more QTL information included.The RF model resulted in the lowest prediction accuracy,and did not improve by including QTL information.Conclusions Our results show that incorporating QTL information in GBLUP and SVR can improve prediction accuracy,but the extent of improvement varies across models.RF had a much lower prediction accuracy than the other models and did not show improvements when QTL information was added.Two possible reasons for this result are that the data structure in our data does not allow RF to fully realize its potential and that RF is not designed well for this particular prediction problem.Our study highlighted the importance of selecting appropriate models for genomic prediction and underscored the potential limitations of machine learning models when applied to genomic prediction in livestock.
基金Supported by National Natural Science Foundation of China(Grant No.51675040)。
文摘The electromagnetic pulse valve,as a key component in baghouse dust removal systems,plays a crucial role in the performance of the system.However,despite the promising results of intelligent fault diagnosis methods based on extensive data in diagnosing electromagnetic valves,real-world diagnostic scenarios still face numerous challenges.Collecting fault data for electromagnetic pulse valves is not only time-consuming but also costly,making it difficult to obtain sufficient fault data in advance,which poses challenges for small sample fault diagnosis.To address this issue,this paper proposes a fault diagnosis method for electromagnetic pulse valves based on deep transfer learning and simulated data.This method achieves effective transfer from simulated data to real data through four parameter transfer strategies,which combine parameter freezing and fine-tuning operations.Furthermore,this paper identifies a parameter transfer strategy that simultaneously fine-tunes the feature extractor and classifier,and introduces an attention mechanism to integrate fault features,thereby enhancing the correlation and information complementarity among multi-sensor data.The effectiveness of the proposed method is evaluated through two fault diagnosis cases under different operating conditions.In this study,small sample data accounted for 7.9%and 8.2%of the total dataset,and the experimental results showed transfer accuracies of 93.5%and 94.2%,respectively,validating the reliability and effectiveness of the method under small sample conditions.
文摘Pulmonary embolism(PE)can range from minor,asymptomatic blood clots to life-threatening emboli capable of obstructing pulmonary arteries,potentially leading to cardiac arrest and fatal outcomes.Due to this significant mortality risk,risk stratification is essential following PE diagnosis to guide appropriate therapeutic intervention.This study proposes a machine learning-based methodology for PE risk stratification,utilizing clinical data from a cohort of 139 patients.The predictive framework integrates an enhanced binary Honey Badger Algorithm(BCCHBA)with the K-Nearest Neighbor(KNN)classifier.To comprehensively evaluate the performance of the core optimization algorithm(CCHBA),a series of benchmark function tests were conducted.Furthermore,diagnostic validation tests were performed using real-world PE patient data collected from medical facilities,demonstrating the clinical significance and practical utility of the BCCHBA-KNN system.Analysis revealed the critical importance of specific indicators,including neutrophil percentage(NEUT%),systolic blood pressure(SBP),oxygen saturation(SaO2%),white blood cell count(WBC),and syncope.The classification results demonstrated exceptional performance,with the prediction model achieving 100%sensitivity and 99.09%accuracy.This approach holds promise as a novel and accurate method for assessing PE severity.
基金supported by the National Natural Science Foundation of China(Grant No.62575051)the Aeronautical Science Foundation of China(Grant No.2023M038080001)+1 种基金the Equipment Pre-research Joint Fund of the Ministry of Education(Grant No.8091B042228)the Science and Technology Project of Sichuan Province(Grant Nos.2023NSFSC1964 and 203NSFSC0033).
文摘The design and optimization of nonlinear fiber laser sources,such as soliton self-frequency shift(SSFS)tunable sources and supercontinuum(SC)sources,have traditionally relied on manual tuning and simulations,posing challenges for real-time applications.Machine learning has shown promise in fiber nonlinear propagation characterization,but the optimization and design of nonlinear systems remain relatively unexplored,especially under multitarget optimization conditions.In this paper,we propose a method that combines deep reinforcement learning(DRL)and deep neural network(DNN)to achieve fast synchronization optimization of ultrafast pulse nonlinear propagation in optical fibers under multitarget optimization tasks,with applications demonstrated in complex SSFS and SC generation systems in the mid-infrared band.The results indicate that a set of optimization parameters can be obtained in a few seconds,enabling rapid,automated tuning of pulse parameters in pursuit of diverse optimization objectives.This integration of DRL and DNN models holds transformative potential for the real-time optimization of not only fiber lasers but also a wide variety of complex photonic systems,paving the way for intelligent,adaptive optical system design and operation.
基金Supported by CAS Basic and Interdisciplinary Frontier Scientific Research Pilot Project(XDB1190300,XDB1190302)Youth Innovation Promotion Association CAS(Y2021056)+1 种基金Joint Fund of the Yulin University and the Dalian National Laboratory for Clean Energy(YLU-DNL Fund 2022007)The special fund for Science and Technology Innovation Teams of Shanxi Province(202304051001007)。
文摘Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face challenges,including high metal usage,high process costs,and low cyclohexene yield.This study utilizes existing literature data combined with machine learning methods to analyze the factors influencing benzene conversion,cyclohexene selectivity,and yield in the benzene hydrogenation to cyclohexene reaction.It constructs predictive models based on XGBoost and Random Forest algorithms.After analysis,it was found that reaction time,Ru content,and space velocity are key factors influencing cyclohexene yield,selectivity,and benzene conversion.Shapley Additive Explanations(SHAP)analysis and feature importance analysis further revealed the contribution of each variable to the reaction outcomes.Additionally,we randomly generated one million variable combinations using the Dirichlet distribution to attempt to predict high-yield catalyst formulations.This paper provides new insights into the application of machine learning in heterogeneous catalysis and offers some reference for further research.
文摘Grasping is one of the most fundamental operations in modern robotics applications.While deep rein-forcement learning(DRL)has demonstrated strong potential in robotics,there is too much emphasis on maximizing the cumulative reward in executing tasks,and the potential safety risks are often ignored.In this paper,an optimization method based on safe reinforcement learning(Safe RL)is proposed to address the robotic grasping problem under safety constraints.Specifically,considering the obstacle avoidance constraints of the system,the grasping problem of the manipulator is modeled as a Constrained Markov Decision Process(CMDP).The Lagrange multiplier and a dynamic weighted mechanism are introduced into the Proximal Policy Optimization(PPO)framework,leading to the development of the dynamic weighted Lagrange PPO(DWL-PPO)algorithm.The behavior of violating safety constraints is punished while the policy is optimized in this proposed method.In addition,the orientation control of the end-effector is included in the reward function,and a compound reward function adapted to changes in pose is designed.Ultimately,the efficacy and advantages of the suggested method are proved by extensive training and testing in the Pybullet simulator.The results of grasping experiments reveal that the recommended approach provides superior safety and efficiency compared with other advanced RL methods and achieves a good trade-off between model learning and risk aversion.
文摘As urbanization continues to accelerate,the challenges associated with managing transportation in metropolitan areas become increasingly complex.The surge in population density contributes to traffic congestion,impacting travel experiences and posing safety risks.Smart urban transportation management emerges as a strategic solution,conceptualized here as a multidimensional big data problem.The success of this strategy hinges on the effective collection of information from diverse,extensive,and heterogeneous data sources,necessitating the implementation of full⁃stack Information and Communication Technology(ICT)solutions.The main idea of the work is to investigate the current technologies of Intelligent Transportation Systems(ITS)and enhance the safety of urban transportation systems.Machine learning models,trained on historical data,can predict traffic congestion,allowing for the implementation of preventive measures.Deep learning architectures,with their ability to handle complex data representations,further refine traffic predictions,contributing to more accurate and dynamic transportation management.The background of this research underscores the challenges posed by traffic congestion in metropolitan areas and emphasizes the need for advanced technological solutions.By integrating GPS and GIS technologies with machine learning algorithms,this work aims to pay attention to the development of intelligent transportation systems that not only address current challenges but also pave the way for future advancements in urban transportation management.
基金Supported by Chongqing Medical Scientific Research Project(Joint Project of Chongqing Health Commission and Science and Technology Bureau),No.2023MSXM060.
文摘BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning often fail to capture the sparse and diagnostically critical features of metastatic potential.AIM To develop and validate a case-level multiple-instance learning(MIL)framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.METHODS The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected.A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosinstained primary tumour slides for each patient.These pathological features were subsequently integrated with clinical data,and model performance was evaluated using the area under the curve(AUC).RESULTS The case-level framework demonstrated superior LNM prediction over slide-level training,with the CONCH v1.5 model achieving a mean AUC(±SD)of 0.899±0.033 vs 0.814±0.083,respectively.Integrating pathology features with clinical data further enhanced performance,yielding a top model with a mean AUC of 0.904±0.047,in sharp contrast to a clinical-only model(mean AUC 0.584±0.084).Crucially,a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.CONCLUSION A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC.This method shows promise for risk stratification and therapy decisions,requiring further validation.
基金supported by the Academic Research Projects of Beijing Union University(ZK20202204)the National Natural Science Foundation of China(12250005,12073040,12273059,11973056,12003051,11573037,12073041,11427901,11572005,11611530679 and 12473052)+1 种基金the Strategic Priority Research Program of the China Academy of Sciences(XDB0560000,XDA15052200,XDB09040200,XDA15010700,XDB0560301,and XDA15320102)the Chinese Meridian Project(CMP).
文摘The solar cycle(SC),a phenomenon caused by the quasi-periodic regular activities in the Sun,occurs approximately every 11 years.Intense solar activity can disrupt the Earth’s ionosphere,affecting communication and navigation systems.Consequently,accurately predicting the intensity of the SC holds great significance,but predicting the SC involves a long-term time series,and many existing time series forecasting methods have fallen short in terms of accuracy and efficiency.The Time-series Dense Encoder model is a deep learning solution tailored for long time series prediction.Based on a multi-layer perceptron structure,it outperforms the best previously existing models in accuracy,while being efficiently trainable on general datasets.We propose a method based on this model for SC forecasting.Using a trained model,we predict the test set from SC 19 to SC 25 with an average mean absolute percentage error of 32.02,root mean square error of 30.3,mean absolute error of 23.32,and R^(2)(coefficient of determination)of 0.76,outperforming other deep learning models in terms of accuracy and training efficiency on sunspot number datasets.Subsequently,we use it to predict the peaks of SC 25 and SC 26.For SC 25,the peak time has ended,but a stronger peak is predicted for SC 26,of 199.3,within a range of 170.8-221.9,projected to occur during April 2034.
基金funded by Hung Yen University of Technology and Education under grand number UTEHY.L.2025.62.
文摘Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic topology of Flying Ad Hoc Networks(FANETs)present significant challenges for maintaining reliable,low-latency communication.Conventional geographic routing protocols often struggle in situations where link quality varies and mobility patterns are unpredictable.To overcome these limitations,this paper proposes an improved routing protocol based on reinforcement learning.This new approach integrates Q-learning with mechanisms that are both link-aware and mobility-aware.The proposed method optimizes the selection of relay nodes by using an adaptive reward function that takes into account energy consumption,delay,and link quality.Additionally,a Kalman filter is integrated to predict UAV mobility,improving the stability of communication links under dynamic network conditions.Simulation experiments were conducted using realistic scenarios,varying the number of UAVs to assess scalability.An analysis was conducted on key performance metrics,including the packet delivery ratio,end-to-end delay,and total energy consumption.The results demonstrate that the proposed approach significantly improves the packet delivery ratio by 12%–15%and reduces delay by up to 25.5%when compared to conventional GEO and QGEO protocols.However,this improvement comes at the cost of higher energy consumption due to additional computations and control overhead.Despite this trade-off,the proposed solution ensures reliable and efficient communication,making it well-suited for large-scale UAV networks operating in complex urban environments.
基金supported by the National Key Research and Development Program of China(Grant Nos.2023YFA1406600 and 2021YFA1202200)。
文摘First-principles calculations based on density functional theory(DFT)have had a significant impact on chemistry,physics,and materials science,enabling in-depth exploration of the structural and electronic properties of a wide variety of materials.Among different implementations of DFT,the plane-wave method is widely used for periodic systems because of its high accuracy.However,this method typically requires a large number of basis functions for large systems,leading to high computational costs.Localized basis sets,such as the muffin-tin orbital(MTO)method,have been introduced to provide a more efficient description of electronic structure with a reduced basis set,albeit at the cost of reduced computational accuracy.In this work,we propose an optimization strategy using machine-learning techniques to automate MTO basis-set parameters,thereby improving the accuracy and efficiency of MTO-based calculations.Default MTO parameter settings primarily focus on lattice structure and give less consideration to element-specific differences.In contrast,our optimized parameters incorporate both structural and elemental information.Based on these converged parameters,we successfully recovered missing bands for CrTe_(2).For the other three materials—Si,GaAs,and CrI_(3)—we achieved band improvements of up to 2 e V.Furthermore,the generalization of the machine-learned method is validated by perturbation,strain,and elemental substitution,resulting in improved band structures.Additionally,lattice-constant optimization for Ga As using the converged parameters yields closer agreement with experiment.
基金supported by the Advanced Materials-National Science and Technology Major Project(Grant No.2025ZD0618401)the National Natural Science Foundation of China(Grant No.12504285)+1 种基金the Natural Science Foundation of Jiangsu Province(Grant No.BK20250472)NFSG grant from BITS-Pilani,Dubai campus。
文摘The rapid advancement of machine learning based tight-binding Hamiltonian(MLTB)methods has opened new avenues for efficient and accurate electronic structure simulations,particularly in large-scale systems and long-time scenarios.This review begins with a concise overview of traditional tight-binding(TB)models,including both(semi-)empirical and first-principles approaches,establishing the foundation for understanding MLTB developments.We then present a systematic classification of existing MLTB methodologies,grouped into two major categories:direct prediction of TB Hamiltonian elements and inference of empirical parameters.A comparative analysis with other ML-based electronic structure models is also provided,highlighting the advancement of MLTB approaches.Finally,we explore the emerging MLTB application ecosystem,highlighting how the integration of MLTB models with a diverse suite of post-processing tools from linear-scaling solvers to quantum transport frameworks and molecular dynamics interfaces is essential for tackling complex scientific problems across different domains.The continued advancement of this integrated paradigm promises to accelerate materials discovery and open new frontiers in the predictive simulation of complex quantum phenomena.
基金the Deanship of Research and Graduate Studies at King Khalid University,KSA,for funding this work through the Large Research Project under grant number RGP2/164/46.
文摘Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.