Countries around the world have been making efforts to reduce pollutant emissions. However, the response of global black carbon(BC) aging to emission changes remains unclear. Using the Community Atmosphere Model versi...Countries around the world have been making efforts to reduce pollutant emissions. However, the response of global black carbon(BC) aging to emission changes remains unclear. Using the Community Atmosphere Model version 6 with a machine-learning-integrated four-mode version of the Modal Aerosol Module, we quantify global BC aging responses to emission reductions for 2011–2018 and for 2050 and 2100 under carbon neutrality. During 2011–18, global trends in BC aging degree(mass ratio of coatings to BC, R_(BC)) exhibited marked regional disparities, with a significant increase in China(5.4% yr^(-1)), which contrasts with minimal changes in the USA, Europe, and India. The divergence is attributed to opposing trends in secondary organic aerosol(SOA) and sulfate coatings, driven by regional changes in the emission ratios of corresponding coating precursors to BC(volatile organic compounds-VOCs/BC and SO_(2)/BC). Projections under carbon neutrality reveal that R_(BC) will increase globally by 47%(118%) in 2050(2100), with strong convergent increases expected across major source regions. The R_(BC) increase, primarily driven by enhanced SOA coatings due to sharper BC reductions relative to VOCs, will enhance the global BC mass absorption cross-section(MAC) by 11%(17%) in 2050(2100).Consequently, although the global BC burden will decline sharply by 60%(76%), the enhanced MAC partially offsets the magnitude of the decline in the BC direct radiative effect, resulting in the moderation of global BC DRE decreases to 88%(92%) of the BC burden reductions in 2050(2100). This study highlights the globally enhanced BC aging and light absorption capacity under carbon neutrality, thereby partly offsetting the impact of BC direct emission reductions on future changes in BC radiative effects globally.展开更多
Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face...Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face challenges,including high metal usage,high process costs,and low cyclohexene yield.This study utilizes existing literature data combined with machine learning methods to analyze the factors influencing benzene conversion,cyclohexene selectivity,and yield in the benzene hydrogenation to cyclohexene reaction.It constructs predictive models based on XGBoost and Random Forest algorithms.After analysis,it was found that reaction time,Ru content,and space velocity are key factors influencing cyclohexene yield,selectivity,and benzene conversion.Shapley Additive Explanations(SHAP)analysis and feature importance analysis further revealed the contribution of each variable to the reaction outcomes.Additionally,we randomly generated one million variable combinations using the Dirichlet distribution to attempt to predict high-yield catalyst formulations.This paper provides new insights into the application of machine learning in heterogeneous catalysis and offers some reference for further research.展开更多
The Internet of Vehicles,or IoV,is expected to lessen pollution,ease traffic,and increase road safety.IoV entities’interconnectedness,however,raises the possibility of cyberattacks,which can have detrimental effects....The Internet of Vehicles,or IoV,is expected to lessen pollution,ease traffic,and increase road safety.IoV entities’interconnectedness,however,raises the possibility of cyberattacks,which can have detrimental effects.IoV systems typically send massive volumes of raw data to central servers,which may raise privacy issues.Additionally,model training on IoV devices with limited resources normally leads to slower training times and reduced service quality.We discuss a privacy-preserving Federated Split Learning with Tiny Machine Learning(TinyML)approach,which operates on IoV edge devices without sharing sensitive raw data.Specifically,we focus on integrating split learning(SL)with federated learning(FL)and TinyML models.FL is a decentralisedmachine learning(ML)technique that enables numerous edge devices to train a standard model while retaining data locally collectively.The article intends to thoroughly discuss the architecture and challenges associated with the increasing prevalence of SL in the IoV domain,coupled with FL and TinyML.The approach starts with the IoV learning framework,which includes edge computing,FL,SL,and TinyML,and then proceeds to discuss how these technologies might be integrated.We elucidate the comprehensive operational principles of Federated and split learning by examining and addressingmany challenges.We subsequently examine the integration of SL with FL and various applications of TinyML.Finally,exploring the potential integration of FL and SL with TinyML in the IoV domain is referred to as FSL-TM.It is a superior method for preserving privacy as it conducts model training on individual devices or edge nodes,thereby obviating the necessity for centralised data aggregation,which presents considerable privacy threats.The insights provided aim to help both researchers and practitioners understand the complicated terrain of FL and SL,hence facilitating advancement in this swiftly progressing domain.展开更多
This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper a...This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper aims to identify the most effective AI techniques,the most used datasets,the most widely used data preprocessing techniques,and the most common issues.After analyzing the literature,it has been found that convolutional neural networks(CNNs)and long short-term memory(LSTM)networks are deep learning models that have shown high accuracy in diabetes prediction.Recursive feature elimination(RFE)and SMOTE are feature selection techniques that have significantly improved model accuracy,training time,and interpretability.Amidst this technological advancement,some existing issues persist:data imbalance,the inapplicability of techniques,computational limitations,and a lack of real-time application in a healthcare environment.The literature review has also identified the need for robust,interpretable,and scalable AI systems capable of handling large volumes of data,including real-world data,in the healthcare industry.Furthermore,it has been identified that the benefits should be integrated with wearable health monitoring systems and the development of privacy-preserving models to ensure continuous,secure,and proactive diabetes management.展开更多
Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potent...Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potential to process complex datasets and support decision-making in OC diagnosis.Nevertheless,traditional ML models tend to be biased,overfitting,noisy,and less generalized.Moreover,their black-box nature reduces interpretability and limits their practical clinical applicability.In this study,we introduce an explainable ensemble learning(EL)model,TreeX-Stack,based on a stacking architecture that employs tree-based learners such as Decision Tree(DT),Random Forest(RF),Gradient Boosting(GB),and Extreme Gradient Boosting(XGBoost)as base learners,and Logistic Regression(LR)as the meta-learner to enhance ovarian cancer(OC)diagnosis.Local Interpretable ModelAgnostic Explanations(LIME)are used to explain individual predictions,making the model outputs more clinically interpretable and applicable.The model is trained on the dataset that includes demographic information,blood test,general chemistry,and tumor markers.Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7.Relevant features are then selected using the Boruta feature selection method.To obtain robust and unbiased performance estimates during hyperparameter tuning,nested cross-validation(CV)with grid search is employed,and all experiments are repeated five times to ensure statistical reliability.TreeX-Stack demonstrates excellent diagnostic performance,achieving an accuracy of 0.9027,a precision of 0.8673,a recall of 0.9391,and an F1-score of 0.9012.Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4(HE4)as the most significant biomarker for OC.The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.展开更多
Federated learning is a decentralized model training paradigm with significant potential.However,the quality of Federated Network’s client updates can vary due to non-IID data distributions,leading to suboptimal glob...Federated learning is a decentralized model training paradigm with significant potential.However,the quality of Federated Network’s client updates can vary due to non-IID data distributions,leading to suboptimal global models.To address this issue,we propose a novel client selection strategy called FedPA(Performance-Based Federated Averaging).This proposed model selectively aggregates client updates based on a predefined performance threshold.Only clients whose local models achieve an F1 score of 70%or higher after training are included in the aggregation process.Clients below this threshold receive the updated global model but do not contribute their parameters.In this way,the low-performance clients are still in the process of learning and,after some rounds,will be able to contribute.If no client meets the performance threshold in a given round,the system falls back to standard FedAvg aggregation.This ensures the global model continues to improve even when most clients perform poorly.We evaluate FedPA on a subset of the MURA dataset for abnormality detection in radiographs of four bone types.Compared to baseline federated learning algorithms such as Federated Averaging(FedAvg),Federated Proximal(FedProx),Federated Stochastic Gradient Descent(FedSGD),and Federated Batch Normalization(FedBN),FedPA consistently ranks first or second across key performance metrics,particularly in accuracy,F1 score,and recall.Moreover,FedPA demonstrates notable efficiency,achieving the lowest average round time(≈2270 s)and minimal memory usage(≈645.58 MB),all without relying on GPU resources.These results highlight FedPA’s effectiveness in improving global model quality while reducing computational overhead,positioning it as a promising approach for real-world federated learning applications in resource-constrained environments.展开更多
Deep graph contrastive clustering has attracted widespread attentions due to its self-supervised representation learning paradigm and superior clustering performance.Although,two challenges emerge and result in high c...Deep graph contrastive clustering has attracted widespread attentions due to its self-supervised representation learning paradigm and superior clustering performance.Although,two challenges emerge and result in high computational costs.Most existing contrastive methods adopt the data augmentation and then representation learning strategy,where representation learning with trainable graph convolution is coupled with complex and fixed data augmentation,inevitably limiting the efficiency and flexibility.The similarity metric between positive-negative sample pairs is complex and contrastive objective is partial,limiting the discriminability of representation learning.To solve these challenges,a novel wide graph clustering network(WGCN)adhering to representation and then augmentation framework is proposed,which mainly consists of multiorder filter fusion(MFF)and double-level contrastive learning(DCL)modules.Specifically,the MFF module integrates multiorder low-pass filters to extract smooth and multi-scale topological features,utilizing self-attention fusion to reduce redundancy and obtain comprehensive embedding representation.Further,the DCL module constructs two augmented views by the parallel parameter-unshared Siamese encoders rather than complex augmentations on graph.To achieve simple yet effective self-supervised learning,representation self-supervision and structural consistency oriented double-level contrastive loss is designed,where representation self-supervision maximizes the agreement between pairwise augmented embedding representations and structural consistency promotes the mutual information correlation between appending neighborhoods with similar semantics.Extensive experiments on six benchmark datasets demonstrate the superiority of the proposed WGCN,especially highlighting its time-saving characteristic.The code could be available in the https://github.com/Tianxiang Zhao0474/WGCN.展开更多
The solar cycle(SC),a phenomenon caused by the quasi-periodic regular activities in the Sun,occurs approximately every 11 years.Intense solar activity can disrupt the Earth’s ionosphere,affecting communication and na...The solar cycle(SC),a phenomenon caused by the quasi-periodic regular activities in the Sun,occurs approximately every 11 years.Intense solar activity can disrupt the Earth’s ionosphere,affecting communication and navigation systems.Consequently,accurately predicting the intensity of the SC holds great significance,but predicting the SC involves a long-term time series,and many existing time series forecasting methods have fallen short in terms of accuracy and efficiency.The Time-series Dense Encoder model is a deep learning solution tailored for long time series prediction.Based on a multi-layer perceptron structure,it outperforms the best previously existing models in accuracy,while being efficiently trainable on general datasets.We propose a method based on this model for SC forecasting.Using a trained model,we predict the test set from SC 19 to SC 25 with an average mean absolute percentage error of 32.02,root mean square error of 30.3,mean absolute error of 23.32,and R^(2)(coefficient of determination)of 0.76,outperforming other deep learning models in terms of accuracy and training efficiency on sunspot number datasets.Subsequently,we use it to predict the peaks of SC 25 and SC 26.For SC 25,the peak time has ended,but a stronger peak is predicted for SC 26,of 199.3,within a range of 170.8-221.9,projected to occur during April 2034.展开更多
Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-through...Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.展开更多
Replicating the chaotic characteristics inherent in nonlinear dynamical systems via machine learning(ML)is a key challenge in this rapidly advancing interdisciplinary field.In this work,we explore the potential of var...Replicating the chaotic characteristics inherent in nonlinear dynamical systems via machine learning(ML)is a key challenge in this rapidly advancing interdisciplinary field.In this work,we explore the potential of variational quantum circuits(VQC)for learning the stochastic properties of classical nonlinear dynamical systems.Specifically,we focus on the one-and two-dimensional logistic maps,which,while simple,remain under-explored in the context of learning dynamical characteristics.Our findings reveal that,even for such simple dynamical systems,accurately replicating longterm characteristics is hindered by a pronounced sensitivity to overfitting.While increasing the parameter complexity of the ML model typically enhances short-term prediction accuracy,it also leads to a degradation in the model’s ability to replicate long-term characteristics,primarily due to the detrimental effects of overfitting on generalization power.By comparing the VQC with two widely recognized classical ML techniques,which are long short-term memory(LSTM)networks for timeseries processing and reservoir computing,we demonstrate that VQC outperforms these methods in terms of replicating long-term characteristics.Our results suggest that for the ML of dynamics,it is demanded to develop more compact and efficient models(such as VQC)rather than more complicated and large-scale ones.展开更多
Unmanned aerial vehicles(UAVs)face the challenge of autonomous obstacle avoidance in complex,multi-obstacle environments.Behavior cloning offers a promising approach to rapidly acquire a learning policy from limited e...Unmanned aerial vehicles(UAVs)face the challenge of autonomous obstacle avoidance in complex,multi-obstacle environments.Behavior cloning offers a promising approach to rapidly acquire a learning policy from limited expert demonstrations.However,pure imitation learning inherently suffers from poor exploration and limited generalization,typically necessitating extensive datasets to train competent student policies.We utilize a cross-modal variational autoencoder(CM-VAE)to extract compact features from raw visual inputs and UAV states,which then feed into a policy network.We evaluated our approach in a simulated environment featuring a challenging circular trajectory with eight gate obstacles.The results demonstrate that the policy trained with pure behavior cloning consistently failed.In stark contrast,our DAgger-augmented behavior cloning method successfully traversed all gates without collision.Our findings confirm that DAgger effectively mitigates the shortcomings of behavior cloning,enabling the creation of reliable and sample-efficient navigation policies for UAVs.展开更多
Cosmic-ray muon sources exhibit distinct scattering angle distributions when interacting with materials of different atomic numbers(Z values),facilitating the identification of various Z-class materials,particularly r...Cosmic-ray muon sources exhibit distinct scattering angle distributions when interacting with materials of different atomic numbers(Z values),facilitating the identification of various Z-class materials,particularly radioactive high-Z nuclear elements.Most traditional identification methods are based on complex statistical iterative reconstruction or simple trajectory approximation.Supervised machine learning methods offer some improvement but rely heavily on prior knowledge of the target materials,significantly limiting their practical applicability in detecting concealed materials.To the best of our knowledge,this is the first study to introduce transfer learning into muon tomography.We propose two lightweight neural network models for fine-tuning and adversarial transfer learning,utilizing muon scattering data of bare materials to predict the Z-class of materials coated by typical shieldings(e.g.,aluminum or polyethylene),simulating practical scenarios such as cargo inspection and arms control.By introducing a novel inverse cumulative distribution-based sampling method,more accurate scattering angle distributions could be obtained from the data,leading to an improvement of nearly 4% in prediction accuracy compared with the traditional random sampling-based training.When applied to coated materials with limited labeled or even unlabeled muon tomography data,the proposed method achieved an overall prediction accuracy exceeding 96%,with high-Z materials reaching nearly 99%.The simulation results indicate that transfer learning improves the prediction accuracy by approximately 10% compared to direct prediction without transfer.This study demonstrates the effectiveness of transfer learning in overcoming the physical challenges associated with limited labeled/unlabeled data and highlights the promising potential of transfer learning in the field of muon tomography.展开更多
With the increasing complexity of malware attack techniques,traditional detection methods face significant challenges,such as privacy preservation,data heterogeneity,and lacking category information.To address these i...With the increasing complexity of malware attack techniques,traditional detection methods face significant challenges,such as privacy preservation,data heterogeneity,and lacking category information.To address these issues,we propose Federated Dynamic Prototype Learning(FedDPL)for malware classification by integrating Federated Learning with a specifically designed K-means.Under the Federated Learning framework,model training occurs locally without data sharing,effectively protecting user data privacy and preventing the leakage of sensitive information.Furthermore,to tackle the challenges of data heterogeneity and the lack of category information,FedDPL introduces a dynamic prototype learning mechanism,which adaptively adjusts the clustering prototypes in terms of position and number.Thus,the dependency on predefined category numbers in typical K-means and its variants can be significantly reduced,resulting in improved clustering performance.Theoretically,it provides a more accurate detection of malicious behavior.Experimental results confirm that FedDPL excels in handling malware classification tasks,demonstrating superior accuracy,robustness,and privacy protection.展开更多
With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy...With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy.However,efficient client selection and adaptive weight allocation in heterogeneous and non-IID environments remain challenging.To address these issues,we propose Federated Learning with Client Selection and Adaptive Weighting(FedCW),a novel algorithm that leverages adaptive client selection and dynamic weight allocation for optimizing model convergence in real-time vehicular networks.FedCW selects clients based on their Euclidean distance from the global model and dynamically adjusts aggregation weights to optimize both data diversity and model convergence.Experimental results show that FedCW significantly outperforms existing FL algorithms such as FedAvg,FedProx,and SCAFFOLD,particularly in non-IID settings,achieving faster convergence,higher accuracy,and reduced communication overhead.These findings demonstrate that FedCW provides an effective solution for enhancing the performance of FL in heterogeneous,edge-based computing environments.展开更多
In the context of rural revitalization and the development of smart agriculture, image classification technology based on deep learning has emerged as a crucial tool for digital monitoring and intelligent prevention a...In the context of rural revitalization and the development of smart agriculture, image classification technology based on deep learning has emerged as a crucial tool for digital monitoring and intelligent prevention and control of agricultural diseases. This paper provides a systematic review of the evolutionary development of algorithms within this field. Addressing challenges such as domain drift and limited global awareness in classical convolutional neural networks (CNNs) applied to complex agricultural environments, the paper focuses on the latest advancements in vision transformers (ViT) and their hybrid architectures to enhance cross-domain robustness and fine-grained recognition capabilities. In response to the challenges posed by scarce long-tail data and limited edge computing power in real-world scenarios, the paper explores solutions related to few-shot learning and ultra-lightweight network deployment. Finally, a forward-looking analysis is presented on the application paradigms of multimodal feature fusion, vision-based large models, and explainable artificial intelligence (AI) within smart plant protection. This analysis aims to offer theoretical insights for the development of efficient and transparent intelligent diagnostic systems for agricultural diseases, thereby supporting the advancement of digital agriculture and the construction of a robust agricultural nation.展开更多
Inclusive green growth(IGG)is a crucial pathway to high-quality economic development,with standardization serving as a key enabler.Standardization plays a critical role in reducing coordination costs and improving res...Inclusive green growth(IGG)is a crucial pathway to high-quality economic development,with standardization serving as a key enabler.Standardization plays a critical role in reducing coordination costs and improving resource allocation efficiency by facilitating rule harmonization,factor integration,and collaborative governance.Examining IGG from a standardization perspective helps clarify the mechanisms through which economic,environmental,and social objectives can be jointly realized,and offers new insights into the institutionalized and sustainable pursuit of multiple development goals.However,how standardization promotes IGG by coordinating economic growth,environmental performance,and social equity remains insufficiently explored in the existing literature.Using panel data from 283 prefecture-level Chinese cities(2012-2021),this study treats the comprehensive standardization reform pilot as a quasi-natural experiment and applies a double machine-learning framework to test whether standardization promotes IGG.The analysis further explores the mediating roles of technological innovation,green finance,and employment quality,and examines heterogeneity across geographic location,resource endowment,industrial base,and city hierarchy.It also evaluates the regional coordination effects of standardization.Results show that standardization significantly advances IGG,though its impact varies by regional and structural characteristics.Standardization enhances IGG by strengthening innovation,expanding green finance,and improving job quality.Moreover,it helps bridge geographic divides,narrow interregional disparities,and enhance coordination.These findings offer empirical evidence for policymakers to design targeted standardization strategies that support sustainable and equitable urban development.展开更多
In today's connected world,the generation of massive streaming data across diverse domains has become commonplace.In the presence of concept drift,class imbalance,label scarcity,and new class emergence,these chall...In today's connected world,the generation of massive streaming data across diverse domains has become commonplace.In the presence of concept drift,class imbalance,label scarcity,and new class emergence,these challenges jointly degrade representation stability,bias learning toward outdated distributions,and reduce the resilience and reliability of detection in dynamic environments.This paper proposes a streaming classincremental learning(SCIL)framework to address these issues.The SCIL framework integrates an autoencoder(AE)with a multi-layer perceptron for multi-class prediction,employs a dual-loss strategy(classification and reconstruction)for prediction and new class detection,uses corrected pseudo-labels for online training,manages classes with queues,and applies oversampling to handle imbalance.The rationale behind the method's structure is elucidated through ablation studies,and a comprehensive experimental evaluation is performed using both real-world and synthetic datasets that feature class imbalance,incremental classes,and concept drifts.Our results demonstrate that SCIL outperforms strong baselines and state-of-the-art methods.In line with our commitment to Open Science,we make our code and datasets available to the community.展开更多
With the recent increase in data volume and diversity,traditional text representation techniques are struggling to capture context,particularly in environments with sparse data.To address these challenges,this study p...With the recent increase in data volume and diversity,traditional text representation techniques are struggling to capture context,particularly in environments with sparse data.To address these challenges,this study proposes a new model,the Masked Joint Representation Model(MJRM).MJRM approximates the original hypothesis by leveraging multiple elements in a limited context.It dynamically adapts to changes in characteristics based on data distribution through three main components.First,masking-based representation learning,termed selective dynamic masking,integrates topic modeling and sentiment clustering to generate and train multiple instances across different data subsets,whose predictions are then aggregated with optimized weights.This design alleviates sparsity,suppresses noise,and preserves contextual structures.Second,regularization-based improvements are applied.Third,techniques for addressing sparse data are used to perform final inference.As a result,MJRM improves performance by up to 4%compared to existing AI techniques.In our experiments,we analyzed the contribution of each factor,demonstrating that masking,dynamic learning,and aggregating multiple instances complement each other to improve performance.This demonstrates that a masking-based multi-learning strategy is effective for context-aware sparse text classification,and can be useful even in challenging situations such as data shortage or data distribution variations.We expect that the approach can be extended to diverse fields such as sentiment analysis,spam filtering,and domain-specific document classification.展开更多
Modern intelligent systems,such as autonomous vehicles and face recognition,must continuously adapt to new scenarios while preserving their ability to handle previously encountered situations.However,when neural netwo...Modern intelligent systems,such as autonomous vehicles and face recognition,must continuously adapt to new scenarios while preserving their ability to handle previously encountered situations.However,when neural networks learn new classes sequentially,they suffer from catastrophic forgetting—the tendency to lose knowledge of earlier classes.This challenge,which lies at the core of class-incremental learning,severely limits the deployment of continual learning systems in real-world applications with streaming data.Existing approaches,including rehearsalbased methods and knowledge distillation techniques,have attempted to address this issue but often struggle to effectively preserve decision boundaries and discriminative features under limited memory constraints.To overcome these limitations,we propose a support vector-guided framework for class-incremental learning.The framework integrates an enhanced feature extractor with a Support Vector Machine classifier,which generates boundary-critical support vectors to guide both replay and distillation.Building on this architecture,we design a joint feature retention strategy that combines boundary proximity with feature diversity,and a Support Vector Distillation Loss that enforces dual alignment in decision and semantic spaces.In addition,triple attention modules are incorporated into the feature extractor to enhance representation power.Extensive experiments on CIFAR-100 and Tiny-ImageNet demonstrate effective improvements.On CIFAR-100 and Tiny-ImageNet with 5 tasks,our method achieves 71.68%and 58.61%average accuracy,outperforming strong baselines by 3.34%and 2.05%.These advantages are consistently observed across different task splits,highlighting the robustness and generalization of the proposed approach.Beyond benchmark evaluations,the framework also shows potential in few-shot and resource-constrained applications such as edge computing and mobile robotics.展开更多
The demand for extended electric vehicle(EV)range necessitates advanced lightweighting strategies.This study introduces a materials genome approach,augmented by machine learning(ML),for optimizing lightweight composit...The demand for extended electric vehicle(EV)range necessitates advanced lightweighting strategies.This study introduces a materials genome approach,augmented by machine learning(ML),for optimizing lightweight composite designs for EVs.A comprehensive materials genome database was developed,encompassing composites based on carbon,glass,and natural fibers.This database systematically records critical parameters such as mechanical properties,density,cost,and environmental impact.Machine learning models,including Random Forest,Support Vector Machines,and Artificial Neural Networks,were employed to construct a predictive system for material performance.Subsequent material composition optimization was performed using amulti-objective genetic algorithm.Experimental validation demonstrated that an optimized carbon fiber/bio-based resin composite achieved a 45%weight reduction compared to conventional steel,while maintaining equivalent structural strength.The predictive accuracy of the models reached 94.2%.A cost-benefit analysis indicated that despite a 15%increase in material cost,the overall vehicle energy consumption decreased by 12%,leading to an 18%total cost saving over a five-year operational lifecycle,under a representative mid-size battery electric vehicle(BEV)operational scenario.展开更多
基金supported by the National Natural Science Foundation of China (42505149,41925023,U2342223,42105069,and 91744208)the China Postdoctoral Science Foundation (2025M770303)+1 种基金the Fundamental Research Funds for the Central Universities (14380230)the Jiangsu Funding Program for Excellent Postdoctoral Talent,and Jiangsu Collaborative Innovation Center of Climate Change。
文摘Countries around the world have been making efforts to reduce pollutant emissions. However, the response of global black carbon(BC) aging to emission changes remains unclear. Using the Community Atmosphere Model version 6 with a machine-learning-integrated four-mode version of the Modal Aerosol Module, we quantify global BC aging responses to emission reductions for 2011–2018 and for 2050 and 2100 under carbon neutrality. During 2011–18, global trends in BC aging degree(mass ratio of coatings to BC, R_(BC)) exhibited marked regional disparities, with a significant increase in China(5.4% yr^(-1)), which contrasts with minimal changes in the USA, Europe, and India. The divergence is attributed to opposing trends in secondary organic aerosol(SOA) and sulfate coatings, driven by regional changes in the emission ratios of corresponding coating precursors to BC(volatile organic compounds-VOCs/BC and SO_(2)/BC). Projections under carbon neutrality reveal that R_(BC) will increase globally by 47%(118%) in 2050(2100), with strong convergent increases expected across major source regions. The R_(BC) increase, primarily driven by enhanced SOA coatings due to sharper BC reductions relative to VOCs, will enhance the global BC mass absorption cross-section(MAC) by 11%(17%) in 2050(2100).Consequently, although the global BC burden will decline sharply by 60%(76%), the enhanced MAC partially offsets the magnitude of the decline in the BC direct radiative effect, resulting in the moderation of global BC DRE decreases to 88%(92%) of the BC burden reductions in 2050(2100). This study highlights the globally enhanced BC aging and light absorption capacity under carbon neutrality, thereby partly offsetting the impact of BC direct emission reductions on future changes in BC radiative effects globally.
基金Supported by CAS Basic and Interdisciplinary Frontier Scientific Research Pilot Project(XDB1190300,XDB1190302)Youth Innovation Promotion Association CAS(Y2021056)+1 种基金Joint Fund of the Yulin University and the Dalian National Laboratory for Clean Energy(YLU-DNL Fund 2022007)The special fund for Science and Technology Innovation Teams of Shanxi Province(202304051001007)。
文摘Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face challenges,including high metal usage,high process costs,and low cyclohexene yield.This study utilizes existing literature data combined with machine learning methods to analyze the factors influencing benzene conversion,cyclohexene selectivity,and yield in the benzene hydrogenation to cyclohexene reaction.It constructs predictive models based on XGBoost and Random Forest algorithms.After analysis,it was found that reaction time,Ru content,and space velocity are key factors influencing cyclohexene yield,selectivity,and benzene conversion.Shapley Additive Explanations(SHAP)analysis and feature importance analysis further revealed the contribution of each variable to the reaction outcomes.Additionally,we randomly generated one million variable combinations using the Dirichlet distribution to attempt to predict high-yield catalyst formulations.This paper provides new insights into the application of machine learning in heterogeneous catalysis and offers some reference for further research.
文摘The Internet of Vehicles,or IoV,is expected to lessen pollution,ease traffic,and increase road safety.IoV entities’interconnectedness,however,raises the possibility of cyberattacks,which can have detrimental effects.IoV systems typically send massive volumes of raw data to central servers,which may raise privacy issues.Additionally,model training on IoV devices with limited resources normally leads to slower training times and reduced service quality.We discuss a privacy-preserving Federated Split Learning with Tiny Machine Learning(TinyML)approach,which operates on IoV edge devices without sharing sensitive raw data.Specifically,we focus on integrating split learning(SL)with federated learning(FL)and TinyML models.FL is a decentralisedmachine learning(ML)technique that enables numerous edge devices to train a standard model while retaining data locally collectively.The article intends to thoroughly discuss the architecture and challenges associated with the increasing prevalence of SL in the IoV domain,coupled with FL and TinyML.The approach starts with the IoV learning framework,which includes edge computing,FL,SL,and TinyML,and then proceeds to discuss how these technologies might be integrated.We elucidate the comprehensive operational principles of Federated and split learning by examining and addressingmany challenges.We subsequently examine the integration of SL with FL and various applications of TinyML.Finally,exploring the potential integration of FL and SL with TinyML in the IoV domain is referred to as FSL-TM.It is a superior method for preserving privacy as it conducts model training on individual devices or edge nodes,thereby obviating the necessity for centralised data aggregation,which presents considerable privacy threats.The insights provided aim to help both researchers and practitioners understand the complicated terrain of FL and SL,hence facilitating advancement in this swiftly progressing domain.
文摘This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper aims to identify the most effective AI techniques,the most used datasets,the most widely used data preprocessing techniques,and the most common issues.After analyzing the literature,it has been found that convolutional neural networks(CNNs)and long short-term memory(LSTM)networks are deep learning models that have shown high accuracy in diabetes prediction.Recursive feature elimination(RFE)and SMOTE are feature selection techniques that have significantly improved model accuracy,training time,and interpretability.Amidst this technological advancement,some existing issues persist:data imbalance,the inapplicability of techniques,computational limitations,and a lack of real-time application in a healthcare environment.The literature review has also identified the need for robust,interpretable,and scalable AI systems capable of handling large volumes of data,including real-world data,in the healthcare industry.Furthermore,it has been identified that the benefits should be integrated with wearable health monitoring systems and the development of privacy-preserving models to ensure continuous,secure,and proactive diabetes management.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)under the grant number IMSIU-DDRSP2601.
文摘Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potential to process complex datasets and support decision-making in OC diagnosis.Nevertheless,traditional ML models tend to be biased,overfitting,noisy,and less generalized.Moreover,their black-box nature reduces interpretability and limits their practical clinical applicability.In this study,we introduce an explainable ensemble learning(EL)model,TreeX-Stack,based on a stacking architecture that employs tree-based learners such as Decision Tree(DT),Random Forest(RF),Gradient Boosting(GB),and Extreme Gradient Boosting(XGBoost)as base learners,and Logistic Regression(LR)as the meta-learner to enhance ovarian cancer(OC)diagnosis.Local Interpretable ModelAgnostic Explanations(LIME)are used to explain individual predictions,making the model outputs more clinically interpretable and applicable.The model is trained on the dataset that includes demographic information,blood test,general chemistry,and tumor markers.Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7.Relevant features are then selected using the Boruta feature selection method.To obtain robust and unbiased performance estimates during hyperparameter tuning,nested cross-validation(CV)with grid search is employed,and all experiments are repeated five times to ensure statistical reliability.TreeX-Stack demonstrates excellent diagnostic performance,achieving an accuracy of 0.9027,a precision of 0.8673,a recall of 0.9391,and an F1-score of 0.9012.Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4(HE4)as the most significant biomarker for OC.The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.
文摘Federated learning is a decentralized model training paradigm with significant potential.However,the quality of Federated Network’s client updates can vary due to non-IID data distributions,leading to suboptimal global models.To address this issue,we propose a novel client selection strategy called FedPA(Performance-Based Federated Averaging).This proposed model selectively aggregates client updates based on a predefined performance threshold.Only clients whose local models achieve an F1 score of 70%or higher after training are included in the aggregation process.Clients below this threshold receive the updated global model but do not contribute their parameters.In this way,the low-performance clients are still in the process of learning and,after some rounds,will be able to contribute.If no client meets the performance threshold in a given round,the system falls back to standard FedAvg aggregation.This ensures the global model continues to improve even when most clients perform poorly.We evaluate FedPA on a subset of the MURA dataset for abnormality detection in radiographs of four bone types.Compared to baseline federated learning algorithms such as Federated Averaging(FedAvg),Federated Proximal(FedProx),Federated Stochastic Gradient Descent(FedSGD),and Federated Batch Normalization(FedBN),FedPA consistently ranks first or second across key performance metrics,particularly in accuracy,F1 score,and recall.Moreover,FedPA demonstrates notable efficiency,achieving the lowest average round time(≈2270 s)and minimal memory usage(≈645.58 MB),all without relying on GPU resources.These results highlight FedPA’s effectiveness in improving global model quality while reducing computational overhead,positioning it as a promising approach for real-world federated learning applications in resource-constrained environments.
基金supported by the National Natural Science Foundation of China(62225303,62403043,62433004)the Beijing Natural Science Foundation(4244085)+1 种基金the Postdoctoral Fellowship Program of China Postdoctoral Science Foundation(GZC20230203)the China Postdoctoral Science Foundation(2023M740201)。
文摘Deep graph contrastive clustering has attracted widespread attentions due to its self-supervised representation learning paradigm and superior clustering performance.Although,two challenges emerge and result in high computational costs.Most existing contrastive methods adopt the data augmentation and then representation learning strategy,where representation learning with trainable graph convolution is coupled with complex and fixed data augmentation,inevitably limiting the efficiency and flexibility.The similarity metric between positive-negative sample pairs is complex and contrastive objective is partial,limiting the discriminability of representation learning.To solve these challenges,a novel wide graph clustering network(WGCN)adhering to representation and then augmentation framework is proposed,which mainly consists of multiorder filter fusion(MFF)and double-level contrastive learning(DCL)modules.Specifically,the MFF module integrates multiorder low-pass filters to extract smooth and multi-scale topological features,utilizing self-attention fusion to reduce redundancy and obtain comprehensive embedding representation.Further,the DCL module constructs two augmented views by the parallel parameter-unshared Siamese encoders rather than complex augmentations on graph.To achieve simple yet effective self-supervised learning,representation self-supervision and structural consistency oriented double-level contrastive loss is designed,where representation self-supervision maximizes the agreement between pairwise augmented embedding representations and structural consistency promotes the mutual information correlation between appending neighborhoods with similar semantics.Extensive experiments on six benchmark datasets demonstrate the superiority of the proposed WGCN,especially highlighting its time-saving characteristic.The code could be available in the https://github.com/Tianxiang Zhao0474/WGCN.
基金supported by the Academic Research Projects of Beijing Union University(ZK20202204)the National Natural Science Foundation of China(12250005,12073040,12273059,11973056,12003051,11573037,12073041,11427901,11572005,11611530679 and 12473052)+1 种基金the Strategic Priority Research Program of the China Academy of Sciences(XDB0560000,XDA15052200,XDB09040200,XDA15010700,XDB0560301,and XDA15320102)the Chinese Meridian Project(CMP).
文摘The solar cycle(SC),a phenomenon caused by the quasi-periodic regular activities in the Sun,occurs approximately every 11 years.Intense solar activity can disrupt the Earth’s ionosphere,affecting communication and navigation systems.Consequently,accurately predicting the intensity of the SC holds great significance,but predicting the SC involves a long-term time series,and many existing time series forecasting methods have fallen short in terms of accuracy and efficiency.The Time-series Dense Encoder model is a deep learning solution tailored for long time series prediction.Based on a multi-layer perceptron structure,it outperforms the best previously existing models in accuracy,while being efficiently trainable on general datasets.We propose a method based on this model for SC forecasting.Using a trained model,we predict the test set from SC 19 to SC 25 with an average mean absolute percentage error of 32.02,root mean square error of 30.3,mean absolute error of 23.32,and R^(2)(coefficient of determination)of 0.76,outperforming other deep learning models in terms of accuracy and training efficiency on sunspot number datasets.Subsequently,we use it to predict the peaks of SC 25 and SC 26.For SC 25,the peak time has ended,but a stronger peak is predicted for SC 26,of 199.3,within a range of 170.8-221.9,projected to occur during April 2034.
基金the Deanship of Research and Graduate Studies at King Khalid University,KSA,for funding this work through the Large Research Project under grant number RGP2/164/46.
文摘Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.
基金Project supported in part by Beijing Natural Science Foundation(Grant No.1232025)Peng Huanwu Visiting Pro-fessor Program,and Academy for Multidisciplinary Studies,Capital Normal University.
文摘Replicating the chaotic characteristics inherent in nonlinear dynamical systems via machine learning(ML)is a key challenge in this rapidly advancing interdisciplinary field.In this work,we explore the potential of variational quantum circuits(VQC)for learning the stochastic properties of classical nonlinear dynamical systems.Specifically,we focus on the one-and two-dimensional logistic maps,which,while simple,remain under-explored in the context of learning dynamical characteristics.Our findings reveal that,even for such simple dynamical systems,accurately replicating longterm characteristics is hindered by a pronounced sensitivity to overfitting.While increasing the parameter complexity of the ML model typically enhances short-term prediction accuracy,it also leads to a degradation in the model’s ability to replicate long-term characteristics,primarily due to the detrimental effects of overfitting on generalization power.By comparing the VQC with two widely recognized classical ML techniques,which are long short-term memory(LSTM)networks for timeseries processing and reservoir computing,we demonstrate that VQC outperforms these methods in terms of replicating long-term characteristics.Our results suggest that for the ML of dynamics,it is demanded to develop more compact and efficient models(such as VQC)rather than more complicated and large-scale ones.
基金supported by the National Natural Science Foundation of China(No.62576349)。
文摘Unmanned aerial vehicles(UAVs)face the challenge of autonomous obstacle avoidance in complex,multi-obstacle environments.Behavior cloning offers a promising approach to rapidly acquire a learning policy from limited expert demonstrations.However,pure imitation learning inherently suffers from poor exploration and limited generalization,typically necessitating extensive datasets to train competent student policies.We utilize a cross-modal variational autoencoder(CM-VAE)to extract compact features from raw visual inputs and UAV states,which then feed into a policy network.We evaluated our approach in a simulated environment featuring a challenging circular trajectory with eight gate obstacles.The results demonstrate that the policy trained with pure behavior cloning consistently failed.In stark contrast,our DAgger-augmented behavior cloning method successfully traversed all gates without collision.Our findings confirm that DAgger effectively mitigates the shortcomings of behavior cloning,enabling the creation of reliable and sample-efficient navigation policies for UAVs.
基金supported by the Research Program of State Key Laboratory of Heavy Ion Science and Technology,Institute of Modern Physics,Chinese Academy of Sciences(No.HIST2025CS06)the National Natural Science Foundation of China(Nos.12405402,12475106,12105327,and 12405337)the Guangdong Basic and Applied Basic Research Foundation,China(No.2023B1515120067)。
文摘Cosmic-ray muon sources exhibit distinct scattering angle distributions when interacting with materials of different atomic numbers(Z values),facilitating the identification of various Z-class materials,particularly radioactive high-Z nuclear elements.Most traditional identification methods are based on complex statistical iterative reconstruction or simple trajectory approximation.Supervised machine learning methods offer some improvement but rely heavily on prior knowledge of the target materials,significantly limiting their practical applicability in detecting concealed materials.To the best of our knowledge,this is the first study to introduce transfer learning into muon tomography.We propose two lightweight neural network models for fine-tuning and adversarial transfer learning,utilizing muon scattering data of bare materials to predict the Z-class of materials coated by typical shieldings(e.g.,aluminum or polyethylene),simulating practical scenarios such as cargo inspection and arms control.By introducing a novel inverse cumulative distribution-based sampling method,more accurate scattering angle distributions could be obtained from the data,leading to an improvement of nearly 4% in prediction accuracy compared with the traditional random sampling-based training.When applied to coated materials with limited labeled or even unlabeled muon tomography data,the proposed method achieved an overall prediction accuracy exceeding 96%,with high-Z materials reaching nearly 99%.The simulation results indicate that transfer learning improves the prediction accuracy by approximately 10% compared to direct prediction without transfer.This study demonstrates the effectiveness of transfer learning in overcoming the physical challenges associated with limited labeled/unlabeled data and highlights the promising potential of transfer learning in the field of muon tomography.
基金supported by the National Natural Science Foundation of China under Grant No.62162009the Key Technologies R&D Program of He’nan Province under Grant No.242102211065+2 种基金the Postgraduate Education Reform and Quality Improvement Project of Henan Province under Grant Nos.YJS2025GZZ36,YJS2024AL112,and YJS2024JD38the Innovation Scientists and Technicians Troop Construction Projects of Henan Province under Grant No.CXTD2017099the Scientific Research Innovation Team of Xuchang University under Grant No.2022CXTD003.
文摘With the increasing complexity of malware attack techniques,traditional detection methods face significant challenges,such as privacy preservation,data heterogeneity,and lacking category information.To address these issues,we propose Federated Dynamic Prototype Learning(FedDPL)for malware classification by integrating Federated Learning with a specifically designed K-means.Under the Federated Learning framework,model training occurs locally without data sharing,effectively protecting user data privacy and preventing the leakage of sensitive information.Furthermore,to tackle the challenges of data heterogeneity and the lack of category information,FedDPL introduces a dynamic prototype learning mechanism,which adaptively adjusts the clustering prototypes in terms of position and number.Thus,the dependency on predefined category numbers in typical K-means and its variants can be significantly reduced,resulting in improved clustering performance.Theoretically,it provides a more accurate detection of malicious behavior.Experimental results confirm that FedDPL excels in handling malware classification tasks,demonstrating superior accuracy,robustness,and privacy protection.
文摘With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy.However,efficient client selection and adaptive weight allocation in heterogeneous and non-IID environments remain challenging.To address these issues,we propose Federated Learning with Client Selection and Adaptive Weighting(FedCW),a novel algorithm that leverages adaptive client selection and dynamic weight allocation for optimizing model convergence in real-time vehicular networks.FedCW selects clients based on their Euclidean distance from the global model and dynamically adjusts aggregation weights to optimize both data diversity and model convergence.Experimental results show that FedCW significantly outperforms existing FL algorithms such as FedAvg,FedProx,and SCAFFOLD,particularly in non-IID settings,achieving faster convergence,higher accuracy,and reduced communication overhead.These findings demonstrate that FedCW provides an effective solution for enhancing the performance of FL in heterogeneous,edge-based computing environments.
基金Supported by School-level Project of Shaoyang Industry Polytechnic College(SKY24A06)Science and Technology Plan(Special Fund Subsidy)of Shaoyang City(2024PT4070)General Research Project of Hunan Provincial Department of Education in 2025(25C1457).
文摘In the context of rural revitalization and the development of smart agriculture, image classification technology based on deep learning has emerged as a crucial tool for digital monitoring and intelligent prevention and control of agricultural diseases. This paper provides a systematic review of the evolutionary development of algorithms within this field. Addressing challenges such as domain drift and limited global awareness in classical convolutional neural networks (CNNs) applied to complex agricultural environments, the paper focuses on the latest advancements in vision transformers (ViT) and their hybrid architectures to enhance cross-domain robustness and fine-grained recognition capabilities. In response to the challenges posed by scarce long-tail data and limited edge computing power in real-world scenarios, the paper explores solutions related to few-shot learning and ultra-lightweight network deployment. Finally, a forward-looking analysis is presented on the application paradigms of multimodal feature fusion, vision-based large models, and explainable artificial intelligence (AI) within smart plant protection. This analysis aims to offer theoretical insights for the development of efficient and transparent intelligent diagnostic systems for agricultural diseases, thereby supporting the advancement of digital agriculture and the construction of a robust agricultural nation.
基金supported by the National Social Science Fund of China“Research on the Ecological Symbiotic Development and Dynamic Governance of Manufacturing Entrepreneurship Platform”[Grant No.22BGL048].
文摘Inclusive green growth(IGG)is a crucial pathway to high-quality economic development,with standardization serving as a key enabler.Standardization plays a critical role in reducing coordination costs and improving resource allocation efficiency by facilitating rule harmonization,factor integration,and collaborative governance.Examining IGG from a standardization perspective helps clarify the mechanisms through which economic,environmental,and social objectives can be jointly realized,and offers new insights into the institutionalized and sustainable pursuit of multiple development goals.However,how standardization promotes IGG by coordinating economic growth,environmental performance,and social equity remains insufficiently explored in the existing literature.Using panel data from 283 prefecture-level Chinese cities(2012-2021),this study treats the comprehensive standardization reform pilot as a quasi-natural experiment and applies a double machine-learning framework to test whether standardization promotes IGG.The analysis further explores the mediating roles of technological innovation,green finance,and employment quality,and examines heterogeneity across geographic location,resource endowment,industrial base,and city hierarchy.It also evaluates the regional coordination effects of standardization.Results show that standardization significantly advances IGG,though its impact varies by regional and structural characteristics.Standardization enhances IGG by strengthening innovation,expanding green finance,and improving job quality.Moreover,it helps bridge geographic divides,narrow interregional disparities,and enhance coordination.These findings offer empirical evidence for policymakers to design targeted standardization strategies that support sustainable and equitable urban development.
基金supported by the European Research Council(ERC)under Grant Agreement No.951424(Water-Futures)by the Republic of Cyprus through the Deputy Ministry of Research,Innovation and Digital Policy.
文摘In today's connected world,the generation of massive streaming data across diverse domains has become commonplace.In the presence of concept drift,class imbalance,label scarcity,and new class emergence,these challenges jointly degrade representation stability,bias learning toward outdated distributions,and reduce the resilience and reliability of detection in dynamic environments.This paper proposes a streaming classincremental learning(SCIL)framework to address these issues.The SCIL framework integrates an autoencoder(AE)with a multi-layer perceptron for multi-class prediction,employs a dual-loss strategy(classification and reconstruction)for prediction and new class detection,uses corrected pseudo-labels for online training,manages classes with queues,and applies oversampling to handle imbalance.The rationale behind the method's structure is elucidated through ablation studies,and a comprehensive experimental evaluation is performed using both real-world and synthetic datasets that feature class imbalance,incremental classes,and concept drifts.Our results demonstrate that SCIL outperforms strong baselines and state-of-the-art methods.In line with our commitment to Open Science,we make our code and datasets available to the community.
基金supported by the SungKyunKwan University and the BK21 FOUR(Graduate School Innovation)funded by the Ministry of Education(MOE,Korea)and National Research Foundation of Korea(NRF).
文摘With the recent increase in data volume and diversity,traditional text representation techniques are struggling to capture context,particularly in environments with sparse data.To address these challenges,this study proposes a new model,the Masked Joint Representation Model(MJRM).MJRM approximates the original hypothesis by leveraging multiple elements in a limited context.It dynamically adapts to changes in characteristics based on data distribution through three main components.First,masking-based representation learning,termed selective dynamic masking,integrates topic modeling and sentiment clustering to generate and train multiple instances across different data subsets,whose predictions are then aggregated with optimized weights.This design alleviates sparsity,suppresses noise,and preserves contextual structures.Second,regularization-based improvements are applied.Third,techniques for addressing sparse data are used to perform final inference.As a result,MJRM improves performance by up to 4%compared to existing AI techniques.In our experiments,we analyzed the contribution of each factor,demonstrating that masking,dynamic learning,and aggregating multiple instances complement each other to improve performance.This demonstrates that a masking-based multi-learning strategy is effective for context-aware sparse text classification,and can be useful even in challenging situations such as data shortage or data distribution variations.We expect that the approach can be extended to diverse fields such as sentiment analysis,spam filtering,and domain-specific document classification.
基金supported by the Gansu Provincial Natural Science Foundation(grant number 25JRRA074)the Gansu Provincial Key R&D Science and Technology Program(grant number 24YFGA060)the National Natural Science Foundation of China(grant number 62161019).
文摘Modern intelligent systems,such as autonomous vehicles and face recognition,must continuously adapt to new scenarios while preserving their ability to handle previously encountered situations.However,when neural networks learn new classes sequentially,they suffer from catastrophic forgetting—the tendency to lose knowledge of earlier classes.This challenge,which lies at the core of class-incremental learning,severely limits the deployment of continual learning systems in real-world applications with streaming data.Existing approaches,including rehearsalbased methods and knowledge distillation techniques,have attempted to address this issue but often struggle to effectively preserve decision boundaries and discriminative features under limited memory constraints.To overcome these limitations,we propose a support vector-guided framework for class-incremental learning.The framework integrates an enhanced feature extractor with a Support Vector Machine classifier,which generates boundary-critical support vectors to guide both replay and distillation.Building on this architecture,we design a joint feature retention strategy that combines boundary proximity with feature diversity,and a Support Vector Distillation Loss that enforces dual alignment in decision and semantic spaces.In addition,triple attention modules are incorporated into the feature extractor to enhance representation power.Extensive experiments on CIFAR-100 and Tiny-ImageNet demonstrate effective improvements.On CIFAR-100 and Tiny-ImageNet with 5 tasks,our method achieves 71.68%and 58.61%average accuracy,outperforming strong baselines by 3.34%and 2.05%.These advantages are consistently observed across different task splits,highlighting the robustness and generalization of the proposed approach.Beyond benchmark evaluations,the framework also shows potential in few-shot and resource-constrained applications such as edge computing and mobile robotics.
文摘The demand for extended electric vehicle(EV)range necessitates advanced lightweighting strategies.This study introduces a materials genome approach,augmented by machine learning(ML),for optimizing lightweight composite designs for EVs.A comprehensive materials genome database was developed,encompassing composites based on carbon,glass,and natural fibers.This database systematically records critical parameters such as mechanical properties,density,cost,and environmental impact.Machine learning models,including Random Forest,Support Vector Machines,and Artificial Neural Networks,were employed to construct a predictive system for material performance.Subsequent material composition optimization was performed using amulti-objective genetic algorithm.Experimental validation demonstrated that an optimized carbon fiber/bio-based resin composite achieved a 45%weight reduction compared to conventional steel,while maintaining equivalent structural strength.The predictive accuracy of the models reached 94.2%.A cost-benefit analysis indicated that despite a 15%increase in material cost,the overall vehicle energy consumption decreased by 12%,leading to an 18%total cost saving over a five-year operational lifecycle,under a representative mid-size battery electric vehicle(BEV)operational scenario.