Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Re...Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.展开更多
The viscosity of refining slags plays a critical role in metallurgical processes.However,obtaining accurate viscosity data remains challenging due to the complexities of high-temperature experiments,often relying on e...The viscosity of refining slags plays a critical role in metallurgical processes.However,obtaining accurate viscosity data remains challenging due to the complexities of high-temperature experiments,often relying on empirical models with limited predictive capabilities.This study focuses on the influence of optical basicity on viscosity in CaO-Al_(2)O_(3)-based refining slags,leveraging machine learning to address data scarcity and improve prediction accuracy.An automated framework for algorithm integration,parameter tuning,and evaluation ranking framework(Auto-APE)is employed to develop customized data-driven models for various slag systems,including CaO-Al_(2)O_(3)-SiO_(2),CaO-Al_(2)O_(3)-CaF_(2),CaO-Al_(2)O_(3)-SiO_(2)-MgO,and CaO-Al_(2)O_(3)-SiO_(2)-MgO-CaF_(2).By incorporating optical basicity as a key feature,the models achieve an average validation error of 8.0%to 15.1%,significantly outperforming traditional empirical models.Additionally,symbolic regression is introduced to rapidly construct domain-specific features,such as optical basicity-like descriptors,offering a potential breakthrough in performance prediction for small datasets.This work highlights the critical role of domain-specific knowledge in understanding and predicting viscosity,providing a robust machine learning-based approach for optimizing refining slag properties.展开更多
Forecasting landslide deformation is challenging due to influence of various internal and external factors on the occurrence of systemic and localized heterogeneities.Despite the potential to improve landslide predict...Forecasting landslide deformation is challenging due to influence of various internal and external factors on the occurrence of systemic and localized heterogeneities.Despite the potential to improve landslide predictability,deep learning has yet to be sufficiently explored for complex deformation patterns associated with landslides and is inherently opaque.Herein,we developed a holistic landslide deformation forecasting method that considers spatiotemporal correlations of landslide deformation by integrating domain knowledge into interpretable deep learning.By spatially capturing the interconnections between multiple deformations from different observation points,our method contributes to the understanding and forecasting of landslide systematic behavior.By integrating specific domain knowledge relevant to each observation point and merging internal properties with external variables,the local heterogeneity is considered in our method,identifying deformation temporal patterns in different landslide zones.Case studies involving reservoir-induced landslides and creeping landslides demonstrated that our approach(1)enhances the accuracy of landslide deformation forecasting,(2)identifies significant contributing factors and their influence on spatiotemporal deformation characteristics,and(3)demonstrates how identifying these factors and patterns facilitates landslide forecasting.Our research offers a promising and pragmatic pathway toward a deeper understanding and forecasting of complex landslide behaviors.展开更多
Despite significant progress in the Prognostics and Health Management(PHM)domain using pattern learning systems from data,machine learning(ML)still faces challenges related to limited generalization and weak interpret...Despite significant progress in the Prognostics and Health Management(PHM)domain using pattern learning systems from data,machine learning(ML)still faces challenges related to limited generalization and weak interpretability.A promising approach to overcoming these challenges is to embed domain knowledge into the ML pipeline,enhancing the model with additional pattern information.In this paper,we review the latest developments in PHM,encapsulated under the concept of Knowledge Driven Machine Learning(KDML).We propose a hierarchical framework to define KDML in PHM,which includes scientific paradigms,knowledge sources,knowledge representations,and knowledge embedding methods.Using this framework,we examine current research to demonstrate how various forms of knowledge can be integrated into the ML pipeline and provide roadmap to specific usage.Furthermore,we present several case studies that illustrate specific implementations of KDML in the PHM domain,including inductive experience,physical model,and signal processing.We analyze the improvements in generalization capability and interpretability that KDML can achieve.Finally,we discuss the challenges,potential applications,and usage recommendations of KDML in PHM,with a particular focus on the critical need for interpretability to ensure trustworthy deployment of artificial intelligence in PHM.展开更多
Topographic maps,as essential tools and sources of information for geographic research,contain precise spatial locations and rich map features,and they illustrate spatio-temporal information on the distribution and di...Topographic maps,as essential tools and sources of information for geographic research,contain precise spatial locations and rich map features,and they illustrate spatio-temporal information on the distribution and differences of various surface features.Currently,topographic maps are mainly stored in raster and vector formats.Extraction of the spatio-temporal knowledge in the maps—such as spatial distribution patterns,feature relationships,and dynamic evolution—still primarily relies on manual interpretation.However,manual interpretation is time-consuming and laborious,especially for large-scale,long-term map knowledge extraction and application.With the development of artificial intelligence technology,it is possible to improve the automation level of map knowledge interpretation.Therefore,the present study proposes an automatic interpretation method for raster topographic map knowledge based on deep learning.To address the limitations of current data-driven intelligent technology in learning map spatial relations and cognitive logic,we establish a formal description of map knowledge by mapping the relationship between map knowledge and features,thereby ensuring interpretation accuracy.Subsequently,deep learning techniques are employed to extract map features automatically,and the spatio-temporal knowledge is constructed by combining formal descriptions of geographic feature knowledge.Validation experiments demonstrate that the proposed method effectively achieves automatic interpretation of spatio-temporal knowledge of geographic features in maps,with an accuracy exceeding 80%.The findings of the present study contribute to machine understanding of spatio-temporal differences in map knowledge and advances the intelligent interpretation and utilization of cartographic information.展开更多
Defect detection based on computer vision is a critical component in ensuring the quality of industrial products.However,existing detection methods encounter several challenges in practical applications,including the ...Defect detection based on computer vision is a critical component in ensuring the quality of industrial products.However,existing detection methods encounter several challenges in practical applications,including the scarcity of labeled samples,limited adaptability of pre-trained models,and the data heterogeneity in distributed environments.To address these issues,this research proposes an unsupervised defect detection method,FLAME(Federated Learning with Adaptive Multi-Model Embeddings).The method comprises three stages:(1)Feature learning stage:this work proposes FADE(Feature-Adaptive Domain-Specific Embeddings),a framework employs Gaussian noise injection to simulate defective patterns and implements a feature discriminator for defect detection,thereby enhancing the pre-trained model’s industrial imagery representation capabilities.(2)Knowledge distillation co-training stage:a multi-model feature knowledge distillation mechanism is introduced.Through feature-level knowledge transfer between the global model and historical local models,the current local model is guided to learn better feature representations from the global model.The approach prevents local models from converging to local optima and mitigates performance degradation caused by data heterogeneity.(3)Model parameter aggregation stage:participating clients utilize weighted averaging aggregation to synthesize an updated global model,facilitating efficient knowledge consolidation.Experimental results demonstrate that FADE improves the average image-level Area under the Receiver Operating Characteristic Curve(AUROC)by 7.34%compared to methods directly utilizing pre-trained models.In federated learning environments,FLAME’s multi-model feature knowledge distillation mechanism outperforms the classic FedAvg algorithm by 2.34%in average image-level AUROC,while exhibiting superior convergence properties.展开更多
For the history of medical culture in the world,the exchange and transmission of medical knowledge has formed an important part of mutual learning among different cultures,which has also increasingly shown unique acad...For the history of medical culture in the world,the exchange and transmission of medical knowledge has formed an important part of mutual learning among different cultures,which has also increasingly shown unique academic value in the study of knowledge history.Traditional Eastern medicine(such as Chinese medicine,Indian ayurvedic medicine,Persian medicine,Arabic medicine),and other medical systems in the ancient Western world(including Greek medicine and Roman medicine)have left precious literature/texts,cultural relics(for example,pills,preparations,medical instruments),folklore and legends,which truly record the process of learning,transplantation,fusion and succession after the encounter of different medical systems at least for the past two thousand years.展开更多
The electro-chemo-mechanical mechanism is critical for understanding the initiation and propagation of lithium(Li)dendrites in solid-state lithium metal battery(SSLMB).Li dendrites often nucleate within surface defect...The electro-chemo-mechanical mechanism is critical for understanding the initiation and propagation of lithium(Li)dendrites in solid-state lithium metal battery(SSLMB).Li dendrites often nucleate within surface defects in the solid-state electrolyte,leading to internal short circuits that hinder practical application of SSLMB.While conventional experimental and finite element methods provide valuable insights,they are often costly,time-consuming,and inefficient for capturing the complicated stress evolution inside solid-state electrolyte.In this study,we propose a novel machine learning strategy that integrates prior knowledge and physics-informed constraints to predict the von Mises stress distribution induced by the internal defects of solid-state electrolyte.High-quality training datasets generated using a multiphysics simulation framework and key findings from previous studies were incorporated as physicsguided constraints to enhance prediction reliability and physical consistency of machine learning models.By employing a modified UNet architecture with squeeze-and-excitation module,it demonstrates remarkably high accuracy in stress prediction and exhibits excellent robustness and generalization across a wide range of defect scenarios.This model allows us to efficiently obtain the electro-chemo-mechanical failure of solid-state electrolyte,thereby guiding micro structural modifications and facilitating the design of SSLMB for practical applications.展开更多
The“Opinions on Comprehensively Deepening Curriculum Reform to Fulfill the Fundamental Task of Strengthening Moral Education”,issued by China’s Ministry of Education in 2015,explicitly identified Project-Based Lear...The“Opinions on Comprehensively Deepening Curriculum Reform to Fulfill the Fundamental Task of Strengthening Moral Education”,issued by China’s Ministry of Education in 2015,explicitly identified Project-Based Learning(PBL)as a key strategy for cultivating students’core competencies.Since then,PBL has been widely implemented as a pilot initiative in primary and secondary schools,gaining increasing influence.Analyzing the intellectual foundations of PBL research in China can offer valuable insights into its theoretical and practical dimensions.This study uses CiteSpace to examine 156 PBL-related articles from the CSSCI database,revealing that the knowledge base of PBL research is primarily built on two major domains.The first is the theoretical foundation,characterized by frequently cited literature focusing on the conceptual framework,educational value,interdisciplinary approaches,core competency cultivation,and instructional objectives of PBL.The second is empirical research,where highly cited studies include case analyses across K–12 settings,general high schools,and higher education institutions.Moving forward,future research on PBL should explore its meaning and value from a dual-subject and integrated perspective,expand case studies to include vocational education,and further promote the interdisciplinary development of core competencies through PBL.展开更多
Existing wireless networks are flooded with video data transmissions,and the demand for high-speed and low-latency video services continues to surge.This has brought with it challenges to networks in the form of conge...Existing wireless networks are flooded with video data transmissions,and the demand for high-speed and low-latency video services continues to surge.This has brought with it challenges to networks in the form of congestion as well as the need for more resources and more dedicated caching schemes.Recently,Multi-access Edge Computing(MEC)-enabled heterogeneous networks,which leverage edge caches for proximity delivery,have emerged as a promising solution to all of these problems.Designing an effective edge caching scheme is critical to its success,however,in the face of limited resources.We propose a novel Knowledge Graph(KG)-based Dueling Deep Q-Network(KG-DDQN)for cooperative caching in MEC-enabled heterogeneous networks.The KGDDQN scheme leverages a KG to uncover video relations,providing valuable insights into user preferences for the caching scheme.Specifically,the KG guides the selection of related videos as caching candidates(i.e.,actions in the DDQN),thus providing a rich reference for implementing a personalized caching scheme while also improving the decision efficiency of the DDQN.Extensive simulation results validate the convergence effectiveness of the KG-DDQN,and it also outperforms baselines regarding cache hit rate and service delay.展开更多
Staple crops are the cornerstone of the food supply but are frequently threatened by plant diseases.Effective disease management,including disease identification and severity assessment,helps to better address these c...Staple crops are the cornerstone of the food supply but are frequently threatened by plant diseases.Effective disease management,including disease identification and severity assessment,helps to better address these challenges.Currently,methods for disease severity assessment typically rely on calculating the area proportion of disease segmentation regions or using classification networks for severity assessment.However,these methods require large amounts of labeled data and fail to quantify lesion proportions when using classification networks,leading to inaccurate evaluations.To address these issues,we propose an automated framework for disease severity assessment that combines multi-task learning and knowledge-driven large-model segmentation techniques.This framework includes an image information processor,a lesion and leaf segmentation module,and a disease severity assessment module.First,the image information processor utilizes a multi-task learning strategy to analyze input images comprehensively,ensuring a deep understanding of disease characteristics.Second,the lesion and leaf segmentation module employ prompt-driven large-model technology to accurately segment diseased areas and entire leaves,providing detailed visual analysis.Finally,the disease severity assessment module objectively evaluates the severity of the disease based on professional grading standards by calculating lesion area proportions.Additionally,we have developed a comprehensive database of diseased leaf images from major crops,including several task-specific datasets.Experimental results demonstrate that our framework can accurately identify and assess the types and severity of crop diseases,even without extensive labeled data.Codes and data are available at http://dkp-ads.samlab.cn/.展开更多
Fault sensing in wind turbine(WT)generator bearings is essential for ensuring reliability and holding down maintenance costs.Feeding raw sensor data to machine learning(ML)model often overlooks the enveloping interdep...Fault sensing in wind turbine(WT)generator bearings is essential for ensuring reliability and holding down maintenance costs.Feeding raw sensor data to machine learning(ML)model often overlooks the enveloping interdependencies between system elements.This study proposes a new hybrid method that combines the domain knowledge via knowledge graphs(KGs)and the traditional feature-based data.Incorporation of contextual relationships through construction of graph embedding methods,such as Node2Vec,can capture meaningful information,such as the relationships among key parameters(e.g.wind speed,rotor Revolutions Per Minute(RPM),and temperature)in the enriched feature representations.These node embeddings,when augmented with the original data,can be used to allow the model to learn and generalize better.As shown in results achieved on experimental data,the augmented ML model(with KG)is much better at predicting with the help of accuracy and error measure compared to traditional ML methods.Paired t-test analysis proves the statistical validity of this improvement.Moreover,graph-based feature importance increases the interpretability of the model and helps to uncover the structurally significant variables that are otherwise ignored by the common methods.The approach provides an excellent,knowledge-guided manner through which intelligent fault detection can be executed on WT systems.展开更多
Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches ofte...Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.展开更多
During the 17th and 18th centuries,medical exchanges between Japan and China were frequent and intensive.In the 17th century,due to the wars during the Ming-Qing transition,numerous Chinese physicians came to Japan,br...During the 17th and 18th centuries,medical exchanges between Japan and China were frequent and intensive.In the 17th century,due to the wars during the Ming-Qing transition,numerous Chinese physicians came to Japan,bringing with them advanced medical techniques and newly published medical texts.In the early 18th century,following Tokugawa Yoshimune’s(徳川吉宗)implementation of medical reform policies,many Chinese physicians arrived in Japan.There,they exchanged knowledge with Japanese physicians and facilitated the publication of Chinese medical texts in Japan.These exchanges significantly increased attention to Chinese medical works,particularly Shang Han Lun(《伤寒论》Treatise on Cold Damage),within Edo medical circles.This had a profound impact on physicians of the Japanese Kohōschool(古方派)and significantly contributed to shaping Kampōmedicine into its contemporary form.From the perspectives of intellectual history and knowledge exchange,this paper explores the circulation of medical knowledge between China and Japan during the early modern period,examining its profound historical influence on Japanese medicine.The study specifically aims to clarify the authentic meaning of“Ancient Learning(复古)”and to correct the prevailing academic misconception that the Kohōschool exclusively focused on reviving Shang Han Lun.展开更多
Extrapolation on Temporal Knowledge Graphs(TKGs)aims to predict future knowledge from a set of historical Knowledge Graphs in chronological order.The temporally adjacent facts in TKGs naturally form event sequences,ca...Extrapolation on Temporal Knowledge Graphs(TKGs)aims to predict future knowledge from a set of historical Knowledge Graphs in chronological order.The temporally adjacent facts in TKGs naturally form event sequences,called event evolution patterns,implying informative temporal dependencies between events.Recently,many extrapolation works on TKGs have been devoted to modelling these evolutional patterns,but the task is still far from resolved because most existing works simply rely on encoding these patterns into entity representations while overlooking the significant information implied by relations of evolutional patterns.However,the authors realise that the temporal dependencies inherent in the relations of these event evolution patterns may guide the follow-up event prediction to some extent.To this end,a Temporal Relational Context-based Temporal Dependencies Learning Network(TRenD)is proposed to explore the temporal context of relations for more comprehensive learning of event evolution patterns,especially those temporal dependencies caused by interactive patterns of relations.Trend incorporates a semantic context unit to capture semantic correlations between relations,and a structural context unit to learn the interaction pattern of relations.By learning the temporal contexts of relations semantically and structurally,the authors gain insights into the underlying event evolution patterns,enabling to extract comprehensive historical information for future prediction better.Experimental results on benchmark datasets demonstrate the superiority of the model.展开更多
This paper proposes a deep learning-based 3D LiDAR perception framework designed for applications such as autonomous robots and vehicles.To address the high dependency on large-scale annotated data—an inherent limita...This paper proposes a deep learning-based 3D LiDAR perception framework designed for applications such as autonomous robots and vehicles.To address the high dependency on large-scale annotated data—an inherent limitation of deep learning models—this study introduces a hybrid perception architecture that incorporates expertdriven LiDAR processing techniques into the deep neural network.Traditional 3DLiDAR processingmethods typically remove ground planes and apply distance-or density-based clustering for object detection.In this work,such expert knowledge is encoded as feature-level inputs and fused with the deep network,therebymitigating the data dependency issue of conventional learning-based approaches.Specifically,the proposedmethod combines two expert algorithms—Patchwork++for ground segmentation and DBSCAN for clustering—with a PointPillars-based LiDAR detection network.We design four hybrid versions of the network depending on the stage and method of integrating expert features into the feature map of the deep model.Among these,Version 4 incorporates a modified neck structure in PointPillars and introduces a new Cluster 2D Pseudo-Map Branch that utilizes cluster-level pseudo-images generated from Patchwork++and DBSCAN.This version achieved a+3.88%improvement mean Average Precision(mAP)compared to the baseline PointPillars.The results demonstrate that embedding expert-based perception logic into deep neural architectures can effectively enhance performance and reduce dependency on extensive training datasets,offering a promising direction for robust 3D LiDAR object detection in real-world scenarios.展开更多
This paper explores the growing trend of digitized and audio learning in contemporary research,learning,and society.As technology advances and digital platforms become more accessible,the field of knowledge has witnes...This paper explores the growing trend of digitized and audio learning in contemporary research,learning,and society.As technology advances and digital platforms become more accessible,the field of knowledge has witnessed a significant shift in focus towards the popularity and examination of both digitized and audio content across various media forms.The way literary texts are accessed,disseminated,and consumed has paved the way for the emergence of digitized content which transforms traditional texts into new forms and mediums,including e-books,audio books,internet research,ChatGPT,Grammarly,virtual reality(VR),Zoom teaching/learning,and so on.The study examines the potentials embedded in digital and audio learning.It criticizes the existing denial of digitized and internet resources as inadequate for making intellectual claims.As a prescriptive study,the paper employed qualitative research method and deployed documentary observation as instrument for data collection through scrutiny of various works of research,conducted using digital sources in comparison to research done using traditional book/physical library sources.This paper maintained that,due to media technology and Artificial intelligence,digitized and e-learning is becoming more popular,faster and contemporary amongst scholars and students.This paper concludes that there is a need to begin to acknowledge digitized learning and teaching as a tool to educational sustainability and development.展开更多
With the increasing constraints of hardware devices,there is a growing demand for compact models to be deployed on device endpoints.Knowledge distillation,a widely used technique for model compression and knowledge tr...With the increasing constraints of hardware devices,there is a growing demand for compact models to be deployed on device endpoints.Knowledge distillation,a widely used technique for model compression and knowledge transfer,has gained significant attention in recent years.However,traditional distillation approaches compare the knowledge of individual samples indirectly through class prototypes overlooking the structural relationships between samples.Although recent distillation methods based on contrastive learning can capture relational knowledge,their relational constraints often distort the positional information of the samples leading to compromised performance in the distilled model.To address these challenges and further enhance the performance of compact models,we propose a novel approach,termed contrastive learning-based multi-level knowledge distillation(CLMKD).The CLMKD framework introduces three key modules:class-guided contrastive distillation,gradient relation contrastive distillation,and semantic similarity distillation.These modules are effectively integrated into a unified framework to extract feature knowledge from multiple levels,capturing not only the representational consistency of individual samples but also their higher-order structure and semantic similarity.We evaluate the proposed CLMKD method on multiple image classification datasets and the results demonstrate its superior performance compared to state-of-the-art knowledge distillation methods.展开更多
Modern intelligent systems,such as autonomous vehicles and face recognition,must continuously adapt to new scenarios while preserving their ability to handle previously encountered situations.However,when neural netwo...Modern intelligent systems,such as autonomous vehicles and face recognition,must continuously adapt to new scenarios while preserving their ability to handle previously encountered situations.However,when neural networks learn new classes sequentially,they suffer from catastrophic forgetting—the tendency to lose knowledge of earlier classes.This challenge,which lies at the core of class-incremental learning,severely limits the deployment of continual learning systems in real-world applications with streaming data.Existing approaches,including rehearsalbased methods and knowledge distillation techniques,have attempted to address this issue but often struggle to effectively preserve decision boundaries and discriminative features under limited memory constraints.To overcome these limitations,we propose a support vector-guided framework for class-incremental learning.The framework integrates an enhanced feature extractor with a Support Vector Machine classifier,which generates boundary-critical support vectors to guide both replay and distillation.Building on this architecture,we design a joint feature retention strategy that combines boundary proximity with feature diversity,and a Support Vector Distillation Loss that enforces dual alignment in decision and semantic spaces.In addition,triple attention modules are incorporated into the feature extractor to enhance representation power.Extensive experiments on CIFAR-100 and Tiny-ImageNet demonstrate effective improvements.On CIFAR-100 and Tiny-ImageNet with 5 tasks,our method achieves 71.68%and 58.61%average accuracy,outperforming strong baselines by 3.34%and 2.05%.These advantages are consistently observed across different task splits,highlighting the robustness and generalization of the proposed approach.Beyond benchmark evaluations,the framework also shows potential in few-shot and resource-constrained applications such as edge computing and mobile robotics.展开更多
文摘Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.
基金supported by the National Key Research and Development Program of China(No.2023YFB3712401),the National Natural Science Foundation of China(No.52274301)the Aeronautical Science Foundation of China(No.2023Z0530S6005)the Ningbo Yongjiang Talent-Introduction Programme(No.2022A-023-C).
文摘The viscosity of refining slags plays a critical role in metallurgical processes.However,obtaining accurate viscosity data remains challenging due to the complexities of high-temperature experiments,often relying on empirical models with limited predictive capabilities.This study focuses on the influence of optical basicity on viscosity in CaO-Al_(2)O_(3)-based refining slags,leveraging machine learning to address data scarcity and improve prediction accuracy.An automated framework for algorithm integration,parameter tuning,and evaluation ranking framework(Auto-APE)is employed to develop customized data-driven models for various slag systems,including CaO-Al_(2)O_(3)-SiO_(2),CaO-Al_(2)O_(3)-CaF_(2),CaO-Al_(2)O_(3)-SiO_(2)-MgO,and CaO-Al_(2)O_(3)-SiO_(2)-MgO-CaF_(2).By incorporating optical basicity as a key feature,the models achieve an average validation error of 8.0%to 15.1%,significantly outperforming traditional empirical models.Additionally,symbolic regression is introduced to rapidly construct domain-specific features,such as optical basicity-like descriptors,offering a potential breakthrough in performance prediction for small datasets.This work highlights the critical role of domain-specific knowledge in understanding and predicting viscosity,providing a robust machine learning-based approach for optimizing refining slag properties.
基金supported by the Postdoctoral Fellowship Program of CPSF(Grant No.GZB20230685)the National Science Foundation of China(Grant No.42277161).
文摘Forecasting landslide deformation is challenging due to influence of various internal and external factors on the occurrence of systemic and localized heterogeneities.Despite the potential to improve landslide predictability,deep learning has yet to be sufficiently explored for complex deformation patterns associated with landslides and is inherently opaque.Herein,we developed a holistic landslide deformation forecasting method that considers spatiotemporal correlations of landslide deformation by integrating domain knowledge into interpretable deep learning.By spatially capturing the interconnections between multiple deformations from different observation points,our method contributes to the understanding and forecasting of landslide systematic behavior.By integrating specific domain knowledge relevant to each observation point and merging internal properties with external variables,the local heterogeneity is considered in our method,identifying deformation temporal patterns in different landslide zones.Case studies involving reservoir-induced landslides and creeping landslides demonstrated that our approach(1)enhances the accuracy of landslide deformation forecasting,(2)identifies significant contributing factors and their influence on spatiotemporal deformation characteristics,and(3)demonstrates how identifying these factors and patterns facilitates landslide forecasting.Our research offers a promising and pragmatic pathway toward a deeper understanding and forecasting of complex landslide behaviors.
基金Supported in part by Science Center for Gas Turbine Project(Project No.P2022-DC-I-003-001)National Natural Science Foundation of China(Grant No.52275130).
文摘Despite significant progress in the Prognostics and Health Management(PHM)domain using pattern learning systems from data,machine learning(ML)still faces challenges related to limited generalization and weak interpretability.A promising approach to overcoming these challenges is to embed domain knowledge into the ML pipeline,enhancing the model with additional pattern information.In this paper,we review the latest developments in PHM,encapsulated under the concept of Knowledge Driven Machine Learning(KDML).We propose a hierarchical framework to define KDML in PHM,which includes scientific paradigms,knowledge sources,knowledge representations,and knowledge embedding methods.Using this framework,we examine current research to demonstrate how various forms of knowledge can be integrated into the ML pipeline and provide roadmap to specific usage.Furthermore,we present several case studies that illustrate specific implementations of KDML in the PHM domain,including inductive experience,physical model,and signal processing.We analyze the improvements in generalization capability and interpretability that KDML can achieve.Finally,we discuss the challenges,potential applications,and usage recommendations of KDML in PHM,with a particular focus on the critical need for interpretability to ensure trustworthy deployment of artificial intelligence in PHM.
基金Deep-time Digital Earth(DDE)Big Science Program(No.GJ-C03-SGF-2025-004)National Natural Science Foundation of China(No.42394063)Sichuan Science and Technology Program(No.2025ZNSFSC0325).
文摘Topographic maps,as essential tools and sources of information for geographic research,contain precise spatial locations and rich map features,and they illustrate spatio-temporal information on the distribution and differences of various surface features.Currently,topographic maps are mainly stored in raster and vector formats.Extraction of the spatio-temporal knowledge in the maps—such as spatial distribution patterns,feature relationships,and dynamic evolution—still primarily relies on manual interpretation.However,manual interpretation is time-consuming and laborious,especially for large-scale,long-term map knowledge extraction and application.With the development of artificial intelligence technology,it is possible to improve the automation level of map knowledge interpretation.Therefore,the present study proposes an automatic interpretation method for raster topographic map knowledge based on deep learning.To address the limitations of current data-driven intelligent technology in learning map spatial relations and cognitive logic,we establish a formal description of map knowledge by mapping the relationship between map knowledge and features,thereby ensuring interpretation accuracy.Subsequently,deep learning techniques are employed to extract map features automatically,and the spatio-temporal knowledge is constructed by combining formal descriptions of geographic feature knowledge.Validation experiments demonstrate that the proposed method effectively achieves automatic interpretation of spatio-temporal knowledge of geographic features in maps,with an accuracy exceeding 80%.The findings of the present study contribute to machine understanding of spatio-temporal differences in map knowledge and advances the intelligent interpretation and utilization of cartographic information.
基金supported in part by the National Natural Science Foundation of China under Grants 32171909,52205254,32301704the Guangdong Basic and Applied Basic Research Foundation under Grants 2023A1515011255,2024A1515010199+1 种基金the Scientific Research Projects of Universities in Guangdong Province under Grants 2024ZDZX1042,2024ZDZX3057the Ji-Hua Laboratory Open Project under Grant X220931UZ230.
文摘Defect detection based on computer vision is a critical component in ensuring the quality of industrial products.However,existing detection methods encounter several challenges in practical applications,including the scarcity of labeled samples,limited adaptability of pre-trained models,and the data heterogeneity in distributed environments.To address these issues,this research proposes an unsupervised defect detection method,FLAME(Federated Learning with Adaptive Multi-Model Embeddings).The method comprises three stages:(1)Feature learning stage:this work proposes FADE(Feature-Adaptive Domain-Specific Embeddings),a framework employs Gaussian noise injection to simulate defective patterns and implements a feature discriminator for defect detection,thereby enhancing the pre-trained model’s industrial imagery representation capabilities.(2)Knowledge distillation co-training stage:a multi-model feature knowledge distillation mechanism is introduced.Through feature-level knowledge transfer between the global model and historical local models,the current local model is guided to learn better feature representations from the global model.The approach prevents local models from converging to local optima and mitigates performance degradation caused by data heterogeneity.(3)Model parameter aggregation stage:participating clients utilize weighted averaging aggregation to synthesize an updated global model,facilitating efficient knowledge consolidation.Experimental results demonstrate that FADE improves the average image-level Area under the Receiver Operating Characteristic Curve(AUROC)by 7.34%compared to methods directly utilizing pre-trained models.In federated learning environments,FLAME’s multi-model feature knowledge distillation mechanism outperforms the classic FedAvg algorithm by 2.34%in average image-level AUROC,while exhibiting superior convergence properties.
文摘For the history of medical culture in the world,the exchange and transmission of medical knowledge has formed an important part of mutual learning among different cultures,which has also increasingly shown unique academic value in the study of knowledge history.Traditional Eastern medicine(such as Chinese medicine,Indian ayurvedic medicine,Persian medicine,Arabic medicine),and other medical systems in the ancient Western world(including Greek medicine and Roman medicine)have left precious literature/texts,cultural relics(for example,pills,preparations,medical instruments),folklore and legends,which truly record the process of learning,transplantation,fusion and succession after the encounter of different medical systems at least for the past two thousand years.
基金financially supported by the National Natural Science Foundation of China(22479067)Yunnan Young Talents Program for“Xingdian Talent Support Plan”(KKXX202551007)。
文摘The electro-chemo-mechanical mechanism is critical for understanding the initiation and propagation of lithium(Li)dendrites in solid-state lithium metal battery(SSLMB).Li dendrites often nucleate within surface defects in the solid-state electrolyte,leading to internal short circuits that hinder practical application of SSLMB.While conventional experimental and finite element methods provide valuable insights,they are often costly,time-consuming,and inefficient for capturing the complicated stress evolution inside solid-state electrolyte.In this study,we propose a novel machine learning strategy that integrates prior knowledge and physics-informed constraints to predict the von Mises stress distribution induced by the internal defects of solid-state electrolyte.High-quality training datasets generated using a multiphysics simulation framework and key findings from previous studies were incorporated as physicsguided constraints to enhance prediction reliability and physical consistency of machine learning models.By employing a modified UNet architecture with squeeze-and-excitation module,it demonstrates remarkably high accuracy in stress prediction and exhibits excellent robustness and generalization across a wide range of defect scenarios.This model allows us to efficiently obtain the electro-chemo-mechanical failure of solid-state electrolyte,thereby guiding micro structural modifications and facilitating the design of SSLMB for practical applications.
基金Provincial-Level Quality Engineering Project,Preschool Education Teacher Training Base of Fuyang Normal University(Project No.:2023cyts023)University-Level Research Team Project,Collaborative Innovation Center for Basic Education in Northern Anhui(Project No.:kytd202418)。
文摘The“Opinions on Comprehensively Deepening Curriculum Reform to Fulfill the Fundamental Task of Strengthening Moral Education”,issued by China’s Ministry of Education in 2015,explicitly identified Project-Based Learning(PBL)as a key strategy for cultivating students’core competencies.Since then,PBL has been widely implemented as a pilot initiative in primary and secondary schools,gaining increasing influence.Analyzing the intellectual foundations of PBL research in China can offer valuable insights into its theoretical and practical dimensions.This study uses CiteSpace to examine 156 PBL-related articles from the CSSCI database,revealing that the knowledge base of PBL research is primarily built on two major domains.The first is the theoretical foundation,characterized by frequently cited literature focusing on the conceptual framework,educational value,interdisciplinary approaches,core competency cultivation,and instructional objectives of PBL.The second is empirical research,where highly cited studies include case analyses across K–12 settings,general high schools,and higher education institutions.Moving forward,future research on PBL should explore its meaning and value from a dual-subject and integrated perspective,expand case studies to include vocational education,and further promote the interdisciplinary development of core competencies through PBL.
基金supported by the National Natural Science Foundation of China(Nos.62201419,62372357)the Natural Science Foundation of Chongqing(CSTB2023NSCQ-LMX0032)the ISN State Key Laboratory.
文摘Existing wireless networks are flooded with video data transmissions,and the demand for high-speed and low-latency video services continues to surge.This has brought with it challenges to networks in the form of congestion as well as the need for more resources and more dedicated caching schemes.Recently,Multi-access Edge Computing(MEC)-enabled heterogeneous networks,which leverage edge caches for proximity delivery,have emerged as a promising solution to all of these problems.Designing an effective edge caching scheme is critical to its success,however,in the face of limited resources.We propose a novel Knowledge Graph(KG)-based Dueling Deep Q-Network(KG-DDQN)for cooperative caching in MEC-enabled heterogeneous networks.The KGDDQN scheme leverages a KG to uncover video relations,providing valuable insights into user preferences for the caching scheme.Specifically,the KG guides the selection of related videos as caching candidates(i.e.,actions in the DDQN),thus providing a rich reference for implementing a personalized caching scheme while also improving the decision efficiency of the DDQN.Extensive simulation results validate the convergence effectiveness of the KG-DDQN,and it also outperforms baselines regarding cache hit rate and service delay.
基金supported by the National Key Research and Development Program of China (2024YFD2001100,2024YFE0214300)the National Natural Science Foundation of China (62162008)+3 种基金Guizhou Provincial Science and Technology Projects ([2024]002, CXTD[2023]027)Guizhou Province Youth Science and Technology Talent Project ([2024]317)Guiyang Guian Science and Technology Talent Training Project ([2024]2-15)the Guizhou Province Graduate Education Innovation Program Project (2024YJSKYJJ096)
文摘Staple crops are the cornerstone of the food supply but are frequently threatened by plant diseases.Effective disease management,including disease identification and severity assessment,helps to better address these challenges.Currently,methods for disease severity assessment typically rely on calculating the area proportion of disease segmentation regions or using classification networks for severity assessment.However,these methods require large amounts of labeled data and fail to quantify lesion proportions when using classification networks,leading to inaccurate evaluations.To address these issues,we propose an automated framework for disease severity assessment that combines multi-task learning and knowledge-driven large-model segmentation techniques.This framework includes an image information processor,a lesion and leaf segmentation module,and a disease severity assessment module.First,the image information processor utilizes a multi-task learning strategy to analyze input images comprehensively,ensuring a deep understanding of disease characteristics.Second,the lesion and leaf segmentation module employ prompt-driven large-model technology to accurately segment diseased areas and entire leaves,providing detailed visual analysis.Finally,the disease severity assessment module objectively evaluates the severity of the disease based on professional grading standards by calculating lesion area proportions.Additionally,we have developed a comprehensive database of diseased leaf images from major crops,including several task-specific datasets.Experimental results demonstrate that our framework can accurately identify and assess the types and severity of crop diseases,even without extensive labeled data.Codes and data are available at http://dkp-ads.samlab.cn/.
文摘Fault sensing in wind turbine(WT)generator bearings is essential for ensuring reliability and holding down maintenance costs.Feeding raw sensor data to machine learning(ML)model often overlooks the enveloping interdependencies between system elements.This study proposes a new hybrid method that combines the domain knowledge via knowledge graphs(KGs)and the traditional feature-based data.Incorporation of contextual relationships through construction of graph embedding methods,such as Node2Vec,can capture meaningful information,such as the relationships among key parameters(e.g.wind speed,rotor Revolutions Per Minute(RPM),and temperature)in the enriched feature representations.These node embeddings,when augmented with the original data,can be used to allow the model to learn and generalize better.As shown in results achieved on experimental data,the augmented ML model(with KG)is much better at predicting with the help of accuracy and error measure compared to traditional ML methods.Paired t-test analysis proves the statistical validity of this improvement.Moreover,graph-based feature importance increases the interpretability of the model and helps to uncover the structurally significant variables that are otherwise ignored by the common methods.The approach provides an excellent,knowledge-guided manner through which intelligent fault detection can be executed on WT systems.
基金supported by the National Natural Science Foundation of China(Grant No.:62101087)the China Postdoctoral Science Foundation(Grant No.:2021MD703942)+2 种基金the Chongqing Postdoctoral Research Project Special Funding,China(Grant No.:2021XM2016)the Science Foundation of Chongqing Municipal Commission of Education,China(Grant No.:KJQN202100642)the Chongqing Natural Science Foundation,China(Grant No.:cstc2021jcyj-msxmX0834).
文摘Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.
基金financed by the grants from the JSPS KAKENHI Grant-in-Aid for Young Scientists(No.24K15918)the 2024 Inamori Research Grant Program.
文摘During the 17th and 18th centuries,medical exchanges between Japan and China were frequent and intensive.In the 17th century,due to the wars during the Ming-Qing transition,numerous Chinese physicians came to Japan,bringing with them advanced medical techniques and newly published medical texts.In the early 18th century,following Tokugawa Yoshimune’s(徳川吉宗)implementation of medical reform policies,many Chinese physicians arrived in Japan.There,they exchanged knowledge with Japanese physicians and facilitated the publication of Chinese medical texts in Japan.These exchanges significantly increased attention to Chinese medical works,particularly Shang Han Lun(《伤寒论》Treatise on Cold Damage),within Edo medical circles.This had a profound impact on physicians of the Japanese Kohōschool(古方派)and significantly contributed to shaping Kampōmedicine into its contemporary form.From the perspectives of intellectual history and knowledge exchange,this paper explores the circulation of medical knowledge between China and Japan during the early modern period,examining its profound historical influence on Japanese medicine.The study specifically aims to clarify the authentic meaning of“Ancient Learning(复古)”and to correct the prevailing academic misconception that the Kohōschool exclusively focused on reviving Shang Han Lun.
基金supported in part by the National Natural Science Foundation of China(No.62302507)and the funding of Harbin Institute of Technology(Shenzhen)(No.20210035).
文摘Extrapolation on Temporal Knowledge Graphs(TKGs)aims to predict future knowledge from a set of historical Knowledge Graphs in chronological order.The temporally adjacent facts in TKGs naturally form event sequences,called event evolution patterns,implying informative temporal dependencies between events.Recently,many extrapolation works on TKGs have been devoted to modelling these evolutional patterns,but the task is still far from resolved because most existing works simply rely on encoding these patterns into entity representations while overlooking the significant information implied by relations of evolutional patterns.However,the authors realise that the temporal dependencies inherent in the relations of these event evolution patterns may guide the follow-up event prediction to some extent.To this end,a Temporal Relational Context-based Temporal Dependencies Learning Network(TRenD)is proposed to explore the temporal context of relations for more comprehensive learning of event evolution patterns,especially those temporal dependencies caused by interactive patterns of relations.Trend incorporates a semantic context unit to capture semantic correlations between relations,and a structural context unit to learn the interaction pattern of relations.By learning the temporal contexts of relations semantically and structurally,the authors gain insights into the underlying event evolution patterns,enabling to extract comprehensive historical information for future prediction better.Experimental results on benchmark datasets demonstrate the superiority of the model.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2023-00245084)by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(RS-2024-00415938,HRD Program for Industrial Innovation)and Soonchunhyang University.
文摘This paper proposes a deep learning-based 3D LiDAR perception framework designed for applications such as autonomous robots and vehicles.To address the high dependency on large-scale annotated data—an inherent limitation of deep learning models—this study introduces a hybrid perception architecture that incorporates expertdriven LiDAR processing techniques into the deep neural network.Traditional 3DLiDAR processingmethods typically remove ground planes and apply distance-or density-based clustering for object detection.In this work,such expert knowledge is encoded as feature-level inputs and fused with the deep network,therebymitigating the data dependency issue of conventional learning-based approaches.Specifically,the proposedmethod combines two expert algorithms—Patchwork++for ground segmentation and DBSCAN for clustering—with a PointPillars-based LiDAR detection network.We design four hybrid versions of the network depending on the stage and method of integrating expert features into the feature map of the deep model.Among these,Version 4 incorporates a modified neck structure in PointPillars and introduces a new Cluster 2D Pseudo-Map Branch that utilizes cluster-level pseudo-images generated from Patchwork++and DBSCAN.This version achieved a+3.88%improvement mean Average Precision(mAP)compared to the baseline PointPillars.The results demonstrate that embedding expert-based perception logic into deep neural architectures can effectively enhance performance and reduce dependency on extensive training datasets,offering a promising direction for robust 3D LiDAR object detection in real-world scenarios.
文摘This paper explores the growing trend of digitized and audio learning in contemporary research,learning,and society.As technology advances and digital platforms become more accessible,the field of knowledge has witnessed a significant shift in focus towards the popularity and examination of both digitized and audio content across various media forms.The way literary texts are accessed,disseminated,and consumed has paved the way for the emergence of digitized content which transforms traditional texts into new forms and mediums,including e-books,audio books,internet research,ChatGPT,Grammarly,virtual reality(VR),Zoom teaching/learning,and so on.The study examines the potentials embedded in digital and audio learning.It criticizes the existing denial of digitized and internet resources as inadequate for making intellectual claims.As a prescriptive study,the paper employed qualitative research method and deployed documentary observation as instrument for data collection through scrutiny of various works of research,conducted using digital sources in comparison to research done using traditional book/physical library sources.This paper maintained that,due to media technology and Artificial intelligence,digitized and e-learning is becoming more popular,faster and contemporary amongst scholars and students.This paper concludes that there is a need to begin to acknowledge digitized learning and teaching as a tool to educational sustainability and development.
基金supported in part by National Natural Science Foundation of China(Grants 62262005,61976107,and 61962010)the Research Foundation for Talented Scholars of Southwest University(Grant SWU-KR24002)+3 种基金the High-level Innovative Talents in Guizhou Province(Grant GCC[2023]033)the Natural Science Research Project of Department of Education of Guizhou Province(Grant QJJ[2024]009)the Guizhou Provincial Department of Science and Technology(Grant QKHCG-DXGA[2025]-ZD002)the High Performance Computing(HPC)clusters at Southwest University.
文摘With the increasing constraints of hardware devices,there is a growing demand for compact models to be deployed on device endpoints.Knowledge distillation,a widely used technique for model compression and knowledge transfer,has gained significant attention in recent years.However,traditional distillation approaches compare the knowledge of individual samples indirectly through class prototypes overlooking the structural relationships between samples.Although recent distillation methods based on contrastive learning can capture relational knowledge,their relational constraints often distort the positional information of the samples leading to compromised performance in the distilled model.To address these challenges and further enhance the performance of compact models,we propose a novel approach,termed contrastive learning-based multi-level knowledge distillation(CLMKD).The CLMKD framework introduces three key modules:class-guided contrastive distillation,gradient relation contrastive distillation,and semantic similarity distillation.These modules are effectively integrated into a unified framework to extract feature knowledge from multiple levels,capturing not only the representational consistency of individual samples but also their higher-order structure and semantic similarity.We evaluate the proposed CLMKD method on multiple image classification datasets and the results demonstrate its superior performance compared to state-of-the-art knowledge distillation methods.
基金supported by the Gansu Provincial Natural Science Foundation(grant number 25JRRA074)the Gansu Provincial Key R&D Science and Technology Program(grant number 24YFGA060)the National Natural Science Foundation of China(grant number 62161019).
文摘Modern intelligent systems,such as autonomous vehicles and face recognition,must continuously adapt to new scenarios while preserving their ability to handle previously encountered situations.However,when neural networks learn new classes sequentially,they suffer from catastrophic forgetting—the tendency to lose knowledge of earlier classes.This challenge,which lies at the core of class-incremental learning,severely limits the deployment of continual learning systems in real-world applications with streaming data.Existing approaches,including rehearsalbased methods and knowledge distillation techniques,have attempted to address this issue but often struggle to effectively preserve decision boundaries and discriminative features under limited memory constraints.To overcome these limitations,we propose a support vector-guided framework for class-incremental learning.The framework integrates an enhanced feature extractor with a Support Vector Machine classifier,which generates boundary-critical support vectors to guide both replay and distillation.Building on this architecture,we design a joint feature retention strategy that combines boundary proximity with feature diversity,and a Support Vector Distillation Loss that enforces dual alignment in decision and semantic spaces.In addition,triple attention modules are incorporated into the feature extractor to enhance representation power.Extensive experiments on CIFAR-100 and Tiny-ImageNet demonstrate effective improvements.On CIFAR-100 and Tiny-ImageNet with 5 tasks,our method achieves 71.68%and 58.61%average accuracy,outperforming strong baselines by 3.34%and 2.05%.These advantages are consistently observed across different task splits,highlighting the robustness and generalization of the proposed approach.Beyond benchmark evaluations,the framework also shows potential in few-shot and resource-constrained applications such as edge computing and mobile robotics.