Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Re...Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.展开更多
This paper presents a novel algorithm for training robotic arms to manipulate cloth,by leveraging reinforcement learning and curriculum learning approaches.Traditional cloth manipulation algorithms rely heavily on pre...This paper presents a novel algorithm for training robotic arms to manipulate cloth,by leveraging reinforcement learning and curriculum learning approaches.Traditional cloth manipulation algorithms rely heavily on predefined action primitives and assumptions about cloth dynamics,introducing significant prior knowledge.To circumvent this limitation,we utilize reinforcement learning to train our cloth folding agent.To fully utilize the advantage of reinforcement learning,we propose a semi-sparse reward function incorporating folding accuracy and a curriculum scheme to accelerate training and improve policy stability.We validate the proposed method by implementing it in the StableBaselines3 framework and training the agent using the soft actor critic algorithm in our virtual environment based on physical-based cloth simulator.Our results demonstrate the benefits of the curriculum learning scheme which increases sample efficiency and accelerates training process compared with previous reinforcement learning cloth manipulation method.展开更多
The Gleason grade group(GG)is an important basis for assessing the malignancy of prostate can-cer,but it requires invasive biopsy to obtain pathology.To noninvasively evaluate GG,an automatic prediction method is prop...The Gleason grade group(GG)is an important basis for assessing the malignancy of prostate can-cer,but it requires invasive biopsy to obtain pathology.To noninvasively evaluate GG,an automatic prediction method is proposed based on multi-scale convolutional neural network of the ensemble attention module trained with curriculum learning.First,a lesion-attention map based on the image of the region of interest is proposed in combination with the bottleneck attention module to make the network more focus on the lesion area.Second,the feature pyramid network is combined to make the network better learn the multi-scale information of the lesion area.Finally,in the network training,a curriculum based on the consistency gap between the visual evaluation and the pathological grade is proposed,which further improves the prediction performance of the network.Ex-perimental results show that the proposed method is better than the traditional network model in predicting GG performance.The quadratic weighted Kappa is 0.4711 and the positive predictive value for predicting clinically significant cancer is 0.9369.展开更多
By leveraging data from a fully labeled source domain,unsupervised domain adaptation(UDA)im-proves classification performance on an unlabeled target domain through explicit discrepancy minimization of data distributio...By leveraging data from a fully labeled source domain,unsupervised domain adaptation(UDA)im-proves classification performance on an unlabeled target domain through explicit discrepancy minimization of data distribution or adversarial learning.As an enhancement,category alignment is involved during adaptation to reinforce target feature discrimination by utilizing model prediction.However,there remain unexplored prob-lems about pseudo-label inaccuracy incurred by wrong category predictions on target domain,and distribution deviation caused by overfitting on source domain.In this paper,we propose a model-agnostic two-stage learning framework,which greatly reduces flawed model predictions using soft pseudo-label strategy and avoids overfitting on source domain with a curriculum learning strategy.Theoretically,it successfully decreases the combined risk in the upper bound of expected error on the target domain.In the first stage,we train a model with distribution alignment-based UDA method to obtain soft semantic label on target domain with rather high confidence.To avoid overfitting on source domain,in the second stage,we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains so that the focus of the training stage is gradually shifted from source distribution to target distribution with prediction confidence boosted on the target domain.Extensive experiments on two well-known benchmark datasets validate the universal effectiveness of our proposed framework on promoting the performance of the top-ranked UDA algorithms and demonstrate its consistent su-perior performance.展开更多
Text-to-SQL is the task of translating a natural language query into a structured query language. Existing text-to-SQL approaches focus on improving the model’s architecture while ignoring the relationship between qu...Text-to-SQL is the task of translating a natural language query into a structured query language. Existing text-to-SQL approaches focus on improving the model’s architecture while ignoring the relationship between queries and table schemas and the differences in difficulty between examples in the dataset. To tackle these challenges, a two-stage curriculum learning framework for text-to-SQL(TSCL-SQL) is proposed in this paper. To exploit the relationship between the queries and the table schemas, a schema identification pre-training task is proposed to make the model choose the correct table schema from a set of candidates for a specific query. To leverage the differences in difficulty between examples, curriculum learning is applied to the text-to-SQL task, accompanied by an automatic curriculum learning solution, including a difficulty scorer and a training scheduler. Experiments show that the framework proposed in this paper is effective.展开更多
This paper delves into the challenges and opportunities in the current educational system and proposes an innovative talent cultivation model that integrates science,industry,and education.Through an analysis of issue...This paper delves into the challenges and opportunities in the current educational system and proposes an innovative talent cultivation model that integrates science,industry,and education.Through an analysis of issues such as problems with university construction mechanisms,inadequate alignment between schools and enterprises,the disconnection between theory and practice,and a lack of awareness of innovation and entrepreneurship education,this paper explores a model using geography-related majors in higher education as an example.It discusses talent cultivation strategies based on innovation,professionalism,and practical education.Additionally,this paper explores a new teaching practice model for research-based learning curriculum design,as well as the construction and implementation of the curriculum system.展开更多
Currently,multi-UAV collision detection and avoidance is facing many challenges,such as navigating in cluttered environments with dynamic obstacles while equipped with low-cost perception devices having a limited fiel...Currently,multi-UAV collision detection and avoidance is facing many challenges,such as navigating in cluttered environments with dynamic obstacles while equipped with low-cost perception devices having a limited field of view(FOV).To this end,we propose a communication-aided collision detection and avoidance method based on curriculum reinforcement learning(CRL).This method integrates perception and communication data to improve environmental understanding,allowing UAVs to handle potential collisions that may go unnoticed.Furthermore,given the challenges in policy learning caused by the substantial differences in scale between perception and communication data,we employ a two-stage training approach,which performs training with the network expanded from part to whole.In the first stage,we train a partial policy network in an obstacle-free environment for inter-UAV collision avoidance.In the second stage,the full network is trained in a complex environment with obstacles,enabling both inter-UAV collision avoidance and obstacle avoidance.Experiments with PX4 software-in-the-loop(SITL)simulations and real flights demonstrate that our method outperforms state-of-the-art baselines in terms of reliability of collision avoidance,including the DRL-based method and NH-ORCA(Non-Holonomic Optimal Reciprocal Collision Avoidance).Besides,the proposed method achieves zero-shot transfer from simulation to real-world environments that were never experienced during training.展开更多
Shuttle tankers scheduling is an important task in offshore oil and gas transportation process,which involves operating time window fulfillment,optimal transportation planning,and proper inventory management.However,c...Shuttle tankers scheduling is an important task in offshore oil and gas transportation process,which involves operating time window fulfillment,optimal transportation planning,and proper inventory management.However,conventional approaches like Mixed lnteger Linear Programming(MlLP)or meta heuristic algorithms often fail in long running time.In this paper,a Graph Pointer Network(GPN)based Hierarchical Curriculum Reinforcement Learning(HCRl)method is proposed to solve Shuttle Tankers Scheduling Problem(STSP)The model is trained to divide STSP into voyage and operation stages and generate routing and inventory management decisions sequentially.An asynchronous training strategy is developed to address the coupling between stages.Comparison experiments demonstrate that the proposed HCRL method achieves 12%shortel tour lengths on average compared to heuristic algorithms.Additional experiments validate its generalizability to unseen instances and scalability to larger instances.展开更多
The influence of a disturbing gravity field on the impact points of long-range vehicles(LRVs)has become increasingly prominent,which is an important factor affecting the accuracy of impact point prediction(IPP).To ach...The influence of a disturbing gravity field on the impact points of long-range vehicles(LRVs)has become increasingly prominent,which is an important factor affecting the accuracy of impact point prediction(IPP).To achieve high-accuracy and fast IPP for LRVs under the influence of a disturbing gravity field,a data-driven multi-level IPP method is proposed to balance the prediction accuracy and real-time performance.At the first level,the impact point of the current flight state is predicted based on elliptical trajectory theory,and the impact deviation of the elliptical trajectory(ID-ET)is calculated.At the second and third levels,a neural network(NN)model is established to learn the ID-ET caused by the J2 term and re-entry aerodynamic drag as well as that caused by the disturbing gravity field.To improve the NN prediction performance,an auxiliary circle is applied to decouple the ID-ET.To reduce the difficulty of NN learning,a training strategy is designed based on the idea of curriculum learning,which improves training accuracy.At the same time,a hybrid sample generation strategy is proposed to improve the NN generalization ability.A detailed simulation experiment is designed to analyze the advantages and computational complexity of the proposed method.The simulation results showed that the proposed model has a high prediction accuracy,strong generalization ability,and good real-time performance under the influence of the disturbing gravity field and re-entry aerodynamic drag.Among the 317,360 samples contained in the training and test sets,the 3σ prediction error was 6.21 m.On an STM32F407 single-chip microcomputer,the IPP required 3.415 ms.The proposed method can provide support for the design of guidance algorithms and is applicable to engineering practice.展开更多
Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are st...Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are still vulnerable to attribute perturbations and out-of-distribution compositions during inference. We ascribe the bottleneck to a failure to explore the intrinsic semantic hierarchy of visual concepts, e.g., {red, blue,···} ∈“color” subspace yet cube ∈“shape”. In this paper, we propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces(i.e., visual superordinates). With only natural visual question answering data, our model first acquires the semantic hierarchy from a linguistic view and then explores mutually exclusive visual superordinates under the guidance of linguistic hierarchy. In addition, a quasi-center visual concept clustering and superordinate shortcut learning schemes are proposed to enhance the discrimination and independence of concepts within each visual superordinate. Experiments demonstrate the superiority of the proposed framework under diverse settings, which increases the overall answering accuracy relatively by 7.5% for reasoning with perturbations and 15.6% for compositional generalization tests.展开更多
This study provides a systematic analysis of the resource-consuming training of deep reinforcement-learning (DRL) agents for simulated low-speed automated driving (AD). In Unity, this study established two case studie...This study provides a systematic analysis of the resource-consuming training of deep reinforcement-learning (DRL) agents for simulated low-speed automated driving (AD). In Unity, this study established two case studies: garage parking and navigating an obstacle-dense area. Our analysis involves training a path-planning agent with real-time-only sensor information. This study addresses research questions insufficiently covered in the literature, exploring curriculum learning (CL), agent generalization (knowledge transfer), computation distribution (CPU vs. GPU), and mapless navigation. CL proved necessary for the garage scenario and beneficial for obstacle avoidance. It involved adjustments at different stages, including terminal conditions, environment complexity, and reward function hyperparameters, guided by their evolution in multiple training attempts. Fine-tuning the simulation tick and decision period parameters was crucial for effective training. The abstraction of high-level concepts (e.g., obstacle avoidance) necessitates training the agent in sufficiently complex environments in terms of the number of obstacles. While blogs and forums discuss training machine learning models in Unity, a lack of scientific articles on DRL agents for AD persists. However, since agent development requires considerable training time and difficult procedures, there is a growing need to support such research through scientific means. In addition to our findings, we contribute to the R&D community by providing our environment with open sources.展开更多
Recently,audio–visual speech recognition(AVSR)has attracted increasing attention.However,most existing works simplify the complex challenges in real-world applications and only focus on scenarios with two speakers an...Recently,audio–visual speech recognition(AVSR)has attracted increasing attention.However,most existing works simplify the complex challenges in real-world applications and only focus on scenarios with two speakers and perfectly aligned audio-video clips.In this work,we study the effect of speaker number and modal misalignment in the AVSR task,and propose an end-to-end AVSR framework under a more realistic condition.Specifically,we propose a speaker-number-aware mixture-of-experts(SA-MoE)mechanism to explicitly model the characteristic difference in scenarios with different speaker numbers,and a cross-modal realignment(CMR)module for robust handling of asynchronous inputs.We also use the underlying difficulty difference and introduce a new training strategy named challenge-based curriculum learning(CBCL),which forces the model to focus on difficult,challenging data instead of simple data to improve efficiency.展开更多
With the emergence of pre-trained models,current neural networks are able to give task performance that is comparable to humans.However,we know little about the fundamental working mechanism of pre-trained models in w...With the emergence of pre-trained models,current neural networks are able to give task performance that is comparable to humans.However,we know little about the fundamental working mechanism of pre-trained models in which we do not know how they approach such performance and how the task is solved by the model.For example,given a task,human learns from easy to hard,whereas the model learns randomly.Undeniably,difficulty-insensitive learning leads to great success in natural language processing(NLP),but little attention has been paid to the effect of text difficulty in NLP.We propose a human learning matching index(HLM Index)to investigate the effect of text difficulty.Experiment results show:1)LSTM gives more human-like learning behavior than BERT.Additionally,UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria.Among nine tasks,some tasks’performance is related to text difficulty,whereas others are not.2)Model trained on easy data performs best in both easy and medium test data,whereas trained on hard data only performs well on hard test data.3)Train the model from easy to hard,leading to quicker convergence.展开更多
文摘Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.
基金the National Key Research and Development Program of China(No.2020AAA0108901)。
文摘This paper presents a novel algorithm for training robotic arms to manipulate cloth,by leveraging reinforcement learning and curriculum learning approaches.Traditional cloth manipulation algorithms rely heavily on predefined action primitives and assumptions about cloth dynamics,introducing significant prior knowledge.To circumvent this limitation,we utilize reinforcement learning to train our cloth folding agent.To fully utilize the advantage of reinforcement learning,we propose a semi-sparse reward function incorporating folding accuracy and a curriculum scheme to accelerate training and improve policy stability.We validate the proposed method by implementing it in the StableBaselines3 framework and training the agent using the soft actor critic algorithm in our virtual environment based on physical-based cloth simulator.Our results demonstrate the benefits of the curriculum learning scheme which increases sample efficiency and accelerates training process compared with previous reinforcement learning cloth manipulation method.
基金Foundation item:the Suzhou Municipal Health and Family Planning Commission's Key Diseases Diagnosis and Treatment Program(No.LCzX202001)the Science and Technology Development Project ofSuzhou(Nos.SS2019012andSKY2021031)+1 种基金the Youth Innovation Promotion Association CAS(No.2021324)the Medical Research Project of Jiangsu Provincial Health and Family Planning Commission(No.M2020068)。
文摘The Gleason grade group(GG)is an important basis for assessing the malignancy of prostate can-cer,but it requires invasive biopsy to obtain pathology.To noninvasively evaluate GG,an automatic prediction method is proposed based on multi-scale convolutional neural network of the ensemble attention module trained with curriculum learning.First,a lesion-attention map based on the image of the region of interest is proposed in combination with the bottleneck attention module to make the network more focus on the lesion area.Second,the feature pyramid network is combined to make the network better learn the multi-scale information of the lesion area.Finally,in the network training,a curriculum based on the consistency gap between the visual evaluation and the pathological grade is proposed,which further improves the prediction performance of the network.Ex-perimental results show that the proposed method is better than the traditional network model in predicting GG performance.The quadratic weighted Kappa is 0.4711 and the positive predictive value for predicting clinically significant cancer is 0.9369.
基金the 111 Project(No.BP0719010)the Project of the Science and Technology Commission of Shanghai Municipality(No.18DZ2270700)。
文摘By leveraging data from a fully labeled source domain,unsupervised domain adaptation(UDA)im-proves classification performance on an unlabeled target domain through explicit discrepancy minimization of data distribution or adversarial learning.As an enhancement,category alignment is involved during adaptation to reinforce target feature discrimination by utilizing model prediction.However,there remain unexplored prob-lems about pseudo-label inaccuracy incurred by wrong category predictions on target domain,and distribution deviation caused by overfitting on source domain.In this paper,we propose a model-agnostic two-stage learning framework,which greatly reduces flawed model predictions using soft pseudo-label strategy and avoids overfitting on source domain with a curriculum learning strategy.Theoretically,it successfully decreases the combined risk in the upper bound of expected error on the target domain.In the first stage,we train a model with distribution alignment-based UDA method to obtain soft semantic label on target domain with rather high confidence.To avoid overfitting on source domain,in the second stage,we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains so that the focus of the training stage is gradually shifted from source distribution to target distribution with prediction confidence boosted on the target domain.Extensive experiments on two well-known benchmark datasets validate the universal effectiveness of our proposed framework on promoting the performance of the top-ranked UDA algorithms and demonstrate its consistent su-perior performance.
基金Fundamental Research Funds for the Central Universities,China (No. 2232023D-19)。
文摘Text-to-SQL is the task of translating a natural language query into a structured query language. Existing text-to-SQL approaches focus on improving the model’s architecture while ignoring the relationship between queries and table schemas and the differences in difficulty between examples in the dataset. To tackle these challenges, a two-stage curriculum learning framework for text-to-SQL(TSCL-SQL) is proposed in this paper. To exploit the relationship between the queries and the table schemas, a schema identification pre-training task is proposed to make the model choose the correct table schema from a set of candidates for a specific query. To leverage the differences in difficulty between examples, curriculum learning is applied to the text-to-SQL task, accompanied by an automatic curriculum learning solution, including a difficulty scorer and a training scheduler. Experiments show that the framework proposed in this paper is effective.
文摘This paper delves into the challenges and opportunities in the current educational system and proposes an innovative talent cultivation model that integrates science,industry,and education.Through an analysis of issues such as problems with university construction mechanisms,inadequate alignment between schools and enterprises,the disconnection between theory and practice,and a lack of awareness of innovation and entrepreneurship education,this paper explores a model using geography-related majors in higher education as an example.It discusses talent cultivation strategies based on innovation,professionalism,and practical education.Additionally,this paper explores a new teaching practice model for research-based learning curriculum design,as well as the construction and implementation of the curriculum system.
基金supported by National Natural Science Foundation of China(U23B2032 and U2241214)Postgrad-uate Scientific Research Innovation Project of Hunan Province(CX20220055).
文摘Currently,multi-UAV collision detection and avoidance is facing many challenges,such as navigating in cluttered environments with dynamic obstacles while equipped with low-cost perception devices having a limited field of view(FOV).To this end,we propose a communication-aided collision detection and avoidance method based on curriculum reinforcement learning(CRL).This method integrates perception and communication data to improve environmental understanding,allowing UAVs to handle potential collisions that may go unnoticed.Furthermore,given the challenges in policy learning caused by the substantial differences in scale between perception and communication data,we employ a two-stage training approach,which performs training with the network expanded from part to whole.In the first stage,we train a partial policy network in an obstacle-free environment for inter-UAV collision avoidance.In the second stage,the full network is trained in a complex environment with obstacles,enabling both inter-UAV collision avoidance and obstacle avoidance.Experiments with PX4 software-in-the-loop(SITL)simulations and real flights demonstrate that our method outperforms state-of-the-art baselines in terms of reliability of collision avoidance,including the DRL-based method and NH-ORCA(Non-Holonomic Optimal Reciprocal Collision Avoidance).Besides,the proposed method achieves zero-shot transfer from simulation to real-world environments that were never experienced during training.
基金supported by the National Natural Science Foundation of China(Nos.22178383 and 21706282)Beijing Natural Science Foundation(No.2232021)Research Foundation of China University of Petroleum(Beijing)(No.2462020BJRC004).
文摘Shuttle tankers scheduling is an important task in offshore oil and gas transportation process,which involves operating time window fulfillment,optimal transportation planning,and proper inventory management.However,conventional approaches like Mixed lnteger Linear Programming(MlLP)or meta heuristic algorithms often fail in long running time.In this paper,a Graph Pointer Network(GPN)based Hierarchical Curriculum Reinforcement Learning(HCRl)method is proposed to solve Shuttle Tankers Scheduling Problem(STSP)The model is trained to divide STSP into voyage and operation stages and generate routing and inventory management decisions sequentially.An asynchronous training strategy is developed to address the coupling between stages.Comparison experiments demonstrate that the proposed HCRL method achieves 12%shortel tour lengths on average compared to heuristic algorithms.Additional experiments validate its generalizability to unseen instances and scalability to larger instances.
基金co-supported by the National Natural Science Foundation of China(Grant No.62103432)the China Postdoctoral Science Foundation(Grant No.284881)the Young Talent fund of the University Association for Science and Technology in Shaanxi,China(Grant No.20210108).
文摘The influence of a disturbing gravity field on the impact points of long-range vehicles(LRVs)has become increasingly prominent,which is an important factor affecting the accuracy of impact point prediction(IPP).To achieve high-accuracy and fast IPP for LRVs under the influence of a disturbing gravity field,a data-driven multi-level IPP method is proposed to balance the prediction accuracy and real-time performance.At the first level,the impact point of the current flight state is predicted based on elliptical trajectory theory,and the impact deviation of the elliptical trajectory(ID-ET)is calculated.At the second and third levels,a neural network(NN)model is established to learn the ID-ET caused by the J2 term and re-entry aerodynamic drag as well as that caused by the disturbing gravity field.To improve the NN prediction performance,an auxiliary circle is applied to decouple the ID-ET.To reduce the difficulty of NN learning,a training strategy is designed based on the idea of curriculum learning,which improves training accuracy.At the same time,a hybrid sample generation strategy is proposed to improve the NN generalization ability.A detailed simulation experiment is designed to analyze the advantages and computational complexity of the proposed method.The simulation results showed that the proposed model has a high prediction accuracy,strong generalization ability,and good real-time performance under the influence of the disturbing gravity field and re-entry aerodynamic drag.Among the 317,360 samples contained in the training and test sets,the 3σ prediction error was 6.21 m.On an STM32F407 single-chip microcomputer,the IPP required 3.415 ms.The proposed method can provide support for the design of guidance algorithms and is applicable to engineering practice.
基金supported in part by the Australian Research Council(ARC)(Nos.FL-170100117,DP-180103424,IC-190100031 and LE-200100049).
文摘Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are still vulnerable to attribute perturbations and out-of-distribution compositions during inference. We ascribe the bottleneck to a failure to explore the intrinsic semantic hierarchy of visual concepts, e.g., {red, blue,···} ∈“color” subspace yet cube ∈“shape”. In this paper, we propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces(i.e., visual superordinates). With only natural visual question answering data, our model first acquires the semantic hierarchy from a linguistic view and then explores mutually exclusive visual superordinates under the guidance of linguistic hierarchy. In addition, a quasi-center visual concept clustering and superordinate shortcut learning schemes are proposed to enhance the discrimination and independence of concepts within each visual superordinate. Experiments demonstrate the superiority of the proposed framework under diverse settings, which increases the overall answering accuracy relatively by 7.5% for reasoning with perturbations and 15.6% for compositional generalization tests.
文摘This study provides a systematic analysis of the resource-consuming training of deep reinforcement-learning (DRL) agents for simulated low-speed automated driving (AD). In Unity, this study established two case studies: garage parking and navigating an obstacle-dense area. Our analysis involves training a path-planning agent with real-time-only sensor information. This study addresses research questions insufficiently covered in the literature, exploring curriculum learning (CL), agent generalization (knowledge transfer), computation distribution (CPU vs. GPU), and mapless navigation. CL proved necessary for the garage scenario and beneficial for obstacle avoidance. It involved adjustments at different stages, including terminal conditions, environment complexity, and reward function hyperparameters, guided by their evolution in multiple training attempts. Fine-tuning the simulation tick and decision period parameters was crucial for effective training. The abstraction of high-level concepts (e.g., obstacle avoidance) necessitates training the agent in sufficiently complex environments in terms of the number of obstacles. While blogs and forums discuss training machine learning models in Unity, a lack of scientific articles on DRL agents for AD persists. However, since agent development requires considerable training time and difficult procedures, there is a growing need to support such research through scientific means. In addition to our findings, we contribute to the R&D community by providing our environment with open sources.
基金Project supported by the National Natural Science Foundation of China(No.62572423)。
文摘Recently,audio–visual speech recognition(AVSR)has attracted increasing attention.However,most existing works simplify the complex challenges in real-world applications and only focus on scenarios with two speakers and perfectly aligned audio-video clips.In this work,we study the effect of speaker number and modal misalignment in the AVSR task,and propose an end-to-end AVSR framework under a more realistic condition.Specifically,we propose a speaker-number-aware mixture-of-experts(SA-MoE)mechanism to explicitly model the characteristic difference in scenarios with different speaker numbers,and a cross-modal realignment(CMR)module for robust handling of asynchronous inputs.We also use the underlying difficulty difference and introduce a new training strategy named challenge-based curriculum learning(CBCL),which forces the model to focus on difficult,challenging data instead of simple data to improve efficiency.
基金the support of the National Natural Science Foundation of China(Nos.U22B2059,62176079)National Natural Science Foundation of Heilongjiang Province,China(No.YQ 2022F005)the Industry-University-Research Innovation Foundation of China University(No.2021ITA05009).
文摘With the emergence of pre-trained models,current neural networks are able to give task performance that is comparable to humans.However,we know little about the fundamental working mechanism of pre-trained models in which we do not know how they approach such performance and how the task is solved by the model.For example,given a task,human learns from easy to hard,whereas the model learns randomly.Undeniably,difficulty-insensitive learning leads to great success in natural language processing(NLP),but little attention has been paid to the effect of text difficulty in NLP.We propose a human learning matching index(HLM Index)to investigate the effect of text difficulty.Experiment results show:1)LSTM gives more human-like learning behavior than BERT.Additionally,UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria.Among nine tasks,some tasks’performance is related to text difficulty,whereas others are not.2)Model trained on easy data performs best in both easy and medium test data,whereas trained on hard data only performs well on hard test data.3)Train the model from easy to hard,leading to quicker convergence.