Reinforcement learning is an excellent approach which is used in artificial intelligence,automatic control, etc. However, ordinary reinforcement learning algorithm, such as Q-learning with lookup table cannot cope wit...Reinforcement learning is an excellent approach which is used in artificial intelligence,automatic control, etc. However, ordinary reinforcement learning algorithm, such as Q-learning with lookup table cannot cope with extremely complex and dynamic environment due to the huge state space. To reduce the state space, modular neural network Q-learning algorithm is proposed, which combines Q-learning algorithm with neural network and module method. Forward feedback neural network, Elman neural network and radius-basis neural network are separately employed to construct such algorithm. It is revealed that Elman neural network Q-learning algorithm has the best performance under the condition that the same neural network training method, i.e. gradient descent error back-propagation algorithm is applied.展开更多
To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this p...To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this paper proposes a hybrid algorithm integrating Q-learning and improved A*-Artificial Potential Field(A-APF).Centered on theQ-learning framework,the algorithmleverages safety-oriented guidance generated byA-APF and employs a dynamic coordination mechanism that adaptively balances exploration and exploitation.The proposed system comprises four core modules:(1)an environment modeling module that constructs grid-based obstacle maps;(2)an A-APF module that combines heuristic search from A*algorithm with repulsive force strategies from APF to generate guidance;(3)a Q-learning module that learns optimal state-action values(Q-values)through spraying robot-environment interaction and a reward function emphasizing path optimality and safety;and(4)a dynamic optimization module that ensures adaptive cooperation between Q-learning and A-APF through exploration rate control and environment-aware constraints.Simulation results demonstrate that the proposed method significantly enhances path safety in complex underground mining environments.Quantitative results indicate that,compared to the traditional Q-learning algorithm,the proposed method shortens training time by 42.95% and achieves a reduction in training failures from 78 to just 3.Compared to the static fusion algorithm,it further reduces both training time(by 10.78%)and training failures(by 50%),thereby improving overall training efficiency.展开更多
Optimization is the key to obtaining efficient utilization of resources in structural design.Due to the complex nature of truss systems,this study presents a method based on metaheuristic modelling that minimises stru...Optimization is the key to obtaining efficient utilization of resources in structural design.Due to the complex nature of truss systems,this study presents a method based on metaheuristic modelling that minimises structural weight under stress and frequency constraints.Two new algorithms,the Red Kite Optimization Algorithm(ROA)and Secretary Bird Optimization Algorithm(SBOA),are utilized on five benchmark trusses with 10,18,37,72,and 200-bar trusses.Both algorithms are evaluated against benchmarks in the literature.The results indicate that SBOA always reaches a lighter optimal.Designs with reducing structural weight ranging from 0.02%to 0.15%compared to ROA,and up to 6%–8%as compared to conventional algorithms.In addition,SBOA can achieve 15%–20%faster convergence speed and 10%–18%reduction in computational time with a smaller standard deviation over independent runs,which demonstrates its robustness and reliability.It is indicated that the adaptive exploration mechanism of SBOA,especially its Levy flight–based search strategy,can obviously improve optimization performance for low-and high-dimensional trusses.The research has implications in the context of promoting bio-inspired optimization techniques by demonstrating the viability of SBOA,a reliable model for large-scale structural design that provides significant enhancements in performance and convergence behavior.展开更多
We study the split common solution problem with multiple output sets for monotone operator equations in Hilbert spaces.To solve this problem,we propose two new parallel algorithms.We establish a weak convergence theor...We study the split common solution problem with multiple output sets for monotone operator equations in Hilbert spaces.To solve this problem,we propose two new parallel algorithms.We establish a weak convergence theorem for the first and a strong convergence theorem for the second.展开更多
The integration of High-Altitude Platform Stations(HAPS)with Reconfigurable Intelligent Surfaces(RIS)represents a critical advancement for next-generation wireless networks,offering unprecedented opportunities for ubi...The integration of High-Altitude Platform Stations(HAPS)with Reconfigurable Intelligent Surfaces(RIS)represents a critical advancement for next-generation wireless networks,offering unprecedented opportunities for ubiquitous connectivity.However,existing research reveals significant gaps in dynamic resource allocation,joint optimization,and equitable service provisioning under varying channel conditions,limiting practical deployment of these technologies.This paper addresses these challenges by proposing a novel Fairness-Aware Deep Q-Learning(FAIRDQL)framework for joint resource management and phase configuration in HAPS-RIS systems.Our methodology employs a comprehensive three-tier algorithmic architecture integrating adaptive power control,priority-based user scheduling,and dynamic learning mechanisms.The FAIR-DQL approach utilizes advanced reinforcement learning with experience replay and fairness-aware reward functions to balance competing objectives while adapting to dynamic environments.Key findings demonstrate substantial improvements:9.15 dB SINR gain,12.5 bps/Hz capacity,78%power efficiency,and 0.82 fairness index.The framework achieves rapid 40-episode convergence with consistent delay performance.These contributions establish new benchmarks for fairness-aware resource allocation in aerial communications,enabling practical HAPS-RIS deployments in rural connectivity,emergency communications,and urban networks.展开更多
In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic h...In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic heterogeneous infrastructures,unstable links and non-uniform hardware capabilities create critical issues regarding security and privacy.Traditional protocols are often too computationally heavy to allow 6G services to achieve their expected Quality-of-Service(QoS).As the transport network is built of ad hoc nodes,there is no guarantee about their trustworthiness or behavior,and transversal functionalities are delegated to the extreme nodes.However,while security can be guaranteed in extreme-to-extreme solutions,privacy cannot,as all intermediate nodes still have to handle the data packets they are transporting.Besides,traditional schemes for private anonymous ad hoc communications are vulnerable against modern intelligent attacks based on learning models.The proposed scheme fulfills this gap.Findings show the probability of a successful intelligent attack reduces by up to 65%compared to ad hoc networks with no privacy protection strategy when used the proposed technology.While congestion probability can remain below 0.001%,as required in 6G services.展开更多
Test case prioritization and ranking play a crucial role in software testing by improving fault detection efficiency and ensuring software reliability.While prioritization selects the most relevant test cases for opti...Test case prioritization and ranking play a crucial role in software testing by improving fault detection efficiency and ensuring software reliability.While prioritization selects the most relevant test cases for optimal coverage,ranking further refines their execution order to detect critical faults earlier.This study investigates machine learning techniques to enhance both prioritization and ranking,contributing to more effective and efficient testing processes.We first employ advanced feature engineering alongside ensemble models,including Gradient Boosted,Support Vector Machines,Random Forests,and Naive Bayes classifiers to optimize test case prioritization,achieving an accuracy score of 0.98847 and significantly improving the Average Percentage of Fault Detection(APFD).Subsequently,we introduce a deep Q-learning framework combined with a Genetic Algorithm(GA)to refine test case ranking within priority levels.This approach achieves a rank accuracy of 0.9172,demonstrating robust performance despite the increasing computational demands of specialized variation operators.Our findings highlight the effectiveness of stacked ensemble learning and reinforcement learning in optimizing test case prioritization and ranking.This integrated approach improves testing efficiency,reduces late-stage defects,and improves overall software stability.The study provides valuable information for AI-driven testing frameworks,paving the way for more intelligent and adaptive software quality assurance methodologies.展开更多
To address the issue of abnormal energy consumption fluctuations in the converter steelmaking process,an integrated diagnostic method combining the gray wolf optimization(GWO)algorithm,support vector machine(SVM),and ...To address the issue of abnormal energy consumption fluctuations in the converter steelmaking process,an integrated diagnostic method combining the gray wolf optimization(GWO)algorithm,support vector machine(SVM),and K-means clustering was proposed.Eight input parameters—derived from molten iron conditions and external factors—were selected as feature variables.A GWO-SVM model was developed to accurately predict the energy consumption of individual heats.Based on the prediction results,the mean absolute percentage error and maximum relative error of the test set were employed as criteria to identify heats with abnormal energy usage.For these heats,the K-means clustering algorithm was used to determine benchmark values of influencing factors from similar steel grades,enabling root-cause diagnosis of excessive energy consumption.The proposed method was applied to real production data from a converter in a steel plant.The analysis reveals that heat sample No.44 exhibits abnormal energy consumption,due to gas recovery being 1430.28 kg of standard coal below the benchmark level.A secondary contributing factor is a steam recovery shortfall of 237.99 kg of standard coal.This integrated approach offers a scientifically grounded tool for energy management in converter operations and provides valuable guidance for optimizing process parameters and enhancing energy efficiency.展开更多
Q-learning is a classical reinforcement learning method with broad applicability.It can respond effectively to environmental changes and provide flexible strategies,making it suitable for solving robot path-planning p...Q-learning is a classical reinforcement learning method with broad applicability.It can respond effectively to environmental changes and provide flexible strategies,making it suitable for solving robot path-planning problems.However,Q-learning faces challenges in search and update efficiency.To address these issues,we propose an improved Q-learning(IQL)algorithm.We use an enhanced Ant Colony Optimization(ACO)algorithmto optimizeQtable initialization.We also introduce the UCH mechanism to refine the reward function and overcome the exploration dilemma.The IQL algorithm is extensively tested in three grid environments of different scales.The results validate the accuracy of themethod and demonstrate superior path-planning performance compared to traditional approaches.The algorithm reduces the number of trials required for convergence,improves learning efficiency,and enables faster adaptation to environmental changes.It also enhances stability and accuracy by reducing the standard deviation of trials to zero.On grid maps of different sizes,IQL achieves higher expected returns.Compared with the original Q-learning algorithm,IQL improves performance by 12.95%,18.28%,and 7.98% on 10*10,20*20,and 30*30 maps,respectively.The proposed algorithm has promising applications in robotics,path planning,intelligent transportation,aerospace,and game development.展开更多
Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting...Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.展开更多
The cemented tailings backfill(CTB)with initial defects is more prone to destabilization damage under the influence of various unfavorable factors during the mining process.In order to investigate its influence on the...The cemented tailings backfill(CTB)with initial defects is more prone to destabilization damage under the influence of various unfavorable factors during the mining process.In order to investigate its influence on the stability of underground mining engineering,this paper simulates the generation of different degrees of initial defects inside the CTB by adding different contents of air-entraining agent(AEA),investigates the acoustic emission RA/AF eigenvalues of CTB with different contents of AEA under uniaxial compression,and adopts various denoising algorithms(e.g.,moving average smoothing,median filtering,and outlier detection)to improve the accuracy of the data.The variance and autocorrelation coefficients of RA/AF parameters were analyzed in conjunction with the critical slowing down(CSD)theory.The results show that the acoustic emission RA/AF values can be used to characterize the progressive damage evolution of CTB.The denoising algorithm processed the AE signals to reduce the effects of extraneous noise and anomalous spikes.Changes in the variance curves provide clear precursor information,while abrupt changes in the autocorrelation coefficient can be used as an auxiliary localization warning signal.The phenomenon of dramatic increase in the variance and autocorrelation coefficient curves during the compression-tightening stage,which is influenced by the initial defects,can lead to false warnings.As the initial defects of the CTB increase,its instability precursor time and instability time are prolonged,the peak stress decreases,and the time difference between the CTB and the instability damage is smaller.The results provide a new method for real-time monitoring and early warning of CTB instability damage.展开更多
With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research s...With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research studies a distributed flexible job shop scheduling problem with assembly operations.Firstly,a mixed integer programming model is formulated to minimize the maximum completion time.Secondly,a Q-learning-assisted coevolutionary algorithmis presented to solve themodel:(1)Multiple populations are developed to seek required decisions simultaneously;(2)An encoding and decoding method based on problem features is applied to represent individuals;(3)A hybrid approach of heuristic rules and random methods is employed to acquire a high-quality population;(4)Three evolutionary strategies having crossover and mutation methods are adopted to enhance exploration capabilities;(5)Three neighborhood structures based on problem features are constructed,and a Q-learning-based iterative local search method is devised to improve exploitation abilities.The Q-learning approach is applied to intelligently select better neighborhood structures.Finally,a group of instances is constructed to perform comparison experiments.The effectiveness of the Q-learning approach is verified by comparing the developed algorithm with its variant without the Q-learning method.Three renowned meta-heuristic algorithms are used in comparison with the developed algorithm.The comparison results demonstrate that the designed method exhibits better performance in coping with the formulated problem.展开更多
Optimizing convolutional neural networks(CNNs)for IoT attack detection remains a critical yet challenging task due to the need to balance multiple performance metrics beyond mere accuracy.This study proposes a unified...Optimizing convolutional neural networks(CNNs)for IoT attack detection remains a critical yet challenging task due to the need to balance multiple performance metrics beyond mere accuracy.This study proposes a unified and flexible optimization framework that leverages metaheuristic algorithms to automatically optimize CNN configurations for IoT attack detection.Unlike conventional single-objective approaches,the proposed method formulates a global multi-objective fitness function that integrates accuracy,precision,recall,and model size(speed/model complexity penalty)with adjustable weights.This design enables both single-objective and weightedsum multi-objective optimization,allowing adaptive selection of optimal CNN configurations for diverse deployment requirements.Two representativemetaheuristic algorithms,GeneticAlgorithm(GA)and Particle Swarm Optimization(PSO),are employed to optimize CNNhyperparameters and structure.At each generation/iteration,the best configuration is selected as themost balanced solution across optimization objectives,i.e.,the one achieving themaximum value of the global objective function.Experimental validation on two benchmark datasets,Edge-IIoT and CIC-IoT2023,demonstrates that the proposed GA-and PSO-based models significantly enhance detection accuracy(94.8%–98.3%)and generalization compared with manually tuned CNN configurations,while maintaining compact architectures.The results confirm that the multi-objective framework effectively balances predictive performance and computational efficiency.This work establishes a generalizable and adaptive optimization strategy for deep learning-based IoT attack detection and provides a foundation for future hybrid metaheuristic extensions in broader IoT security applications.展开更多
文摘Reinforcement learning is an excellent approach which is used in artificial intelligence,automatic control, etc. However, ordinary reinforcement learning algorithm, such as Q-learning with lookup table cannot cope with extremely complex and dynamic environment due to the huge state space. To reduce the state space, modular neural network Q-learning algorithm is proposed, which combines Q-learning algorithm with neural network and module method. Forward feedback neural network, Elman neural network and radius-basis neural network are separately employed to construct such algorithm. It is revealed that Elman neural network Q-learning algorithm has the best performance under the condition that the same neural network training method, i.e. gradient descent error back-propagation algorithm is applied.
基金supported by the National Natural Science Foundation of China(Grant No.52374156).
文摘To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense,dynamic,unpredictable obstacles challenging conventional methods—this paper proposes a hybrid algorithm integrating Q-learning and improved A*-Artificial Potential Field(A-APF).Centered on theQ-learning framework,the algorithmleverages safety-oriented guidance generated byA-APF and employs a dynamic coordination mechanism that adaptively balances exploration and exploitation.The proposed system comprises four core modules:(1)an environment modeling module that constructs grid-based obstacle maps;(2)an A-APF module that combines heuristic search from A*algorithm with repulsive force strategies from APF to generate guidance;(3)a Q-learning module that learns optimal state-action values(Q-values)through spraying robot-environment interaction and a reward function emphasizing path optimality and safety;and(4)a dynamic optimization module that ensures adaptive cooperation between Q-learning and A-APF through exploration rate control and environment-aware constraints.Simulation results demonstrate that the proposed method significantly enhances path safety in complex underground mining environments.Quantitative results indicate that,compared to the traditional Q-learning algorithm,the proposed method shortens training time by 42.95% and achieves a reduction in training failures from 78 to just 3.Compared to the static fusion algorithm,it further reduces both training time(by 10.78%)and training failures(by 50%),thereby improving overall training efficiency.
文摘Optimization is the key to obtaining efficient utilization of resources in structural design.Due to the complex nature of truss systems,this study presents a method based on metaheuristic modelling that minimises structural weight under stress and frequency constraints.Two new algorithms,the Red Kite Optimization Algorithm(ROA)and Secretary Bird Optimization Algorithm(SBOA),are utilized on five benchmark trusses with 10,18,37,72,and 200-bar trusses.Both algorithms are evaluated against benchmarks in the literature.The results indicate that SBOA always reaches a lighter optimal.Designs with reducing structural weight ranging from 0.02%to 0.15%compared to ROA,and up to 6%–8%as compared to conventional algorithms.In addition,SBOA can achieve 15%–20%faster convergence speed and 10%–18%reduction in computational time with a smaller standard deviation over independent runs,which demonstrates its robustness and reliability.It is indicated that the adaptive exploration mechanism of SBOA,especially its Levy flight–based search strategy,can obviously improve optimization performance for low-and high-dimensional trusses.The research has implications in the context of promoting bio-inspired optimization techniques by demonstrating the viability of SBOA,a reliable model for large-scale structural design that provides significant enhancements in performance and convergence behavior.
基金supported by the Science and Technology Fund of TNU-Thai Nguyen University of Science.
文摘We study the split common solution problem with multiple output sets for monotone operator equations in Hilbert spaces.To solve this problem,we propose two new parallel algorithms.We establish a weak convergence theorem for the first and a strong convergence theorem for the second.
基金supported by the Princess Nourah bint Abdulrahman University Researchers Supporting Project,number PNURSP2025R757Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The integration of High-Altitude Platform Stations(HAPS)with Reconfigurable Intelligent Surfaces(RIS)represents a critical advancement for next-generation wireless networks,offering unprecedented opportunities for ubiquitous connectivity.However,existing research reveals significant gaps in dynamic resource allocation,joint optimization,and equitable service provisioning under varying channel conditions,limiting practical deployment of these technologies.This paper addresses these challenges by proposing a novel Fairness-Aware Deep Q-Learning(FAIRDQL)framework for joint resource management and phase configuration in HAPS-RIS systems.Our methodology employs a comprehensive three-tier algorithmic architecture integrating adaptive power control,priority-based user scheduling,and dynamic learning mechanisms.The FAIR-DQL approach utilizes advanced reinforcement learning with experience replay and fairness-aware reward functions to balance competing objectives while adapting to dynamic environments.Key findings demonstrate substantial improvements:9.15 dB SINR gain,12.5 bps/Hz capacity,78%power efficiency,and 0.82 fairness index.The framework achieves rapid 40-episode convergence with consistent delay performance.These contributions establish new benchmarks for fairness-aware resource allocation in aerial communications,enabling practical HAPS-RIS deployments in rural connectivity,emergency communications,and urban networks.
基金funding from the European Commission by the Ruralities project(grant agreement no.101060876).
文摘In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic heterogeneous infrastructures,unstable links and non-uniform hardware capabilities create critical issues regarding security and privacy.Traditional protocols are often too computationally heavy to allow 6G services to achieve their expected Quality-of-Service(QoS).As the transport network is built of ad hoc nodes,there is no guarantee about their trustworthiness or behavior,and transversal functionalities are delegated to the extreme nodes.However,while security can be guaranteed in extreme-to-extreme solutions,privacy cannot,as all intermediate nodes still have to handle the data packets they are transporting.Besides,traditional schemes for private anonymous ad hoc communications are vulnerable against modern intelligent attacks based on learning models.The proposed scheme fulfills this gap.Findings show the probability of a successful intelligent attack reduces by up to 65%compared to ad hoc networks with no privacy protection strategy when used the proposed technology.While congestion probability can remain below 0.001%,as required in 6G services.
文摘Test case prioritization and ranking play a crucial role in software testing by improving fault detection efficiency and ensuring software reliability.While prioritization selects the most relevant test cases for optimal coverage,ranking further refines their execution order to detect critical faults earlier.This study investigates machine learning techniques to enhance both prioritization and ranking,contributing to more effective and efficient testing processes.We first employ advanced feature engineering alongside ensemble models,including Gradient Boosted,Support Vector Machines,Random Forests,and Naive Bayes classifiers to optimize test case prioritization,achieving an accuracy score of 0.98847 and significantly improving the Average Percentage of Fault Detection(APFD).Subsequently,we introduce a deep Q-learning framework combined with a Genetic Algorithm(GA)to refine test case ranking within priority levels.This approach achieves a rank accuracy of 0.9172,demonstrating robust performance despite the increasing computational demands of specialized variation operators.Our findings highlight the effectiveness of stacked ensemble learning and reinforcement learning in optimizing test case prioritization and ranking.This integrated approach improves testing efficiency,reduces late-stage defects,and improves overall software stability.The study provides valuable information for AI-driven testing frameworks,paving the way for more intelligent and adaptive software quality assurance methodologies.
基金support from the National Key R&D Program of China(Grant No.2020YFB1711100).
文摘To address the issue of abnormal energy consumption fluctuations in the converter steelmaking process,an integrated diagnostic method combining the gray wolf optimization(GWO)algorithm,support vector machine(SVM),and K-means clustering was proposed.Eight input parameters—derived from molten iron conditions and external factors—were selected as feature variables.A GWO-SVM model was developed to accurately predict the energy consumption of individual heats.Based on the prediction results,the mean absolute percentage error and maximum relative error of the test set were employed as criteria to identify heats with abnormal energy usage.For these heats,the K-means clustering algorithm was used to determine benchmark values of influencing factors from similar steel grades,enabling root-cause diagnosis of excessive energy consumption.The proposed method was applied to real production data from a converter in a steel plant.The analysis reveals that heat sample No.44 exhibits abnormal energy consumption,due to gas recovery being 1430.28 kg of standard coal below the benchmark level.A secondary contributing factor is a steam recovery shortfall of 237.99 kg of standard coal.This integrated approach offers a scientifically grounded tool for energy management in converter operations and provides valuable guidance for optimizing process parameters and enhancing energy efficiency.
基金Financial supports from the National Natural Science Foundation of China(GrantNo.52374123&51974144)Project of Liaoning Provincial Department of Education(GrantNo.LJKZ0340)Liaoning Revitalization Talents Program(Grant No.XLYC2211085)are greatly acknowledged.
文摘Q-learning is a classical reinforcement learning method with broad applicability.It can respond effectively to environmental changes and provide flexible strategies,making it suitable for solving robot path-planning problems.However,Q-learning faces challenges in search and update efficiency.To address these issues,we propose an improved Q-learning(IQL)algorithm.We use an enhanced Ant Colony Optimization(ACO)algorithmto optimizeQtable initialization.We also introduce the UCH mechanism to refine the reward function and overcome the exploration dilemma.The IQL algorithm is extensively tested in three grid environments of different scales.The results validate the accuracy of themethod and demonstrate superior path-planning performance compared to traditional approaches.The algorithm reduces the number of trials required for convergence,improves learning efficiency,and enables faster adaptation to environmental changes.It also enhances stability and accuracy by reducing the standard deviation of trials to zero.On grid maps of different sizes,IQL achieves higher expected returns.Compared with the original Q-learning algorithm,IQL improves performance by 12.95%,18.28%,and 7.98% on 10*10,20*20,and 30*30 maps,respectively.The proposed algorithm has promising applications in robotics,path planning,intelligent transportation,aerospace,and game development.
基金National Key Research and Development Program of China,No.2023YFC3006704National Natural Science Foundation of China,No.42171047CAS-CSIRO Partnership Joint Project of 2024,No.177GJHZ2023097MI。
文摘Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.
基金Projects(52374138,51764013)supported by the National Natural Science Foundation of ChinaProject(20204BCJ22005)supported by the Training Plan for Academic and Technical Leaders of Major Disciplines of Jiangxi Province,China+1 种基金Project(2019M652277)supported by the China Postdoctoral Science FoundationProject(20192ACBL21014)supported by the Natural Science Youth Foundation Key Projects of Jiangxi Province,China。
文摘The cemented tailings backfill(CTB)with initial defects is more prone to destabilization damage under the influence of various unfavorable factors during the mining process.In order to investigate its influence on the stability of underground mining engineering,this paper simulates the generation of different degrees of initial defects inside the CTB by adding different contents of air-entraining agent(AEA),investigates the acoustic emission RA/AF eigenvalues of CTB with different contents of AEA under uniaxial compression,and adopts various denoising algorithms(e.g.,moving average smoothing,median filtering,and outlier detection)to improve the accuracy of the data.The variance and autocorrelation coefficients of RA/AF parameters were analyzed in conjunction with the critical slowing down(CSD)theory.The results show that the acoustic emission RA/AF values can be used to characterize the progressive damage evolution of CTB.The denoising algorithm processed the AE signals to reduce the effects of extraneous noise and anomalous spikes.Changes in the variance curves provide clear precursor information,while abrupt changes in the autocorrelation coefficient can be used as an auxiliary localization warning signal.The phenomenon of dramatic increase in the variance and autocorrelation coefficient curves during the compression-tightening stage,which is influenced by the initial defects,can lead to false warnings.As the initial defects of the CTB increase,its instability precursor time and instability time are prolonged,the peak stress decreases,and the time difference between the CTB and the instability damage is smaller.The results provide a new method for real-time monitoring and early warning of CTB instability damage.
文摘With the development of economic globalization,distributedmanufacturing is becomingmore andmore prevalent.Recently,integrated scheduling of distributed production and assembly has captured much concern.This research studies a distributed flexible job shop scheduling problem with assembly operations.Firstly,a mixed integer programming model is formulated to minimize the maximum completion time.Secondly,a Q-learning-assisted coevolutionary algorithmis presented to solve themodel:(1)Multiple populations are developed to seek required decisions simultaneously;(2)An encoding and decoding method based on problem features is applied to represent individuals;(3)A hybrid approach of heuristic rules and random methods is employed to acquire a high-quality population;(4)Three evolutionary strategies having crossover and mutation methods are adopted to enhance exploration capabilities;(5)Three neighborhood structures based on problem features are constructed,and a Q-learning-based iterative local search method is devised to improve exploitation abilities.The Q-learning approach is applied to intelligently select better neighborhood structures.Finally,a group of instances is constructed to perform comparison experiments.The effectiveness of the Q-learning approach is verified by comparing the developed algorithm with its variant without the Q-learning method.Three renowned meta-heuristic algorithms are used in comparison with the developed algorithm.The comparison results demonstrate that the designed method exhibits better performance in coping with the formulated problem.
文摘Optimizing convolutional neural networks(CNNs)for IoT attack detection remains a critical yet challenging task due to the need to balance multiple performance metrics beyond mere accuracy.This study proposes a unified and flexible optimization framework that leverages metaheuristic algorithms to automatically optimize CNN configurations for IoT attack detection.Unlike conventional single-objective approaches,the proposed method formulates a global multi-objective fitness function that integrates accuracy,precision,recall,and model size(speed/model complexity penalty)with adjustable weights.This design enables both single-objective and weightedsum multi-objective optimization,allowing adaptive selection of optimal CNN configurations for diverse deployment requirements.Two representativemetaheuristic algorithms,GeneticAlgorithm(GA)and Particle Swarm Optimization(PSO),are employed to optimize CNNhyperparameters and structure.At each generation/iteration,the best configuration is selected as themost balanced solution across optimization objectives,i.e.,the one achieving themaximum value of the global objective function.Experimental validation on two benchmark datasets,Edge-IIoT and CIC-IoT2023,demonstrates that the proposed GA-and PSO-based models significantly enhance detection accuracy(94.8%–98.3%)and generalization compared with manually tuned CNN configurations,while maintaining compact architectures.The results confirm that the multi-objective framework effectively balances predictive performance and computational efficiency.This work establishes a generalizable and adaptive optimization strategy for deep learning-based IoT attack detection and provides a foundation for future hybrid metaheuristic extensions in broader IoT security applications.