半监督学习方法通过少量标记数据和大量未标记数据来提升学习性能.Tri-training是一种经典的基于分歧的半监督学习方法,但在学习过程中可能产生标记噪声问题.为了减少Tri-training中的标记噪声对未标记数据的预测偏差,学习到更好的半监...半监督学习方法通过少量标记数据和大量未标记数据来提升学习性能.Tri-training是一种经典的基于分歧的半监督学习方法,但在学习过程中可能产生标记噪声问题.为了减少Tri-training中的标记噪声对未标记数据的预测偏差,学习到更好的半监督分类模型,用交叉熵代替错误率以更好地反映模型预估结果和真实分布之间的差距,并结合凸优化方法来达到降低标记噪声的目的,保证模型效果.在此基础上,分别提出了一种基于交叉熵的Tri-training算法、一个安全的Tri-training算法,以及一种基于交叉熵的安全Tri-training算法.在UCI(University of California Irvine)机器学习库等基准数据集上验证了所提方法的有效性,并利用显著性检验从统计学的角度进一步验证了方法的性能.实验结果表明,提出的半监督学习方法在分类性能方面优于传统的Tri-training算法,其中基于交叉熵的安全Tri-training算法拥有更高的分类性能和泛化能力.展开更多
Tri-Training算法是半监督算法中的一种,其初始分类器性能受有标记样本影响较大,当样本数目不足时,分类器性能相对较弱,会直接影响后续迭代.为此提出IFS-Tri-Training(Tri-Training based on intuitionistic fuzzy sets)算法,引入SOM算...Tri-Training算法是半监督算法中的一种,其初始分类器性能受有标记样本影响较大,当样本数目不足时,分类器性能相对较弱,会直接影响后续迭代.为此提出IFS-Tri-Training(Tri-Training based on intuitionistic fuzzy sets)算法,引入SOM算法构建直觉模糊集,使得分类器在多因素下综合判别无标记样本,提高无标记样本的使用率,从而在迭代中扩展有标记样本集.在多个UCI数据上进行实验,结果数据表明,分类器的性能得到提高,学习无标记样本过程是影响分类器的关键点.展开更多
In this paper are reported the local minimum problem by means of current greedy algorithm for training the empirical potential function of protein folding on 8623 non-native structures of 31 globular proteins and a so...In this paper are reported the local minimum problem by means of current greedy algorithm for training the empirical potential function of protein folding on 8623 non-native structures of 31 globular proteins and a solution of the problem based upon the simulated annealing algorithm. This simulated annealing algorithm is indispensable for developing and testing highly refined empirical potential functions.展开更多
The adoption of 5G for Railways(5G-R)is expanding,particularly in high-speed trains,due to the benefits offered by 5G technology.High-speed trains must provide seamless connectivity and Quality of Service(QoS)to ensur...The adoption of 5G for Railways(5G-R)is expanding,particularly in high-speed trains,due to the benefits offered by 5G technology.High-speed trains must provide seamless connectivity and Quality of Service(QoS)to ensure passengers have a satisfactory experience throughout their journey.Installing base stations along urban environments can improve coverage but can dramatically reduce the experience of users due to interference.In particular,when a user with a mobile phone is a passenger in a high speed train traversing between urban centres,the coverage and the 5G resources in general need to be adequate not to diminish her experience of the service.The utilization of macro,pico,and femto cells may optimize the utilization of 5G resources.In this paper,a Genetic Algorithm(GA)-based approach to address the challenges of 5G network planning for 5G-R services is presented.The network is divided into three cell types,macro,pico,and femto cells—and the optimization process is designed to achieve a balance between key objectives:providing comprehensive coverage,minimizing interference,and maximizing energy efficiency.The study focuses on environments with high user density,such as high-speed trains,where reliable and high-quality connectivity is critical.Through simulations,the effectiveness of the GA-driven framework in optimizing coverage and performance in such scenarios is demonstrated.The algorithm is compared with the Particle Swarm Optimisation(PSO)and the Simulated Annealing(SA)methods and interesting insights emerged.The GA offers a strong balance between coverage and efficiency,achieving significantly higher coverage than PSO while maintaining competitive energy efficiency and interference levels.Its steady fitness improvement and adaptability make it well-suited for scenarios where wide coverage is a priority alongside acceptable performance trade-offs.展开更多
The combination of case-based reasoning (CBR) and genetic algorithm (GA) is considered in the problem of failure mode identification in aeronautical component failure analysis. Several imple- mentation issues such...The combination of case-based reasoning (CBR) and genetic algorithm (GA) is considered in the problem of failure mode identification in aeronautical component failure analysis. Several imple- mentation issues such as matching attributes selection, similarity measure calculation, weights learning and training evaluation policies are carefully studied. The testing applications illustrate that an accuracy of 74.67 % can be achieved with 75 balanced-distributed failure cases covering 3 failure modes, and that the resulting learning weight vector can be well applied to the other 2 failure modes, achieving 73.3 % of recognition accuracy. It is also proved that its popularizing capability is good to the recognition of even more mixed failure modes.展开更多
Overfitting is one of the important problems that restrain the application of neural network. The traditional OBD (Optimal Brain Damage) algorithm can avoid overfitting effectively. But it needs to train the network r...Overfitting is one of the important problems that restrain the application of neural network. The traditional OBD (Optimal Brain Damage) algorithm can avoid overfitting effectively. But it needs to train the network repeatedly with low calculational efficiency. In this paper, the Marquardt algorithm is incorporated into the OBD algorithm and a new method for pruning network-the Dynamic Optimal Brain Damage (DOBD) is introduced. This algorithm simplifies a network and obtains good generalization through dynamically deleting weight parameters with low sensitivity that is defined as the change of error function value with respect to the change of weights. Also a simplified method is presented through which sensitivities can be calculated during training with a little computation. A rule to determine the lower limit of sensitivity for deleting the unnecessary weights and other control methods during pruning and training are introduced. The training course is analyzed theoretically and the reason why DOBD algorithm can obtain a much faster training speed than the OBD algorithm and avoid overfitting effectively is given.展开更多
文摘半监督学习方法通过少量标记数据和大量未标记数据来提升学习性能.Tri-training是一种经典的基于分歧的半监督学习方法,但在学习过程中可能产生标记噪声问题.为了减少Tri-training中的标记噪声对未标记数据的预测偏差,学习到更好的半监督分类模型,用交叉熵代替错误率以更好地反映模型预估结果和真实分布之间的差距,并结合凸优化方法来达到降低标记噪声的目的,保证模型效果.在此基础上,分别提出了一种基于交叉熵的Tri-training算法、一个安全的Tri-training算法,以及一种基于交叉熵的安全Tri-training算法.在UCI(University of California Irvine)机器学习库等基准数据集上验证了所提方法的有效性,并利用显著性检验从统计学的角度进一步验证了方法的性能.实验结果表明,提出的半监督学习方法在分类性能方面优于传统的Tri-training算法,其中基于交叉熵的安全Tri-training算法拥有更高的分类性能和泛化能力.
文摘Tri-Training算法是半监督算法中的一种,其初始分类器性能受有标记样本影响较大,当样本数目不足时,分类器性能相对较弱,会直接影响后续迭代.为此提出IFS-Tri-Training(Tri-Training based on intuitionistic fuzzy sets)算法,引入SOM算法构建直觉模糊集,使得分类器在多因素下综合判别无标记样本,提高无标记样本的使用率,从而在迭代中扩展有标记样本集.在多个UCI数据上进行实验,结果数据表明,分类器的性能得到提高,学习无标记样本过程是影响分类器的关键点.
基金Supported by the National Nataral Science Foundation of China(No.39980 0 0 5 )
文摘In this paper are reported the local minimum problem by means of current greedy algorithm for training the empirical potential function of protein folding on 8623 non-native structures of 31 globular proteins and a solution of the problem based upon the simulated annealing algorithm. This simulated annealing algorithm is indispensable for developing and testing highly refined empirical potential functions.
文摘The adoption of 5G for Railways(5G-R)is expanding,particularly in high-speed trains,due to the benefits offered by 5G technology.High-speed trains must provide seamless connectivity and Quality of Service(QoS)to ensure passengers have a satisfactory experience throughout their journey.Installing base stations along urban environments can improve coverage but can dramatically reduce the experience of users due to interference.In particular,when a user with a mobile phone is a passenger in a high speed train traversing between urban centres,the coverage and the 5G resources in general need to be adequate not to diminish her experience of the service.The utilization of macro,pico,and femto cells may optimize the utilization of 5G resources.In this paper,a Genetic Algorithm(GA)-based approach to address the challenges of 5G network planning for 5G-R services is presented.The network is divided into three cell types,macro,pico,and femto cells—and the optimization process is designed to achieve a balance between key objectives:providing comprehensive coverage,minimizing interference,and maximizing energy efficiency.The study focuses on environments with high user density,such as high-speed trains,where reliable and high-quality connectivity is critical.Through simulations,the effectiveness of the GA-driven framework in optimizing coverage and performance in such scenarios is demonstrated.The algorithm is compared with the Particle Swarm Optimisation(PSO)and the Simulated Annealing(SA)methods and interesting insights emerged.The GA offers a strong balance between coverage and efficiency,achieving significantly higher coverage than PSO while maintaining competitive energy efficiency and interference levels.Its steady fitness improvement and adaptability make it well-suited for scenarios where wide coverage is a priority alongside acceptable performance trade-offs.
文摘The combination of case-based reasoning (CBR) and genetic algorithm (GA) is considered in the problem of failure mode identification in aeronautical component failure analysis. Several imple- mentation issues such as matching attributes selection, similarity measure calculation, weights learning and training evaluation policies are carefully studied. The testing applications illustrate that an accuracy of 74.67 % can be achieved with 75 balanced-distributed failure cases covering 3 failure modes, and that the resulting learning weight vector can be well applied to the other 2 failure modes, achieving 73.3 % of recognition accuracy. It is also proved that its popularizing capability is good to the recognition of even more mixed failure modes.
文摘Overfitting is one of the important problems that restrain the application of neural network. The traditional OBD (Optimal Brain Damage) algorithm can avoid overfitting effectively. But it needs to train the network repeatedly with low calculational efficiency. In this paper, the Marquardt algorithm is incorporated into the OBD algorithm and a new method for pruning network-the Dynamic Optimal Brain Damage (DOBD) is introduced. This algorithm simplifies a network and obtains good generalization through dynamically deleting weight parameters with low sensitivity that is defined as the change of error function value with respect to the change of weights. Also a simplified method is presented through which sensitivities can be calculated during training with a little computation. A rule to determine the lower limit of sensitivity for deleting the unnecessary weights and other control methods during pruning and training are introduced. The training course is analyzed theoretically and the reason why DOBD algorithm can obtain a much faster training speed than the OBD algorithm and avoid overfitting effectively is given.