When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes...When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets.The proposed method first uses a biased second-order cone programming support vectormachine(B-SOCP-SVM)to identify the support vectors(SVs)and non-support vectors(NSVs)in the imbalanced data.Then,it applies the synthetic minority over-sampling technique(SV-SMOTE)to oversample the support vectors of the minority class and uses the random under-sampling technique(NSV-RUS)multiple times to undersample the non-support vectors of the majority class.Combining the above-obtained minority class data set withmultiple majority class datasets can obtainmultiple new balanced data sets.Finally,SOCP-SVM is used to classify each data set,and the final result is obtained through the integrated algorithm.Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets.展开更多
Assessment of imprecise time-variant reliability in engineering is a critical task when accounting for both the variability of structural properties and loads over time and the presence of uncertainties involved in th...Assessment of imprecise time-variant reliability in engineering is a critical task when accounting for both the variability of structural properties and loads over time and the presence of uncertainties involved in the ambiguity of parameters simultaneously.To estimate the Imprecise Time-variant Failure Probability Function(ITFPF)and derive the imprecise reliability results as a byproduct,Adaptive Combination Augmented Line Sampling(ACALS)is proposed.It consists of three integrated features:Augmented Line Sampling(ALS),adaptive strategy,and the optimal combination.ALS is adopted as an efficient analysis tool to obtain the failure probability function w.r.t.imprecise parameters.Then,the adaptive strategy iteratively applies ALS while considering both imprecise parameters and time simultaneously.Finally,the optimal combination algorithm collects all result components in an optimal manner to minimize the Coefficient of Variance(C.o.V.)of the ITFPF estimate.Overall,the proposed ACALS method outperforms the original ALS method by efficiently estimating the ITFPF while guaranteeing a minimal C.o.V.Thus,the proposed approach can serve as an effective tool for imprecise time-variant reliability analysis in real engineering applications.Several examples are presented to demonstrate the superiority of the proposed approach in addressing the challenges of estimating the ITFPF.展开更多
Selection of negative samples significantly influences landslide susceptibility assessment,especially when establishing the relationship between landslides and environmental factors in regions with complex geological ...Selection of negative samples significantly influences landslide susceptibility assessment,especially when establishing the relationship between landslides and environmental factors in regions with complex geological conditions.Traditional sampling strategies commonly used in landslide susceptibility models can lead to a misrepresentation of the distribution of negative samples,causing a deviation from actual geological conditions.This,in turn,negatively affects the discriminative ability and generalization performance of the models.To address this issue,we propose a novel approach for selecting negative samples to enhance the quality of machine learning models.We choose the Liangshan Yi Autonomous Prefecture,located in southwestern Sichuan,China,as the case study.This area,characterized by complex terrain,frequent tectonic activities,and steep slope erosion,experiences recurrent landslides,making it an ideal setting for validating our proposed method.We calculate the contribution values of environmental factors using the relief algorithm to construct the feature space,apply the Target Space Exteriorization Sampling(TSES)method to select negative samples,calculate landslide probability values by Random Forest(RF)modeling,and then create regional landslide susceptibility maps.We evaluate the performance of the RF model optimized by the Environmental Factor Selection-based TSES(EFSTSES)method using standard performance metrics.The results indicated that the model achieved an accuracy(ACC)of 0.962,precision(PRE)of 0.961,and an area under the curve(AUC)of 0.962.These findings demonstrate that the EFSTSES-based model effectively mitigates the negative sample imbalance issue,enhances the differentiation between landslide and non-landslide samples,and reduces misclassification,particularly in geologically complex areas.These improvements offer valuable insights for disaster prevention,land use planning,and risk mitigation strategies.展开更多
Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni...Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.展开更多
For a class of non-uniform output sampling hybrid system with actuator faults and bounded disturbances,an iterative learning fault diagnosis algorithm is proposed.Firstly,in order to measure the impact of fault on sys...For a class of non-uniform output sampling hybrid system with actuator faults and bounded disturbances,an iterative learning fault diagnosis algorithm is proposed.Firstly,in order to measure the impact of fault on system between every consecutive output sampling instants,the actual fault function is transformed to obtain an equivalent fault model by using the integral mean value theorem,then the non-uniform sampling hybrid system is converted to continuous systems with timevarying delay based on the output delay method.Afterwards,an observer-based fault diagnosis filter with virtual fault is designed to estimate the equivalent fault,and the iterative learning regulation algorithm is chosen to update the virtual fault repeatedly to make it approximate the actual equivalent fault after some iterative learning trials,so the algorithm can detect and estimate the system faults adaptively.Simulation results of an electro-mechanical control system model with different types of faults illustrate the feasibility and effectiveness of this algorithm.展开更多
In order to prevent cracking appeared in the work-piece during the hot stamping operation,this paper proposes a hybrid optimization method based on Hammersley sequence sampling( HSS),finite analysis,backpropagation( B...In order to prevent cracking appeared in the work-piece during the hot stamping operation,this paper proposes a hybrid optimization method based on Hammersley sequence sampling( HSS),finite analysis,backpropagation( BP) neural network and genetic algorithm( GA). The mechanical properties of high strength boron steel are characterized on the basis of uniaxial tensile test at elevated temperatures. The samples of process parameters are chosen via the HSS that encourages the exploration throughout the design space and hence achieves better discovery of possible global optimum in the solution space. Meanwhile, numerical simulation is carried out to predict the forming quality for the optimized design. A BP neural network model is developed to obtain the mathematical relationship between optimization goal and design variables,and genetic algorithm is used to optimize the process parameters. Finally,the results of numerical simulation are compared with those of production experiment to demonstrate that the optimization strategy proposed in the paper is feasible.展开更多
This paper introduces the Particle SwarmOptimization(PSO)algorithmto enhance the LatinHypercube Sampling(LHS)process.The key objective is to mitigate the issues of lengthy computation times and low computational accur...This paper introduces the Particle SwarmOptimization(PSO)algorithmto enhance the LatinHypercube Sampling(LHS)process.The key objective is to mitigate the issues of lengthy computation times and low computational accuracy typically encountered when applying Monte Carlo Simulation(MCS)to LHS for probabilistic trend calculations.The PSOmethod optimizes sample distribution,enhances global search capabilities,and significantly boosts computational efficiency.To validate its effectiveness,the proposed method was applied to IEEE34 and IEEE-118 node systems containing wind power.The performance was then compared with Latin Hypercubic Important Sampling(LHIS),which integrates significant sampling with theMonte Carlomethod.The comparison results indicate that the PSO-enhanced method significantly improves the uniformity and representativeness of the sampling.This enhancement leads to a reduction in data errors and an improvement in both computational accuracy and convergence speed.展开更多
We introduce the potential-decomposition strategy (PDS), which can be used in Markov chain Monte Carlo sampling algorithms. PDS can be designed to make particles move in a modified potential that favors diffusion in...We introduce the potential-decomposition strategy (PDS), which can be used in Markov chain Monte Carlo sampling algorithms. PDS can be designed to make particles move in a modified potential that favors diffusion in phase space, then, by rejecting some trial samples, the target distributions can be sampled in an unbiased manner. Furthermore, if the accepted trial samples are insumcient, they can be recycled as initial states to form more unbiased samples. This strategy can greatly improve efficiency when the original potential has multiple metastable states separated by large barriers. We apply PDS to the 2d Ising model and a double-well potential model with a large barrier, demonstrating in these two representative examples that convergence is accelerated by orders of magnitude.展开更多
AIM To detect blood withdrawal for patients with arterial blood pressure monitoring to increase patient safety and provide better sample dating.METHODS Blood pressure information obtained from a patient monitor was fe...AIM To detect blood withdrawal for patients with arterial blood pressure monitoring to increase patient safety and provide better sample dating.METHODS Blood pressure information obtained from a patient monitor was fed as a real-time data stream to an experimental medical framework. This framework was connected to an analytical application which observes changes in systolic, diastolic and mean pressure to determine anomalies in the continuous data stream. Detection was based on an increased mean blood pressure caused by the closing of the withdrawal three-way tap and an absence of systolic and diastolic measurements during this manipulation. For evaluation of the proposed algorithm, measured data from animal studies in healthy pigs were used.RESULTS Using this novel approach for processing real-time measurement data of arterial pressure monitoring, the exact time of blood withdrawal could be successfully detected retrospectively and in real-time. The algorithm was able to detect 422 of 434(97%) blood withdrawals for blood gas analysis in the retrospective analysis of 7 study trials. Additionally, 64 sampling events for other procedures like laboratory and activated clotting time analyses were detected. The proposed algorithm achieved a sensitivity of 0.97, a precision of 0.96 and an F1 score of 0.97.CONCLUSION Arterial blood pressure monitoring data can be used toperform an accurate identification of individual blood samplings in order to reduce sample mix-ups and thereby increase patient safety.展开更多
处于改建阶段的智能变电站采样模式复杂,继电保护装置难以发现采样回路轻微异常,导致回路隐患暴露时间严重滞后。针对上述问题,分析改建时期智能变电站的采样模式和二次设备配置情况,提出基于同源录波数据比对的继电保护采样回路异常检...处于改建阶段的智能变电站采样模式复杂,继电保护装置难以发现采样回路轻微异常,导致回路隐患暴露时间严重滞后。针对上述问题,分析改建时期智能变电站的采样模式和二次设备配置情况,提出基于同源录波数据比对的继电保护采样回路异常检测方法。首先,利用双向编码器表征(bidirectional encoder representations from transformers,BERT)语言模型与余弦相似度算法,实现同源录波数据的通道匹配。然后,利用重采样技术和曼哈顿距离完成波形的采样频率统一与时域对齐。最后,基于动态时间规整(dynamic time warping,DTW)算法提出改进算法,并结合采样点偏移量共同设置采样回路的异常判据。算例分析表明,该方法可以完成录波数据的同源通道匹配,实现波形的一致性对齐,并且相比于传统DTW算法,改进DTW算法对异常状态识别的灵敏性和准确性更高。根据异常判据能够有效检测继电保护采样回路的异常状态,确保了智能变电站的安全可靠运行。展开更多
针对经典MCMC(Markov chain Monte Carlo)算法求解河流水污染源信息(排放量、排放时间和排放位置)时初始点的选取和接受率不高导致的计算效率低下问题,通过COMSOL仿真软件构建污染物二维扩散模型,利用不同算法对比分析了上述两方面对水...针对经典MCMC(Markov chain Monte Carlo)算法求解河流水污染源信息(排放量、排放时间和排放位置)时初始点的选取和接受率不高导致的计算效率低下问题,通过COMSOL仿真软件构建污染物二维扩散模型,利用不同算法对比分析了上述两方面对水污染溯源结果的影响,并由此提出了基于等距随机抽样方法(equidistant random sampling)的两阶段多链Metropolis Hastings算法(ERS-TSMH).仿真结果表明,传统的MH算法和TSMH算法在求解时易陷入局部最优值或不收敛的情况,前者接受率在20%左右,后者却达到近50%;多链ERS-MH算法提高了反演的准确性,但经过10 000次左右迭代后收敛,效率低下;多链ERS-TSMH算法在保证溯源精度的同时,在5 000次左右迭代后收敛,效率显著提高且表现出高稳定性和可靠性.展开更多
基金supported by the Natural Science Basic Research Program of Shaanxi(Program No.2024JC-YBMS-026).
文摘When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets.The proposed method first uses a biased second-order cone programming support vectormachine(B-SOCP-SVM)to identify the support vectors(SVs)and non-support vectors(NSVs)in the imbalanced data.Then,it applies the synthetic minority over-sampling technique(SV-SMOTE)to oversample the support vectors of the minority class and uses the random under-sampling technique(NSV-RUS)multiple times to undersample the non-support vectors of the majority class.Combining the above-obtained minority class data set withmultiple majority class datasets can obtainmultiple new balanced data sets.Finally,SOCP-SVM is used to classify each data set,and the final result is obtained through the integrated algorithm.Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets.
基金The Aeronautical Science Foundation of China(Nos.20170968002,20230003068002)The National Major Science and Technology Projects of China(Nos.J2019-II-0022-0043,J2019-VII-0013-0153).
文摘Assessment of imprecise time-variant reliability in engineering is a critical task when accounting for both the variability of structural properties and loads over time and the presence of uncertainties involved in the ambiguity of parameters simultaneously.To estimate the Imprecise Time-variant Failure Probability Function(ITFPF)and derive the imprecise reliability results as a byproduct,Adaptive Combination Augmented Line Sampling(ACALS)is proposed.It consists of three integrated features:Augmented Line Sampling(ALS),adaptive strategy,and the optimal combination.ALS is adopted as an efficient analysis tool to obtain the failure probability function w.r.t.imprecise parameters.Then,the adaptive strategy iteratively applies ALS while considering both imprecise parameters and time simultaneously.Finally,the optimal combination algorithm collects all result components in an optimal manner to minimize the Coefficient of Variance(C.o.V.)of the ITFPF estimate.Overall,the proposed ACALS method outperforms the original ALS method by efficiently estimating the ITFPF while guaranteeing a minimal C.o.V.Thus,the proposed approach can serve as an effective tool for imprecise time-variant reliability analysis in real engineering applications.Several examples are presented to demonstrate the superiority of the proposed approach in addressing the challenges of estimating the ITFPF.
基金supported by Natural Science Research Project of Anhui Educational Committee(2023AH030041)National Natural Science Foundation of China(42277136)Anhui Province Young and Middle-aged Teacher Training Action Project(DTR2023018).
文摘Selection of negative samples significantly influences landslide susceptibility assessment,especially when establishing the relationship between landslides and environmental factors in regions with complex geological conditions.Traditional sampling strategies commonly used in landslide susceptibility models can lead to a misrepresentation of the distribution of negative samples,causing a deviation from actual geological conditions.This,in turn,negatively affects the discriminative ability and generalization performance of the models.To address this issue,we propose a novel approach for selecting negative samples to enhance the quality of machine learning models.We choose the Liangshan Yi Autonomous Prefecture,located in southwestern Sichuan,China,as the case study.This area,characterized by complex terrain,frequent tectonic activities,and steep slope erosion,experiences recurrent landslides,making it an ideal setting for validating our proposed method.We calculate the contribution values of environmental factors using the relief algorithm to construct the feature space,apply the Target Space Exteriorization Sampling(TSES)method to select negative samples,calculate landslide probability values by Random Forest(RF)modeling,and then create regional landslide susceptibility maps.We evaluate the performance of the RF model optimized by the Environmental Factor Selection-based TSES(EFSTSES)method using standard performance metrics.The results indicated that the model achieved an accuracy(ACC)of 0.962,precision(PRE)of 0.961,and an area under the curve(AUC)of 0.962.These findings demonstrate that the EFSTSES-based model effectively mitigates the negative sample imbalance issue,enhances the differentiation between landslide and non-landslide samples,and reduces misclassification,particularly in geologically complex areas.These improvements offer valuable insights for disaster prevention,land use planning,and risk mitigation strategies.
基金Supported by the Open Researches Fund Program of L IESMARS(WKL(0 0 ) 0 30 2 )
文摘Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.
基金supported by the National Natural Science Foundation of China(61273070,61203092)the Enterprise-college-institute Cooperative Project of Jiangsu Province(BY2015019-21)+1 种基金111 Project(B12018)the Fun-damental Research Funds for the Central Universities(JUSRP51733B)
文摘For a class of non-uniform output sampling hybrid system with actuator faults and bounded disturbances,an iterative learning fault diagnosis algorithm is proposed.Firstly,in order to measure the impact of fault on system between every consecutive output sampling instants,the actual fault function is transformed to obtain an equivalent fault model by using the integral mean value theorem,then the non-uniform sampling hybrid system is converted to continuous systems with timevarying delay based on the output delay method.Afterwards,an observer-based fault diagnosis filter with virtual fault is designed to estimate the equivalent fault,and the iterative learning regulation algorithm is chosen to update the virtual fault repeatedly to make it approximate the actual equivalent fault after some iterative learning trials,so the algorithm can detect and estimate the system faults adaptively.Simulation results of an electro-mechanical control system model with different types of faults illustrate the feasibility and effectiveness of this algorithm.
基金Sponsored by the Fundamental Research Funds for the Central Universities(Grant No.CDJZR14130006)
文摘In order to prevent cracking appeared in the work-piece during the hot stamping operation,this paper proposes a hybrid optimization method based on Hammersley sequence sampling( HSS),finite analysis,backpropagation( BP) neural network and genetic algorithm( GA). The mechanical properties of high strength boron steel are characterized on the basis of uniaxial tensile test at elevated temperatures. The samples of process parameters are chosen via the HSS that encourages the exploration throughout the design space and hence achieves better discovery of possible global optimum in the solution space. Meanwhile, numerical simulation is carried out to predict the forming quality for the optimized design. A BP neural network model is developed to obtain the mathematical relationship between optimization goal and design variables,and genetic algorithm is used to optimize the process parameters. Finally,the results of numerical simulation are compared with those of production experiment to demonstrate that the optimization strategy proposed in the paper is feasible.
文摘This paper introduces the Particle SwarmOptimization(PSO)algorithmto enhance the LatinHypercube Sampling(LHS)process.The key objective is to mitigate the issues of lengthy computation times and low computational accuracy typically encountered when applying Monte Carlo Simulation(MCS)to LHS for probabilistic trend calculations.The PSOmethod optimizes sample distribution,enhances global search capabilities,and significantly boosts computational efficiency.To validate its effectiveness,the proposed method was applied to IEEE34 and IEEE-118 node systems containing wind power.The performance was then compared with Latin Hypercubic Important Sampling(LHIS),which integrates significant sampling with theMonte Carlomethod.The comparison results indicate that the PSO-enhanced method significantly improves the uniformity and representativeness of the sampling.This enhancement leads to a reduction in data errors and an improvement in both computational accuracy and convergence speed.
基金Supported by the National Natural Science Foundation of China under Grant Nos.10674016,10875013the Specialized Research Foundation for the Doctoral Program of Higher Education under Grant No.20080027005
文摘We introduce the potential-decomposition strategy (PDS), which can be used in Markov chain Monte Carlo sampling algorithms. PDS can be designed to make particles move in a modified potential that favors diffusion in phase space, then, by rejecting some trial samples, the target distributions can be sampled in an unbiased manner. Furthermore, if the accepted trial samples are insumcient, they can be recycled as initial states to form more unbiased samples. This strategy can greatly improve efficiency when the original potential has multiple metastable states separated by large barriers. We apply PDS to the 2d Ising model and a double-well potential model with a large barrier, demonstrating in these two representative examples that convergence is accelerated by orders of magnitude.
文摘AIM To detect blood withdrawal for patients with arterial blood pressure monitoring to increase patient safety and provide better sample dating.METHODS Blood pressure information obtained from a patient monitor was fed as a real-time data stream to an experimental medical framework. This framework was connected to an analytical application which observes changes in systolic, diastolic and mean pressure to determine anomalies in the continuous data stream. Detection was based on an increased mean blood pressure caused by the closing of the withdrawal three-way tap and an absence of systolic and diastolic measurements during this manipulation. For evaluation of the proposed algorithm, measured data from animal studies in healthy pigs were used.RESULTS Using this novel approach for processing real-time measurement data of arterial pressure monitoring, the exact time of blood withdrawal could be successfully detected retrospectively and in real-time. The algorithm was able to detect 422 of 434(97%) blood withdrawals for blood gas analysis in the retrospective analysis of 7 study trials. Additionally, 64 sampling events for other procedures like laboratory and activated clotting time analyses were detected. The proposed algorithm achieved a sensitivity of 0.97, a precision of 0.96 and an F1 score of 0.97.CONCLUSION Arterial blood pressure monitoring data can be used toperform an accurate identification of individual blood samplings in order to reduce sample mix-ups and thereby increase patient safety.
文摘处于改建阶段的智能变电站采样模式复杂,继电保护装置难以发现采样回路轻微异常,导致回路隐患暴露时间严重滞后。针对上述问题,分析改建时期智能变电站的采样模式和二次设备配置情况,提出基于同源录波数据比对的继电保护采样回路异常检测方法。首先,利用双向编码器表征(bidirectional encoder representations from transformers,BERT)语言模型与余弦相似度算法,实现同源录波数据的通道匹配。然后,利用重采样技术和曼哈顿距离完成波形的采样频率统一与时域对齐。最后,基于动态时间规整(dynamic time warping,DTW)算法提出改进算法,并结合采样点偏移量共同设置采样回路的异常判据。算例分析表明,该方法可以完成录波数据的同源通道匹配,实现波形的一致性对齐,并且相比于传统DTW算法,改进DTW算法对异常状态识别的灵敏性和准确性更高。根据异常判据能够有效检测继电保护采样回路的异常状态,确保了智能变电站的安全可靠运行。
文摘针对经典MCMC(Markov chain Monte Carlo)算法求解河流水污染源信息(排放量、排放时间和排放位置)时初始点的选取和接受率不高导致的计算效率低下问题,通过COMSOL仿真软件构建污染物二维扩散模型,利用不同算法对比分析了上述两方面对水污染溯源结果的影响,并由此提出了基于等距随机抽样方法(equidistant random sampling)的两阶段多链Metropolis Hastings算法(ERS-TSMH).仿真结果表明,传统的MH算法和TSMH算法在求解时易陷入局部最优值或不收敛的情况,前者接受率在20%左右,后者却达到近50%;多链ERS-MH算法提高了反演的准确性,但经过10 000次左右迭代后收敛,效率低下;多链ERS-TSMH算法在保证溯源精度的同时,在5 000次左右迭代后收敛,效率显著提高且表现出高稳定性和可靠性.