To solve the multi-class fault diagnosis tasks,decision tree support vector machine(DTSVM),which combines SVM and decision tree using the concept of dichotomy,is proposed.Since the classification performance of DTSVM ...To solve the multi-class fault diagnosis tasks,decision tree support vector machine(DTSVM),which combines SVM and decision tree using the concept of dichotomy,is proposed.Since the classification performance of DTSVM highly depends on its structure,to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes,genetic algorithm is introduced into the formation of decision tree,so that the most separable classes would be separated at each node of decisions tree.Numerical simulations conducted on three datasets compared with"one-against-all"and"one-against-one"demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.展开更多
Scientists have introduced new methods for capturing energy from ocean waves.Specifically,scientists have focused on a type of wave energy converter(WEC)that is nonbuoyant(i.e.,a body that cannot float).Typically,the ...Scientists have introduced new methods for capturing energy from ocean waves.Specifically,scientists have focused on a type of wave energy converter(WEC)that is nonbuoyant(i.e.,a body that cannot float).Typically,the WEC is most effective when it is in resonance,which occurs when the natural frequency of the WEC aligns with that of the ocean waves.Therefore,accurately predicting the movement of the WEC is crucial for adjusting its system to resonate with the incoming waves for optimal performance.In this study,artificial intelligence techniques,such as random forest,extra trees(ET),and support vector machines,are created to forecast the vertical movement of a nonbuoyant WEC.The developed models require two variables as input,namely,the water wave height and its time period.A total of approximately 4500 data points,which include nonlinear water wave height and duration ob-tained from a laboratory experiment,are used as the input for these models,with the resulting vertical movement as the output.When comparing the three models based on their processing speed and accuracy,the ET model stands out as the most efficient.Ultimately,the ET model is tested using data from a real ocean setting.展开更多
Support vector machines and a Kalman-like observer are used for fault detection and isolation in a variable speed horizontalaxis wind turbine composed of three blades and a full converter. The support vector approach ...Support vector machines and a Kalman-like observer are used for fault detection and isolation in a variable speed horizontalaxis wind turbine composed of three blades and a full converter. The support vector approach is data-based and is therefore robust to process knowledge. It is based on structural risk minimization which enhances generalization even with small training data set and it allows for process nonlinearity by using flexible kernels. In this work, a radial basis function is used as the kernel. Different parts of the process are investigated including actuators and sensors faults. With duplicated sensors, sensor faults in blade pitch positions,generator and rotor speeds can be detected. Faults of type stuck measurements can be detected in 2 sampling periods. The detection time of offset/scaled measurements depends on the severity of the fault and on the process dynamics when the fault occurs. The converter torque actuator fault can be detected within 2 sampling periods. Faults in the actuators of the pitch systems represents a higher difficulty for fault detection which is due to the fact that such faults only affect the transitory state(which is very fast) but not the final stationary state. Therefore, two methods are considered and compared for fault detection and isolation of this fault: support vector machines and a Kalman-like observer. Advantages and disadvantages of each method are discussed. On one hand, support vector machines training of transitory states would require a big amount of data in different situations, but the fault detection and isolation results are robust to variations in the input/operating point. On the other hand, the observer is model-based, and therefore does not require training, and it allows identification of the fault level, which is interesting for fault reconfiguration. But the observability of the system is ensured under specific conditions, related to the dynamics of the inputs and outputs. The whole fault detection and isolation scheme is evaluated using a wind turbine benchmark with a real sequence of wind speed.展开更多
This paper presents the fault diagnosis of face milling tool based on machine learning approach.While machining,spindle vibration signals in feed direction under healthy and faulty conditions of the milling tool are a...This paper presents the fault diagnosis of face milling tool based on machine learning approach.While machining,spindle vibration signals in feed direction under healthy and faulty conditions of the milling tool are acquired.A set of discrete wavelet features is extracted from the vibration signals using discrete wavelet transform(DWT)technique.The decision tree technique is used to select significant features out of all extracted wavelet features.C-support vector classification(C-SVC)andν-support vector classification(ν-SVC)models with different kernel functions of support vector machine(SVM)are used to study and classify the tool condition based on selected features.From the results obtained,C-SVC is the best model thanν-SVC and it can be able to give 94.5%classification accuracy for face milling of special steel alloy 42CrMo4.展开更多
Credit card fraudulent data is highly imbalanced, and it has presented an overwhelmingly large portion of nonfraudulent transactions and a small portion of fraudulent transactions. The measures used to judge the verac...Credit card fraudulent data is highly imbalanced, and it has presented an overwhelmingly large portion of nonfraudulent transactions and a small portion of fraudulent transactions. The measures used to judge the veracity of the detection algorithms become critical to the deployment of a model that accurately scores fraudulent transactions taking into account case imbalance, and the cost of identifying a case as genuine when, in fact, the case is a fraudulent transaction. In this paper, a new criterion to judge classification algorithms, which considers the cost of misclassification, is proposed, and several undersampling techniques are compared by this new criterion. At the same time, a weighted support vector machine (SVM) algorithm considering the financial cost of misclassification is introduced, proving to be more practical for credit card fraud detection than traditional methodologies. This weighted SVM uses transaction balances as weights for fraudulent transactions, and a uniformed weight for nonfraudulent transactions. The results show this strategy greatly improve performance of credit card fraud detection.展开更多
BACKGROUND Delayed wound healing is a common clinical complication following gastric cancer radical surgery,adversely affecting patient prognosis.With advances in artificial intelligence,machine learning offers a prom...BACKGROUND Delayed wound healing is a common clinical complication following gastric cancer radical surgery,adversely affecting patient prognosis.With advances in artificial intelligence,machine learning offers a promising approach for developing predictive models that can identify high-risk patients and support early clinical intervention.AIM To construct machine learning-based risk prediction models for delayed wound healing after gastric cancer surgery to support clinical decision-making.METHODS We reviewed a total of 514 patients who underwent gastric cancer radical surgery under general anesthesia from January 1,2014 to December 30,2023.Seventy percent of the dataset was selected as the training set and 30%as the validation set.Decision trees,support vector machines,and logistic regression were used to construct a risk prediction model.The performance of the model was evaluated using accuracy,recall,precision,F1 index,and area under the receiver operating characteristic curve and decision curve.RESULTS This study included five variables:Sex,elderly,duration of abdominal drainage,preoperative white blood cell(WBC)count,and absolute value of neutrophils.These variables were selected based on their clinical relevance and statistical significance in predicting delayed wound healing.The results showed that the decision tree model outperformed the logistic regression and support vector machine models in both the training and validation sets.Specifically,the decision tree model achieved higher accuracy,F1 index,recall,and area under the curve(AUC)values.The support vector machine model also demonstrated better performance than logistic regression,with higher accuracy,recall,and F1 index,but a slightly lower AUC.The key variables of sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils were found to be strong predictors of delayed wound healing.Patients with longer duration of abdominal drainage had a significantly higher risk of delayed wound healing,with a risk ratio of 1.579 compared to those with shorter duration of abdominal drainage.Similarly,preoperative WBC count,sex,elderly,and absolute value of neutrophils were associated with a higher risk of delayed wound healing,highlighting the importance of these variables in the model.CONCLUSION The model is able to identify high-risk patients based on sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils can provide valuable insights for clinical decision-making.展开更多
Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing ...Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing and preparation involve two processes: data cleaning and feature scaling. Several machine learning algorithms, including Linear Regression(LR), Decision Tree(DT), Support Vector Machine(SVM),Random Forest(RF), and Gradient Boosting(GB) for classification, were tested using different iterations and various combinations of features and parameters. The support vector radial kernel training model achieved an accuracy of 72.49% without grid search and 64.02% with grid search, while the blind-well test scores were 71.01% and 69.67%, respectively. The Decision Tree(DT) Hyperparameter Optimization model showed an accuracy of 64.15% for training and 67.45% for testing. In comparison, the Decision Tree coupled with grid search yielded better results, with a training score of 69.91% and a testing score of67.89%. The model's validation was carried out using the blind well validation approach, which achieved an accuracy of 69.81%. Three algorithms were used to generate the gradient-boosting model. During training, the Gradient Boosting classifier achieved an accuracy score of 71.57%, and during testing, it achieved 69.89%. The Grid Search model achieved a higher accuracy score of 72.14% during testing. The Extreme Gradient Boosting model had the lowest accuracy score, with only 66.13% for training and66.12% for testing. For validation, the Gradient Boosting(GB) classifier model achieved an accuracy score of 75.41% on the blind well test, while the Gradient Boosting with Grid Search achieved an accuracy score of 71.36%. The Enhanced Random Forest and Random Forest with Bagging algorithms were the most effective, with validation accuracies of 78.30% and 79.18%, respectively. However, the Random Forest and Random Forest with Grid Search models displayed significant variance between their training and testing scores, indicating the potential for overfitting. Random Forest(RF) and Gradient Boosting(GB) are highly effective for facies classification because they handle complex relationships and provide high predictive accuracy. The choice between the two depends on specific project requirements, including interpretability, computational resources, and data nature.展开更多
Many animals possess actively movable tactile sensors in their heads,to explore the near-range space.During locomotion,an antenna is used in near range orientation,for example,in detecting,localizing,probing,and negot...Many animals possess actively movable tactile sensors in their heads,to explore the near-range space.During locomotion,an antenna is used in near range orientation,for example,in detecting,localizing,probing,and negotiating obstacles.A bionic tactile sensor used in the present work was inspired by the antenna of the stick insects.The sensor is able to detect an obstacle and its location in 3 D(Three dimensional) space.The vibration signals are analyzed in the frequency domain using Fast Fourier Transform(FFT) to estimate the distances.Signal processing algorithms,Artificial Neural Network(ANN) and Support Vector Machine(SVM) are used for the analysis and prediction processes.These three prediction techniques are compared for both distance estimation and material classification processes.When estimating the distances,the accuracy of estimation is deteriorated towards the tip of the probe due to the change in the vibration modes.Since the vibration data within that region have high a variance,the accuracy in distance estimation and material classification are lower towards the tip.The change in vibration mode is mathematically analyzed and a solution is proposed to estimate the distance along the full range of the probe.展开更多
In recent years,binary image steganography has developed so rapidly that the research of binary image steganalysis becomes more important for information security.In most state-of-the-art binary image steganographic s...In recent years,binary image steganography has developed so rapidly that the research of binary image steganalysis becomes more important for information security.In most state-of-the-art binary image steganographic schemes,they always find out the flippable pixels to minimize the embedding distortions.For this reason,the stego images generated by the previous schemes maintain visual quality and it is hard for steganalyzer to capture the embedding trace in spacial domain.However,the distortion maps can be calculated for cover and stego images and the difference between them is significant.In this paper,a novel binary image steganalytic scheme is proposed,which is based on distortion level co-occurrence matrix.The proposed scheme first generates the corresponding distortion maps for cover and stego images.Then the co-occurrence matrix is constructed on the distortion level maps to represent the features of cover and stego images.Finally,support vector machine,based on the gaussian kernel,is used to classify the features.Compared with the prior steganalytic methods,experimental results demonstrate that the proposed scheme can effectively detect stego images.展开更多
Support Vector Clustering (SVC) is a kernel-based unsupervised learning clustering method. The main drawback of SVC is its high computational complexity in getting the adjacency matrix describing the connectivity for ...Support Vector Clustering (SVC) is a kernel-based unsupervised learning clustering method. The main drawback of SVC is its high computational complexity in getting the adjacency matrix describing the connectivity for each pairs of points. Based on the proximity graph model [3], the Euclidean distance in Hilbert space is calculated using a Gaussian kernel, which is the right criterion to generate a minimum spanning tree using Kruskal's algorithm. Then the connectivity estimation is lowered by only checking the linkages between the edges that construct the main stem of the MST (Minimum Spanning Tree), in which the non-compatibility degree is originally defined to support the edge selection during linkage estimations. This new approach is experimentally analyzed. The results show that the revised algorithm has a better performance than the proximity graph model with faster speed, optimized clustering quality and strong ability to noise suppression, which makes SVC scalable to large data sets.展开更多
Students in South African Universities come from different socio-cultural backgrounds, countries and high schools. This suggests that these students have different experiences which impact on their levels of grasping ...Students in South African Universities come from different socio-cultural backgrounds, countries and high schools. This suggests that these students have different experiences which impact on their levels of grasping information in class as they potentially use different lenses on tuition. The current practice in Universities in contributing to the academic performance of students includes the use of tutors, the use of mobile devices for first year students, use of student assistants and the use of different feedback measures. What is problematic about the current practice is that students are quitting university in high numbers. In this study, knowledge has been drawn from data through the use of machine learning algorithms. Bayesian networks, support vector machines (SVMs) and decision trees algorithms were used individually in this work to construct predictive models for the academic performance of students. The best model was constructed using SVM and it gave a prediction of 72.87% and a prediction cost of 139. The model does predict the performance of students in advance of the year-end examinations outcome. The results suggest that South African Universities must recognize the diversity in student population and thus provide students with better support and equip them with the necessary knowledge that will enable them to tap into their full potential and thus enhance their skills.展开更多
Every second, a large volume of useful data is created in social media about the various kind of online purchases and in another forms of reviews. Particularly, purchased products review data is enormously growing in ...Every second, a large volume of useful data is created in social media about the various kind of online purchases and in another forms of reviews. Particularly, purchased products review data is enormously growing in different database repositories every day. Most of the review data are useful to new customers for theier further purchases as well as existing companies to view customers feedback about various products. Data Mining and Machine Leaning techniques are familiar to analyse such kind of data to visualise and know the potential use of the purchased items through online. The customers are making quality of products through their sentiments about the purchased items from different online companies. In this research work, it is analysed sentiments of Headphone review data, which is collected from online repositories. For the analysis of Headphone review data, some of the Machine Learning techniques like Support Vector Machines, Naive Bayes, Decision Trees and Random Forest Algorithms and a Hybrid method are applied to find the quality via the customers’ sentiments. The accuracy and performance of the taken algorithms are also analysed based on the three types of sentiments such as positive, negative and neutral.展开更多
The significance of precise energy usage forecasts has been highlighted by the increasing need for sustainability and energy efficiency across a range of industries.In order to improve the precision and openness of en...The significance of precise energy usage forecasts has been highlighted by the increasing need for sustainability and energy efficiency across a range of industries.In order to improve the precision and openness of energy consumption projections,this study investigates the combination of machine learning(ML)methods with Shapley additive explanations(SHAP)values.The study evaluates three distinct models:the first is a Linear Regressor,the second is a Support Vector Regressor,and the third is a Decision Tree Regressor,which was scaled up to a Random Forest Regressor/Additions made were the third one which was Regressor which was extended to a Random Forest Regressor.These models were deployed with the use of Shareable,Plot-interpretable Explainable Artificial Intelligence techniques,to improve trust in the AI.The findings suggest that our developedmodels are superior to the conventional models discussed in prior studies;with high Mean Absolute Error(MAE)and Root Mean Squared Error(RMSE)values being close to perfection.In detail,the Random Forest Regressor shows the MAE of 0.001 for predicting the house prices whereas the SVR gives 0.21 of MAE and 0.24 RMSE.Such outcomes reflect the possibility of optimizing the use of the promoted advanced AI models with the use of Explainable AI for more accurate prediction of energy consumption and at the same time for the models’decision-making procedures’explanation.In addition to increasing prediction accuracy,this strategy gives stakeholders comprehensible insights,which facilitates improved decision-making and fosters confidence in AI-powered energy solutions.The outcomes show how well ML and SHAP work together to enhance prediction performance and guarantee transparency in energy usage projections.展开更多
This study aims to evaluate the effectiveness of machine learning techniques for predicting groundwater fluctuations in arid and semi-arid regions using data from the Gravity Recovery and Climate Experiment satellite ...This study aims to evaluate the effectiveness of machine learning techniques for predicting groundwater fluctuations in arid and semi-arid regions using data from the Gravity Recovery and Climate Experiment satellite mission.The primary objective is to develop accurate predictive models for groundwa-ter level changes by leveraging the unique capabilities of GRACE satellite data in conjunction with advanced machine learning algorithms.Three widely-used machine learning models,namely DT,SVM and RF,were employed to analyze and model the relationship between GRACE satellite data and groundwater fluctuations in South Khorasan Province,Iran.The study utilized 151 months of GRACE data spanning from 2002 to 2017,which were correlated with piezometer well data available in the study area.The JPL 2 model was selected based on its strong correlation(R=0.9368)with the observed data.The machine learn-ing models were trained and validated using a 70/30 split of the data,and their performance was evaluated 2 using various statistical metrics,including RMSE,R and NSE.The results demonstrated the suitability of machine learning approaches for modeling groundwater fluctuations using GRACE satellite data.The DT 2 model exhibited the best performance during the calibration stage,with an R value of 0.95,RMSE of 20.655,and NSE of 0.96.The SVM and RF models achieved R values of 0.79 and 0.65,and NSE values of 0.86 and 0.71,respectively.For the prediction stage,the DT model maintained its high efficiency,with an 2 RMSE of 1.48,R of 0.87,and NSE of 0.90,indicating its robustness in predicting future groundwater fluc-tuations using GRACE data.The study highlights the potential of machine learning techniques,particularly Decision Trees,in conjunction with GRACE satellite data,for accurate prediction and monitoring of groundwater fluctuations in arid and semi-arid regions.The findings demonstrate the effectiveness of the DT model in capturing the complex relationships between GRACE data and groundwater dynamics,provid-ing reliable predictions and insights for sustainable groundwater management strategies.展开更多
Finger vein recognition is a biometric technique which identifies individuals using their unique finger vein patterns. It is reported to have a high accuracy and rapid processing speed. In addition, it is impossible t...Finger vein recognition is a biometric technique which identifies individuals using their unique finger vein patterns. It is reported to have a high accuracy and rapid processing speed. In addition, it is impossible to steal a vein pattern located inside the finger. We propose a new identification method of finger vascular patterns using a weighted local binary pattern (LBP) and support vector machine (SVM). This research is novel in the following three ways. First, holistic codes are extracted through the LBP method without using a vein detection procedure. This reduces the processing time and the complexities in detecting finger vein patterns. Second, we classify the local areas from which the LBP codes are extracted into three categories based on the SVM classifier: local areas that include a large amount (LA), a medium amount (MA), and a small amount (SA) of vein patterns. Third, different weights are assigned to the extracted LBP code according to the local area type (LA, MA, and SA) from which the LBP codes were extracted. The optimal weights are determined empirically in terms of the accuracy of the finger vein recognition. Experimental results show that our equal error rate (EER) is significantly lower compared to that without the proposed method or using a conventional method.展开更多
Posterior probability support vector machines (PPSVMs) prove robust against noises and outliers and need fewer storage support vectors (SVs). Gonen et al. (2008) extended PPSVMs to a multiclass case by both single-mac...Posterior probability support vector machines (PPSVMs) prove robust against noises and outliers and need fewer storage support vectors (SVs). Gonen et al. (2008) extended PPSVMs to a multiclass case by both single-machine and multimachine approaches. However, these extensions suffer from low classification efficiency, high computational burden, and more importantly, unclassifiable regions. To achieve higher classification efficiency and accuracy with fewer SVs, a binary tree of PPSVMs for the multiclass classification problem is proposed in this letter. Moreover, a Fisher ratio separability measure is adopted to determine the tree structure. Several experiments on handwritten recognition datasets are included to illustrate the proposed approach. Specifically, the Fisher ratio separability accelerated binary tree of PPSVMs obtains overall test accuracy, if not higher than, at least comparable to those of other multiclass algorithms, while using significantly fewer SVs and much less test time.展开更多
Landslides are abundant in mountainous regions.They are responsible for substantial damages and losses in those areas.The A1 Highway,which is an important road in Algeria,was sometimes constructed in mountainous and/o...Landslides are abundant in mountainous regions.They are responsible for substantial damages and losses in those areas.The A1 Highway,which is an important road in Algeria,was sometimes constructed in mountainous and/or semi-mountainous areas.Previous studies of landslide susceptibility mapping conducted near this road using statistical and expert methods have yielded ordinary results.In this research,we are interested in how do machine learning techniques help in increasing accuracy of landslide susceptibility maps in the vicinity of the A1 Highway corridor.To do this,an important section at Ain Bouziane(NE,Algeria) is chosen as a case study to evaluate the landslide susceptibility using three different machine learning methods,namely,random forest(RF),support vector machine(SVM),and boosted regression tree(BRT).First,an inventory map and nine input factors were prepared for landslide susceptibility mapping(LSM) analyses.The three models were constructed to find the most susceptible areas to this phenomenon.The results were assessed by calculating the receiver operating characteristic(ROC) curve,the standard error(Std.error),and the confidence interval(CI) at 95%.The RF model reached the highest predictive accuracy(AUC=97.2%) comparatively to the other models.The outcomes of this research proved that the obtained machine learning models had the ability to predict future landslide locations in this important road section.In addition,their application gives an improvement of the accuracy of LSMs near the road corridor.The machine learning models may become an important prediction tool that will identify landslide alleviation actions.展开更多
As a primary defense technique, intrusion detection becomes more and more significant since the security of the networks is one of the most critical issues in the world. We present an adaptive collaboration intrusion ...As a primary defense technique, intrusion detection becomes more and more significant since the security of the networks is one of the most critical issues in the world. We present an adaptive collaboration intrusion detection method to improve the safety of a network. A self-adaptive and collaborative intrusion detection model is built by applying the Environmentsclasses, agents, roles, groups, and objects(E-CARGO) model. The objects, roles, agents, and groups are designed by using decision trees(DTs) and support vector machines(SVMs), and adaptive scheduling mechanisms are set up. The KDD CUP 1999 data set is used to verify the effectiveness of the method. The experimental results demonstrate the feasibility and efficiency of the proposed collaborative and adaptive intrusion detection method. Also, the proposed method is shown to be more predominant than the methods that use a set of single type support vector machine(SVM) in terms of detection precision rate and recall rate.展开更多
基金supported by the National Natural Science Foundation of China(60604021,60874054)
文摘To solve the multi-class fault diagnosis tasks,decision tree support vector machine(DTSVM),which combines SVM and decision tree using the concept of dichotomy,is proposed.Since the classification performance of DTSVM highly depends on its structure,to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes,genetic algorithm is introduced into the formation of decision tree,so that the most separable classes would be separated at each node of decisions tree.Numerical simulations conducted on three datasets compared with"one-against-all"and"one-against-one"demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.
文摘Scientists have introduced new methods for capturing energy from ocean waves.Specifically,scientists have focused on a type of wave energy converter(WEC)that is nonbuoyant(i.e.,a body that cannot float).Typically,the WEC is most effective when it is in resonance,which occurs when the natural frequency of the WEC aligns with that of the ocean waves.Therefore,accurately predicting the movement of the WEC is crucial for adjusting its system to resonate with the incoming waves for optimal performance.In this study,artificial intelligence techniques,such as random forest,extra trees(ET),and support vector machines,are created to forecast the vertical movement of a nonbuoyant WEC.The developed models require two variables as input,namely,the water wave height and its time period.A total of approximately 4500 data points,which include nonlinear water wave height and duration ob-tained from a laboratory experiment,are used as the input for these models,with the resulting vertical movement as the output.When comparing the three models based on their processing speed and accuracy,the ET model stands out as the most efficient.Ultimately,the ET model is tested using data from a real ocean setting.
文摘Support vector machines and a Kalman-like observer are used for fault detection and isolation in a variable speed horizontalaxis wind turbine composed of three blades and a full converter. The support vector approach is data-based and is therefore robust to process knowledge. It is based on structural risk minimization which enhances generalization even with small training data set and it allows for process nonlinearity by using flexible kernels. In this work, a radial basis function is used as the kernel. Different parts of the process are investigated including actuators and sensors faults. With duplicated sensors, sensor faults in blade pitch positions,generator and rotor speeds can be detected. Faults of type stuck measurements can be detected in 2 sampling periods. The detection time of offset/scaled measurements depends on the severity of the fault and on the process dynamics when the fault occurs. The converter torque actuator fault can be detected within 2 sampling periods. Faults in the actuators of the pitch systems represents a higher difficulty for fault detection which is due to the fact that such faults only affect the transitory state(which is very fast) but not the final stationary state. Therefore, two methods are considered and compared for fault detection and isolation of this fault: support vector machines and a Kalman-like observer. Advantages and disadvantages of each method are discussed. On one hand, support vector machines training of transitory states would require a big amount of data in different situations, but the fault detection and isolation results are robust to variations in the input/operating point. On the other hand, the observer is model-based, and therefore does not require training, and it allows identification of the fault level, which is interesting for fault reconfiguration. But the observability of the system is ensured under specific conditions, related to the dynamics of the inputs and outputs. The whole fault detection and isolation scheme is evaluated using a wind turbine benchmark with a real sequence of wind speed.
文摘This paper presents the fault diagnosis of face milling tool based on machine learning approach.While machining,spindle vibration signals in feed direction under healthy and faulty conditions of the milling tool are acquired.A set of discrete wavelet features is extracted from the vibration signals using discrete wavelet transform(DWT)technique.The decision tree technique is used to select significant features out of all extracted wavelet features.C-support vector classification(C-SVC)andν-support vector classification(ν-SVC)models with different kernel functions of support vector machine(SVM)are used to study and classify the tool condition based on selected features.From the results obtained,C-SVC is the best model thanν-SVC and it can be able to give 94.5%classification accuracy for face milling of special steel alloy 42CrMo4.
文摘Credit card fraudulent data is highly imbalanced, and it has presented an overwhelmingly large portion of nonfraudulent transactions and a small portion of fraudulent transactions. The measures used to judge the veracity of the detection algorithms become critical to the deployment of a model that accurately scores fraudulent transactions taking into account case imbalance, and the cost of identifying a case as genuine when, in fact, the case is a fraudulent transaction. In this paper, a new criterion to judge classification algorithms, which considers the cost of misclassification, is proposed, and several undersampling techniques are compared by this new criterion. At the same time, a weighted support vector machine (SVM) algorithm considering the financial cost of misclassification is introduced, proving to be more practical for credit card fraud detection than traditional methodologies. This weighted SVM uses transaction balances as weights for fraudulent transactions, and a uniformed weight for nonfraudulent transactions. The results show this strategy greatly improve performance of credit card fraud detection.
基金Supported by the Shandong Province Traditional Chinese Medicine Technology Project,No.Q-2023147the Weifang Health Commission Research Project,No.WFWSJK-2023-033+3 种基金the Weifang City Science and Technology Development Plan(Medical Category),No.2023YX057the Weifang Medical University 2022 Campus Level Education and Teaching Reform and Research Project,No.2022YB051Norman Bethune Public Welfare Foundation,No.ezmr2023-037Special Research Project on Optimized Management of Acute Pain,Wu Jieping Medical Foundation.
文摘BACKGROUND Delayed wound healing is a common clinical complication following gastric cancer radical surgery,adversely affecting patient prognosis.With advances in artificial intelligence,machine learning offers a promising approach for developing predictive models that can identify high-risk patients and support early clinical intervention.AIM To construct machine learning-based risk prediction models for delayed wound healing after gastric cancer surgery to support clinical decision-making.METHODS We reviewed a total of 514 patients who underwent gastric cancer radical surgery under general anesthesia from January 1,2014 to December 30,2023.Seventy percent of the dataset was selected as the training set and 30%as the validation set.Decision trees,support vector machines,and logistic regression were used to construct a risk prediction model.The performance of the model was evaluated using accuracy,recall,precision,F1 index,and area under the receiver operating characteristic curve and decision curve.RESULTS This study included five variables:Sex,elderly,duration of abdominal drainage,preoperative white blood cell(WBC)count,and absolute value of neutrophils.These variables were selected based on their clinical relevance and statistical significance in predicting delayed wound healing.The results showed that the decision tree model outperformed the logistic regression and support vector machine models in both the training and validation sets.Specifically,the decision tree model achieved higher accuracy,F1 index,recall,and area under the curve(AUC)values.The support vector machine model also demonstrated better performance than logistic regression,with higher accuracy,recall,and F1 index,but a slightly lower AUC.The key variables of sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils were found to be strong predictors of delayed wound healing.Patients with longer duration of abdominal drainage had a significantly higher risk of delayed wound healing,with a risk ratio of 1.579 compared to those with shorter duration of abdominal drainage.Similarly,preoperative WBC count,sex,elderly,and absolute value of neutrophils were associated with a higher risk of delayed wound healing,highlighting the importance of these variables in the model.CONCLUSION The model is able to identify high-risk patients based on sex,elderly,duration of abdominal drainage,preoperative WBC count,and absolute value of neutrophils can provide valuable insights for clinical decision-making.
文摘Machine learning techniques and a dataset of five wells from the Rawat oilfield in Sudan containing 93,925 samples per feature(seven well logs and one facies log) were used to classify four facies. Data preprocessing and preparation involve two processes: data cleaning and feature scaling. Several machine learning algorithms, including Linear Regression(LR), Decision Tree(DT), Support Vector Machine(SVM),Random Forest(RF), and Gradient Boosting(GB) for classification, were tested using different iterations and various combinations of features and parameters. The support vector radial kernel training model achieved an accuracy of 72.49% without grid search and 64.02% with grid search, while the blind-well test scores were 71.01% and 69.67%, respectively. The Decision Tree(DT) Hyperparameter Optimization model showed an accuracy of 64.15% for training and 67.45% for testing. In comparison, the Decision Tree coupled with grid search yielded better results, with a training score of 69.91% and a testing score of67.89%. The model's validation was carried out using the blind well validation approach, which achieved an accuracy of 69.81%. Three algorithms were used to generate the gradient-boosting model. During training, the Gradient Boosting classifier achieved an accuracy score of 71.57%, and during testing, it achieved 69.89%. The Grid Search model achieved a higher accuracy score of 72.14% during testing. The Extreme Gradient Boosting model had the lowest accuracy score, with only 66.13% for training and66.12% for testing. For validation, the Gradient Boosting(GB) classifier model achieved an accuracy score of 75.41% on the blind well test, while the Gradient Boosting with Grid Search achieved an accuracy score of 71.36%. The Enhanced Random Forest and Random Forest with Bagging algorithms were the most effective, with validation accuracies of 78.30% and 79.18%, respectively. However, the Random Forest and Random Forest with Grid Search models displayed significant variance between their training and testing scores, indicating the potential for overfitting. Random Forest(RF) and Gradient Boosting(GB) are highly effective for facies classification because they handle complex relationships and provide high predictive accuracy. The choice between the two depends on specific project requirements, including interpretability, computational resources, and data nature.
文摘Many animals possess actively movable tactile sensors in their heads,to explore the near-range space.During locomotion,an antenna is used in near range orientation,for example,in detecting,localizing,probing,and negotiating obstacles.A bionic tactile sensor used in the present work was inspired by the antenna of the stick insects.The sensor is able to detect an obstacle and its location in 3 D(Three dimensional) space.The vibration signals are analyzed in the frequency domain using Fast Fourier Transform(FFT) to estimate the distances.Signal processing algorithms,Artificial Neural Network(ANN) and Support Vector Machine(SVM) are used for the analysis and prediction processes.These three prediction techniques are compared for both distance estimation and material classification processes.When estimating the distances,the accuracy of estimation is deteriorated towards the tip of the probe due to the change in the vibration modes.Since the vibration data within that region have high a variance,the accuracy in distance estimation and material classification are lower towards the tip.The change in vibration mode is mathematically analyzed and a solution is proposed to estimate the distance along the full range of the probe.
基金This work is supported by the National Natural Science Foundation of China(No.U1736118)the Natural Science Foundation of Guangdong(No.2016A030313350)+3 种基金the Special Funds for Science and Technology Development of Guangdong(No.2016KZ010103)the Key Project of Scientific Research Plan of Guangzhou(No.201804020068)the Fundamental Research Funds for the Central Universities(No.16lgjc83 and No.17lgjc45)the Science and Technology Planning Project of Guangdong Province(Grant No.2017A040405051).
文摘In recent years,binary image steganography has developed so rapidly that the research of binary image steganalysis becomes more important for information security.In most state-of-the-art binary image steganographic schemes,they always find out the flippable pixels to minimize the embedding distortions.For this reason,the stego images generated by the previous schemes maintain visual quality and it is hard for steganalyzer to capture the embedding trace in spacial domain.However,the distortion maps can be calculated for cover and stego images and the difference between them is significant.In this paper,a novel binary image steganalytic scheme is proposed,which is based on distortion level co-occurrence matrix.The proposed scheme first generates the corresponding distortion maps for cover and stego images.Then the co-occurrence matrix is constructed on the distortion level maps to represent the features of cover and stego images.Finally,support vector machine,based on the gaussian kernel,is used to classify the features.Compared with the prior steganalytic methods,experimental results demonstrate that the proposed scheme can effectively detect stego images.
基金TheNationalHighTechnologyResearchandDevelopmentProgramofChina (No .86 3 5 11 930 0 0 9)
文摘Support Vector Clustering (SVC) is a kernel-based unsupervised learning clustering method. The main drawback of SVC is its high computational complexity in getting the adjacency matrix describing the connectivity for each pairs of points. Based on the proximity graph model [3], the Euclidean distance in Hilbert space is calculated using a Gaussian kernel, which is the right criterion to generate a minimum spanning tree using Kruskal's algorithm. Then the connectivity estimation is lowered by only checking the linkages between the edges that construct the main stem of the MST (Minimum Spanning Tree), in which the non-compatibility degree is originally defined to support the edge selection during linkage estimations. This new approach is experimentally analyzed. The results show that the revised algorithm has a better performance than the proximity graph model with faster speed, optimized clustering quality and strong ability to noise suppression, which makes SVC scalable to large data sets.
文摘Students in South African Universities come from different socio-cultural backgrounds, countries and high schools. This suggests that these students have different experiences which impact on their levels of grasping information in class as they potentially use different lenses on tuition. The current practice in Universities in contributing to the academic performance of students includes the use of tutors, the use of mobile devices for first year students, use of student assistants and the use of different feedback measures. What is problematic about the current practice is that students are quitting university in high numbers. In this study, knowledge has been drawn from data through the use of machine learning algorithms. Bayesian networks, support vector machines (SVMs) and decision trees algorithms were used individually in this work to construct predictive models for the academic performance of students. The best model was constructed using SVM and it gave a prediction of 72.87% and a prediction cost of 139. The model does predict the performance of students in advance of the year-end examinations outcome. The results suggest that South African Universities must recognize the diversity in student population and thus provide students with better support and equip them with the necessary knowledge that will enable them to tap into their full potential and thus enhance their skills.
文摘Every second, a large volume of useful data is created in social media about the various kind of online purchases and in another forms of reviews. Particularly, purchased products review data is enormously growing in different database repositories every day. Most of the review data are useful to new customers for theier further purchases as well as existing companies to view customers feedback about various products. Data Mining and Machine Leaning techniques are familiar to analyse such kind of data to visualise and know the potential use of the purchased items through online. The customers are making quality of products through their sentiments about the purchased items from different online companies. In this research work, it is analysed sentiments of Headphone review data, which is collected from online repositories. For the analysis of Headphone review data, some of the Machine Learning techniques like Support Vector Machines, Naive Bayes, Decision Trees and Random Forest Algorithms and a Hybrid method are applied to find the quality via the customers’ sentiments. The accuracy and performance of the taken algorithms are also analysed based on the three types of sentiments such as positive, negative and neutral.
文摘The significance of precise energy usage forecasts has been highlighted by the increasing need for sustainability and energy efficiency across a range of industries.In order to improve the precision and openness of energy consumption projections,this study investigates the combination of machine learning(ML)methods with Shapley additive explanations(SHAP)values.The study evaluates three distinct models:the first is a Linear Regressor,the second is a Support Vector Regressor,and the third is a Decision Tree Regressor,which was scaled up to a Random Forest Regressor/Additions made were the third one which was Regressor which was extended to a Random Forest Regressor.These models were deployed with the use of Shareable,Plot-interpretable Explainable Artificial Intelligence techniques,to improve trust in the AI.The findings suggest that our developedmodels are superior to the conventional models discussed in prior studies;with high Mean Absolute Error(MAE)and Root Mean Squared Error(RMSE)values being close to perfection.In detail,the Random Forest Regressor shows the MAE of 0.001 for predicting the house prices whereas the SVR gives 0.21 of MAE and 0.24 RMSE.Such outcomes reflect the possibility of optimizing the use of the promoted advanced AI models with the use of Explainable AI for more accurate prediction of energy consumption and at the same time for the models’decision-making procedures’explanation.In addition to increasing prediction accuracy,this strategy gives stakeholders comprehensible insights,which facilitates improved decision-making and fosters confidence in AI-powered energy solutions.The outcomes show how well ML and SHAP work together to enhance prediction performance and guarantee transparency in energy usage projections.
文摘This study aims to evaluate the effectiveness of machine learning techniques for predicting groundwater fluctuations in arid and semi-arid regions using data from the Gravity Recovery and Climate Experiment satellite mission.The primary objective is to develop accurate predictive models for groundwa-ter level changes by leveraging the unique capabilities of GRACE satellite data in conjunction with advanced machine learning algorithms.Three widely-used machine learning models,namely DT,SVM and RF,were employed to analyze and model the relationship between GRACE satellite data and groundwater fluctuations in South Khorasan Province,Iran.The study utilized 151 months of GRACE data spanning from 2002 to 2017,which were correlated with piezometer well data available in the study area.The JPL 2 model was selected based on its strong correlation(R=0.9368)with the observed data.The machine learn-ing models were trained and validated using a 70/30 split of the data,and their performance was evaluated 2 using various statistical metrics,including RMSE,R and NSE.The results demonstrated the suitability of machine learning approaches for modeling groundwater fluctuations using GRACE satellite data.The DT 2 model exhibited the best performance during the calibration stage,with an R value of 0.95,RMSE of 20.655,and NSE of 0.96.The SVM and RF models achieved R values of 0.79 and 0.65,and NSE values of 0.86 and 0.71,respectively.For the prediction stage,the DT model maintained its high efficiency,with an 2 RMSE of 1.48,R of 0.87,and NSE of 0.90,indicating its robustness in predicting future groundwater fluc-tuations using GRACE data.The study highlights the potential of machine learning techniques,particularly Decision Trees,in conjunction with GRACE satellite data,for accurate prediction and monitoring of groundwater fluctuations in arid and semi-arid regions.The findings demonstrate the effectiveness of the DT model in capturing the complex relationships between GRACE data and groundwater dynamics,provid-ing reliable predictions and insights for sustainable groundwater management strategies.
基金Project(No.R112002105070020(2010))supported by the National Research Foundation of Korea(NRF) through the Biometrics Engi-neering Research Center(BERC)at Yonsei University
文摘Finger vein recognition is a biometric technique which identifies individuals using their unique finger vein patterns. It is reported to have a high accuracy and rapid processing speed. In addition, it is impossible to steal a vein pattern located inside the finger. We propose a new identification method of finger vascular patterns using a weighted local binary pattern (LBP) and support vector machine (SVM). This research is novel in the following three ways. First, holistic codes are extracted through the LBP method without using a vein detection procedure. This reduces the processing time and the complexities in detecting finger vein patterns. Second, we classify the local areas from which the LBP codes are extracted into three categories based on the SVM classifier: local areas that include a large amount (LA), a medium amount (MA), and a small amount (SA) of vein patterns. Third, different weights are assigned to the extracted LBP code according to the local area type (LA, MA, and SA) from which the LBP codes were extracted. The optimal weights are determined empirically in terms of the accuracy of the finger vein recognition. Experimental results show that our equal error rate (EER) is significantly lower compared to that without the proposed method or using a conventional method.
基金Project (Nos. 60874104 and 70971020) supported by the National Natural Science Foundation of China
文摘Posterior probability support vector machines (PPSVMs) prove robust against noises and outliers and need fewer storage support vectors (SVs). Gonen et al. (2008) extended PPSVMs to a multiclass case by both single-machine and multimachine approaches. However, these extensions suffer from low classification efficiency, high computational burden, and more importantly, unclassifiable regions. To achieve higher classification efficiency and accuracy with fewer SVs, a binary tree of PPSVMs for the multiclass classification problem is proposed in this letter. Moreover, a Fisher ratio separability measure is adopted to determine the tree structure. Several experiments on handwritten recognition datasets are included to illustrate the proposed approach. Specifically, the Fisher ratio separability accelerated binary tree of PPSVMs obtains overall test accuracy, if not higher than, at least comparable to those of other multiclass algorithms, while using significantly fewer SVs and much less test time.
文摘Landslides are abundant in mountainous regions.They are responsible for substantial damages and losses in those areas.The A1 Highway,which is an important road in Algeria,was sometimes constructed in mountainous and/or semi-mountainous areas.Previous studies of landslide susceptibility mapping conducted near this road using statistical and expert methods have yielded ordinary results.In this research,we are interested in how do machine learning techniques help in increasing accuracy of landslide susceptibility maps in the vicinity of the A1 Highway corridor.To do this,an important section at Ain Bouziane(NE,Algeria) is chosen as a case study to evaluate the landslide susceptibility using three different machine learning methods,namely,random forest(RF),support vector machine(SVM),and boosted regression tree(BRT).First,an inventory map and nine input factors were prepared for landslide susceptibility mapping(LSM) analyses.The three models were constructed to find the most susceptible areas to this phenomenon.The results were assessed by calculating the receiver operating characteristic(ROC) curve,the standard error(Std.error),and the confidence interval(CI) at 95%.The RF model reached the highest predictive accuracy(AUC=97.2%) comparatively to the other models.The outcomes of this research proved that the obtained machine learning models had the ability to predict future landslide locations in this important road section.In addition,their application gives an improvement of the accuracy of LSMs near the road corridor.The machine learning models may become an important prediction tool that will identify landslide alleviation actions.
基金supported in part by the National Natural Science Foundation of China(61772141,61673123)Guangdong Provincial Science&Technology Project(2015B090901016,2016B010108007)+1 种基金Guangdong Education Department Project(Guangdong Higher Education letter 2015[133])the Guangzhou Science&Technology Project(201508010067,201604020145201604046017,and 2016201604030034)
文摘As a primary defense technique, intrusion detection becomes more and more significant since the security of the networks is one of the most critical issues in the world. We present an adaptive collaboration intrusion detection method to improve the safety of a network. A self-adaptive and collaborative intrusion detection model is built by applying the Environmentsclasses, agents, roles, groups, and objects(E-CARGO) model. The objects, roles, agents, and groups are designed by using decision trees(DTs) and support vector machines(SVMs), and adaptive scheduling mechanisms are set up. The KDD CUP 1999 data set is used to verify the effectiveness of the method. The experimental results demonstrate the feasibility and efficiency of the proposed collaborative and adaptive intrusion detection method. Also, the proposed method is shown to be more predominant than the methods that use a set of single type support vector machine(SVM) in terms of detection precision rate and recall rate.