In the field of optoelectronics,certain types of data may be difficult to accurately annotate,such as high-resolution optoelectronic imaging or imaging in certain special spectral ranges.Weakly supervised learning can...In the field of optoelectronics,certain types of data may be difficult to accurately annotate,such as high-resolution optoelectronic imaging or imaging in certain special spectral ranges.Weakly supervised learning can provide a more reliable approach in these situations.Current popular approaches mainly adopt the classification-based class activation maps(CAM)as initial pseudo labels to solve the task.展开更多
With the rapid urbanization and exponential population growth in China,two-wheeled vehicles have become a popular mode of transportation,particularly for short-distance travel.However,due to a lack of safety awareness...With the rapid urbanization and exponential population growth in China,two-wheeled vehicles have become a popular mode of transportation,particularly for short-distance travel.However,due to a lack of safety awareness,traffic violations by two-wheeled vehicle riders have become a widespread concern,contributing to urban traffic risks.Currently,significant human and material resources are being allocated to monitor and intercept non-compliant riders to ensure safe driving behavior.To enhance the safety,efficiency,and cost-effectiveness of traffic monitoring,automated detection systems based on image processing algorithms can be employed to identify traffic violations from eye-level video footage.In this study,we propose a robust detection algorithm specifically designed for two-wheeled vehicles,which serves as a fundamental step toward intelligent traffic monitoring.Our approach integrates a novel convolutional and attention mechanism to improve detection accuracy and efficiency.Additionally,we introduce a semi-supervised training strategy that leverages a large number of unlabeled images to enhance the model’s learning capability by extracting valuable background information.This method enables the model to generalize effectively to diverse urban environments and varying lighting conditions.We evaluate our proposed algorithm on a custom-built dataset,and experimental results demonstrate its superior performance,achieving an average precision(AP)of 95%and a recall(R)of 90.6%.Furthermore,the model maintains a computational efficiency of only 25.7 GFLOPs while achieving a high processing speed of 249 FPS,making it highly suitable for deployment on edge devices.Compared to existing detection methods,our approach significantly enhances the accuracy and robustness of two-wheeled vehicle identification while ensuring real-time performance.展开更多
Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The ma...Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.展开更多
This study proposes a supervised learning method that does not rely on labels.We use variables associated with the label as indirect labels,and construct an indirect physics-constrained loss based on the physical mech...This study proposes a supervised learning method that does not rely on labels.We use variables associated with the label as indirect labels,and construct an indirect physics-constrained loss based on the physical mechanism to train the model.In the training process,the model prediction is mapped to the space of value that conforms to the physical mechanism through the projection matrix,and then the model is trained based on the indirect labels.The final prediction result of the model conforms to the physical mechanism between indirect label and label,and also meets the constraints of the indirect label.The present study also develops projection matrix normalization and prediction covariance analysis to ensure that the model can be fully trained.Finally,the effect of the physics-constrained indirect supervised learning is verified based on a well log generation problem.展开更多
Transition prediction has always been a frontier issue in the field of aerodynamics.A supervised learning model with probability interpretation for transition judgment based on experimental data was developed in this ...Transition prediction has always been a frontier issue in the field of aerodynamics.A supervised learning model with probability interpretation for transition judgment based on experimental data was developed in this paper.It solved the shortcomings of the point detection method in the experiment,that which was often only one transition point could be obtained,and comparison of multi-point data was necessary.First,the Variable-Interval Time Average(VITA)method was used to transform the fluctuating pressure signal measured on the airfoil surface into a sequence of states which was described by Markov chain model.Second,a feature vector consisting of one-step transition matrix and its stationary distribution was extracted.Then,the Hidden Markov Model(HMM)was used to pre-classify the feature vectors marked using the traditional Root Mean Square(RMS)criteria.Finally,a classification model with probability interpretation was established,and the cross-validation method was used for model validation.The research results show that the developed model is effective and reliable,and it has strong Reynolds number generalization ability.The developed model was theoretically analyzed in depth,and the effect of parameters on the model was studied in detail.Compared with the traditional RMS criterion,a reasonable transition zone can be obtained using the developed classification model.In addition,the developed model does not require comparison of multi-point data.The developed supervised learning model provides new ideas for the transition detection in flight experiments and other experiments.展开更多
Rare labeled data are difficult to recognize by using conventional methods in the process of radar emitter recogni-tion.To solve this problem,an optimized cooperative semi-supervised learning radar emitter recognition...Rare labeled data are difficult to recognize by using conventional methods in the process of radar emitter recogni-tion.To solve this problem,an optimized cooperative semi-supervised learning radar emitter recognition method based on a small amount of labeled data is developed.First,a small amount of labeled data are randomly sampled by using the bootstrap method,loss functions for three common deep learning net-works are improved,the uniform distribution and cross-entropy function are combined to reduce the overconfidence of softmax classification.Subsequently,the dataset obtained after sam-pling is adopted to train three improved networks so as to build the initial model.In addition,the unlabeled data are preliminarily screened through dynamic time warping(DTW)and then input into the initial model trained previously for judgment.If the judg-ment results of two or more networks are consistent,the unla-beled data are labeled and put into the labeled data set.Lastly,the three network models are input into the labeled dataset for training,and the final model is built.As revealed by the simula-tion results,the semi-supervised learning method adopted in this paper is capable of exploiting a small amount of labeled data and basically achieving the accuracy of labeled data recognition.展开更多
Log-linear models and more recently neural network models used forsupervised relation extraction requires substantial amounts of training data andtime, limiting the portability to new relations and domains. To this en...Log-linear models and more recently neural network models used forsupervised relation extraction requires substantial amounts of training data andtime, limiting the portability to new relations and domains. To this end, we propose a training representation based on the dependency paths between entities in adependency tree which we call lexicalized dependency paths (LDPs). We showthat this representation is fast, efficient and transparent. We further propose representations utilizing entity types and its subtypes to refine our model and alleviatethe data sparsity problem. We apply lexicalized dependency paths to supervisedlearning using the ACE corpus and show that it can achieve similar performancelevel to other state-of-the-art methods and even surpass them on severalcategories.展开更多
In order to solve the problem of automatic defect detection and process control in the welding and arc additive process,the paper monitors the current,voltage,audio,and other data during the welding process and extrac...In order to solve the problem of automatic defect detection and process control in the welding and arc additive process,the paper monitors the current,voltage,audio,and other data during the welding process and extracts the minimum value,standard deviation,deviation from the voltage and current data.It extracts spectral features such as root mean square,spectral centroid,and zero-crossing rate from audio data,fuses the features extracted from multiple sensor signals,and establishes multiple machine learning supervised and unsupervised models.They are used to detect abnormalities in the welding process.The experimental results show that the established multiple machine learning models have high accuracy,among which the supervised learning model,the balanced accuracy of Ada boost is 0.957,and the unsupervised learning model Isolation Forest has a balanced accuracy of 0.909.展开更多
In aerospace industry,gears are the most common parts of a mechanical transmission system.Gear pitting faults could cause the transmission system to crash and give rise to safety disaster.It is always a challenging pr...In aerospace industry,gears are the most common parts of a mechanical transmission system.Gear pitting faults could cause the transmission system to crash and give rise to safety disaster.It is always a challenging problem to diagnose the gear pitting condition directly through the raw signal of vibration.In this paper,a novel method named augmented deep sparse autoencoder(ADSAE)is proposed.The method can be used to diagnose the gear pitting fault with relatively few raw vibration signal data.This method is mainly based on the theory of pitting fault diagnosis and creatively combines with both data augmentation ideology and the deep sparse autoencoder algorithm for the fault diagnosis of gear wear.The effectiveness of the proposed method is validated by experiments of six types of gear pitting conditions.The results show that the ADSAE method can effectively increase the network generalization ability and robustness with very high accuracy.This method can effectively diagnose different gear pitting conditions and show the obvious trend according to the severity of gear wear faults.The results obtained by the ADSAE method proposed in this paper are compared with those obtained by other common deep learning methods.This paper provides an important insight into the field of gear fault diagnosis based on deep learning and has a potential practical application value.展开更多
In soft sensor field, just-in-time learning(JITL) is an effective approach to model nonlinear and time varying processes. However, most similarity criterions in JITL are computed in the input space only while ignoring...In soft sensor field, just-in-time learning(JITL) is an effective approach to model nonlinear and time varying processes. However, most similarity criterions in JITL are computed in the input space only while ignoring important output information, which may lead to inaccurate construction of relevant sample set. To solve this problem, we propose a novel supervised feature extraction method suitable for the regression problem called supervised local and non-local structure preserving projections(SLNSPP), in which both input and output information can be easily and effectively incorporated through a newly defined similarity index. The SLNSPP can not only retain the virtue of locality preserving projections but also prevent faraway points from nearing after projection,which endues SLNSPP with powerful discriminating ability. Such two good properties of SLNSPP are desirable for JITL as they are expected to enhance the accuracy of similar sample selection. Consequently, we present a SLNSPP-JITL framework for developing adaptive soft sensor, including a sparse learning strategy to limit the scale and update the frequency of database. Finally, two case studies are conducted with benchmark datasets to evaluate the performance of the proposed schemes. The results demonstrate the effectiveness of LNSPP and SLNSPP.展开更多
Interact traffic classification is vital to the areas of network operation and management. Traditional classification methods such as port mapping and payload analysis are becoming increasingly difficult as newly emer...Interact traffic classification is vital to the areas of network operation and management. Traditional classification methods such as port mapping and payload analysis are becoming increasingly difficult as newly emerged applications (e. g. Peer-to-Peer) using dynamic port numbers, masquerading techniques and encryption to avoid detection. This paper presents a machine learning (ML) based traffic classifica- tion scheme, which offers solutions to a variety of network activities and provides a platform of performance evaluation for the classifiers. The impact of dataset size, feature selection, number of application types and ML algorithm selection on classification performance is analyzed and demonstrated by the following experiments: (1) The genetic algorithm based feature selection can dramatically reduce the cost without diminishing classification accuracy. (2) The chosen ML algorithms can achieve high classification accuracy. Particularly, REPTree and C4.5 outperform the other ML algorithms when computational complexity and accuracy are both taken into account. (3) Larger dataset and fewer application types would result in better classification accuracy. Finally, early detection with only several initial packets is proposed for real-time network activity and it is proved to be feasible according to the preliminary results.展开更多
Aiming at the topic of electroencephalogram (EEG) pattern recognition in brain computer interface (BCI), a classification method based on probabilistic neural network (PNN) with supervised learning is presented ...Aiming at the topic of electroencephalogram (EEG) pattern recognition in brain computer interface (BCI), a classification method based on probabilistic neural network (PNN) with supervised learning is presented in this paper. It applies the recognition rate of training samples to the learning progress of network parameters. The learning vector quantization is employed to group training samples and the Genetic algorithm (GA) is used for training the network' s smoothing parameters and hidden central vector for detemlining hidden neurons. Utilizing the standard dataset I (a) of BCI Competition 2003 and comparing with other classification methods, the experiment results show that the best performance of pattern recognition Js got in this way, and the classification accuracy can reach to 93.8%, which improves over 5% compared with the best result (88.7 % ) of the competition. This technology provides an effective way to EEG classification in practical system of BCI.展开更多
As the fundamental infrastructure of the Internet,the optical network carries a great amount of Internet traffic.There would be great financial losses if some faults happen.Therefore,fault location is very important f...As the fundamental infrastructure of the Internet,the optical network carries a great amount of Internet traffic.There would be great financial losses if some faults happen.Therefore,fault location is very important for the operation and maintenance in optical networks.Due to complex relationships among each network element in topology level,each board in network element level,and each component in board level,the con-crete fault location is hard for traditional method.In recent years,machine learning,es-pecially deep learning,has been applied to many complex problems,because machine learning can find potential non-linear mapping from some inputs to the output.In this paper,we introduce supervised machine learning to propose a complete process for fault location.Firstly,we use data preprocessing,data annotation,and data augmenta-tion in order to process original collected data to build a high-quality dataset.Then,two machine learning algorithms(convolutional neural networks and deep neural networks)are applied on the dataset.The evaluation on commercial optical networks shows that this process helps improve the quality of dataset,and two algorithms perform well on fault location.展开更多
A novel algorithm is presented for supervised inductive learning by integrating a genetic algorithm with hot'tom-up induction process.The hybrid learning algorithm has been implemented in C on a personal computer(...A novel algorithm is presented for supervised inductive learning by integrating a genetic algorithm with hot'tom-up induction process.The hybrid learning algorithm has been implemented in C on a personal computer(386DX/40).The performance of the algorithm has been evaluated by applying it to 11-multiplexer problem and the results show that the algorithm's accuracy is higher than the others[5,12, 13].展开更多
A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed.The proposed method clusters the output data into groups and clusters the input d...A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed.The proposed method clusters the output data into groups and clusters the input data in accordance with the groups of output data.Then,a set of prototypes are selected from the clustered input data.The inessential data can be ultimately discarded from the data set.The proposed method can reduce the effect from outliers because only the prototypes are used.This method is applied to reduce the data set in regression problems.Two standard synthetic data sets and three standard real-world data sets are used for evaluation.The root-mean-square errors are compared from support vector regression models trained with the original data sets and the corresponding instance-reduced data sets.From the experiments,the proposed method provides good results on the reduction and the reconstruction of the standard synthetic and real-world data sets.The numbers of instances of the synthetic data sets are decreased by 25%-69%.The reduction rates for the real-world data sets of the automobile miles per gallon and the 1990 census in CA are 46% and 57%,respectively.The reduction rate of 96% is very good for the electrocardiogram(ECG) data set because of the redundant and periodic nature of ECG signals.For all of the data sets,the regression results are similar to those from the corresponding original data sets.Therefore,the regression performance of the proposed method is good while only a fraction of the data is needed in the training process.展开更多
Soft sensing has been widely used in chemical industry to build an online monitor of the variables which are unmeasurable online or measurable online but with a high cost. One inherent difficulty is insufficiency of t...Soft sensing has been widely used in chemical industry to build an online monitor of the variables which are unmeasurable online or measurable online but with a high cost. One inherent difficulty is insufficiency of the training samples because the labeled data are limited. Besides, the traditional soft-sensing structure has no online correction mechanism. The forecasting result may be incorrect if the working condition is changed. In this work, a semi-supervised learning(SSL) method is proposed to build the soft-sensing model by use of the unlabeled data. Meanwhile, an online correction mechanism is proposed to establish a soft-sensing approach. The mechanism estimates the input variables at each step by a prediction model and calibrates the output variables by a compensation model. The experimental results show that the proposed method has better prediction accuracy and generalization ability than other approaches.展开更多
The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions va...The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions vary widely among individuals,making it challenging to accurately diagnose the disease.This study proposed a deep-learning disease diagnosismodel based onweakly supervised learning and clustering visualization(W_CVNet)that fused classification with segmentation.First,the data were preprocessed.An optimizable weakly supervised segmentation preprocessing method(O-WSSPM)was used to remove redundant data and solve the category imbalance problem.Second,a deep-learning fusion method was used for feature extraction and classification recognition.A dual asymmetric complementary bilinear feature extraction method(D-CBM)was used to fully extract complementary features,which solved the problem of insufficient feature extraction by a single deep learning network.Third,an unsupervised learning method based on Fuzzy C-Means(FCM)clustering was used to segment and visualize COVID-19 lesions enabling physicians to accurately assess lesion distribution and disease severity.In this study,5-fold cross-validation methods were used,and the results showed that the network had an average classification accuracy of 85.8%,outperforming six recent advanced classification models.W_CVNet can effectively help physicians with automated aid in diagnosis to determine if the disease is present and,in the case of COVID-19 patients,to further predict the area of the lesion.展开更多
This study proposes an architecture for the prediction of extremist human behaviour from projected suicide bombings.By linking‘dots’of police data comprising scattered information of people,groups,logistics,location...This study proposes an architecture for the prediction of extremist human behaviour from projected suicide bombings.By linking‘dots’of police data comprising scattered information of people,groups,logistics,locations,communication,and spatiotemporal characters on different social media groups,the proposed architecture will spawn beneficial information.This useful information will,in turn,help the police both in predicting potential terrorist events and in investigating previous events.Furthermore,this architecture will aid in the identification of criminals and their associates and handlers.Terrorism is psychological warfare,which,in the broadest sense,can be defined as the utilisation of deliberate violence for economic,political or religious purposes.In this study,a supervised learning-based approach was adopted to develop the proposed architecture.The dataset was prepared from the suicide bomb blast data of Pakistan obtained from the South Asia Terrorism Portal(SATP).As the proposed architecture was simulated,the supervised learning-based classifiers na飗e Bayes and Hoeffding Tree reached 72.17%accuracy.One of the additional benefits this study offers is the ability to predict the target audience of potential suicide bomb blasts,which may be used to eliminate future threats or,at least,minimise the number of casualties and other property losses.展开更多
We have presented an integrated approach based on supervised and unsupervised learning tech- nique to improve the accuracy of six predictive models. They are developed to predict outcome of tuberculosis treatment cour...We have presented an integrated approach based on supervised and unsupervised learning tech- nique to improve the accuracy of six predictive models. They are developed to predict outcome of tuberculosis treatment course and their accuracy needs to be improved as they are not precise as much as necessary. The integrated supervised and unsupervised learning method (ISULM) has been proposed as a new way to improve model accuracy. The dataset of 6450 Iranian TB patients under DOTS therapy was applied to initially select the significant predictors and then develop six predictive models using decision tree, Bayesian network, logistic regression, multilayer perceptron, radial basis function, and support vector machine algorithms. Developed models have integrated with k-mean clustering analysis to calculate more accurate predicted outcome of tuberculosis treatment course. Obtained results, then, have been evaluated to compare prediction accuracy before and after ISULM application. Recall, Precision, F-measure, and ROC area are other criteria used to assess the models validity as well as change percentage to show how different are models before and after ISULM. ISULM led to improve the prediction accuracy for all applied classifiers ranging between 4% and 10%. The most and least improvement for prediction accuracy were shown by logistic regression and support vector machine respectively. Pre-learning by k- mean clustering to relocate the objects and put similar cases in the same group can improve the classification accuracy in the process of integrating supervised and unsupervised learning.展开更多
文摘In the field of optoelectronics,certain types of data may be difficult to accurately annotate,such as high-resolution optoelectronic imaging or imaging in certain special spectral ranges.Weakly supervised learning can provide a more reliable approach in these situations.Current popular approaches mainly adopt the classification-based class activation maps(CAM)as initial pseudo labels to solve the task.
基金supported by the Natural Science Foundation Project of Fujian Province,China(Grant No.2023J011439 and No.2019J01859).
文摘With the rapid urbanization and exponential population growth in China,two-wheeled vehicles have become a popular mode of transportation,particularly for short-distance travel.However,due to a lack of safety awareness,traffic violations by two-wheeled vehicle riders have become a widespread concern,contributing to urban traffic risks.Currently,significant human and material resources are being allocated to monitor and intercept non-compliant riders to ensure safe driving behavior.To enhance the safety,efficiency,and cost-effectiveness of traffic monitoring,automated detection systems based on image processing algorithms can be employed to identify traffic violations from eye-level video footage.In this study,we propose a robust detection algorithm specifically designed for two-wheeled vehicles,which serves as a fundamental step toward intelligent traffic monitoring.Our approach integrates a novel convolutional and attention mechanism to improve detection accuracy and efficiency.Additionally,we introduce a semi-supervised training strategy that leverages a large number of unlabeled images to enhance the model’s learning capability by extracting valuable background information.This method enables the model to generalize effectively to diverse urban environments and varying lighting conditions.We evaluate our proposed algorithm on a custom-built dataset,and experimental results demonstrate its superior performance,achieving an average precision(AP)of 95%and a recall(R)of 90.6%.Furthermore,the model maintains a computational efficiency of only 25.7 GFLOPs while achieving a high processing speed of 249 FPS,making it highly suitable for deployment on edge devices.Compared to existing detection methods,our approach significantly enhances the accuracy and robustness of two-wheeled vehicle identification while ensuring real-time performance.
基金This research was funded by the National Natural Science Foundation of China(21878124,31771680 and 61773182).
文摘Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.
基金partially funded by the National Natural Science Foundation of China (Grants 51520105005 and U1663208)
文摘This study proposes a supervised learning method that does not rely on labels.We use variables associated with the label as indirect labels,and construct an indirect physics-constrained loss based on the physical mechanism to train the model.In the training process,the model prediction is mapped to the space of value that conforms to the physical mechanism through the projection matrix,and then the model is trained based on the indirect labels.The final prediction result of the model conforms to the physical mechanism between indirect label and label,and also meets the constraints of the indirect label.The present study also develops projection matrix normalization and prediction covariance analysis to ensure that the model can be fully trained.Finally,the effect of the physics-constrained indirect supervised learning is verified based on a well log generation problem.
基金supported by the National Key Laboratory of Science and Technology on Aerodynamic Design and Research Foundation, China
文摘Transition prediction has always been a frontier issue in the field of aerodynamics.A supervised learning model with probability interpretation for transition judgment based on experimental data was developed in this paper.It solved the shortcomings of the point detection method in the experiment,that which was often only one transition point could be obtained,and comparison of multi-point data was necessary.First,the Variable-Interval Time Average(VITA)method was used to transform the fluctuating pressure signal measured on the airfoil surface into a sequence of states which was described by Markov chain model.Second,a feature vector consisting of one-step transition matrix and its stationary distribution was extracted.Then,the Hidden Markov Model(HMM)was used to pre-classify the feature vectors marked using the traditional Root Mean Square(RMS)criteria.Finally,a classification model with probability interpretation was established,and the cross-validation method was used for model validation.The research results show that the developed model is effective and reliable,and it has strong Reynolds number generalization ability.The developed model was theoretically analyzed in depth,and the effect of parameters on the model was studied in detail.Compared with the traditional RMS criterion,a reasonable transition zone can be obtained using the developed classification model.In addition,the developed model does not require comparison of multi-point data.The developed supervised learning model provides new ideas for the transition detection in flight experiments and other experiments.
文摘Rare labeled data are difficult to recognize by using conventional methods in the process of radar emitter recogni-tion.To solve this problem,an optimized cooperative semi-supervised learning radar emitter recognition method based on a small amount of labeled data is developed.First,a small amount of labeled data are randomly sampled by using the bootstrap method,loss functions for three common deep learning net-works are improved,the uniform distribution and cross-entropy function are combined to reduce the overconfidence of softmax classification.Subsequently,the dataset obtained after sam-pling is adopted to train three improved networks so as to build the initial model.In addition,the unlabeled data are preliminarily screened through dynamic time warping(DTW)and then input into the initial model trained previously for judgment.If the judg-ment results of two or more networks are consistent,the unla-beled data are labeled and put into the labeled data set.Lastly,the three network models are input into the labeled dataset for training,and the final model is built.As revealed by the simula-tion results,the semi-supervised learning method adopted in this paper is capable of exploiting a small amount of labeled data and basically achieving the accuracy of labeled data recognition.
文摘Log-linear models and more recently neural network models used forsupervised relation extraction requires substantial amounts of training data andtime, limiting the portability to new relations and domains. To this end, we propose a training representation based on the dependency paths between entities in adependency tree which we call lexicalized dependency paths (LDPs). We showthat this representation is fast, efficient and transparent. We further propose representations utilizing entity types and its subtypes to refine our model and alleviatethe data sparsity problem. We apply lexicalized dependency paths to supervisedlearning using the ACE corpus and show that it can achieve similar performancelevel to other state-of-the-art methods and even surpass them on severalcategories.
文摘In order to solve the problem of automatic defect detection and process control in the welding and arc additive process,the paper monitors the current,voltage,audio,and other data during the welding process and extracts the minimum value,standard deviation,deviation from the voltage and current data.It extracts spectral features such as root mean square,spectral centroid,and zero-crossing rate from audio data,fuses the features extracted from multiple sensor signals,and establishes multiple machine learning supervised and unsupervised models.They are used to detect abnormalities in the welding process.The experimental results show that the established multiple machine learning models have high accuracy,among which the supervised learning model,the balanced accuracy of Ada boost is 0.957,and the unsupervised learning model Isolation Forest has a balanced accuracy of 0.909.
基金supported by the Natural Science Foundation of China(No.51675089).
文摘In aerospace industry,gears are the most common parts of a mechanical transmission system.Gear pitting faults could cause the transmission system to crash and give rise to safety disaster.It is always a challenging problem to diagnose the gear pitting condition directly through the raw signal of vibration.In this paper,a novel method named augmented deep sparse autoencoder(ADSAE)is proposed.The method can be used to diagnose the gear pitting fault with relatively few raw vibration signal data.This method is mainly based on the theory of pitting fault diagnosis and creatively combines with both data augmentation ideology and the deep sparse autoencoder algorithm for the fault diagnosis of gear wear.The effectiveness of the proposed method is validated by experiments of six types of gear pitting conditions.The results show that the ADSAE method can effectively increase the network generalization ability and robustness with very high accuracy.This method can effectively diagnose different gear pitting conditions and show the obvious trend according to the severity of gear wear faults.The results obtained by the ADSAE method proposed in this paper are compared with those obtained by other common deep learning methods.This paper provides an important insight into the field of gear fault diagnosis based on deep learning and has a potential practical application value.
基金Supported by the National Natural Science Foundation of China(61273160)the Fundamental Research Funds for the Central Universities(14CX06067A,13CX05021A)
文摘In soft sensor field, just-in-time learning(JITL) is an effective approach to model nonlinear and time varying processes. However, most similarity criterions in JITL are computed in the input space only while ignoring important output information, which may lead to inaccurate construction of relevant sample set. To solve this problem, we propose a novel supervised feature extraction method suitable for the regression problem called supervised local and non-local structure preserving projections(SLNSPP), in which both input and output information can be easily and effectively incorporated through a newly defined similarity index. The SLNSPP can not only retain the virtue of locality preserving projections but also prevent faraway points from nearing after projection,which endues SLNSPP with powerful discriminating ability. Such two good properties of SLNSPP are desirable for JITL as they are expected to enhance the accuracy of similar sample selection. Consequently, we present a SLNSPP-JITL framework for developing adaptive soft sensor, including a sparse learning strategy to limit the scale and update the frequency of database. Finally, two case studies are conducted with benchmark datasets to evaluate the performance of the proposed schemes. The results demonstrate the effectiveness of LNSPP and SLNSPP.
基金Supported by the National High Technology Research and Development Programme of China (No. 2005AA121620, 2006AA01Z232)the Zhejiang Provincial Natural Science Foundation of China (No. Y1080935 )the Research Innovation Program for Graduate Students in Jiangsu Province (No. CX07B_ 110zF)
文摘Interact traffic classification is vital to the areas of network operation and management. Traditional classification methods such as port mapping and payload analysis are becoming increasingly difficult as newly emerged applications (e. g. Peer-to-Peer) using dynamic port numbers, masquerading techniques and encryption to avoid detection. This paper presents a machine learning (ML) based traffic classifica- tion scheme, which offers solutions to a variety of network activities and provides a platform of performance evaluation for the classifiers. The impact of dataset size, feature selection, number of application types and ML algorithm selection on classification performance is analyzed and demonstrated by the following experiments: (1) The genetic algorithm based feature selection can dramatically reduce the cost without diminishing classification accuracy. (2) The chosen ML algorithms can achieve high classification accuracy. Particularly, REPTree and C4.5 outperform the other ML algorithms when computational complexity and accuracy are both taken into account. (3) Larger dataset and fewer application types would result in better classification accuracy. Finally, early detection with only several initial packets is proposed for real-time network activity and it is proved to be feasible according to the preliminary results.
基金Supported by the National Natural Science Foundation of China (No. 30570485)the Shanghai "Chen Guang" Project (No. 09CG69).
文摘Aiming at the topic of electroencephalogram (EEG) pattern recognition in brain computer interface (BCI), a classification method based on probabilistic neural network (PNN) with supervised learning is presented in this paper. It applies the recognition rate of training samples to the learning progress of network parameters. The learning vector quantization is employed to group training samples and the Genetic algorithm (GA) is used for training the network' s smoothing parameters and hidden central vector for detemlining hidden neurons. Utilizing the standard dataset I (a) of BCI Competition 2003 and comparing with other classification methods, the experiment results show that the best performance of pattern recognition Js got in this way, and the classification accuracy can reach to 93.8%, which improves over 5% compared with the best result (88.7 % ) of the competition. This technology provides an effective way to EEG classification in practical system of BCI.
文摘As the fundamental infrastructure of the Internet,the optical network carries a great amount of Internet traffic.There would be great financial losses if some faults happen.Therefore,fault location is very important for the operation and maintenance in optical networks.Due to complex relationships among each network element in topology level,each board in network element level,and each component in board level,the con-crete fault location is hard for traditional method.In recent years,machine learning,es-pecially deep learning,has been applied to many complex problems,because machine learning can find potential non-linear mapping from some inputs to the output.In this paper,we introduce supervised machine learning to propose a complete process for fault location.Firstly,we use data preprocessing,data annotation,and data augmenta-tion in order to process original collected data to build a high-quality dataset.Then,two machine learning algorithms(convolutional neural networks and deep neural networks)are applied on the dataset.The evaluation on commercial optical networks shows that this process helps improve the quality of dataset,and two algorithms perform well on fault location.
文摘A novel algorithm is presented for supervised inductive learning by integrating a genetic algorithm with hot'tom-up induction process.The hybrid learning algorithm has been implemented in C on a personal computer(386DX/40).The performance of the algorithm has been evaluated by applying it to 11-multiplexer problem and the results show that the algorithm's accuracy is higher than the others[5,12, 13].
基金supported by Chiang Mai University Research Fund under the contract number T-M5744
文摘A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed.The proposed method clusters the output data into groups and clusters the input data in accordance with the groups of output data.Then,a set of prototypes are selected from the clustered input data.The inessential data can be ultimately discarded from the data set.The proposed method can reduce the effect from outliers because only the prototypes are used.This method is applied to reduce the data set in regression problems.Two standard synthetic data sets and three standard real-world data sets are used for evaluation.The root-mean-square errors are compared from support vector regression models trained with the original data sets and the corresponding instance-reduced data sets.From the experiments,the proposed method provides good results on the reduction and the reconstruction of the standard synthetic and real-world data sets.The numbers of instances of the synthetic data sets are decreased by 25%-69%.The reduction rates for the real-world data sets of the automobile miles per gallon and the 1990 census in CA are 46% and 57%,respectively.The reduction rate of 96% is very good for the electrocardiogram(ECG) data set because of the redundant and periodic nature of ECG signals.For all of the data sets,the regression results are similar to those from the corresponding original data sets.Therefore,the regression performance of the proposed method is good while only a fraction of the data is needed in the training process.
基金the National Natural Science Foundation of China(Nos.61374110 and 61074060)the Specialized Research Fund for the Doctoral Program of Higher Education of China(No.20120073110017)
文摘Soft sensing has been widely used in chemical industry to build an online monitor of the variables which are unmeasurable online or measurable online but with a high cost. One inherent difficulty is insufficiency of the training samples because the labeled data are limited. Besides, the traditional soft-sensing structure has no online correction mechanism. The forecasting result may be incorrect if the working condition is changed. In this work, a semi-supervised learning(SSL) method is proposed to build the soft-sensing model by use of the unlabeled data. Meanwhile, an online correction mechanism is proposed to establish a soft-sensing approach. The mechanism estimates the input variables at each step by a prediction model and calibrates the output variables by a compensation model. The experimental results show that the proposed method has better prediction accuracy and generalization ability than other approaches.
基金funded by the Open Foundation of Anhui EngineeringResearch Center of Intelligent Perception and Elderly Care,Chuzhou University(No.2022OPA03)the Higher EducationNatural Science Foundation of Anhui Province(No.KJ2021B01)and the Innovation Team Projects of Universities in Guangdong(No.2022KCXTD057).
文摘The coronavirus disease 2019(COVID-19)has severely disrupted both human life and the health care system.Timely diagnosis and treatment have become increasingly important;however,the distribution and size of lesions vary widely among individuals,making it challenging to accurately diagnose the disease.This study proposed a deep-learning disease diagnosismodel based onweakly supervised learning and clustering visualization(W_CVNet)that fused classification with segmentation.First,the data were preprocessed.An optimizable weakly supervised segmentation preprocessing method(O-WSSPM)was used to remove redundant data and solve the category imbalance problem.Second,a deep-learning fusion method was used for feature extraction and classification recognition.A dual asymmetric complementary bilinear feature extraction method(D-CBM)was used to fully extract complementary features,which solved the problem of insufficient feature extraction by a single deep learning network.Third,an unsupervised learning method based on Fuzzy C-Means(FCM)clustering was used to segment and visualize COVID-19 lesions enabling physicians to accurately assess lesion distribution and disease severity.In this study,5-fold cross-validation methods were used,and the results showed that the network had an average classification accuracy of 85.8%,outperforming six recent advanced classification models.W_CVNet can effectively help physicians with automated aid in diagnosis to determine if the disease is present and,in the case of COVID-19 patients,to further predict the area of the lesion.
文摘This study proposes an architecture for the prediction of extremist human behaviour from projected suicide bombings.By linking‘dots’of police data comprising scattered information of people,groups,logistics,locations,communication,and spatiotemporal characters on different social media groups,the proposed architecture will spawn beneficial information.This useful information will,in turn,help the police both in predicting potential terrorist events and in investigating previous events.Furthermore,this architecture will aid in the identification of criminals and their associates and handlers.Terrorism is psychological warfare,which,in the broadest sense,can be defined as the utilisation of deliberate violence for economic,political or religious purposes.In this study,a supervised learning-based approach was adopted to develop the proposed architecture.The dataset was prepared from the suicide bomb blast data of Pakistan obtained from the South Asia Terrorism Portal(SATP).As the proposed architecture was simulated,the supervised learning-based classifiers na飗e Bayes and Hoeffding Tree reached 72.17%accuracy.One of the additional benefits this study offers is the ability to predict the target audience of potential suicide bomb blasts,which may be used to eliminate future threats or,at least,minimise the number of casualties and other property losses.
文摘We have presented an integrated approach based on supervised and unsupervised learning tech- nique to improve the accuracy of six predictive models. They are developed to predict outcome of tuberculosis treatment course and their accuracy needs to be improved as they are not precise as much as necessary. The integrated supervised and unsupervised learning method (ISULM) has been proposed as a new way to improve model accuracy. The dataset of 6450 Iranian TB patients under DOTS therapy was applied to initially select the significant predictors and then develop six predictive models using decision tree, Bayesian network, logistic regression, multilayer perceptron, radial basis function, and support vector machine algorithms. Developed models have integrated with k-mean clustering analysis to calculate more accurate predicted outcome of tuberculosis treatment course. Obtained results, then, have been evaluated to compare prediction accuracy before and after ISULM application. Recall, Precision, F-measure, and ROC area are other criteria used to assess the models validity as well as change percentage to show how different are models before and after ISULM. ISULM led to improve the prediction accuracy for all applied classifiers ranging between 4% and 10%. The most and least improvement for prediction accuracy were shown by logistic regression and support vector machine respectively. Pre-learning by k- mean clustering to relocate the objects and put similar cases in the same group can improve the classification accuracy in the process of integrating supervised and unsupervised learning.