Weakly Supervised Semantic Segmentation(WSSS),which relies only on image-level labels,has attracted significant attention for its cost-effectiveness and scalability.Existing methods mainly enhance inter-class distinct...Weakly Supervised Semantic Segmentation(WSSS),which relies only on image-level labels,has attracted significant attention for its cost-effectiveness and scalability.Existing methods mainly enhance inter-class distinctions and employ data augmentation to mitigate semantic ambiguity and reduce spurious activations.However,they often neglect the complex contextual dependencies among image patches,resulting in incomplete local representations and limited segmentation accuracy.To address these issues,we propose the Context Patch Fusion with Class Token Enhancement(CPF-CTE)framework,which exploits contextual relations among patches to enrich feature repre-sentations and improve segmentation.At its core,the Contextual-Fusion Bidirectional Long Short-Term Memory(CF-BiLSTM)module captures spatial dependencies between patches and enables bidirectional information flow,yield-ing a more comprehensive understanding of spatial correlations.This strengthens feature learning and segmentation robustness.Moreover,we introduce learnable class tokens that dynamically encode and refine class-specific semantics,enhancing discriminative capability.By effectively integrating spatial and semantic cues,CPF-CTE produces richer and more accurate representations of image content.Extensive experiments on PASCAL VOC 2012 and MS COCO 2014 validate that CPF-CTE consistently surpasses prior WSSS methods.展开更多
Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classificati...Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classification.This limitation becomes critical when hidden secondary cues—potentially more meaningful than the visualized ones—remain undiscovered.This study introduces CasCAM(Cascaded Class Activation Mapping)to address this fundamental limitation through counterfactual reasoning.By asking“if this dominant cue were absent,what other evidence would the model use?”,CasCAM progressively masks the most salient features and systematically uncovers the hierarchy of classification evidence hidden beneath them.Experimental results demonstrate that CasCAM effectively discovers the full spectrum of reasoning evidence and can be universally applied with nine existing interpretation methods.展开更多
Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting...Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.展开更多
In image analysis,high-precision semantic segmentation predominantly relies on supervised learning.Despite significant advancements driven by deep learning techniques,challenges such as class imbalance and dynamic per...In image analysis,high-precision semantic segmentation predominantly relies on supervised learning.Despite significant advancements driven by deep learning techniques,challenges such as class imbalance and dynamic performance evaluation persist.Traditional weighting methods,often based on pre-statistical class counting,tend to overemphasize certain classes while neglecting others,particularly rare sample categories.Approaches like focal loss and other rare-sample segmentation techniques introduce multiple hyperparameters that require manual tuning,leading to increased experimental costs due to their instability.This paper proposes a novel CAWASeg framework to address these limitations.Our approach leverages Grad-CAM technology to generate class activation maps,identifying key feature regions that the model focuses on during decision-making.We introduce a Comprehensive Segmentation Performance Score(CSPS)to dynamically evaluate model performance by converting these activation maps into pseudo mask and comparing them with Ground Truth.Additionally,we design two adaptive weights for each class:a Basic Weight(BW)and a Ratio Weight(RW),which the model adjusts during training based on real-time feedback.Extensive experiments on the COCO-Stuff,CityScapes,and ADE20k datasets demonstrate that our CAWASeg framework significantly improves segmentation performance for rare sample categories while enhancing overall segmentation accuracy.The proposed method offers a robust and efficient solution for addressing class imbalance in semantic segmentation tasks.展开更多
Legal case classification involves the categorization of legal documents into predefined categories,which facilitates legal information retrieval and case management.However,real-world legal datasets often suffer from...Legal case classification involves the categorization of legal documents into predefined categories,which facilitates legal information retrieval and case management.However,real-world legal datasets often suffer from class imbalances due to the uneven distribution of case types across legal domains.This leads to biased model performance,in the form of high accuracy for overrepresented categories and underperformance for minority classes.To address this issue,in this study,we propose a data augmentation method that masks unimportant terms within a document selectively while preserving key terms fromthe perspective of the legal domain.This approach enhances data diversity and improves the generalization capability of conventional models.Our experiments demonstrate consistent improvements achieved by the proposed augmentation strategy in terms of accuracy and F1 score across all models,validating the effectiveness of the proposed method in legal case classification.展开更多
The high polymorphism of histocompatibility complex class Ⅱ(MHC-Ⅱ)alleles and limited immunopeptidomic data hinder pan-species epitope prediction.In this study,leveraging the predictive power of AlphaFold(AF)and the...The high polymorphism of histocompatibility complex class Ⅱ(MHC-Ⅱ)alleles and limited immunopeptidomic data hinder pan-species epitope prediction.In this study,leveraging the predictive power of AlphaFold(AF)and the conserved structural features of the core region of MHC-Ⅱ-binding peptides,derived from a comprehensive analysis of MHC-Ⅱ structure data in the PDB database,we developed a new tool,AF-prediction(AF-pred),with explicit quantitative criteria for MHC-Ⅱ-restricted epitope prediction.We validated AF-pred across human,porcine,bovine,and bat MHC-Ⅱ molecules through large-scale in silico analyses using known immunopeptidome datasets(1000 positive and 1000 negative antigenic peptides),together with in vitro binding assays and crystallographic characterization of newly predicted epitopes.Using uncharacterized bat MHC-Ⅱ structures,we demonstrated that AF-pred’s amino-acid interaction prediction underpins its pan-prediction capability and the underlying rationale of the method.Conversely,this characteristic limits the prediction of atypical MHC-Ⅱ peptide-binding modes.Compared with sequence-based tools,AF-pred demonstrates enhanced cross-species MHC-Ⅱ binding prediction,with higher accuracy and interpretability,and further reveals that iterative AF updates improve AF-pred performance.AF-pred has the potential to facilitate the development of novel T-cell epitope vaccines and advance the“One Health”initiative.展开更多
自动安全换道是车辆实现无人驾驶的关键,为精确识别行驶车辆换道状态,保证行车安全,设计了一种基于多分类支持向量机(Multi-class Support Vector Machine,Multiclass SVM)的车辆换道识别模型。从NGSIM数据集中选取美国101公路车辆轨迹...自动安全换道是车辆实现无人驾驶的关键,为精确识别行驶车辆换道状态,保证行车安全,设计了一种基于多分类支持向量机(Multi-class Support Vector Machine,Multiclass SVM)的车辆换道识别模型。从NGSIM数据集中选取美国101公路车辆轨迹数据进行分类处理,并将车辆换道过程划分为车辆跟驰阶段、车辆换道准备阶段和车辆换道执行阶段。采用网格搜索结合粒子群优化算法(Grid Search-PSO)对SVM模型中惩罚参数C和核参数g进行寻优标定,利用多分类支持向量机换道识别模型对样本数据进行训练和测试,模型测试精度达97.68%。研究表明,模型能够很好地识别车辆在换道过程中的行为状态,为车辆换道阶段的研究提供支持。展开更多
Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discre...Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discrete particle swarm optimization is applied to optimize the feature selection and the LS-SVM parameters. Secondly, cost-con- scious formula is presented for fitness function and it contains in detail training time, recognition accuracy and the feature selection. The CLS-SVM algorithm is presented to increase the performance of the LS-SVM classifier. The new method can select the best fault features in much shorter time and have fewer support vectbrs and better general- ization performance in the application of fault diagnosis of the blast furnace. Thirdly, a gradual change binary tree is established for blast furnace faults diagnosis. It is a multi-class classification method based on center-of-gravity formula distance of cluster. A gradual change classification percentage ia used to select sample randomly. The proposed new metbod raises the sped of diagnosis, optimizes the classifieation scraraey and has good generalization ability for fault diagnosis of the application of blast furnace.展开更多
Hierarchical Support Vector Machine (H-SVM) is faster in training and classification than other usual multi-class SVMs such as "1-V-R"and "1-V-1". In this paper, a new multi-class fault diagnosis algorithm based...Hierarchical Support Vector Machine (H-SVM) is faster in training and classification than other usual multi-class SVMs such as "1-V-R"and "1-V-1". In this paper, a new multi-class fault diagnosis algorithm based on H-SVM is proposed and applied to aero-engine. Before SVM training, the training data are first clustered according to their class-center Euclid distances in some feature spaces. The samples which have close distances are divided into the same sub-classes for training, and this makes the H-SVM have reasonable hierarchical construction and good generalization performance. Instead of the common C-SVM, the v-SVM is selected as the binary classifier, in which the parameter v varies only from 0 to 1 and can be determined more easily. The simulation results show that the designed H-SVMs can fast diagnose the multi-class single faults and combination faults for the gas path components of an aero-engine. The fault classifiers have good diagnosis accuracy and can keep robust even when the measurement inputs are disturbed by noises.展开更多
Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region sam...Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region samples center method with adjustable pruning scale was used to prune data samples. This method could reduce classifierr s training time and testing time. Secondly, ELS-TWSVM was proposed to classify the data samples. By introducing error variable contribution parameter and weight parameter, ELS-TWSVM could restrain the impact of noise sam- ples and have better classification accuracy. Finally, multi-class classification algorithms of ELS-TWSVM were pro- posed by combining ELS-TWSVM and complete binary tree. Some experiments were made on two-dimensional data- sets and strip steel surface defect datasets. The experiments showed that the multi-class classification methods of ELS-TWSVM had higher classification speed and accuracy for the datasets with large-scale, unbalanced and noise samples.展开更多
The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class cla...The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.展开更多
Focusing on strip steel surface defects classification, a novel support vector machine with adjustable hyper-sphere (AHSVM) is formulated. Meanwhile, a new multi-class classification method is proposed. Originated f...Focusing on strip steel surface defects classification, a novel support vector machine with adjustable hyper-sphere (AHSVM) is formulated. Meanwhile, a new multi-class classification method is proposed. Originated from support vector data description, AHSVM adopts hyper-sphere to solve classification problem. AHSVM can obey two principles: the margin maximization and inner-class dispersion minimization. Moreover, the hyper-sphere of AHSVM is adjustable, which makes the final classification hyper-sphere optimal for training dataset. On the other hand, AHSVM is combined with binary tree to solve multi-class classification for steel surface defects. A scheme of samples pruning in mapped feature space is provided, which can reduce the number of training samples under the premise of classification accuracy, resulting in the improvements of classification speed. Finally, some testing experiments are done for eight types of strip steel surface defects. Experimental results show that multi-class AHSVM classifier exhibits satisfactory results in classification accuracy and efficiency.展开更多
Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise...Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise the multi-class classification accuracies. The topology structure of DAG is constructed by rearranging the nodes' sequence in the graph. DAG is equivalent to guided operating SVM on a list, and the classification performance depends on the nodes' sequence in the graph. Jeffries-Matusita distance (JMD) is introduced to estimate the separability of each class, and the implementation list is initialized with all classes organized according to certain sequence in the list. To testify the effectiveness of the proposed method, numerical analysis is conducted on UCI data and hyperspectral data. Meanwhile, comparative studies using standard OAO and DAG classification methods are also conducted and the results illustrate better performance and higher accuracy of the orooosed JMD-DAG method.展开更多
The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wav...The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification.展开更多
The acoustic vibration signal of tank is disassembled into the sum of intrinsic mode function (IMF) by multi-resolution empirical mode decomposition (EMD) method. The instantaneous frequency is obtained, and featu...The acoustic vibration signal of tank is disassembled into the sum of intrinsic mode function (IMF) by multi-resolution empirical mode decomposition (EMD) method. The instantaneous frequency is obtained, and feature transformation matrix is figured out by class scatter matrix. Multi- dimensional scale energy vector is mapped into low-dimensional eigenvector, and classification extraction is realized. This method sufficiently separates of different sound target features. The test result indicates that it is effective.展开更多
Due to e-business' s variety of customers with different navigational patterns and demands, multiclass queuing network is a natural performance model for it. The open multi-class queuing network(QN) models are bas...Due to e-business' s variety of customers with different navigational patterns and demands, multiclass queuing network is a natural performance model for it. The open multi-class queuing network(QN) models are based on the assumption that no service center is saturated as a result of the combined loads of all the classes. Several formulas are used to calculate performance measures, including throughput, residence time, queue length, response time and the average number of requests. The solution technique of closed multi-class QN models is an approximate mean value analysis algorithm (MVA) based on three key equations, because the exact algorithm needs huge time and space requirement. As mixed multi-class QN models, include some open and some closed classes, the open classes should be eliminated to create a closed multi-class QN so that the closed model algorithm can be applied. Some corresponding examples are given to show how to apply the algorithms mentioned in this article. These examples indicate that multi-class QN is a reasonably accurate model of e-business and can be solved efficiently.展开更多
The requirement for guaranteed Quality of Service (QoS) have become very essential since there are numerous network base application is available such as video conferencing, data streaming, data transfer and many more...The requirement for guaranteed Quality of Service (QoS) have become very essential since there are numerous network base application is available such as video conferencing, data streaming, data transfer and many more. This has led to the multi-class switch architecture to cater for the needs for different QoS requirements. The introduction of threshold in multi-class switch to solve the starvation problems in loss sensitive class has increased the mean delay for delay sensitive class. In this research, a new scheduling architecture is introduced to improve mean delay in delay sensitive class when the threshold is active. The proposed architecture has been simulated under uniform and non-uniform traffic to show performance of the switch in terms of mean delay. The results show that the proposed architecture has achieved better performance as compared to Weighted Fair Queueing (WFQ) and Priority Queue (PQ).展开更多
A multi-class method is proposed based on Error Correcting Output Codes algorithm in order to get better performance of attack recognition in Wireless Sensor Networks. Aiming to enhance the accuracy of attack detectio...A multi-class method is proposed based on Error Correcting Output Codes algorithm in order to get better performance of attack recognition in Wireless Sensor Networks. Aiming to enhance the accuracy of attack detection, the multi-class method is constructed with Hadamard matrix and two-class Support Vector Machines. In order to minimize the complexity of the algorithm, sparse coding method is applied in this paper. The comprehensive experimental results show that this modified multi-class method has better attack detection rate compared with other three coding algorithms, and its time efficiency is higher than Hadamard coding algorithm.展开更多
The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treat...The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treatment process in order to provide detailed trace data in medical dispute resolution.Second,IoMT can infiltrate the ongoing treatment and provide timely intelligent decision support to medical staff.This information includes recommendation of similar historical cases,guidance for medical treatment,alerting of hired dispute profiteers etc.The multi-label classification of medical dispute documents(MDDs)plays an important role as a front-end process for intelligent decision support,especially in the recommendation of similar historical cases.However,MDDs usually appear as long texts containing a large amount of redundant information,and there is a serious distribution imbalance in the dataset,which directly leads to weaker classification performance.Accordingly,in this paper,a multi-label classification method based on key sentence extraction is proposed for MDDs.The method is divided into two parts.First,the attention-based hierarchical bi-directional long short-term memory(BiLSTM)model is used to extract key sentences from documents;second,random comprehensive sampling Bagging(RCS-Bagging),which is an ensemble multi-label classification model,is employed to classify MDDs based on key sentence sets.The use of this approach greatly improves the classification performance.Experiments show that the performance of the two models proposed in this paper is remarkably better than that of the baseline methods.展开更多
文摘Weakly Supervised Semantic Segmentation(WSSS),which relies only on image-level labels,has attracted significant attention for its cost-effectiveness and scalability.Existing methods mainly enhance inter-class distinctions and employ data augmentation to mitigate semantic ambiguity and reduce spurious activations.However,they often neglect the complex contextual dependencies among image patches,resulting in incomplete local representations and limited segmentation accuracy.To address these issues,we propose the Context Patch Fusion with Class Token Enhancement(CPF-CTE)framework,which exploits contextual relations among patches to enrich feature repre-sentations and improve segmentation.At its core,the Contextual-Fusion Bidirectional Long Short-Term Memory(CF-BiLSTM)module captures spatial dependencies between patches and enables bidirectional information flow,yield-ing a more comprehensive understanding of spatial correlations.This strengthens feature learning and segmentation robustness.Moreover,we introduce learnable class tokens that dynamically encode and refine class-specific semantics,enhancing discriminative capability.By effectively integrating spatial and semantic cues,CPF-CTE produces richer and more accurate representations of image content.Extensive experiments on PASCAL VOC 2012 and MS COCO 2014 validate that CPF-CTE consistently surpasses prior WSSS methods.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF),funded by the Ministry of Education(RS-2023-00249743).
文摘Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classification.This limitation becomes critical when hidden secondary cues—potentially more meaningful than the visualized ones—remain undiscovered.This study introduces CasCAM(Cascaded Class Activation Mapping)to address this fundamental limitation through counterfactual reasoning.By asking“if this dominant cue were absent,what other evidence would the model use?”,CasCAM progressively masks the most salient features and systematically uncovers the hierarchy of classification evidence hidden beneath them.Experimental results demonstrate that CasCAM effectively discovers the full spectrum of reasoning evidence and can be universally applied with nine existing interpretation methods.
基金National Key Research and Development Program of China,No.2023YFC3006704National Natural Science Foundation of China,No.42171047CAS-CSIRO Partnership Joint Project of 2024,No.177GJHZ2023097MI。
文摘Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.
基金supported by the Funds for Central-Guided Local Science and Technology Development(Grant No.202407AC110005)Key Technologies for the Construction of a Whole-Process Intelligent Service System for Neuroendocrine Neoplasm.Supported by 2023 Opening Research Fund of Yunnan Key Laboratory of Digital Communications(YNJTKFB-20230686,YNKLDC-KFKT-202304).
文摘In image analysis,high-precision semantic segmentation predominantly relies on supervised learning.Despite significant advancements driven by deep learning techniques,challenges such as class imbalance and dynamic performance evaluation persist.Traditional weighting methods,often based on pre-statistical class counting,tend to overemphasize certain classes while neglecting others,particularly rare sample categories.Approaches like focal loss and other rare-sample segmentation techniques introduce multiple hyperparameters that require manual tuning,leading to increased experimental costs due to their instability.This paper proposes a novel CAWASeg framework to address these limitations.Our approach leverages Grad-CAM technology to generate class activation maps,identifying key feature regions that the model focuses on during decision-making.We introduce a Comprehensive Segmentation Performance Score(CSPS)to dynamically evaluate model performance by converting these activation maps into pseudo mask and comparing them with Ground Truth.Additionally,we design two adaptive weights for each class:a Basic Weight(BW)and a Ratio Weight(RW),which the model adjusts during training based on real-time feedback.Extensive experiments on the COCO-Stuff,CityScapes,and ADE20k datasets demonstrate that our CAWASeg framework significantly improves segmentation performance for rare sample categories while enhancing overall segmentation accuracy.The proposed method offers a robust and efficient solution for addressing class imbalance in semantic segmentation tasks.
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)[RS-2021-II211341,Artificial Intelligence Graduate School Program(Chung-Ang University)],and by the Chung-Ang University Graduate Research Scholarship in 2024.
文摘Legal case classification involves the categorization of legal documents into predefined categories,which facilitates legal information retrieval and case management.However,real-world legal datasets often suffer from class imbalances due to the uneven distribution of case types across legal domains.This leads to biased model performance,in the form of high accuracy for overrepresented categories and underperformance for minority classes.To address this issue,in this study,we propose a data augmentation method that masks unimportant terms within a document selectively while preserving key terms fromthe perspective of the legal domain.This approach enhances data diversity and improves the generalization capability of conventional models.Our experiments demonstrate consistent improvements achieved by the proposed augmentation strategy in terms of accuracy and F1 score across all models,validating the effectiveness of the proposed method in legal case classification.
基金supported by the National Key Research and Development Program of China(grant number 2021YFD1800100 to N.Z.)the National Natural Science Foundation of China(grant number 32172871 to N.Z.)the 2115 Talent Development Program of China Agricultural University to N.Z.This study was supported by High-performance Computing Platform of China Agricultural University.
文摘The high polymorphism of histocompatibility complex class Ⅱ(MHC-Ⅱ)alleles and limited immunopeptidomic data hinder pan-species epitope prediction.In this study,leveraging the predictive power of AlphaFold(AF)and the conserved structural features of the core region of MHC-Ⅱ-binding peptides,derived from a comprehensive analysis of MHC-Ⅱ structure data in the PDB database,we developed a new tool,AF-prediction(AF-pred),with explicit quantitative criteria for MHC-Ⅱ-restricted epitope prediction.We validated AF-pred across human,porcine,bovine,and bat MHC-Ⅱ molecules through large-scale in silico analyses using known immunopeptidome datasets(1000 positive and 1000 negative antigenic peptides),together with in vitro binding assays and crystallographic characterization of newly predicted epitopes.Using uncharacterized bat MHC-Ⅱ structures,we demonstrated that AF-pred’s amino-acid interaction prediction underpins its pan-prediction capability and the underlying rationale of the method.Conversely,this characteristic limits the prediction of atypical MHC-Ⅱ peptide-binding modes.Compared with sequence-based tools,AF-pred demonstrates enhanced cross-species MHC-Ⅱ binding prediction,with higher accuracy and interpretability,and further reveals that iterative AF updates improve AF-pred performance.AF-pred has the potential to facilitate the development of novel T-cell epitope vaccines and advance the“One Health”initiative.
基金Item Sponsored by National Natural Science Foundation of China(60843007,61050006)
文摘Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discrete particle swarm optimization is applied to optimize the feature selection and the LS-SVM parameters. Secondly, cost-con- scious formula is presented for fitness function and it contains in detail training time, recognition accuracy and the feature selection. The CLS-SVM algorithm is presented to increase the performance of the LS-SVM classifier. The new method can select the best fault features in much shorter time and have fewer support vectbrs and better general- ization performance in the application of fault diagnosis of the blast furnace. Thirdly, a gradual change binary tree is established for blast furnace faults diagnosis. It is a multi-class classification method based on center-of-gravity formula distance of cluster. A gradual change classification percentage ia used to select sample randomly. The proposed new metbod raises the sped of diagnosis, optimizes the classifieation scraraey and has good generalization ability for fault diagnosis of the application of blast furnace.
基金University Science Foundation of Jiangsu Province (04KJD510018)
文摘Hierarchical Support Vector Machine (H-SVM) is faster in training and classification than other usual multi-class SVMs such as "1-V-R"and "1-V-1". In this paper, a new multi-class fault diagnosis algorithm based on H-SVM is proposed and applied to aero-engine. Before SVM training, the training data are first clustered according to their class-center Euclid distances in some feature spaces. The samples which have close distances are divided into the same sub-classes for training, and this makes the H-SVM have reasonable hierarchical construction and good generalization performance. Instead of the common C-SVM, the v-SVM is selected as the binary classifier, in which the parameter v varies only from 0 to 1 and can be determined more easily. The simulation results show that the designed H-SVMs can fast diagnose the multi-class single faults and combination faults for the gas path components of an aero-engine. The fault classifiers have good diagnosis accuracy and can keep robust even when the measurement inputs are disturbed by noises.
基金Item Sponsored by National Natural Science Foundation of China(61050006)
文摘Considering strip steel surface defect samples, a multi-class classification method was proposed based on enhanced least squares twin support vector machines (ELS-TWSVMs) and binary tree. Firstly, pruning region samples center method with adjustable pruning scale was used to prune data samples. This method could reduce classifierr s training time and testing time. Secondly, ELS-TWSVM was proposed to classify the data samples. By introducing error variable contribution parameter and weight parameter, ELS-TWSVM could restrain the impact of noise sam- ples and have better classification accuracy. Finally, multi-class classification algorithms of ELS-TWSVM were pro- posed by combining ELS-TWSVM and complete binary tree. Some experiments were made on two-dimensional data- sets and strip steel surface defect datasets. The experiments showed that the multi-class classification methods of ELS-TWSVM had higher classification speed and accuracy for the datasets with large-scale, unbalanced and noise samples.
基金supported by the National Natural Science Foundation of China(61703131 61703129+1 种基金 61701148 61703128)
文摘The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.
文摘Focusing on strip steel surface defects classification, a novel support vector machine with adjustable hyper-sphere (AHSVM) is formulated. Meanwhile, a new multi-class classification method is proposed. Originated from support vector data description, AHSVM adopts hyper-sphere to solve classification problem. AHSVM can obey two principles: the margin maximization and inner-class dispersion minimization. Moreover, the hyper-sphere of AHSVM is adjustable, which makes the final classification hyper-sphere optimal for training dataset. On the other hand, AHSVM is combined with binary tree to solve multi-class classification for steel surface defects. A scheme of samples pruning in mapped feature space is provided, which can reduce the number of training samples under the premise of classification accuracy, resulting in the improvements of classification speed. Finally, some testing experiments are done for eight types of strip steel surface defects. Experimental results show that multi-class AHSVM classifier exhibits satisfactory results in classification accuracy and efficiency.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61201310)the Fundamental Research Funds for the Central Universities(Grant No.HIT.NSRIF.201160)the China Postdoctoral Science Foundation(Grant No.20110491067)
文摘Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise the multi-class classification accuracies. The topology structure of DAG is constructed by rearranging the nodes' sequence in the graph. DAG is equivalent to guided operating SVM on a list, and the classification performance depends on the nodes' sequence in the graph. Jeffries-Matusita distance (JMD) is introduced to estimate the separability of each class, and the implementation list is initialized with all classes organized according to certain sequence in the list. To testify the effectiveness of the proposed method, numerical analysis is conducted on UCI data and hyperspectral data. Meanwhile, comparative studies using standard OAO and DAG classification methods are also conducted and the results illustrate better performance and higher accuracy of the orooosed JMD-DAG method.
文摘The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification.
文摘The acoustic vibration signal of tank is disassembled into the sum of intrinsic mode function (IMF) by multi-resolution empirical mode decomposition (EMD) method. The instantaneous frequency is obtained, and feature transformation matrix is figured out by class scatter matrix. Multi- dimensional scale energy vector is mapped into low-dimensional eigenvector, and classification extraction is realized. This method sufficiently separates of different sound target features. The test result indicates that it is effective.
文摘Due to e-business' s variety of customers with different navigational patterns and demands, multiclass queuing network is a natural performance model for it. The open multi-class queuing network(QN) models are based on the assumption that no service center is saturated as a result of the combined loads of all the classes. Several formulas are used to calculate performance measures, including throughput, residence time, queue length, response time and the average number of requests. The solution technique of closed multi-class QN models is an approximate mean value analysis algorithm (MVA) based on three key equations, because the exact algorithm needs huge time and space requirement. As mixed multi-class QN models, include some open and some closed classes, the open classes should be eliminated to create a closed multi-class QN so that the closed model algorithm can be applied. Some corresponding examples are given to show how to apply the algorithms mentioned in this article. These examples indicate that multi-class QN is a reasonably accurate model of e-business and can be solved efficiently.
文摘The requirement for guaranteed Quality of Service (QoS) have become very essential since there are numerous network base application is available such as video conferencing, data streaming, data transfer and many more. This has led to the multi-class switch architecture to cater for the needs for different QoS requirements. The introduction of threshold in multi-class switch to solve the starvation problems in loss sensitive class has increased the mean delay for delay sensitive class. In this research, a new scheduling architecture is introduced to improve mean delay in delay sensitive class when the threshold is active. The proposed architecture has been simulated under uniform and non-uniform traffic to show performance of the switch in terms of mean delay. The results show that the proposed architecture has achieved better performance as compared to Weighted Fair Queueing (WFQ) and Priority Queue (PQ).
文摘A multi-class method is proposed based on Error Correcting Output Codes algorithm in order to get better performance of attack recognition in Wireless Sensor Networks. Aiming to enhance the accuracy of attack detection, the multi-class method is constructed with Hadamard matrix and two-class Support Vector Machines. In order to minimize the complexity of the algorithm, sparse coding method is applied in this paper. The comprehensive experimental results show that this modified multi-class method has better attack detection rate compared with other three coding algorithms, and its time efficiency is higher than Hadamard coding algorithm.
基金supported by the National Key R&D Program of China(2018YFC0830200,Zhang,B,www.most.gov.cn)the Fundamental Research Funds for the Central Universities(2242018S30021 and 2242017S30023,Zhou S,www.seu.edu.cn)the Open Research Fund from Key Laboratory of Computer Network and Information Integration In Southeast University,Ministry of Education,China(3209012001C3,Zhang B,www.seu.edu.cn).
文摘The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treatment process in order to provide detailed trace data in medical dispute resolution.Second,IoMT can infiltrate the ongoing treatment and provide timely intelligent decision support to medical staff.This information includes recommendation of similar historical cases,guidance for medical treatment,alerting of hired dispute profiteers etc.The multi-label classification of medical dispute documents(MDDs)plays an important role as a front-end process for intelligent decision support,especially in the recommendation of similar historical cases.However,MDDs usually appear as long texts containing a large amount of redundant information,and there is a serious distribution imbalance in the dataset,which directly leads to weaker classification performance.Accordingly,in this paper,a multi-label classification method based on key sentence extraction is proposed for MDDs.The method is divided into two parts.First,the attention-based hierarchical bi-directional long short-term memory(BiLSTM)model is used to extract key sentences from documents;second,random comprehensive sampling Bagging(RCS-Bagging),which is an ensemble multi-label classification model,is employed to classify MDDs based on key sentence sets.The use of this approach greatly improves the classification performance.Experiments show that the performance of the two models proposed in this paper is remarkably better than that of the baseline methods.