期刊文献+
共找到3,290篇文章
< 1 2 165 >
每页显示 20 50 100
Gate-Attention and Dual-End Enhancement Mechanism for Multi-Label Text Classification 被引量:1
1
作者 Jieren Cheng Xiaolong Chen +3 位作者 Wenghang Xu Shuai Hua Zhu Tang Victor S.Sheng 《Computers, Materials & Continua》 SCIE EI 2023年第11期1779-1793,共15页
In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in sema... In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in semantic feature extraction have turned to external knowledge to augment the model’s grasp of textual content,often overlooking intrinsic textual cues such as label statistical features.In contrast,these endogenous insights naturally align with the classification task.In our paper,to complement this focus on intrinsic knowledge,we introduce a novel Gate-Attention mechanism.This mechanism adeptly integrates statistical features from the text itself into the semantic fabric,enhancing the model’s capacity to understand and represent the data.Additionally,to address the intricate task of mining label correlations,we propose a Dual-end enhancement mechanism.This mechanism effectively mitigates the challenges of information loss and erroneous transmission inherent in traditional long short term memory propagation.We conducted an extensive battery of experiments on the AAPD and RCV1-2 datasets.These experiments serve the dual purpose of confirming the efficacy of both the Gate-Attention mechanism and the Dual-end enhancement mechanism.Our final model unequivocally outperforms the baseline model,attesting to its robustness.These findings emphatically underscore the imperativeness of taking into account not just external knowledge but also the inherent intricacies of textual data when crafting potent MLTC models. 展开更多
关键词 multi-label text classification feature extraction label distribution information sequence generation
在线阅读 下载PDF
Multi-Label Machine Learning Classification of Cardiovascular Diseases
2
作者 Chih-Ta Yen Jung-Ren Wong Chia-Hsang Chang 《Computers, Materials & Continua》 2025年第7期347-363,共17页
In its 2023 global health statistics,the World Health Organization noted that noncommunicable diseases(NCDs)remain the leading cause of disease burden worldwide,with cardiovascular diseases(CVDs)resulting in more deat... In its 2023 global health statistics,the World Health Organization noted that noncommunicable diseases(NCDs)remain the leading cause of disease burden worldwide,with cardiovascular diseases(CVDs)resulting in more deaths than the three other major NCDs combined.In this study,we developed a method that can comprehensively detect which CVDs are present in a patient.Specifically,we propose a multi-label classification method that utilizes photoplethysmography(PPG)signals and physiological characteristics from public datasets to classify four types of CVDs and related conditions:hypertension,diabetes,cerebral infarction,and cerebrovascular disease.Our approach to multi-disease classification of cardiovascular diseases(CVDs)using PPG signals achieves the highest classification performance when encompassing the broadest range of disease categories,thereby offering a more comprehensive assessment of human health.We employ a multi-label classification strategy to simultaneously predict the presence or absence of multiple diseases.Specifically,we first apply the Savitzky-Golay(S-G)filter to the PPG signals to reduce noise and then transform into statistical features.We integrate processed PPG signals with individual physiological features as a multimodal input,thereby expanding the learned feature space.Notably,even with a simple machine learning method,this approach can achieve relatively high accuracy.The proposed method achieved a maximum F1-score of 0.91,minimum Hamming loss of 0.04,and an accuracy of 0.95.Thus,our method represents an effective and rapid solution for detecting multiple diseases simultaneously,which is beneficial for comprehensively managing CVDs. 展开更多
关键词 PHOTOPLETHYSMOGRAPHY machine learning health management multi-label classification cardiovascu-lar disease
在线阅读 下载PDF
Multi-Label Movie Genre Classification with Attention Mechanism on Movie Plots
3
作者 Faheem Shaukat Naveed Ejaz +3 位作者 Rashid Kamal Tamim Alkhalifah Sheraz Aslam Mu Mu 《Computers, Materials & Continua》 2025年第6期5595-5622,共28页
Automated and accurate movie genre classification is crucial for content organization,recommendation systems,and audience targeting in the film industry.Although most existing approaches focus on audiovisual features ... Automated and accurate movie genre classification is crucial for content organization,recommendation systems,and audience targeting in the film industry.Although most existing approaches focus on audiovisual features such as trailers and posters,the text-based classification remains underexplored despite its accessibility and semantic richness.This paper introduces the Genre Attention Model(GAM),a deep learning architecture that integrates transformer models with a hierarchical attention mechanism to extract and leverage contextual information from movie plots formulti-label genre classification.In order to assess its effectiveness,we assessmultiple transformer-based models,including Bidirectional Encoder Representations fromTransformers(BERT),ALite BERT(ALBERT),Distilled BERT(DistilBERT),Robustly Optimized BERT Pretraining Approach(RoBERTa),Efficiently Learning an Encoder that Classifies Token Replacements Accurately(ELECTRA),eXtreme Learning Network(XLNet)and Decodingenhanced BERT with Disentangled Attention(DeBERTa).Experimental results demonstrate the superior performance of DeBERTa-based GAM,which employs a two-tier hierarchical attention mechanism:word-level attention highlights key terms,while sentence-level attention captures critical narrative segments,ensuring a refined and interpretable representation of movie plots.Evaluated on three benchmark datasets Trailers12K,Large Movie Trailer Dataset-9(LMTD-9),and MovieLens37K.GAM achieves micro-average precision scores of 83.63%,83.32%,and 83.34%,respectively,surpassing state-of-the-artmodels.Additionally,GAMis computationally efficient,requiring just 6.10Giga Floating Point Operations Per Second(GFLOPS),making it a scalable and cost-effective solution.These results highlight the growing potential of text-based deep learning models in genre classification and GAM’s effectiveness in improving predictive accuracy while maintaining computational efficiency.With its robust performance,GAM offers a versatile and scalable framework for content recommendation,film indexing,and media analytics,providing an interpretable alternative to traditional audiovisual-based classification techniques. 展开更多
关键词 multi-label classification artificial intelligence movie genre classification hierarchical attention mechanisms natural language processing content recommendation text-based genre classification explainable AI(Artificial Intelligence) transformer models BERT
在线阅读 下载PDF
Multi-label text classification model based on semantic embedding 被引量:4
4
作者 Yan Danfeng Ke Nan +2 位作者 Gu Chao Cui Jianfei Ding Yiqi 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2019年第1期95-104,共10页
Text classification means to assign a document to one or more classes or categories according to content. Text classification provides convenience for users to obtain data. Because of the polysemy of text data, multi-... Text classification means to assign a document to one or more classes or categories according to content. Text classification provides convenience for users to obtain data. Because of the polysemy of text data, multi-label classification can handle text data more comprehensively. Multi-label text classification become the key problem in the data mining. To improve the performances of multi-label text classification, semantic analysis is embedded into the classification model to complete label correlation analysis, and the structure, objective function and optimization strategy of this model is designed. Then, the convolution neural network(CNN) model based on semantic embedding is introduced. In the end, Zhihu dataset is used for evaluation. The result shows that this model outperforms the related work in terms of recall and area under curve(AUC) metrics. 展开更多
关键词 multi-label text classification CONVOLUTION NEURAL network SEMANTIC analysis
原文传递
Robust Multi-Label Cartoon Character Classification on the Novel Kral Sakir Dataset Using Deep Learning Techniques
5
作者 Candan Tumer Erdal Guvenoglu Volkan Tunali 《Computers, Materials & Continua》 2025年第12期5135-5158,共24页
Automated cartoon character recognition is crucial for applications in content indexing,filtering,and copyright protection,yet it faces a significant challenge in animated media due to high intra-class visual variabil... Automated cartoon character recognition is crucial for applications in content indexing,filtering,and copyright protection,yet it faces a significant challenge in animated media due to high intra-class visual variability,where characters frequently alter their appearance.To address this problem,we introduce the novel Kral Sakir dataset,a public benchmark of 16,725 images specifically curated for the task of multi-label cartoon character classification under these varied conditions.This paper conducts a comprehensive benchmark study,evaluating the performance of state-of-the-art pretrained Convolutional Neural Networks(CNNs),including DenseNet,ResNet,and VGG,against a custom baseline model trained from scratch.Our experiments,evaluated using metrics of F1-Score,accuracy,and Area Under the ROC Curve(AUC),demonstrate that fine-tuning pretrained models is a highly effective strategy.The best-performing model,DenseNet121,achieved an F1-Score of 0.9890 and an accuracy of 0.9898,significantly outperforming our baseline CNN(F1-Score of 0.9545).The findings validate the power of transfer learning for this domain and establish a strong performance benchmark.The introduced dataset provides a valuable resource for future research into developing robust and accurate character recognition systems. 展开更多
关键词 Cartoon character recognition multi-label classification deep learning transfer learning predictive modelling artificial intelligence-enhanced(AI-Enhanced)systems Kral Sakir dataset
在线阅读 下载PDF
Text categorization based on fuzzy classification rules tree 被引量:2
6
作者 郭玉琴 袁方 刘海博 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期339-342,共4页
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree... To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency. 展开更多
关键词 text categorization fuzzy classification association rule classification rules tree fuzzy classification rules tree
在线阅读 下载PDF
Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark 被引量:22
7
作者 LIU Peng ZHAO Hui-han +3 位作者 TENG Jia-yu YANG Yan-yan LIU Ya-feng ZHU Zong-wei 《Journal of Central South University》 SCIE EI CAS CSCD 2019年第1期1-12,共12页
The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parall... The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parallel naive Bayes algorithm(PNBA)for Chinese text classification based on Spark,a parallel memory computing platform for big data.This algorithm has implemented parallel operation throughout the entire training and prediction process of naive Bayes classifier mainly by adopting the programming model of resilient distributed datasets(RDD).For comparison,a PNBA based on Hadoop is also implemented.The test results show that in the same computing environment and for the same text sets,the Spark PNBA is obviously superior to the Hadoop PNBA in terms of key indicators such as speedup ratio and scalability.Therefore,Spark-based parallel algorithms can better meet the requirement of large-scale Chinese text data mining. 展开更多
关键词 Chinese text classification naive Bayes SPARK HADOOP resilient distributed dataset PARALLELIZATION
在线阅读 下载PDF
Multi-label dimensionality reduction and classification with extreme learning machines 被引量:9
8
作者 Lin Feng Jing Wang +1 位作者 Shenglan Liu Yao Xiao 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2014年第3期502-513,共12页
In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the researc... In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification. 展开更多
关键词 multi-label dimensionality reduction kernel trick classification.
在线阅读 下载PDF
Review of Text Classification Methods on Deep Learning 被引量:14
9
作者 Hongping Wu Yuling Liu Jingwen Wang 《Computers, Materials & Continua》 SCIE EI 2020年第6期1309-1321,共13页
Text classification has always been an increasingly crucial topic in natural language processing.Traditional text classification methods based on machine learning have many disadvantages such as dimension explosion,da... Text classification has always been an increasingly crucial topic in natural language processing.Traditional text classification methods based on machine learning have many disadvantages such as dimension explosion,data sparsity,limited generalization ability and so on.Based on deep learning text classification,this paper presents an extensive study on the text classification models including Convolutional Neural Network-Based(CNN-Based),Recurrent Neural Network-Based(RNN-based),Attention Mechanisms-Based and so on.Many studies have proved that text classification methods based on deep learning outperform the traditional methods when processing large-scale and complex datasets.The main reasons are text classification methods based on deep learning can avoid cumbersome feature extraction process and have higher prediction accuracy for a large set of unstructured data.In this paper,we also summarize the shortcomings of traditional text classification methods and introduce the text classification process based on deep learning including text preprocessing,distributed representation of text,text classification model construction based on deep learning and performance evaluation. 展开更多
关键词 text classification deep learning distributed representation CNN RNN attention mechanism
在线阅读 下载PDF
MII:A Novel Text Classification Model Combining Deep Active Learning with BERT 被引量:8
10
作者 Anman Zhang Bohan Li +2 位作者 Wenhuan Wang Shuo Wan Weitong Chen 《Computers, Materials & Continua》 SCIE EI 2020年第6期1499-1514,共16页
Active learning has been widely utilized to reduce the labeling cost of supervised learning.By selecting specific instances to train the model,the performance of the model was improved within limited steps.However,rar... Active learning has been widely utilized to reduce the labeling cost of supervised learning.By selecting specific instances to train the model,the performance of the model was improved within limited steps.However,rare work paid attention to the effectiveness of active learning on it.In this paper,we proposed a deep active learning model with bidirectional encoder representations from transformers(BERT)for text classification.BERT takes advantage of the self-attention mechanism to integrate contextual information,which is beneficial to accelerate the convergence of training.As for the process of active learning,we design an instance selection strategy based on posterior probabilities Margin,Intra-correlation and Inter-correlation(MII).Selected instances are characterized by small margin,low intra-cohesion and high inter-cohesion.We conduct extensive experiments and analytics with our methods.The effect of learner is compared while the effect of sampling strategy and text classification is assessed from three real datasets.The results show that our method outperforms the baselines in terms of accuracy. 展开更多
关键词 Active learning instance selection deep neural network text classification
在线阅读 下载PDF
An improved TF-IDF approach for text classification 被引量:6
11
作者 张云涛 龚玲 王永成 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2005年第1期49-55,共7页
This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synony... This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach. 展开更多
关键词 Term frequency/inverse document frequency (TF-IDF) text classification CONFIDENCE SUPPORT Characteristic words
在线阅读 下载PDF
Dimensionality Reduction by Mutual Information for Text Classification 被引量:2
12
作者 刘丽珍 宋瀚涛 陆玉昌 《Journal of Beijing Institute of Technology》 EI CAS 2005年第1期32-36,共5页
The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descript... The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descriptive way, to measure the stochastic dependency of discrete random variables. The measure method was used as a criterion to reduce high dimensionality of feature vectors in text classification on Web. Feature selections or conversions were performed by using maximum mutual information including linear and non-linear feature conversions. Entropy was used and extended to find right features commendably in pattern recognition systems. Favorable foundation would be established for text classification mining. 展开更多
关键词 text classification mutual information dimensionality reduction
在线阅读 下载PDF
A Multi-Label Classification Algorithm Based on Label-Specific Features 被引量:2
13
作者 QU Huaqiao ZHANG Shichao +1 位作者 LIU Huawen ZHAO Jianmin 《Wuhan University Journal of Natural Sciences》 CAS 2011年第6期520-524,共5页
Aiming at the problem of multi-label classification, a multi-label classification algorithm based on label-specific features is proposed in this paper. In this algorithm, we compute feature density on the positive and... Aiming at the problem of multi-label classification, a multi-label classification algorithm based on label-specific features is proposed in this paper. In this algorithm, we compute feature density on the positive and negative instances set of each class firstly and then select mk features of high density from the positive and negative instances set of each class, respectively; the intersec- tion is taken as the label-specific features of the corresponding class. Finally, multi-label data are classified on the basis of la- bel-specific features. The algorithm can show the label-specific features of each class. Experiments show that our proposed method, the MLSF algorithm, performs significantly better than the other state-of-the-art multi-label learning approaches. 展开更多
关键词 multi-label classification label-specific features feature's value DENSITY
原文传递
Term-Based Pooling in Convolutional Neural Networks for Text Classification 被引量:2
14
作者 Shuifei Zeng Yan Ma +1 位作者 Xiaoyan Zhang Xiaofeng Du 《China Communications》 SCIE CSCD 2020年第4期109-124,共16页
To achieve good results in convolutional neural networks(CNN) for text classification task, term-based pooling operation in CNNs is proposed. Firstly, the convolution results of several convolution kernels are combine... To achieve good results in convolutional neural networks(CNN) for text classification task, term-based pooling operation in CNNs is proposed. Firstly, the convolution results of several convolution kernels are combined by this method, and then the results after combination are made pooling operation, three sorts of CNN models(we named TBCNN, MCT-CNN and MMCT-CNN respectively) are constructed and then corresponding algorithmic thought are detailed on this basis. Secondly, relevant experiments and analyses are respectively designed to show the effects of three key parameters(convolution kernel, combination kernel number and word embedding) on three kinds of CNN models and to further demonstrate the effect of the models proposed. The experimental results show that compared with the traditional method of text classification in CNNs, term-based pooling method is addressed that not only the availability of the way is proved, but also the performance shows good superiority. 展开更多
关键词 convolutional NEURAL Networks term-based pooling text classification
在线阅读 下载PDF
Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets 被引量:1
15
作者 Shuo Xu Yuefu Zhang +1 位作者 Xin An Sainan Pi 《Journal of Data and Information Science》 CSCD 2024年第2期81-103,共23页
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t... Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution. 展开更多
关键词 multi-label classification Real-World datasets Hierarchical structure classification system Label correlation Machine learning
在线阅读 下载PDF
Feature selection algorithm for text classification based on improved mutual information 被引量:1
16
作者 丛帅 张积宾 +1 位作者 徐志明 王宇颖 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2011年第3期144-148,共5页
In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improve... In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improved mutual information algorithm, which is on the basis of traditional improved mutual information methods that enbance the MI value of negative characteristics and feature' s frequency, supports the concept of concentration degree and dispersion degree. In accordance with the concept of concentration degree and dispersion degree, formulas which embody concentration degree and dispersion degree were constructed and the improved mutual information was implemented based on these. In this paper, the feature selection algorithm was applied based on improved mutual information to a text classifier based on Biomimetic Pattern Recognition and it was compared with several other feature selection methods. The experimental results showed that the improved mutu- al information feature selection method greatly enhances the performance compared with traditional mutual information feature selection methods and the performance is better than that of information gain. Through the introduction of the concept of concentration degree and dispersion degree, the improved mutual information feature selection method greatly improves the performance of text classification system. 展开更多
关键词 text classification feature selection improved mutual information: Biomimetie Pattern Recognition
在线阅读 下载PDF
Study on Multi-Label Classification of Medical Dispute Documents 被引量:2
17
作者 Baili Zhang Shan Zhou +2 位作者 Le Yang Jianhua Lv Mingjun Zhong 《Computers, Materials & Continua》 SCIE EI 2020年第12期1975-1986,共12页
The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treat... The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treatment process in order to provide detailed trace data in medical dispute resolution.Second,IoMT can infiltrate the ongoing treatment and provide timely intelligent decision support to medical staff.This information includes recommendation of similar historical cases,guidance for medical treatment,alerting of hired dispute profiteers etc.The multi-label classification of medical dispute documents(MDDs)plays an important role as a front-end process for intelligent decision support,especially in the recommendation of similar historical cases.However,MDDs usually appear as long texts containing a large amount of redundant information,and there is a serious distribution imbalance in the dataset,which directly leads to weaker classification performance.Accordingly,in this paper,a multi-label classification method based on key sentence extraction is proposed for MDDs.The method is divided into two parts.First,the attention-based hierarchical bi-directional long short-term memory(BiLSTM)model is used to extract key sentences from documents;second,random comprehensive sampling Bagging(RCS-Bagging),which is an ensemble multi-label classification model,is employed to classify MDDs based on key sentence sets.The use of this approach greatly improves the classification performance.Experiments show that the performance of the two models proposed in this paper is remarkably better than that of the baseline methods. 展开更多
关键词 Internet of Medical Things(IoMT) medical disputes medical dispute document(MDD) multi-label classification(MLC) key sentence extraction class imbalance
在线阅读 下载PDF
Short Text Classification Based on Improved ITC 被引量:1
18
作者 Liangliang Li Shouning Qu 《Journal of Computer and Communications》 2013年第4期22-27,共6页
The long text classification has got great achievements, but short text classification still needs to be perfected. In this paper, at first, we describe why we select the ITC feature selection algorithm not the conven... The long text classification has got great achievements, but short text classification still needs to be perfected. In this paper, at first, we describe why we select the ITC feature selection algorithm not the conventional TFIDF and the superiority of the ITC compared with the TFIDF, then we conclude the flaws of the conventional ITC algorithm, and then we present an improved ITC feature selection algorithm based on the characteristics of short text classification while combining the concepts of the Documents Distribution Entropy with the Position Distribution Weight. The improved ITC algorithm conforms to the actual situation of the short text classification. The experimental results show that the performance based on the new algorithm was much better than that based on the traditional TFIDF and ITC. 展开更多
关键词 ITC text classification SHORT text
在线阅读 下载PDF
A Short Text Classification Model for Electrical Equipment Defects Based on Contextual Features 被引量:1
19
作者 LI Peipei ZENG Guohui +5 位作者 HUANG Bo YIN Ling SHI Zhicai HE Chuanpeng LIU Wei CHEN Yu 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2022年第6期465-475,共11页
The defective information of substation equipment is usually recorded in the form of text. Due to the irregular spoken expressions of equipment inspectors, the defect information lacks sufficient contextual informatio... The defective information of substation equipment is usually recorded in the form of text. Due to the irregular spoken expressions of equipment inspectors, the defect information lacks sufficient contextual information and becomes more ambiguous.To solve the problem of sparse data deficient of semantic features in classification process, a short text classification model for defects in electrical equipment that fuses contextual features is proposed. The model uses bi-directional long-short term memory in short text classification to obtain the contextual semantics of short text data. Also, the attention mechanism is introduced to assign weights to different information in the context. Meanwhile, this model optimizes the convolutional neural network parameters with the help of the genetic algorithm for extracting salient features. According to the experimental results, the model can effectively realize the classification of power equipment defect text. In addition, the model was tested on an automotive parts repair dataset provided by the project partners, thus enabling the effective application of the method in specific industrial scenarios. 展开更多
关键词 short text classification genetic algorithm convolutional neural network attention mechanism
原文传递
Novel Machine Learning–Based Approach for Arabic Text Classification Using Stylistic and Semantic Features 被引量:1
20
作者 Fethi Fkih Mohammed Alsuhaibani +1 位作者 Delel Rhouma Ali Mustafa Qamar 《Computers, Materials & Continua》 SCIE EI 2023年第6期5871-5886,共16页
Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeli... Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeling.Even though the importance of this task,Arabic Text Classification tools still suffer from many problems and remain incapable of responding to the increasing volume of Arabic content that circulates on the web or resides in large databases.This paper introduces a novel machine learning-based approach that exclusively uses hybrid(stylistic and semantic)features.First,we clean the Arabic documents and translate them to English using translation tools.Consequently,the semantic features are automatically extracted from the translated documents using an existing database of English topics.Besides,the model automatically extracts from the textual content a set of stylistic features such as word and character frequencies and punctuation.Therefore,we obtain 3 types of features:semantic,stylistic and hybrid.Using each time,a different type of feature,we performed an in-depth comparison study of nine well-known Machine Learning models to evaluate our approach and used a standard Arabic corpus.The obtained results show that Neural Network outperforms other models and provides good performances using hybrid features(F1-score=0.88%). 展开更多
关键词 Arabic text classification machine learning stylistic features semantic features TOPICS
在线阅读 下载PDF
上一页 1 2 165 下一页 到第
使用帮助 返回顶部