With the rapid development of the Internet, text information has shown a blowout growth. Massivetext data such as news, social media posts, academic literature, etc. are constantly emerging, and manual classificationa...With the rapid development of the Internet, text information has shown a blowout growth. Massivetext data such as news, social media posts, academic literature, etc. are constantly emerging, and manual classificationand management of these texts has become time-consuming and inefficient, which is difficult to meetthe actual needs. The continuous progress of natural language processing technology, especially the rise of deeplearning methods, provides strong technical support for automatic text classification. Deep learning models canautomatically mine the essential features of text from massive samples, capture deep semantic representationinformation, and avoid the tedious process of manual design rules and features. In practical applications, textdata often co-exists with data of other modes (such as images, audio, etc.). Through the feature learning ofmultimodal data, the information of multiple modes can be mapped to the joint vector space, and the unifiedrepresentation of data can be obtained, so that the text classification can be more accurate. In recent years,pre-trained language models such as BERT and GPT have achieved remarkable results. These models learn acommon language representation through unsupervised pre-training on large-scale corpus, and then fine-tuneon specific text classification tasks, which can significantly improve the classification performance and furtherpromote the research of automatic text classification. Automatic text classification can classify massive textdata into different categories quickly and accurately, which is convenient for information storage, retrievaland management. For example, in the fields of library document management and enterprise document management,automatic classification can greatly improve work efficiency and save labor costs. In social mediaand online public opinion monitoring, automatic text classification can quickly identify text information withdifferent themes and emotional tendencies. This helps to timely understand the dynamics of public opinion,and provides a basis for the government, enterprises and other institutions to formulate corresponding copingstrategies. In the field of customer service, such as online customer service, customer feedback processing,etc., automatic text classification can automatically identify the types of questions and emotional tendencies ofcustomers. Thus, automated customer consultation and problem classification can be realized to improve theefficiency and quality of customer service. Automatic text classification is an important task in the field of naturallanguage processing, and its research progress can provide reference for other natural language processingtasks. For example, in tasks such as sentiment analysis, machine translation, question answering system, etc.,the techniques and methods of text classification can be applied and expanded. Automatic text classificationtechnology can be widely used in many fields, such as financial risk assessment, medical text analysis, legaldocument classification and so on. In these fields, automatic text classification can help professionals quicklysift and process a large amount of text information, improve work efficiency and decision-making accuracy.展开更多
Although, researchers in the ATC field have done a wide range of work based on SVM, almost all existing approaches utilize an empirical model of selection algorithms. Their attempts to model automatic selection in pra...Although, researchers in the ATC field have done a wide range of work based on SVM, almost all existing approaches utilize an empirical model of selection algorithms. Their attempts to model automatic selection in practical, large-scale, text classification systems have been limited. In this paper, we propose a new model selection algorithm that utilizes the DDAG learning architecture. This architecture derives a new large-scale text classifier with very good performance. Experimental results show that the proposed algorithm has good efficiency and the necessary generalization capability while handling large-scale multi-class text classification tasks.展开更多
文摘With the rapid development of the Internet, text information has shown a blowout growth. Massivetext data such as news, social media posts, academic literature, etc. are constantly emerging, and manual classificationand management of these texts has become time-consuming and inefficient, which is difficult to meetthe actual needs. The continuous progress of natural language processing technology, especially the rise of deeplearning methods, provides strong technical support for automatic text classification. Deep learning models canautomatically mine the essential features of text from massive samples, capture deep semantic representationinformation, and avoid the tedious process of manual design rules and features. In practical applications, textdata often co-exists with data of other modes (such as images, audio, etc.). Through the feature learning ofmultimodal data, the information of multiple modes can be mapped to the joint vector space, and the unifiedrepresentation of data can be obtained, so that the text classification can be more accurate. In recent years,pre-trained language models such as BERT and GPT have achieved remarkable results. These models learn acommon language representation through unsupervised pre-training on large-scale corpus, and then fine-tuneon specific text classification tasks, which can significantly improve the classification performance and furtherpromote the research of automatic text classification. Automatic text classification can classify massive textdata into different categories quickly and accurately, which is convenient for information storage, retrievaland management. For example, in the fields of library document management and enterprise document management,automatic classification can greatly improve work efficiency and save labor costs. In social mediaand online public opinion monitoring, automatic text classification can quickly identify text information withdifferent themes and emotional tendencies. This helps to timely understand the dynamics of public opinion,and provides a basis for the government, enterprises and other institutions to formulate corresponding copingstrategies. In the field of customer service, such as online customer service, customer feedback processing,etc., automatic text classification can automatically identify the types of questions and emotional tendencies ofcustomers. Thus, automated customer consultation and problem classification can be realized to improve theefficiency and quality of customer service. Automatic text classification is an important task in the field of naturallanguage processing, and its research progress can provide reference for other natural language processingtasks. For example, in tasks such as sentiment analysis, machine translation, question answering system, etc.,the techniques and methods of text classification can be applied and expanded. Automatic text classificationtechnology can be widely used in many fields, such as financial risk assessment, medical text analysis, legaldocument classification and so on. In these fields, automatic text classification can help professionals quicklysift and process a large amount of text information, improve work efficiency and decision-making accuracy.
文摘Although, researchers in the ATC field have done a wide range of work based on SVM, almost all existing approaches utilize an empirical model of selection algorithms. Their attempts to model automatic selection in practical, large-scale, text classification systems have been limited. In this paper, we propose a new model selection algorithm that utilizes the DDAG learning architecture. This architecture derives a new large-scale text classifier with very good performance. Experimental results show that the proposed algorithm has good efficiency and the necessary generalization capability while handling large-scale multi-class text classification tasks.