The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the comm...The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the common feature selection algorithms and feature representation methods,and introduces the basic principles,advantages and disadvantages of SVM and KNN,and the evaluation indexes of classification algorithms.In the aspect of mutual information feature selection function,it describes its processing flow,shortcomings and optimization improvements.In view of its weakness in not balancing the positive and negative correlation characteristics,a balance weight attribute factor and feature difference factor are introduced to make up for its deficiency.The experimental stage mainly describes the specific process:the word segmentation processing,to disuse words,using various feature selection algorithms,including optimized mutual information,and weighted with TF-IDF.Under the two classification algorithms of SVM and KNN,we compare the merits and demerits of all the feature selection algorithms according to the evaluation index.Experiments show that the optimized mutual information feature selection has good performance and is better than KNN under the SVM classification algorithm.This experiment proves its validity.展开更多
This article focuses on the Hainan Wenbifeng Pangu Cultural Tourist Area in Ding’an County,Hainan Province.Network text analysis was used to collect internet promotion information about the Wenbifeng Scenic Area.Data...This article focuses on the Hainan Wenbifeng Pangu Cultural Tourist Area in Ding’an County,Hainan Province.Network text analysis was used to collect internet promotion information about the Wenbifeng Scenic Area.Data from five platforms—Xiaohongshu,Tiktok,WeChat Official Accounts,Headlines Today,and Baidu—are gathered to understand the current situation and existing problems in the tourism promotion of the Wenbifeng Scenic Area.This article summarizes and analyzes these issues.Finally,combined with on-site research,targeted suggestions are proposed for tourism promotion in theWenbifeng Scenic Area.展开更多
Using the Shenlong Gorge Scenic Area in Nanchuan as a case study,this research adopts a network text analysis approach to examine the current state of tourism service management within the scenic area.Through Python s...Using the Shenlong Gorge Scenic Area in Nanchuan as a case study,this research adopts a network text analysis approach to examine the current state of tourism service management within the scenic area.Through Python software,online review data from tourists on the Dianping platform was collected and analyzed using ROST CM 6 software,focusing on dimensions such as high-frequency words,social semantic networks,and tourist sentiments.The findings illuminate the present state of tourism service management in the Shenlong Gorge Scenic Area,providing critical theoretical support and practical guidance for the scenic area’s management authorities.Based on the analysis,an optimized pathway for tourism service management is proposed to facilitate the sustainable development of the Shenlong Gorge Scenic Area in Nanchuan,improve tourism service management,and enhance the quality of tourists’service experiences.展开更多
In this paper, the role of rare or infrequent terms in enhancing the accuracy of English Text Categorization using Polynomial Networks (PNs) is investigated. To study the impact of rare terms in enhancing the accuracy...In this paper, the role of rare or infrequent terms in enhancing the accuracy of English Text Categorization using Polynomial Networks (PNs) is investigated. To study the impact of rare terms in enhancing the accuracy of PNs-based text categorization, different term reduction criteria as well as different term weighting schemes were experimented on the Reuters Corpus using PNs. Each term weighting scheme on each reduced term set was tested once keeping the rare terms and another time removing them. All the experiments conducted in this research show that keeping rare terms substantially improves the performance of Polynomial Networks in Text Categorization, regardless of the term reduction method, the number of terms used in classification, or the term weighting scheme adopted.展开更多
Reliability parameter selection is very important in the period of equipment project design and demonstration. In this paper, the problem in selecting the reliability parameters and their number is proposed. In order ...Reliability parameter selection is very important in the period of equipment project design and demonstration. In this paper, the problem in selecting the reliability parameters and their number is proposed. In order to solve this problem, the thought of text mining is used to extract the feature and curtail feature sets from text data firstly, and frequent pattern tree (FPT) of the text data is constructed to reason frequent item-set between the key factors by frequent patter growth (FPC) algorithm. Then on the basis of fuzzy Bayesian network (FBN) and sample distribution, this paper fuzzifies the key attributes, which forms associated relationship in frequent item-sets and their main parameters, eliminates the subjective influence factors and obtains condition mutual information and maximum weight directed tree among all the attribute variables. Furthermore, the hybrid model is established by reason fuzzy prior probability and contingent probability and concluding parameter learning method. Finally, the example indicates the model is believable and effective.展开更多
文摘The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the common feature selection algorithms and feature representation methods,and introduces the basic principles,advantages and disadvantages of SVM and KNN,and the evaluation indexes of classification algorithms.In the aspect of mutual information feature selection function,it describes its processing flow,shortcomings and optimization improvements.In view of its weakness in not balancing the positive and negative correlation characteristics,a balance weight attribute factor and feature difference factor are introduced to make up for its deficiency.The experimental stage mainly describes the specific process:the word segmentation processing,to disuse words,using various feature selection algorithms,including optimized mutual information,and weighted with TF-IDF.Under the two classification algorithms of SVM and KNN,we compare the merits and demerits of all the feature selection algorithms according to the evaluation index.Experiments show that the optimized mutual information feature selection has good performance and is better than KNN under the SVM classification algorithm.This experiment proves its validity.
基金supported by Sanya Science and Technology Special Fund,project number:2019YD23.
文摘This article focuses on the Hainan Wenbifeng Pangu Cultural Tourist Area in Ding’an County,Hainan Province.Network text analysis was used to collect internet promotion information about the Wenbifeng Scenic Area.Data from five platforms—Xiaohongshu,Tiktok,WeChat Official Accounts,Headlines Today,and Baidu—are gathered to understand the current situation and existing problems in the tourism promotion of the Wenbifeng Scenic Area.This article summarizes and analyzes these issues.Finally,combined with on-site research,targeted suggestions are proposed for tourism promotion in theWenbifeng Scenic Area.
文摘Using the Shenlong Gorge Scenic Area in Nanchuan as a case study,this research adopts a network text analysis approach to examine the current state of tourism service management within the scenic area.Through Python software,online review data from tourists on the Dianping platform was collected and analyzed using ROST CM 6 software,focusing on dimensions such as high-frequency words,social semantic networks,and tourist sentiments.The findings illuminate the present state of tourism service management in the Shenlong Gorge Scenic Area,providing critical theoretical support and practical guidance for the scenic area’s management authorities.Based on the analysis,an optimized pathway for tourism service management is proposed to facilitate the sustainable development of the Shenlong Gorge Scenic Area in Nanchuan,improve tourism service management,and enhance the quality of tourists’service experiences.
文摘In this paper, the role of rare or infrequent terms in enhancing the accuracy of English Text Categorization using Polynomial Networks (PNs) is investigated. To study the impact of rare terms in enhancing the accuracy of PNs-based text categorization, different term reduction criteria as well as different term weighting schemes were experimented on the Reuters Corpus using PNs. Each term weighting scheme on each reduced term set was tested once keeping the rare terms and another time removing them. All the experiments conducted in this research show that keeping rare terms substantially improves the performance of Polynomial Networks in Text Categorization, regardless of the term reduction method, the number of terms used in classification, or the term weighting scheme adopted.
基金the Weapon Equipment Beforehand Research Foundation of China(No.9140A19030314JB35275)the Army Technology Element Foundation of China(No.A157167)
文摘Reliability parameter selection is very important in the period of equipment project design and demonstration. In this paper, the problem in selecting the reliability parameters and their number is proposed. In order to solve this problem, the thought of text mining is used to extract the feature and curtail feature sets from text data firstly, and frequent pattern tree (FPT) of the text data is constructed to reason frequent item-set between the key factors by frequent patter growth (FPC) algorithm. Then on the basis of fuzzy Bayesian network (FBN) and sample distribution, this paper fuzzifies the key attributes, which forms associated relationship in frequent item-sets and their main parameters, eliminates the subjective influence factors and obtains condition mutual information and maximum weight directed tree among all the attribute variables. Furthermore, the hybrid model is established by reason fuzzy prior probability and contingent probability and concluding parameter learning method. Finally, the example indicates the model is believable and effective.