The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product revie...The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product reviews;however,to react to such reviews,extracting aspects of the entity to which these reviews belong is equally important.Aspect-based Sentiment Analysis(ABSA)refers to aspects extracted from an opinionated text.The literature proposes different approaches for ABSA;however,most research is focused on supervised approaches,which require labeled datasets with manual sentiment polarity labeling and aspect tagging.This study proposes a semisupervised approach with minimal human supervision to extract aspect terms by detecting the aspect categories.Hence,the study deals with two main sub-tasks in ABSA,named Aspect Category Detection(ACD)and Aspect Term Extraction(ATE).In the first sub-task,aspects categories are extracted using topic modeling and filtered by an oracle further,and it is fed to zero-shot learning as the prompts and the augmented text.The predicted categories are the input to find similar phrases curated with extracting meaningful phrases(e.g.,Nouns,Proper Nouns,NER(Named Entity Recognition)entities)to detect the aspect terms.The study sets a baseline accuracy for two main sub-tasks in ABSA on the Multi-Aspect Multi-Sentiment(MAMS)dataset along with SemEval-2014 Task 4 subtask 1 to show that the proposed approach helps detect aspect terms via aspect categories.展开更多
Effective analysis of large text collections remains a challenging problem given the growing volume of available text data.Recently,text mining techniques have been rapidly developed for automatically extracting key i...Effective analysis of large text collections remains a challenging problem given the growing volume of available text data.Recently,text mining techniques have been rapidly developed for automatically extracting key information from massive text data.Topic modeling,as one of the novel techniques that extracts a thematic structure from documents,is widely used to generate text summarization and foster an overall understanding of the corpus content.Although powerful,this technique may not be directly applicable for general analytics scenarios since the topics and topic-document relationship are often presented probabilistically in models.Moreover,information that plays an important role in knowledge discovery,for example,times and authors,is hardly reflected in topic modeling for comprehensive analysis.In this paper,we address this issue by presenting a visual analytics system,VISTopic,to help users make sense of large document collections based on topic modeling.VISTopic first extracts a set of hierarchical topics using a novel hierarchical latent tree model(HLTM)(Liu et al.,2014).In specific,a topic view accounting for the model features is designed for overall understanding and interactive exploration of the topic organization.To leverage multi-perspective information for visual analytics,VISTopic further provides an evolution view to reveal the trend of topics and a document view to show details of topical documents.Three case studies based on the dataset of IEEE VIS conference demonstrate the effectiveness of our system in gaining insights from large document collections.展开更多
文摘The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product reviews;however,to react to such reviews,extracting aspects of the entity to which these reviews belong is equally important.Aspect-based Sentiment Analysis(ABSA)refers to aspects extracted from an opinionated text.The literature proposes different approaches for ABSA;however,most research is focused on supervised approaches,which require labeled datasets with manual sentiment polarity labeling and aspect tagging.This study proposes a semisupervised approach with minimal human supervision to extract aspect terms by detecting the aspect categories.Hence,the study deals with two main sub-tasks in ABSA,named Aspect Category Detection(ACD)and Aspect Term Extraction(ATE).In the first sub-task,aspects categories are extracted using topic modeling and filtered by an oracle further,and it is fed to zero-shot learning as the prompts and the augmented text.The predicted categories are the input to find similar phrases curated with extracting meaningful phrases(e.g.,Nouns,Proper Nouns,NER(Named Entity Recognition)entities)to detect the aspect terms.The study sets a baseline accuracy for two main sub-tasks in ABSA on the Multi-Aspect Multi-Sentiment(MAMS)dataset along with SemEval-2014 Task 4 subtask 1 to show that the proposed approach helps detect aspect terms via aspect categories.
基金This project is funded by a grant proposal(Ref:YBCB2009041-44)of Huawei Technologies Noah’s Ark Lab.
文摘Effective analysis of large text collections remains a challenging problem given the growing volume of available text data.Recently,text mining techniques have been rapidly developed for automatically extracting key information from massive text data.Topic modeling,as one of the novel techniques that extracts a thematic structure from documents,is widely used to generate text summarization and foster an overall understanding of the corpus content.Although powerful,this technique may not be directly applicable for general analytics scenarios since the topics and topic-document relationship are often presented probabilistically in models.Moreover,information that plays an important role in knowledge discovery,for example,times and authors,is hardly reflected in topic modeling for comprehensive analysis.In this paper,we address this issue by presenting a visual analytics system,VISTopic,to help users make sense of large document collections based on topic modeling.VISTopic first extracts a set of hierarchical topics using a novel hierarchical latent tree model(HLTM)(Liu et al.,2014).In specific,a topic view accounting for the model features is designed for overall understanding and interactive exploration of the topic organization.To leverage multi-perspective information for visual analytics,VISTopic further provides an evolution view to reveal the trend of topics and a document view to show details of topical documents.Three case studies based on the dataset of IEEE VIS conference demonstrate the effectiveness of our system in gaining insights from large document collections.