Against the background of urban regeneration,“reducing quantity and improving quality” has been the core demand for planning and development in Beijing’s central urban areas,which applies to the development of urba...Against the background of urban regeneration,“reducing quantity and improving quality” has been the core demand for planning and development in Beijing’s central urban areas,which applies to the development of urban vitality spaces of the city as well.In the meantime,in the digital age,users’ sentiment needs for urban spaces continuously increase,making it necessary to consider the sentiment connection between people and the built environment when creating vitality spaces.This study uses two typical examples of urban regeneration vitality spaces in Beijing,“Xidan The New” and “Beijing Fun,” and uses the sentiment connection theory that is based on multiple disciplines as a lever to introduce the thinking,technology,and methods of data science,focusing on the lack of attention to small-scale built environment features in vitality space research.The study integrates the from-top-to-bottom sentiment connection scale data,the qualitative research data,and the from-bottom-to-top big data sentiment analysis to explore the characteristics of small-scale vitality space that can stimulate positive sentiment experiences for users and to bring the discussion of spatial vitality back to the human scale and real three-dimensional spatial experience.In the end,this study summarizes the elements of small-scale spatial vitality and constructs evaluation indicators for sentiment connection to vitality spaces.The goal is to expand and refine the research scope of urban vitality spaces,update corresponding design and evaluation methods,and ensure that the spatial expression of “vitality” truly reflects liveliness and responds to human nature.展开更多
In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedente...In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.展开更多
Background: Weibo is a Twitter-like micro-blog platform in China where people post their real-life events as well as express their feelings in short texts. Since the outbreak of the Covid-19 pandemic, thousands of peo...Background: Weibo is a Twitter-like micro-blog platform in China where people post their real-life events as well as express their feelings in short texts. Since the outbreak of the Covid-19 pandemic, thousands of people have expressed their concerns and worries about the outbreak via Weibo, showing the existence of public panic. Methods: This paper comes up with a sentiment analysis approach to discover public panic. First, we used Octoparse to obtain Weibo posts about the hot topic Covid-19 Pandemic. Second, we break down those sentences into independent words and clean the data by removing stop words. Then, we use the sentiment score function that deals with negative words, adverbs, and sentiment words to get the sentiment score of each Weibo post. Results: We observe the distribution of sentiment scores and get the benchmark to evaluate public panic. Also, we apply the same process to test the mass sentiment under other topics to test the efficiency of the sentiment function, which shows that our function works well.展开更多
With the huge increase in popularity of Twitter in recent years, the ability to draw information regarding public sentiment from Twitter data has become an area of immense interest. Numerous methods of determining the...With the huge increase in popularity of Twitter in recent years, the ability to draw information regarding public sentiment from Twitter data has become an area of immense interest. Numerous methods of determining the sentiment of tweets, both in general and in regard to a specific topic, have been developed, however most of these functions are in a batch learning environment where instances may be passed over multiple times. Since Twitter data in real world situations are far similar to a stream environment, we proposed several algorithms which classify the sentiment of tweets in a data stream. We were able to determine whether a tweet was subjective or objective with an error rate as low as 0.24 and an F-score as high as 0.85. For the determination of positive or negative sentiment in subjective tweets, an error rate as low as 0.23 and an F-score as high as 0.78 were achieved.展开更多
This study is an exploratory analysis of applying natural language processing techniques such as Term Frequency-Inverse Document Frequency and Sentiment Analysis on Twitter data. The uniqueness of this work is establi...This study is an exploratory analysis of applying natural language processing techniques such as Term Frequency-Inverse Document Frequency and Sentiment Analysis on Twitter data. The uniqueness of this work is established by determining the overall sentiment of a politician’s tweets based on TF-IDF values of terms used in their published tweets. By calculating the TF-IDF value of terms from the corpus, this work displays the correlation between TF-IDF score and polarity. The results of this work show that calculating the TF-IDF score of the corpus allows for a more accurate representation of the overall polarity since terms are given a weight based on their uniqueness and relevance rather than just the frequency at which they appear in the corpus.展开更多
Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with o...Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.展开更多
Microblog is a social platform with huge user community and mass data. We propose a semantic recommendation mechanism based on sentiment analysis for microblog. Firstly, the keywords and sensibility words in this mech...Microblog is a social platform with huge user community and mass data. We propose a semantic recommendation mechanism based on sentiment analysis for microblog. Firstly, the keywords and sensibility words in this mechanism are extracted by natural language processing including segmentation, lexical analysis and strategy selection. Then, we query the background knowledge base based on linked open data (LOD) with the basic information of users. The experiment result shows that the accuracy of recommendation is within the range of 70% -89% with sentiment analysis and semantic query. Compared with traditional recommendation method, this method can satisfy users' requirement greatly.展开更多
People's attitudes towards public events or products may change overtime,rather than staying on the same state.Understanding how sentiments change overtime is an interesting and important problem with many applica...People's attitudes towards public events or products may change overtime,rather than staying on the same state.Understanding how sentiments change overtime is an interesting and important problem with many applications.Given a certain public event or product,a user's sentiments expressed in microblog stream can be regarded as a vector.In this paper,we define a novel problem of sentiment evolution analysis,and develop a simple yet effective method to detect sentiment evolution in user-level for public events.We firstly propose a multidimensional sentiment model with hierarchical structure to model user's complicate sentiments.Based on this model,we use FP-growth tree algorithm to mine frequent sentiment patterns and perform sentiment evolution analysis by Kullback-Leibler divergence.Moreover,we develop an improve Affinity Propagation algorithm to detect why people change their sentiments.Experimental evaluations on real data sets show that sentiment evolution could be implemented effectively using our method proposed in this article.展开更多
Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effectiv...Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span>展开更多
基金supported by the National Natural Science Foundation of China Youth Project (No.52208005)the first batch of Industry-Education Collaboration Project by the Ministry of Education in 2024 (No. 230805329162932, project of King Far-CES “Human Factors and Ergonomics”)Beijing Social Science Foundation Youth Project (No. 22GLC063)。
文摘Against the background of urban regeneration,“reducing quantity and improving quality” has been the core demand for planning and development in Beijing’s central urban areas,which applies to the development of urban vitality spaces of the city as well.In the meantime,in the digital age,users’ sentiment needs for urban spaces continuously increase,making it necessary to consider the sentiment connection between people and the built environment when creating vitality spaces.This study uses two typical examples of urban regeneration vitality spaces in Beijing,“Xidan The New” and “Beijing Fun,” and uses the sentiment connection theory that is based on multiple disciplines as a lever to introduce the thinking,technology,and methods of data science,focusing on the lack of attention to small-scale built environment features in vitality space research.The study integrates the from-top-to-bottom sentiment connection scale data,the qualitative research data,and the from-bottom-to-top big data sentiment analysis to explore the characteristics of small-scale vitality space that can stimulate positive sentiment experiences for users and to bring the discussion of spatial vitality back to the human scale and real three-dimensional spatial experience.In the end,this study summarizes the elements of small-scale spatial vitality and constructs evaluation indicators for sentiment connection to vitality spaces.The goal is to expand and refine the research scope of urban vitality spaces,update corresponding design and evaluation methods,and ensure that the spatial expression of “vitality” truly reflects liveliness and responds to human nature.
文摘In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.
文摘Background: Weibo is a Twitter-like micro-blog platform in China where people post their real-life events as well as express their feelings in short texts. Since the outbreak of the Covid-19 pandemic, thousands of people have expressed their concerns and worries about the outbreak via Weibo, showing the existence of public panic. Methods: This paper comes up with a sentiment analysis approach to discover public panic. First, we used Octoparse to obtain Weibo posts about the hot topic Covid-19 Pandemic. Second, we break down those sentences into independent words and clean the data by removing stop words. Then, we use the sentiment score function that deals with negative words, adverbs, and sentiment words to get the sentiment score of each Weibo post. Results: We observe the distribution of sentiment scores and get the benchmark to evaluate public panic. Also, we apply the same process to test the mass sentiment under other topics to test the efficiency of the sentiment function, which shows that our function works well.
文摘With the huge increase in popularity of Twitter in recent years, the ability to draw information regarding public sentiment from Twitter data has become an area of immense interest. Numerous methods of determining the sentiment of tweets, both in general and in regard to a specific topic, have been developed, however most of these functions are in a batch learning environment where instances may be passed over multiple times. Since Twitter data in real world situations are far similar to a stream environment, we proposed several algorithms which classify the sentiment of tweets in a data stream. We were able to determine whether a tweet was subjective or objective with an error rate as low as 0.24 and an F-score as high as 0.85. For the determination of positive or negative sentiment in subjective tweets, an error rate as low as 0.23 and an F-score as high as 0.78 were achieved.
文摘This study is an exploratory analysis of applying natural language processing techniques such as Term Frequency-Inverse Document Frequency and Sentiment Analysis on Twitter data. The uniqueness of this work is established by determining the overall sentiment of a politician’s tweets based on TF-IDF values of terms used in their published tweets. By calculating the TF-IDF value of terms from the corpus, this work displays the correlation between TF-IDF score and polarity. The results of this work show that calculating the TF-IDF score of the corpus allows for a more accurate representation of the overall polarity since terms are given a weight based on their uniqueness and relevance rather than just the frequency at which they appear in the corpus.
文摘Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.
基金Supported by the National Natural Science Foundation of China(60803160 and 61272110)the Key Projects of National Social Science Foundation of China(11&ZD189)+4 种基金the Natural Science Foundation of Hubei Province(2013CFB334)the Natural Science Foundation of Educational Agency of Hubei Province(Q20101110)the State Key Lab of Software Engineering Open Foundation of Wuhan University(SKLSE2012-09-07)the Teaching Research Project of Hubei Province(2011s005)the Wuhan Key Technology Support Program(2013010602010216)
文摘Microblog is a social platform with huge user community and mass data. We propose a semantic recommendation mechanism based on sentiment analysis for microblog. Firstly, the keywords and sensibility words in this mechanism are extracted by natural language processing including segmentation, lexical analysis and strategy selection. Then, we query the background knowledge base based on linked open data (LOD) with the basic information of users. The experiment result shows that the accuracy of recommendation is within the range of 70% -89% with sentiment analysis and semantic query. Compared with traditional recommendation method, this method can satisfy users' requirement greatly.
基金ACKNOWLEDGEMENTS The authors would like to thank the reviewers for their detailed reviews and constructive comments, which have helped improve the quality of this paper. The research was supported in part by National Basic Research Program of China (973 Program, No. 2013CB329601, No. 2013CB329604), National Natural Science Foundation of China (No.91124002, 61372191, 61472433, 61202362, 11301302), and China Postdoctoral Science Foundation (2013M542560). All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.
文摘People's attitudes towards public events or products may change overtime,rather than staying on the same state.Understanding how sentiments change overtime is an interesting and important problem with many applications.Given a certain public event or product,a user's sentiments expressed in microblog stream can be regarded as a vector.In this paper,we define a novel problem of sentiment evolution analysis,and develop a simple yet effective method to detect sentiment evolution in user-level for public events.We firstly propose a multidimensional sentiment model with hierarchical structure to model user's complicate sentiments.Based on this model,we use FP-growth tree algorithm to mine frequent sentiment patterns and perform sentiment evolution analysis by Kullback-Leibler divergence.Moreover,we develop an improve Affinity Propagation algorithm to detect why people change their sentiments.Experimental evaluations on real data sets show that sentiment evolution could be implemented effectively using our method proposed in this article.
文摘Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span>