期刊文献+
共找到74篇文章
< 1 2 4 >
每页显示 20 50 100
Investigation of the relationship between number of tweets and USDTRY exchange rate with wavelet coherence and transfer entropy analysis
1
作者 Cengiz Karatas Sukriye Tuysuz +1 位作者 Kazim Berk Kucuklerli Veysel Ulusoy 《Financial Innovation》 2025年第1期755-774,共20页
Predicting the currency exchange rate is crucial for financial agents,risk managers,and policymakers.Traditional approaches use publicly announced news on macroeconomic and financial variables as predictors of currenc... Predicting the currency exchange rate is crucial for financial agents,risk managers,and policymakers.Traditional approaches use publicly announced news on macroeconomic and financial variables as predictors of currency exchange.However,the rise of social media may have changed the source of information.For instance,tweets can help investors make informed decisions about the foreign exchange(FX)market by reflecting market sentiment and opinion.From another aspect,changes in currency exchange may incite agents to post tweets.Are tweets good predictors of currency exchange?Is the relationship between tweets and currency exchange bidirectional?We investigate the comovement/causality between the number of#dolar(“enflasyon”resp.)tweets and USDTRY currency exchange using wavelet coherence and transfer entropy(TE)to answer these questions.Wavelet coherence allows us to determine the relationship between the number of tweets and the USDTRY rate by considering the time–frequency domain.TE enables us to quantify the net information flow between the number of tweets and USDTRY.Data from October 2020 to March 2022 were used.The obtained results remain robust regardless of the frequency of retained data(daily or hourly)and the methods used(wavelet or TE).Based on our results,USDTRY is correlated with the number of#dolar tweets(#inflation)mainly in the short run and a few times in the medium run.These relationships change through time and frequency(wavelet analysis results).However,the results from TE indicate a bidirectional relationship between the#dolar(#inflation)tweets number and the USDTRY exchange rate.The influence of the exchange rate on the number of tweets is highly pronounced.Financial agents,risk managers,policymakers,and investors should then pay moderate attention to the number of#dolar(#inflation)tweets in trading/forecasting the USD–TRY exchange rate. 展开更多
关键词 tweets tweets number Currency exchange Wavelet coherence Transfer entropy
在线阅读 下载PDF
ReTweeting Analysis and Prediction in Microblogs: An Epidemic Inspired Approach 被引量:11
2
作者 王昊 李义萍 +1 位作者 冯卓楠 冯铃 《China Communications》 SCIE CSCD 2013年第3期13-24,共12页
Microblogs currently play an important role in social communication. Hot topics currently being tweeted can quickly become popular within a very short time as a result of retweeting. Gaining an understanding of the re... Microblogs currently play an important role in social communication. Hot topics currently being tweeted can quickly become popular within a very short time as a result of retweeting. Gaining an understanding of the retweeting behavior is desirable for a number of tasks such as topic detection, personalized message recommendation, and fake information monitoring and prevention. Interestingly, the propagation of tweets bears some similarity to the spread of infectious diseases. We present a method to model the tweets' spread behavior in microblogs based on the classic Susceptible-Infectious-Susceptible (SIS) epidemic model that was developed in the medical field for the spread of infectious diseases. On the basis of this model, future retweeting trends can be predicted. Our experiments on data obtained from the Chinese micro-blogging website Sina Weibo show that the proposed model has lower predictive error compared to the four commonly used prediction methods. 展开更多
关键词 tweets retweeting PREDICTION SIS epidemic model
在线阅读 下载PDF
Tweet Sentiment Analysis (TSA) for Cloud Providers Using Classification Algorithms and Latent Semantic Analysis 被引量:1
3
作者 Ioannis Karamitsos Saeed Albarhami Charalampos Apostolopoulos 《Journal of Data Analysis and Information Processing》 2019年第4期276-294,共19页
The availability and advancements of cloud computing service models such as IaaS, SaaS, and PaaS;introducing on-demand self-service, auto scaling, easy maintenance, and pay as you go, has dramatically transformed the ... The availability and advancements of cloud computing service models such as IaaS, SaaS, and PaaS;introducing on-demand self-service, auto scaling, easy maintenance, and pay as you go, has dramatically transformed the way organizations design and operate their datacenters. However, some organizations still have many concerns like: security, governance, lack of expertise, and migration. The purpose of this paper is to discuss the cloud computing customers’ opinions, feedbacks, attitudes, and emotions towards cloud computing services using sentiment analysis. The associated aim, is to help people and organizations to understand the benefits and challenges of cloud services from the general public’s perspective view as well as opinions about existing cloud providers, focusing on three main cloud providers: Azure, Amazon Web Services (AWS) and Google Cloud. The methodology used in this paper is based on sentiment analysis applied to the tweets that were extracted from social media platform (Twitter) via its search API. We have extracted a sample of 11,000 tweets and each cloud provider has almost similar proportion of the tweets based on relevant hashtags and keywords. Analysis starts by combining the tweets in order to find the overall polarity about cloud computing, then breaking the tweets to find the specific polarity for each cloud provider. Bing and NRC Lexicons are employed to measure the polarity and emotion of the terms in the tweets. The overall polarity classification of the tweets across all cloud providers shows 68.5% positive and 31.5% negative percentages. More specifically, Azure shows 63.8% positive and 36.2% negative tweets, Google Cloud shows 72.6% positive and 27.4% negative tweets and AWS shows 69.1% positive and 30.9% negative tweets. 展开更多
关键词 AZURE AWS GOOGLE CLOUD Machine Learning SENTIMENT Analysis tweets
在线阅读 下载PDF
Dragonfly Optimization with Deep Learning Enabled Sentiment Analysis for Arabic Tweets 被引量:1
4
作者 Aisha M.Mashraqi Hanan T.Halawani 《Computer Systems Science & Engineering》 SCIE EI 2023年第8期2555-2570,共16页
Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social m... Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social media.In literature,it has been reported that SA data is created for English language in excess of any other language.It is challenging to perform SA for Arabic Twitter data owing to informal nature and rich morphology of Arabic language.An earlier study conducted upon SA for Arabic Twitter focused mostly on automatic extraction of the features from the text.Neural word embedding has been employed in literature,since it is less labor-intensive than automatic feature engineering.By ignoring the context of sentiment,most of the word-embedding models follow syntactic data of words.The current study presents a new Dragonfly Optimization with Deep Learning Enabled Sentiment Analysis for Arabic Tweets(DFODLSAAT)model.The aim of the presented DFODL-SAAT model is to distinguish the sentiments from opinions that are tweeted in Arabic language.At first,data cleaning and pre-processing steps are performed to convert the input tweets into a useful format.In addition,TF-IDF model is exploited as a feature extractor to generate the feature vectors.Besides,Attention-based Bidirectional Long Short Term Memory(ABLSTM)technique is applied for identification and classification of sentiments.At last,the hyperparameters of ABLSTM model are optimized using DFO algorithm.The performance of the proposed DFODL-SAAT model was validated using the benchmark dataset and the outcomes were investigated under different aspects.The experimental outcomes highlight the superiority of DFODL-SAAT model over recent approaches. 展开更多
关键词 Natural language processing sentiment analysis arabic tweets deep learning metaheuristics lexicon approach
在线阅读 下载PDF
AMachine Learning Approach to Cyberbullying Detection in Arabic Tweets
5
作者 Dhiaa Musleh Atta Rahman +8 位作者 Mohammed Abbas Alkherallah Menhal Kamel Al-Bohassan Mustafa Mohammed Alawami Hayder Ali Alsebaa Jawad Ali Alnemer Ghazi Fayez Al-Mutairi May Issa Aldossary Dalal A.Aldowaihi Fahd Alhaidari 《Computers, Materials & Continua》 SCIE EI 2024年第7期1033-1054,共22页
With the rapid growth of internet usage,a new situation has been created that enables practicing bullying.Cyberbullying has increased over the past decade,and it has the same adverse effects as face-to-face bullying,l... With the rapid growth of internet usage,a new situation has been created that enables practicing bullying.Cyberbullying has increased over the past decade,and it has the same adverse effects as face-to-face bullying,like anger,sadness,anxiety,and fear.With the anonymity people get on the internet,they tend to bemore aggressive and express their emotions freely without considering the effects,which can be a reason for the increase in cyberbullying and it is the main motive behind the current study.This study presents a thorough background of cyberbullying and the techniques used to collect,preprocess,and analyze the datasets.Moreover,a comprehensive review of the literature has been conducted to figure out research gaps and effective techniques and practices in cyberbullying detection in various languages,and it was deduced that there is significant room for improvement in the Arabic language.As a result,the current study focuses on the investigation of shortlisted machine learning algorithms in natural language processing(NLP)for the classification of Arabic datasets duly collected from Twitter(also known as X).In this regard,support vector machine(SVM),Naive Bayes(NB),Random Forest(RF),Logistic regression(LR),Bootstrap aggregating(Bagging),Gradient Boosting(GBoost),Light Gradient Boosting Machine(LightGBM),Adaptive Boosting(AdaBoost),and eXtreme Gradient Boosting(XGBoost)were shortlisted and investigated due to their effectiveness in the similar problems.Finally,the scheme was evaluated by well-known performance measures like accuracy,precision,Recall,and F1-score.Consequently,XGBoost exhibited the best performance with 89.95%accuracy,which is promising compared to the state-of-the-art. 展开更多
关键词 Supervised machine learning ensemble learning CYBERBULLYING Arabic tweets NLP
在线阅读 下载PDF
Quantum Particle Swarm Optimization with Deep Learning-Based Arabic Tweets Sentiment Analysis
6
作者 Badriyya BAl-onazi Abdulkhaleq Q.A.Hassan +5 位作者 Mohamed K.Nour Mesfer Al Duhayyim Abdullah Mohamed Amgad Atta Abdelmageed Ishfaq Yaseen Gouse Pasha Mohammed 《Computers, Materials & Continua》 SCIE EI 2023年第5期2575-2591,共17页
Sentiment Analysis(SA),a Machine Learning(ML)technique,is often applied in the literature.The SA technique is specifically applied to the data collected from social media sites.The research studies conducted earlier u... Sentiment Analysis(SA),a Machine Learning(ML)technique,is often applied in the literature.The SA technique is specifically applied to the data collected from social media sites.The research studies conducted earlier upon the SA of the tweets were mostly aimed at automating the feature extraction process.In this background,the current study introduces a novel method called Quantum Particle Swarm Optimization with Deep Learning-Based Sentiment Analysis on Arabic Tweets(QPSODL-SAAT).The presented QPSODL-SAAT model determines and classifies the sentiments of the tweets written in Arabic.Initially,the data pre-processing is performed to convert the raw tweets into a useful format.Then,the word2vec model is applied to generate the feature vectors.The Bidirectional Gated Recurrent Unit(BiGRU)classifier is utilized to identify and classify the sentiments.Finally,the QPSO algorithm is exploited for the optimal finetuning of the hyperparameters involved in the BiGRU model.The proposed QPSODL-SAAT model was experimentally validated using the standard datasets.An extensive comparative analysis was conducted,and the proposed model achieved a maximum accuracy of 98.35%.The outcomes confirmed the supremacy of the proposed QPSODL-SAAT model over the rest of the approaches,such as the Surface Features(SF),Generic Embeddings(GE),Arabic Sentiment Embeddings constructed using the Hybrid(ASEH)model and the Bidirectional Encoder Representations from Transformers(BERT)model. 展开更多
关键词 Sentiment analysis Arabic tweets quantum particle swarm optimization deep learning word embedding
在线阅读 下载PDF
Dynamic Spatio-Temporal Tweet Mining for Event Detection:A Case Study of Hurricane Florence 被引量:1
7
作者 Mahdi Farnaghi Zeinab Ghaemi Ali Mansourian 《International Journal of Disaster Risk Science》 SCIE CSCD 2020年第3期378-393,共16页
Extracting information about emerging events in large study areas through spatiotemporal and textual analysis of geotagged tweets provides the possibility of monitoring the current state of a disaster.This study propo... Extracting information about emerging events in large study areas through spatiotemporal and textual analysis of geotagged tweets provides the possibility of monitoring the current state of a disaster.This study proposes dynamic spatio-temporal tweet mining as a method for dynamic event extraction from geotagged tweets in large study areas.It introduces the use of a modified version of ordering points to identify the clustering structure to address the intrinsic heterogeneity of Twitter data.To precisely calculate the textual similarity,three state-of-theart text embedding methods of Word2vec,GloVe,and Fast Text were used to capture both syntactic and semantic similarities.The impact of selected embedding algorithms on the quality of the outputs was studied.Different combinations of spatial and temporal distances with the textual similarity measure were investigated to improve the event detection outcomes.The proposed method was applied to a case study related to 2018 Hurricane Florence.The method was able to precisely identify events of varied sizes and densities before,during,and after the hurricane.The feasibility of the proposed method was qualitatively evaluated using the Silhouette coefficient and qualitatively discussed.The proposed method was also compared to an implementation based on the standard density-based spatial clustering of applications with noise algorithm,where it showed more promising results. 展开更多
关键词 Disaster management Hurricane Florence Natural language processing Spatio-temporal tweet analysis tweet clustering TWITTER
原文传递
A visual-textual fused approach to automated tagging of flood-related tweets during a flood event 被引量:3
8
作者 Xiao Huang Cuizhen Wang +1 位作者 Zhenlong Li Huan Ning 《International Journal of Digital Earth》 SCIE EI 2019年第11期1248-1264,共17页
In recent years,social media such as Twitter have received much attention as a new data source for rapid flood awareness.The timely response and large coverage provided by citizen sensors significantly compensate the ... In recent years,social media such as Twitter have received much attention as a new data source for rapid flood awareness.The timely response and large coverage provided by citizen sensors significantly compensate the limitations of non-timely remote sensing data and spatially isolated river gauges.However,automatic extraction of flood tweets from a massive tweets pool remains a challenge.Taking the Houston Flood in 2017 as a study case,this paper presents an automated flood tweets extraction approach by mining both visual and textual information a tweet contains.A CNN architecture was designed to classify the visual content of flood pictures during the Houston Flood.A sensitivity test was then applied to extract flood-sensitive keywords that were further used to refine the CNN classified results.A duplication test was finally performed to trim the database by removing the duplicated pictures to create the flood tweets pool for the flood event.The results indicated that coupling CNN classification results with flood-sensitive words in tweets allows a significant increase in precision while keeps the recall rate in a high level.The elimination of tweets containing duplicated pictures greatly contributes to higher spatio-temporal relevance to the flood. 展开更多
关键词 Data mining FLOOD social media CNN tweets geotagging
原文传递
An enhanced cosine-based visual technique for the robust tweets data clustering 被引量:3
9
作者 Narasimhulu K. Meena Abarna K.T. Sivakumar B. 《International Journal of Intelligent Computing and Cybernetics》 EI 2021年第2期170-184,共15页
Purpose-The purpose of the paper is to study multiple viewpoints which are required to access the more informative similarity features among the tweets documents,which is useful for achieving the robust tweets data cl... Purpose-The purpose of the paper is to study multiple viewpoints which are required to access the more informative similarity features among the tweets documents,which is useful for achieving the robust tweets data clustering results.Design/methodology/approach-Let“N”be the number of tweets documents for the topics extraction.Unwanted texts,punctuations and other symbols are removed,tokenization and stemming operations are performed in the initial tweets pre-processing step.Bag-of-features are determined for the tweets;later tweets are modelled with the obtained bag-of-features during the process of topics extraction.Approximation of topics features are extracted for every tweet document.These set of topics features of N documents are treated as multi-viewpoints.The key idea of the proposed work is to use multi-viewpoints in the similarity features computation.The following figure illustrates multi-viewpoints based cosine similarity computation of the five tweets documents(here N 55)and corresponding documents are defined in projected space with five viewpoints,say,v_(1),v_(2),v_(3),v4,and v5.For example,similarity features between two documents(viewpoints v_(1),and v_(2))are computed concerning the other three multi-viewpoints(v_(3),v4,and v5),unlike a single viewpoint in traditional cosine metric.Findings-Healthcare problems with tweets data.Topic models play a crucial role in the classification of health-related tweets with finding topics(or health clusters)instead of finding term frequency and inverse document frequency(TF-IDF)for unlabelled tweets.Originality/value-Topic models play a crucial role in the classification of health-related tweets with finding topics(or health clusters)instead of finding TF-IDF for unlabelled tweets. 展开更多
关键词 tweets data clustering Topic models TF-IDF Similarity features Visual technique VAT cVAT MVCS-VAT
在线阅读 下载PDF
Public Emotional Diffusion over COVID-19 Related Tweets Posted by Major Public Health Agencies in the United States 被引量:1
10
作者 Haixu Xi Chengzhi Zhang +1 位作者 Yi Zhao Sheng He 《Data Intelligence》 EI 2022年第1期66-87,共22页
Since the end of 2019,the COVID-19 outbreak worldwide has not only presented challenges for government agencies in addressing public health emergency,but also tested their capacity in dealing with public opinion on so... Since the end of 2019,the COVID-19 outbreak worldwide has not only presented challenges for government agencies in addressing public health emergency,but also tested their capacity in dealing with public opinion on social media and responding to social emergencies.To understand the impact of COVID-19 related tweets posted by the major public health agencies in the United States on public emotion,this paper studied public emotional diffusion in the tweets network,including its process and characteristics,by taking Twitter users of four official public health systems in the United States as an example.We extracted the interactions between tweets in the COVID-19-Tweet Ids data set and drew the tweets diffusion network.We proposed a method to measure the characteristics of the emotional diffusion network,with which we analyzed the changes of the public emotional intensity and the proportion of emotional polarity,investigated the emotional influence of key nodes and users,and the emotional diffusion of tweets at different tweeting time,tweet topics and the tweet posting agencies.The results show that the emotional polarity of tweets has changed from negative to positive with the improvement of pandemic management measures.The public’s emotional polarity on pandemic related topics tends to be negative,and the emotional intensity of management measures such as pandemic medical services turn from positive to negative to the greatest extent,while the emotional intensity of pandemic related knowledge changes the most.The tweets posted by the Centers for Disease Control and Prevention and the Food and Drug Administration of the United States have a broad impact on public emotions,and the emotional spread of tweets’polarity eventually forms a very close proportion of opposite emotions. 展开更多
关键词 Emotional diffusion tweets COVID-19 Pandemic management US Public Health Agency
原文传递
Study of language distribution in informal scientific communication from the perspective of scientific tweets 被引量:1
11
作者 YU Houqiang DONG Ke +1 位作者 WANG Yuefen ZHANG Chengzhi 《Journal of Library Science in China》 2018年第1期175-176,共2页
Language is a media of scientific communication.Language distribution of scientific communication reflects the status of global scientific power.The study,based on scientific tweets,has revealed the language distribut... Language is a media of scientific communication.Language distribution of scientific communication reflects the status of global scientific power.The study,based on scientific tweets,has revealed the language distribution in informal scientific communication. 展开更多
关键词 STUDY of LANGUAGE distribution INFORMAL SCIENTIFIC communication the PERSPECTIVE of SCIENTIFIC tweets
原文传递
网络信息内容生态治理科研主题识别及网民需求协同研究
12
作者 王协舟 李奕扉 《情报科学》 北大核心 2025年第8期68-77,共10页
【目的/意义】探索网络信息内容生态治理科研成果与网民需求的协同关系,为提升科研成果的问题适用性与话题启发性提供思路与参考。【方法/过程】利用BERTopic主题建模方法,结合中国知网代表性学术论文与新浪微博代表性话题,识别总体与... 【目的/意义】探索网络信息内容生态治理科研成果与网民需求的协同关系,为提升科研成果的问题适用性与话题启发性提供思路与参考。【方法/过程】利用BERTopic主题建模方法,结合中国知网代表性学术论文与新浪微博代表性话题,识别总体与阶段主题类型及其演化趋势,基于Word2vec模型计算协同关系。【结果/结论】推文主题与科研主题存在协同程度较低、主题协同关系滞后、主题协同类型失衡等情况;未来学界应推动人格权益保护研究、加强平台信息分配研究并实现热点主题交叉研究。【创新/局限】针对科研主题整体分布、时序特征、协同关系形成较为科学的描述文本,提供一个更为系统的网民需求协同分析框架;未能分析科研主题与网民需求主题各自的内部协同关系、同类型主题间的演变关系等。 展开更多
关键词 网络信息内容生态治理 科研主题识别 推文主题识别 需求协同 BERTopic主题建模
原文传递
基于人工与ChatGPT标注的推文情感分析对比研究
13
作者 杨艺 黄镜月 +1 位作者 贺品尧 荣婷 《重庆工商大学学报(自然科学版)》 2025年第4期95-101,共7页
目的针对特定推文情感分析任务中标注数据的困难和由于标注不准确带来的分类结果不尽如人意问题,提出一种机器标注数据的方法来研究深度学习模型对人工标注和机器标注推文数据情感分类的性能表现差异。方法研究中,对于统一的标签体系,... 目的针对特定推文情感分析任务中标注数据的困难和由于标注不准确带来的分类结果不尽如人意问题,提出一种机器标注数据的方法来研究深度学习模型对人工标注和机器标注推文数据情感分类的性能表现差异。方法研究中,对于统一的标签体系,分别对推文数据进行人工标注和运用ChatGPT模型接口标注,再采用BERT-TextCNN深度学习混合模型,对经过人工标注和ChatGPT标注的数据集进行情感分类。结果实验结果表明:人工标注数据集在整体性能上表现出更高的准确性和可信度,但是在某些推文数据上,ChatGPT大模型以其比人更丰富的知识储备,可以生成比人更客观科学的可解释性标注,在情感分类结果上呈现出一定的优势,人工标注和机器标注方法各具优劣;由此可以得出对于文本情感分类任务,机器标注是一种可行的标注方法。结论在实际应用场景中,可以根据任务需求灵活选择和结合两种标注方法,充分利用两者之间的优势,以达到更佳的分析性能和效果。 展开更多
关键词 人工标注 ChatGPT标注 推文 情感分析 BERT-TextCNN
在线阅读 下载PDF
科学推文的文本属性特征及其对科学论文的级联演化趋势分析
14
作者 曹仁猛 许小可 王贤文 《情报学报》 北大核心 2025年第10期1259-1271,共13页
科学推文是科学论文在社交媒体上的重要传播载体,揭示科学推文的文本属性特征对科学论文传播效果的影响,可以帮助科学传播者优化传播策略,提高科学信息的传播范围,促进学术交流与公众参与。本文基于5万余篇论文和40万条科学推文的研究样... 科学推文是科学论文在社交媒体上的重要传播载体,揭示科学推文的文本属性特征对科学论文传播效果的影响,可以帮助科学传播者优化传播策略,提高科学信息的传播范围,促进学术交流与公众参与。本文基于5万余篇论文和40万条科学推文的研究样本,从推文文本内容、多媒体信息和表情符号3个维度,分析了不同文本属性特征的科学推文的级联演化趋势。研究发现,将论文亮点的内容作为推文文本,以及在推文中加入图片、视频和表情符号等视觉元素,能够显著增强科学论文的传播范围。这种效果不仅在传播初期得到体现,而且在后续传播中得到进一步增强,形成“强者愈强”的传播趋势,即级联马太效应。本文将计算传播视角引入altmetrics研究中,能够帮助研究者更深入、全面地理解科学论文在社交媒体上的传播过程和模式,从而揭示科学论文传播背后的机制。 展开更多
关键词 altmetrics 级联传播 科学推文 传播效果
在线阅读 下载PDF
基于改进CURE算法的微博热点话题发现 被引量:12
15
作者 杨长春 周猛 +1 位作者 叶施仁 徐小松 《计算机仿真》 CSCD 北大核心 2013年第11期383-387,共5页
由于微博平台的信息量大,为对博文热点进行准确识别,本文提出了一种基于经典CURE聚类算法的改进算法来发现微博热点话题。本文选取了20391条中文微博作为样本数据集,通过将博文稀疏矩阵化达到将高维数据降维的效果,很大程度上提高了计... 由于微博平台的信息量大,为对博文热点进行准确识别,本文提出了一种基于经典CURE聚类算法的改进算法来发现微博热点话题。本文选取了20391条中文微博作为样本数据集,通过将博文稀疏矩阵化达到将高维数据降维的效果,很大程度上提高了计算的精度和速度。从选取CURE层次聚类的代表点出发,将代表点转换为博文种子集,同时调节收缩因子,加大排除博文的异常点,利用CURE层次聚类算法的思想设计了改进的CURE算法来发现微博热点话题,通过实验发现改进CURE层次聚类算法能够将数据集中的74.65%作为孤立点,更好的提高了算法的精度,同时准确地抓住长尾效应的"头部",能够更加直观的发现微博热点话题。 展开更多
关键词 稀疏矩阵 热点话题 层次聚类算法 博文种子集 改进层次聚类算法
在线阅读 下载PDF
采用情感特征向量的Twitter情感分类方法研究 被引量:7
16
作者 易顺明 易昊 周国栋 《小型微型计算机系统》 CSCD 北大核心 2016年第11期2454-2458,共5页
面向公共媒体内容开展情感分析是分析公众情感的一项基础工作.经典的基于词频特征向量的特征提取方法,主要利用词频作为文本分类的依据,而词频与情感信息之间的关系并不紧密.提出一种采用基于情感特征向量的Twitter推文情感分类方法.该... 面向公共媒体内容开展情感分析是分析公众情感的一项基础工作.经典的基于词频特征向量的特征提取方法,主要利用词频作为文本分类的依据,而词频与情感信息之间的关系并不紧密.提出一种采用基于情感特征向量的Twitter推文情感分类方法.该方法首先通过对推文进行数据清洗、词形还原、词性标注和词汇向量化;其次,将单词匹配到情感词典中;最后,利用每个单词的正向情感、负向情感取值生成情感特征向量,通过MNB、SVM等机器学习方法训练模型,对推文的情感进行分类.实验结果表明采用情感特征向量的Twitter推文情感分类方法能够获得更佳的分类性能. 展开更多
关键词 推文 情感分类 情感词典 情感特征向量
在线阅读 下载PDF
新浪微博话题流行度预测技术研究 被引量:7
17
作者 熊小兵 周刚 +1 位作者 黄永忠 马俊 《信息工程大学学报》 2012年第4期496-502,共7页
微博作为一种新的在线社会网形式,逐渐成为人们获取和共享信息的重要平台。以我国最大的微博网站——新浪微博为对象,重点研究了微博话题的流行度预测问题。收集了大约40G的微博话题信息作为研究数据集,从中提取出与话题流行度相关的微... 微博作为一种新的在线社会网形式,逐渐成为人们获取和共享信息的重要平台。以我国最大的微博网站——新浪微博为对象,重点研究了微博话题的流行度预测问题。收集了大约40G的微博话题信息作为研究数据集,从中提取出与话题流行度相关的微博用户属性和话题内容属性,在对这些属性相关性分析的基础上,提出了一种兼顾用户属性和内容属性的话题流行度定量描述方法。文章对影响话题流行度的各属性进行了详细的主成分分析,总结出4种属性作为话题流行度预测的依据,并建立了流行度的线性预测模型。该模型能较好地预测话题流行度,模型指标R2达到0.89。 展开更多
关键词 微博 话题流行度 预测 主成分分析
在线阅读 下载PDF
基于地理标签的推文话题时空演变的可视分析方法 被引量:3
18
作者 孙国道 周志秀 +2 位作者 李思 刘义鹏 梁荣华 《计算机科学》 CSCD 北大核心 2019年第8期42-49,共8页
社交媒体中,用户所发布的推文内容记录了与用户相关的各种信息。文字信息中涵盖了推文中包含的各种话题,以及时间和空间信息,从这些信息中分析出话题的时空演变情况具有十分重要的研究意义。针对推文数据,设计了一套可视分析流程来挖掘... 社交媒体中,用户所发布的推文内容记录了与用户相关的各种信息。文字信息中涵盖了推文中包含的各种话题,以及时间和空间信息,从这些信息中分析出话题的时空演变情况具有十分重要的研究意义。针对推文数据,设计了一套可视分析流程来挖掘推文信息,通过用户交互的方式多角度地展示了推文话题的时空演变过程。首先,基于部分历史推文数据,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)聚类算法,结合泰森多边形对全球地理空间进行区域划分;然后,针对用户查询搜索的兴趣话题,索引找到所有相关的推文内容,并将信息与聚类中心绑定;最后,通过设计的多个结合时序聚类算法和自适应算法的可视化视图来展示话题的时空演变过程。通过推特官网提供的API抓取存储的推文数据,并进行实验和分析,结果表明:改进的可视化视图自适应布局算法有效地解决了图形遮挡问题,完整展现了推文的时空演变模式;地理区域的划分以及可视化组件能够有效帮助研究人员分析推文的时空演变以及全球关注的热点话题分布。 展开更多
关键词 推文话题 可视化分析流程 自适应布局算法 聚类 时空演变
在线阅读 下载PDF
微博个性化信息流推荐研究 被引量:2
19
作者 闫光辉 陈勇 +1 位作者 赵红运 任亚缙 《计算机工程与设计》 CSCD 北大核心 2014年第6期2013-2016,2036,共5页
针对为微博用户推荐符合其兴趣和喜好的个性化微博信息的问题,结合协同过滤的思想,基于TF-IDF模型综合考虑了单个词语向量和多个词语向量相结合的特点后,用于计算微博信息流的相似性并评估用户的兴趣度。通过进一步分析用户的冷启动的... 针对为微博用户推荐符合其兴趣和喜好的个性化微博信息的问题,结合协同过滤的思想,基于TF-IDF模型综合考虑了单个词语向量和多个词语向量相结合的特点后,用于计算微博信息流的相似性并评估用户的兴趣度。通过进一步分析用户的冷启动的问题和个性化特点,有效降低了无关微博信息的排名,优化用户微博信息排序。将基于新浪微博数据集与现有的余弦相似性和标签向量的微博推荐方法进行了对比实验,实验结果表明,该算法的有效性。 展开更多
关键词 微博推荐 信息检索 协同过滤 个性化 冷启动
在线阅读 下载PDF
Twitter推文与情感词典SentiWordNet匹配算法研究 被引量:2
20
作者 易顺明 周洪斌 周国栋 《南京师范大学学报(工程技术版)》 CAS 2016年第3期41-47,53,共8页
在Twitter情感分类研究中,经常会采用将推文中的单词匹配情感词典中的同义词条查找相应情感值的方法 .但推文书写比较随意,包含许多俚语、缩写和特殊符号,导致许多词汇与情感词典中的词条无法匹配,匹配率不高直接影响推文的情感分类性能... 在Twitter情感分类研究中,经常会采用将推文中的单词匹配情感词典中的同义词条查找相应情感值的方法 .但推文书写比较随意,包含许多俚语、缩写和特殊符号,导致许多词汇与情感词典中的词条无法匹配,匹配率不高直接影响推文的情感分类性能.针对Twitter的语言特征,提出了一套Twitter推文与情感词典SentiWordNet的匹配算法.该算法首先通过对推文内容进行数据清洗、替代处理、词性标注和词形还原等预处理,增加了命名实体识别、对hashtags内容的断词处理、基于Word Clusters的否定句处理和词组匹配等方法 .实验结果表明,采用此方法的匹配率可达90%以上. 展开更多
关键词 推文 情感分类 SentiWordNet 匹配算法
在线阅读 下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部