This paper explores the uses’ influences on microblog. At first, according to the social network theory, we present an analysis of information transmitting network structure based on the relationship of following and...This paper explores the uses’ influences on microblog. At first, according to the social network theory, we present an analysis of information transmitting network structure based on the relationship of following and followed phenomenon of microblog users. Informed by the microblog user behavior analysis, the paper also addresses a model for calculating weights of users’ influence. It proposes a U-R model, using which we can evaluate users’ influence based on PageRank algorithms and analyzes user behaviors. In the U-R model, the effect of user behaviors is explored and PageRank is applied to evaluate the importance and the influence of every user in a microblog network by repeatedly iterating their own U-R value. The users’ influences in a microblog network can be ranked by the U-R value. Finally, the validity of U-R model is proved with a real-life numerical example.展开更多
In this paper, we propose to detect a special group of microblog users: the "marionette" users, who are created or employed by backstage "puppeteers", either through programs or manually. Unlike normal users that...In this paper, we propose to detect a special group of microblog users: the "marionette" users, who are created or employed by backstage "puppeteers", either through programs or manually. Unlike normal users that access microblog for information sharing or social communication, the marionette users perform specific tasks to earn financial profits. For example, they follow certain users to increase their "statistical popularity", or retweet some tweets to amplify their "statistical impact". The fabricated follower or retweet counts not only mislead normal users to wrong information, but also seriously impair microblog-based applications, such as hot tweets selection and expert finding. In this paper, we study the important problem of detecting marionette users on microblog platforms. This problem is challenging because puppeteers are employing complicated strategies to generate marionette users that present similar behaviors as normal users. To tackle this challenge, we propose to take into account two types of discriminative information: 1) individual user tweeting behavior and 2) the social interactions among users. By integrating both information into a semi-supervised probabilistic model, we can effectively distinguish marionette users from normal ones. By applying the proposed model to one of the most popular microblog platforms (Sina Weibo) in China, we find that the model can detect marionette users with F-measure close to 0.9. In addition, we apply the proposed model to calculate the marionette ratio of the top 200 most followed microbloggers and the top 50 most retweeted posts in Sina Weibo. To accelerate the detecting speed and reduce feature generation cost, we further propose a light-weight model which utilizes fewer features to identify marionettes from retweeters.展开更多
Based on user's in-degree distribution, traditional ranking algorithms of user's weight usually neglect the considerations of the differences among user's followers and the features of user's tweets. In order to a...Based on user's in-degree distribution, traditional ranking algorithms of user's weight usually neglect the considerations of the differences among user's followers and the features of user's tweets. In order to analyze the factors which impact on user's weight, under the analysis of the data collected from SINA Microblog network, this paper discovers that user influence and active degrees are the dominant factors for this issue. The proposed algorithm evaluates user influence by user's follower number, the influence of user's followers and the reciprocity between users. User's active degree is modeled by user's participation and the quality of user's tweets. The models are tested by different data groups to confirm the parameters for the final calculation. Eventually, this paper compares the computational results with the user's ranking order given by the SINA official application. The performance of this algorithm presents a stronger stability on the fluctuant range of the value of user's weight.展开更多
Recent progress of Web 2.0 applications has witnessed the rapid development of microblog in China, which has already been one of the most important ways for online communications, especially on sharing information. Th...Recent progress of Web 2.0 applications has witnessed the rapid development of microblog in China, which has already been one of the most important ways for online communications, especially on sharing information. This paper tries to make an in-depth investigation on the big data modeling and analysis of microblog ecosystem in China by using a real dataset containing over17 million records of SinaWeibo users. First, we present the detailed geography, gender, authentication, education and age analysis of microblog users in this dataset. Then we conduct the numerical features distribution analysis, propose the user influence formula and calculate the influences for different kinds of microblog users. Finally, user content intention analysis is performed to reveal users most concerns in their daily life.展开更多
An individual's personal network is a basic object of study in sociology. This article analyzes and compares sina-weibo users' personal network size based on over 2 billion tweets gath- ered from over 1.3 million us...An individual's personal network is a basic object of study in sociology. This article analyzes and compares sina-weibo users' personal network size based on over 2 billion tweets gath- ered from over 1.3 million users in 2012. We propose a new measure method for the analysis of user interactions based on how an individual divides his attention across contacts and how user's characteristics affect the interactions. We find that the balance of attention of user with different age and gender is quite different in weibo. It displays interesting variation in both different groups of people and different modes of interaction.展开更多
利用微博关注关系和社交行为构建微博信任网络,通过引入基于信任的随机游走模型,结合用户间兴趣相似度,建立了微博粉丝推荐模型。为提高粉丝推荐系统的覆盖率,将用户间的社交行为引入信任的计算,实现了TopN推荐。利用KDD Cup 2012腾讯...利用微博关注关系和社交行为构建微博信任网络,通过引入基于信任的随机游走模型,结合用户间兴趣相似度,建立了微博粉丝推荐模型。为提高粉丝推荐系统的覆盖率,将用户间的社交行为引入信任的计算,实现了TopN推荐。利用KDD Cup 2012腾讯微博数据进行了实证研究。实验结果表明:在混合多种社交行为的信任网络中,推荐算法的整体性能最优;推荐长度对推荐结果影响较大,当长度为40时算法获得最好的推荐性能;与主流的推荐算法相比,改进后的基于信任的随机游走推荐模型在推荐准确率和覆盖率等多种评价指标上都取得了更好的结果。研究结论为微博粉丝推荐研究提供了新的方法,为微博网络社会化推荐提供了新的视角。展开更多
文摘This paper explores the uses’ influences on microblog. At first, according to the social network theory, we present an analysis of information transmitting network structure based on the relationship of following and followed phenomenon of microblog users. Informed by the microblog user behavior analysis, the paper also addresses a model for calculating weights of users’ influence. It proposes a U-R model, using which we can evaluate users’ influence based on PageRank algorithms and analyzes user behaviors. In the U-R model, the effect of user behaviors is explored and PageRank is applied to evaluate the importance and the influence of every user in a microblog network by repeatedly iterating their own U-R value. The users’ influences in a microblog network can be ranked by the U-R value. Finally, the validity of U-R model is proved with a real-life numerical example.
文摘In this paper, we propose to detect a special group of microblog users: the "marionette" users, who are created or employed by backstage "puppeteers", either through programs or manually. Unlike normal users that access microblog for information sharing or social communication, the marionette users perform specific tasks to earn financial profits. For example, they follow certain users to increase their "statistical popularity", or retweet some tweets to amplify their "statistical impact". The fabricated follower or retweet counts not only mislead normal users to wrong information, but also seriously impair microblog-based applications, such as hot tweets selection and expert finding. In this paper, we study the important problem of detecting marionette users on microblog platforms. This problem is challenging because puppeteers are employing complicated strategies to generate marionette users that present similar behaviors as normal users. To tackle this challenge, we propose to take into account two types of discriminative information: 1) individual user tweeting behavior and 2) the social interactions among users. By integrating both information into a semi-supervised probabilistic model, we can effectively distinguish marionette users from normal ones. By applying the proposed model to one of the most popular microblog platforms (Sina Weibo) in China, we find that the model can detect marionette users with F-measure close to 0.9. In addition, we apply the proposed model to calculate the marionette ratio of the top 200 most followed microbloggers and the top 50 most retweeted posts in Sina Weibo. To accelerate the detecting speed and reduce feature generation cost, we further propose a light-weight model which utilizes fewer features to identify marionettes from retweeters.
基金supported by the National Natural Sciences Foundation of China under Grant No. 61172072the Beijing Natural Science Foundation under Grant No. 4112045the Fundamental Research Funds for the Central Universities under Grant No. 2011YJS215
文摘Based on user's in-degree distribution, traditional ranking algorithms of user's weight usually neglect the considerations of the differences among user's followers and the features of user's tweets. In order to analyze the factors which impact on user's weight, under the analysis of the data collected from SINA Microblog network, this paper discovers that user influence and active degrees are the dominant factors for this issue. The proposed algorithm evaluates user influence by user's follower number, the influence of user's followers and the reciprocity between users. User's active degree is modeled by user's participation and the quality of user's tweets. The models are tested by different data groups to confirm the parameters for the final calculation. Eventually, this paper compares the computational results with the user's ranking order given by the SINA official application. The performance of this algorithm presents a stronger stability on the fluctuant range of the value of user's weight.
基金supported by National Natural Science Foundation of China(No.61272362)National Basic Research Program ofChina(973 Program)(No.2013CB329606)High-Tech Development Plan of Xinjiang(No.201212124)
文摘Recent progress of Web 2.0 applications has witnessed the rapid development of microblog in China, which has already been one of the most important ways for online communications, especially on sharing information. This paper tries to make an in-depth investigation on the big data modeling and analysis of microblog ecosystem in China by using a real dataset containing over17 million records of SinaWeibo users. First, we present the detailed geography, gender, authentication, education and age analysis of microblog users in this dataset. Then we conduct the numerical features distribution analysis, propose the user influence formula and calculate the influences for different kinds of microblog users. Finally, user content intention analysis is performed to reveal users most concerns in their daily life.
基金Supported by the National Natural Science Foundation of China(61272109)the Natural Science Foundation of Hubei Province of China(2014CFB289)
文摘An individual's personal network is a basic object of study in sociology. This article analyzes and compares sina-weibo users' personal network size based on over 2 billion tweets gath- ered from over 1.3 million users in 2012. We propose a new measure method for the analysis of user interactions based on how an individual divides his attention across contacts and how user's characteristics affect the interactions. We find that the balance of attention of user with different age and gender is quite different in weibo. It displays interesting variation in both different groups of people and different modes of interaction.
文摘利用微博关注关系和社交行为构建微博信任网络,通过引入基于信任的随机游走模型,结合用户间兴趣相似度,建立了微博粉丝推荐模型。为提高粉丝推荐系统的覆盖率,将用户间的社交行为引入信任的计算,实现了TopN推荐。利用KDD Cup 2012腾讯微博数据进行了实证研究。实验结果表明:在混合多种社交行为的信任网络中,推荐算法的整体性能最优;推荐长度对推荐结果影响较大,当长度为40时算法获得最好的推荐性能;与主流的推荐算法相比,改进后的基于信任的随机游走推荐模型在推荐准确率和覆盖率等多种评价指标上都取得了更好的结果。研究结论为微博粉丝推荐研究提供了新的方法,为微博网络社会化推荐提供了新的视角。