Various stakeholders,such as researchers,government agencies,businesses,and research laboratories require a large volume of reliable scientific research outcomes including research articles and patent data to support ...Various stakeholders,such as researchers,government agencies,businesses,and research laboratories require a large volume of reliable scientific research outcomes including research articles and patent data to support their work.These data are crucial for a variety of application,such as advancing scientific research,conducting business evaluations,and undertaking policy analysis.However,collecting such data is often a time-consuming and laborious task.Consequently,many users turn to using openly accessible data for their research.However,these existing open dataset releases typically suffer from lack of relationship between different data sources and a limited temporal coverage.To address this issue,we present a new open dataset,the Intelligent Innovation Dataset(IIDS),which comprises six interrelated datasets spanning nearly 120 years,encompassing paper information,paper citation relationships,patent details,patent legal statuses,and funding information.The extensive contextual and extensive temporal coverage of the IIDS dataset will provide researchers and practitioners and policy maker with comprehensive data support,enabling them to conduct in-depth scientific research and comprehensive data analyses.展开更多
Dear readers,It is our pleasure to present six articles in Volume 6,Issue 4 of the Journal of Social Computing.To help readers navigate this diverse content,we have organized the papers into three thematic clusters:(1...Dear readers,It is our pleasure to present six articles in Volume 6,Issue 4 of the Journal of Social Computing.To help readers navigate this diverse content,we have organized the papers into three thematic clusters:(1)Theoretical development in information diffusion,particularly the distribution of biases across social categories and cultural contexts.(2)Methodological contributions to contemporary research,including social experiments using multiple LLM agents and computational recommendation methods.(3)Integrated approaches combining machine learning and statistical modeling,using mental health as a case study to explore and test theory.展开更多
This essay examines the intricate relationship between large language models(LLMs)and privacy,investigating the ethical and practical issues stemming from cutting-edge artificial intelligence(AI)technologies.The resea...This essay examines the intricate relationship between large language models(LLMs)and privacy,investigating the ethical and practical issues stemming from cutting-edge artificial intelligence(AI)technologies.The research delves into the evolving understanding of privacy in the digital era,with a specific emphasis on the risks posed by anthropomorphic AI design.The analysis highlights critical privacy concerns:(1)Trust and accountability:The lack of true moral agency in AI systems complicates traditional notions of trust and responsibility;(2)Nissenbaum’s Contextual Integrity Framework as a tool to explore privacy issues in general and with LLM;(3)Data collection challenges:LLMs collect extensive user data,often without explicit consent,potentially breaching contextual privacy norms;(4)Anthropomorphism risks:Human-like AI interfaces can foster over-trust,leading users to share sensitive information inappropriately.This article underscores that privacy is a complex,multidimensional concept profoundly shaped by technological,cultural,and social forces.As AI technologies continue to advance,safeguarding privacy will necessitate a nuanced approach that strikes a balance between individual rights,societal needs,and technological progress.We conclude with useroriented guidelines and future research directions,offering a comprehensive framework for understanding and addressing the privacy implications of LLMs.展开更多
Dear readers,Welcome to the first issue of volume 6 of the Journal of Social Computing!We present six articles that highlight the interplay between artificial intelligence,computational modeling,and research resources...Dear readers,Welcome to the first issue of volume 6 of the Journal of Social Computing!We present six articles that highlight the interplay between artificial intelligence,computational modeling,and research resources in addressing both methodological challenges and real-world societal issues.These papers are grouped into three thematic clusters:(1)AI and computational methods for social and economic research,(2)data-driven insights into real-world social challenges,and(3)research resources for scientific and policy advancements.展开更多
Dear readers,Welcome to the second issue of the sixth volume of the Journal of Social Computing!This issue presents six interdisciplinary research articles that explore how modern computational tools can help us bette...Dear readers,Welcome to the second issue of the sixth volume of the Journal of Social Computing!This issue presents six interdisciplinary research articles that explore how modern computational tools can help us better understand and make sense of complex systems.These include social behavior,technological development,scientific discovery,and the evolving capabilities of artificial intelligence.To guide readers through this diverse content,the articles are grouped into three thematic clusters:(1)Understanding Human and Machine Behavior,(2)Modeling Social and Legislative Systems,and(3)Extracting Insights from Scientific and Technological Data.展开更多
The exponential growth of social media has heightened concerns about digital addiction and its mental health consequences,particularly among younger populations.Existing digital health tools,including conversational a...The exponential growth of social media has heightened concerns about digital addiction and its mental health consequences,particularly among younger populations.Existing digital health tools,including conversational agents and large language models,offer real-time support but often neglect the predictive value of structured behavioural data.This study introduces a machine learning framework to assess digital addiction risk using 3200 anonymised self-reports comprising screen time,social media engagement,sleep duration,and mental health indicators.Across multiple models,categorical boosting(CatBoost)achieves the highest performance(precision=85.4%,receiver operating characteristic-area under the curve(ROC-AUC)=0.93),outperforming extreme gradient boosting(XGBoost)and graph neural networks(GNN).A linear regression model provides interpretable correlations between behavioural variables and addiction risk.Structural equation modelling(SEM)reveals that anxiety and depression mediate the relationship between digital behaviours and addiction risk,offering causal insights into these pathways.Feature importance analysis identified excessive screen time,frequent social media checking,and reduced sleep as the most influential predictors.To translate findings into practice,K-means clustering generated behavioural risk profiles,enabling personalised,data-driven recommendations.While clinical validation remains a next step,this framework demonstrates how predictive modelling and clustering can inform scalable,noninvasive digital health interventions.By integrating machine learning with causal modelling and personalised intervention design,this study advances computational approaches to digital addiction and contributes to the broader discourse on artificial intelligence applications in mental health and social computing.展开更多
Morally controversial content,such as offensive and hateful images over social media,is especially challenging to categorize,given widespread disagreement in how people interpret and evaluate this content.Numerous stu...Morally controversial content,such as offensive and hateful images over social media,is especially challenging to categorize,given widespread disagreement in how people interpret and evaluate this content.Numerous studies argue that a range of subjective biases,such as partisan differences in moral reasoning,lead people not only to diverge in their classifications of controversial content,but also to resist any attempts to change their classification judgments via social influence.Yet,recent large-scale analyses of classification patterns over social media suggest that separate populations,such as democrats and republicans,can reach surprising levels of agreement in the categorization of inflammatory content like fake news and hate speech,despite considerable differences in their moral reasoning and worldview.This poses a fundamental puzzle:how can populations of diverse individuals who disagree in the interpretation of controversial content nevertheless arrive at highly similar decisions for the classification and removal of such content?Here,we use an online platform to test the hypothesis that structural symmetries in information exchange networks can synchronize convergence on decisions regarding the classification and removal of controversial images across independent networks,leading them to independently reproduce consistent systems of classification.We find that isolated individuals diverge considerably in their classification of controversial content,whereas separate,structurally similar networks independently synchronize in their classifications and content removal decisions,reducing partisan biases across all networks.We also find that when participant experience is compared to subjects evaluating content individually in the control condition,participants within synchronizing networks reported having significantly more positive feelings about their task,and experience significantly less emotional stress when evaluating controversial content.展开更多
Understanding influencers’perspectives and predicting public sentiment are crucial for event assessment and guidance in computational social systems,enabling more informed decision-making.However,this task is inheren...Understanding influencers’perspectives and predicting public sentiment are crucial for event assessment and guidance in computational social systems,enabling more informed decision-making.However,this task is inherently challenging due to the unstructured,context-sensitive,and heterogeneous nature of online communication.To address these challenges,we propose a novel intelligent computational framework,Multi-domain Opinion Leader Agents Emotion Prediction(MOAEP).Our framework comprises three key components:(1)An Automatic Question Generation(AQG)module employing“Who,What,Where,When,Why,and How”(5W1H)questioning to systematically explore topic dimensions;(2)A Multi-domain Opinion Leader Agents(MOA)module that integrates enhanced Large Language Models(LLMs)with Retrieval-Augmented Generation(RAG)to produce domain-specific responses;and(3)An emotion prediction engine that synthesizes agent interactions to forecast collective emotional responses,enabling proactive social computing analysis that surpasses conventional post-event methods.Experimental results demonstrate the framework’s efficacy:the AQG module generates high-fidelity outputs,while the influencer agents maintain consistent performance,achieving an average“Generative Pre-trained Transformer 4”(GPT-4)evaluation score of 6.85(on a 0-10 scale)across multiple dimensions.In a social media conflict case study,“Russia-Ukraine War”,our framework successfully predicts key influencers’perspectives and aligns emotional forecasts with observed real-world sentiment trends.These findings underscore the potential of MOAEP to provide actionable insights for decision-making in computational social science.展开更多
This research introduced a predictive modeling framework to analyze the temporal dynamics of social media discourse during three global movements:the Mahsa Amini Protests(2022),South African Unrest(2021),and the Black...This research introduced a predictive modeling framework to analyze the temporal dynamics of social media discourse during three global movements:the Mahsa Amini Protests(2022),South African Unrest(2021),and the Black Lives Matter Movement(2020).By utilizing Twitter data,the study developed a Protest Social Media Archetype to capture the evolution of tweet activity during key protest periods,identifying common patterns and notable variations in public engagement.Techniques such as LOESS regression and correlation analysis were used to model fluctuations in online activity and assess the impact of social media on public mobilization during socio-political unrest.The findings revealed distinct temporal signatures for each movement,showing similarities in initial engagement and differences in sustained activity across various contexts.These insights underscore the role of digital platforms in protest organization and global solidarity,offering a framework for future studies on digital activism and protest dynamics.展开更多
Defining poverty based on relative income,intended to identify individuals who are significantly worse off than the mainstream living standard—is widely adopted by more developed countries and regions,such as those i...Defining poverty based on relative income,intended to identify individuals who are significantly worse off than the mainstream living standard—is widely adopted by more developed countries and regions,such as those in the Organization for Economic Co-operation and Development(OECD).There are,however,different ways income is counted,sometimes only counting earnings(e.g.,from employment),sometimes counting government transfers(e.g.,social welfare distributions),sometimes counting that generated from savings and investments as well.While some simpler form of income may be used for calculating relative poverty for ease of measurement(or other practical considerations),the intention of the relative poverty definition should be based on full income from all sources(including assets).This paper studies a method for evaluating the inaccuracy caused by using a simpler(and easier to measure)income distribution and understanding where the inaccuracy comes from.We test our method by using a 2000-household dataset from Hong Kong Special Administrative Region,to evaluate the relative poverty approach once adopted there.We also recommend practical alternatives:focusing on economically active households only or using disposable income instead of market income.We show how much such alternatives can improve accuracy and explain why.展开更多
Video classification typically requires large labeled datasets which are costly and time-consuming to obtain.This paper proposes a novel Active Learning(AL)framework to improve video classification performance while m...Video classification typically requires large labeled datasets which are costly and time-consuming to obtain.This paper proposes a novel Active Learning(AL)framework to improve video classification performance while minimizing the human annotation effort.Unlike passive learning methods that randomly select samples for labeling,our approach actively identifies the most informative unlabeled instances to be annotated.Specifically,we develop batch mode AL techniques that select useful videos based on uncertainty and diversity sampling.The algorithm then extracts a diverse set of representative keyframes from the queried videos.Human annotators only need to label these keyframes instead of watching the full videos.We implement this approach by leveraging recent advances in deep neural networks for visual feature extraction and sequence modeling.Our experiments on benchmark datasets demonstrate that our method achieves significant improvements in video classification accuracy with less training data.This enables more efficient video dataset construction and could make large-scale video annotation more feasible.Our AL framework minimizes the human effort needed to train accurate video classifiers.展开更多
Dear Readers,We present six articles in this issue that display a wide range of methods used in social computing.These six articles can be classified into three categories:(1)Theoretical perspectives and formal models...Dear Readers,We present six articles in this issue that display a wide range of methods used in social computing.These six articles can be classified into three categories:(1)Theoretical perspectives and formal models,(2)simulation modeling,and(3)computational models.A theoretical perspective offers deep reflection on social phenomena。展开更多
The problem of Point-Of-Interest(POI)recommendation,based on the user’s historical checkin records,determines whether a user checks in at specific POI.However,the user-POI data have a longtail distribution phenomenon...The problem of Point-Of-Interest(POI)recommendation,based on the user’s historical checkin records,determines whether a user checks in at specific POI.However,the user-POI data have a longtail distribution phenomenon.To mitigate the sparsity of check-in data,it is a good idea to exploit the sufficient attributes of POI and recommend POIs in both geography wise and category wise.Generally,this problem can be treated as two specific tasks with feature combination,ignoring cross-task dependencies and feature disentanglement.To address the aforementioned problems,this paper proposes a novel joint framework named InteractPOI,enabling two-stage interaction bewteen geographywise and category-wise POI recommendations.Specifically,this paper comprehensively considers the sequence effect and the neighbor effect both from geography wise and category wise.For the firststage interaction,we design a disentangled graph embedding model to distinguish different influencing factors from geography wise and category wise.For the second-stage interaction,we integrate a gating mechanism for feature fusion with a complementary algorithm for interactive optimization.Extensive experiments on two datasets demonstrate the superiority of the proposed model.展开更多
There exist many panel data decision problems in real life,and they take on obvious structural similarities and lag effects among decision objects or indicators,which are difficult to solve effectively based on tradit...There exist many panel data decision problems in real life,and they take on obvious structural similarities and lag effects among decision objects or indicators,which are difficult to solve effectively based on traditional panel data analysis methods.To deal with these problems,considering the structural characteristics of panel data and lag effect,from multiple structural dimensions such as scale volume,development trend,and volatility,we exploit grey incidence analysis and panel data to establish an indicator-type grey structural incidence analysis model,and utilize it to analyze and identify factors influencing technological innovation of industrial enterprises.The results show that the proposed method fully considers the structural characteristics of panel data and lag effect,and it can deal with panel data decision problems and provide a new methodological support for the grey incidence analysis.展开更多
In this paper,we propose and implement a systematic pipeline for the automatic classification of AI-related documents extracted from large-scale literature databases.This process results in the creation of an AI-relat...In this paper,we propose and implement a systematic pipeline for the automatic classification of AI-related documents extracted from large-scale literature databases.This process results in the creation of an AI-related literature dataset named DeepDiveAI.The dataset construction pipeline integrates expert knowledge with the capabilities of advanced models,structured into two primary stages.In the first stage,expert-curated classification datasets are used to train a Long Short-Term Memory(LSTM)model,which performs coarse-grained classification of AI-related records from large-scale datasets.In the second stage,a large language model,specifically Qwen2.5 Plus,is employed to annotate a random 10%of the initially coarse set of classified AI-related records.These annotated records are subsequently used to train a Bidirectional Encoder Representations from Transformers(BERT)based binary classifier,further refining the coarse set to produce the final DeepDiveAI dataset.Evaluation results indicate that the proposed pipeline achieves both accuracy and efficiency in identifying AI-related literature from large-scale datasets.展开更多
The paper critically evaluates the global discourses on algorithmic fairness,reviews key Western literature on artificial intelligence(AI)fairness,identifies twelve documented cases of algorithmic discrimination in We...The paper critically evaluates the global discourses on algorithmic fairness,reviews key Western literature on artificial intelligence(AI)fairness,identifies twelve documented cases of algorithmic discrimination in Western contexts,and extends them for their analytical relevance to non-Western socio-political environments.The study applies these frameworks in particular to the Indian context,and proposes that India’s entrenched socio-cultural structures—caste,religion,language,regional identity,and minority status—tends to misalign with Western paradigms of fairness.The proposed study identifies identity-specific factors unique to India that are likely to contribute to algorithmic oppression if unaddressed.A critical policy analysis of major documents shaping India’s AI and digital governance landscape reveals that these critical factors remain largely unacknowledged.Indian policy responses tend to replicate Western techno-legal models without engaging indigenous socio-structural realities.The paper concludes that ethical AI governance in India must transcend imported normative models and instead be rooted in context-sensitive approaches that accommodates the nation’s distinct social fabric,in order to prevent algorithm induced structural discrimination and ensure inclusive algorithmic justice.展开更多
Common legislative prediction methods often emphasize bill content or social relationships.This paper,motivated by the insight that similar policy texts reflect comparable political ideologies and can lead to similar ...Common legislative prediction methods often emphasize bill content or social relationships.This paper,motivated by the insight that similar policy texts reflect comparable political ideologies and can lead to similar voting outcomes,proposes a deep learning method that exploits attention mechanisms to incorporate semantic similarity between bills into legislative prediction models.Our approach uses attention scores to identify bills that are most similar to the one being predicted,and combines the encoded features of these similar bills as additional auxiliary information.By integrating these related features,the model goes beyond the semantic information of individual bills,leading to a more comprehensive use of roll-call data.Empirical results show that utilizing bill similarity along with traditional social relationships,voter characteristics,and bill content significantly improves performance in terms of accuracy,recall,precision,and F1 score compared to models that ignore bill similarity.The results also confirm that legislators tend to maintain consistent views or voting patterns on bills that are similar in nature.In addition,we demonstrate that the attention mechanism is more effective than conventional similarity measures,such as cosine similarity and Euclidean distance,in capturing the similarities between bills.展开更多
In the global venture capital(VC)landscape,cross-community collaboration is vital for foreign VC firms,especially in markets like China,where the business environment and the guanxi culture present unique challenges.U...In the global venture capital(VC)landscape,cross-community collaboration is vital for foreign VC firms,especially in markets like China,where the business environment and the guanxi culture present unique challenges.Using co-investment data from 2000 to 2014,this study identifies seven communities through a semi-supervised detection method,categorizing them by the predominance of domestic or foreign VCs.Cross-community collaboration refers to partnerships between VC firms from different communities,involving at least one domestic and one foreign VC.Logistic regression analysis reveals that industry distance does not significantly impact cross-community collaboration.However,industry hotness and local knowledge positively moderate this relationship.In the Chinese context,signaling theory suggests that cross-community collaborations between foreign and domestic VCs act as a signal of credibility.Guanxi,characterized by trust and reciprocity,encourages foreign VCs to foster long-term relationships with domestic counterparts,helping them bridge industrial and cultural gaps.Additionally,industry hotness and local experience reduce investment risk and uncertainty,leading foreign VCs to engage more frequently in cross-community collaborations that link domestic and foreign ecosystems.This study integrates signaling theory with guanxi in the cross-community VC context,emphasizing the strategic role of syndication as a signal in emerging markets like China.展开更多
The human-centric visual analysis field thrives on rich video datasets that explore human behaviours and interactions.Yet,a gap persists in datasets covering both human pose estimation and parsing challenges.In this s...The human-centric visual analysis field thrives on rich video datasets that explore human behaviours and interactions.Yet,a gap persists in datasets covering both human pose estimation and parsing challenges.In this study,a notable effort has been made to develop a dedicated dataset named“Single Person Video-in-Person(SP-VIP)”to suit the research scenario,resolving a lack of a universal dataset to support three major human-centric visual analysis methods.The SP-VIP dataset was derived by extracting videos from the VIP dataset initially designed exclusively for parsing-related tasks.Furthermore,the VIP dataset did not encompass provisions for pose estimation and human activity recognition,which are crucial elements for human activity recognition.To bridge this gap,the SP-VIP dataset was meticulously curated with a specific focus on single-person activities.Videos in the newly created dataset are split into frames with semantic labels and joint values for each frame.To assess the performance of the tailored dataset,a novel architecture Single-person Parsing and Pose Network(SPPNet)was employed using a Deep ConvNet network for parsing while simultaneously performing pose estimation using the stacked hourglass method.To demonstrate the effectiveness of the newly created dataset,extensive experiments were performed on the discussed architecture,which produced favourable results with a pixel accuracy of 88.50%,a mean accuracy of 60.50%,and a mean Intersection over Union(IoU)of 49.30%signifying enhancement in performance.展开更多
The rise of online social platforms has enhanced connectivity and access to information.Still,it has also enabled the proliferation of malicious social bots that threaten platform security and disrupt social order.In ...The rise of online social platforms has enhanced connectivity and access to information.Still,it has also enabled the proliferation of malicious social bots that threaten platform security and disrupt social order.In this paper,we introduce a unified framework for defining and classifying malicious social bots along three dimensions:behavior,interaction,and operation.We then present a comprehensive review of social bot detection methods,tracing their evolution from traditional machine learning techniques to deep learning architectures and graph neural networks,with particular emphasis on recent advances in group-level detection.We also explore the emerging paradigm of Large Language Model(LLM)based bot detection.This paper reviews the current state of research,identifies key challenges,and outlines future directions.It provides a cohesive foundation for building more robust detection frameworks to counter the evolving threats posed by malicious social bots.展开更多
文摘Various stakeholders,such as researchers,government agencies,businesses,and research laboratories require a large volume of reliable scientific research outcomes including research articles and patent data to support their work.These data are crucial for a variety of application,such as advancing scientific research,conducting business evaluations,and undertaking policy analysis.However,collecting such data is often a time-consuming and laborious task.Consequently,many users turn to using openly accessible data for their research.However,these existing open dataset releases typically suffer from lack of relationship between different data sources and a limited temporal coverage.To address this issue,we present a new open dataset,the Intelligent Innovation Dataset(IIDS),which comprises six interrelated datasets spanning nearly 120 years,encompassing paper information,paper citation relationships,patent details,patent legal statuses,and funding information.The extensive contextual and extensive temporal coverage of the IIDS dataset will provide researchers and practitioners and policy maker with comprehensive data support,enabling them to conduct in-depth scientific research and comprehensive data analyses.
文摘Dear readers,It is our pleasure to present six articles in Volume 6,Issue 4 of the Journal of Social Computing.To help readers navigate this diverse content,we have organized the papers into three thematic clusters:(1)Theoretical development in information diffusion,particularly the distribution of biases across social categories and cultural contexts.(2)Methodological contributions to contemporary research,including social experiments using multiple LLM agents and computational recommendation methods.(3)Integrated approaches combining machine learning and statistical modeling,using mental health as a case study to explore and test theory.
文摘This essay examines the intricate relationship between large language models(LLMs)and privacy,investigating the ethical and practical issues stemming from cutting-edge artificial intelligence(AI)technologies.The research delves into the evolving understanding of privacy in the digital era,with a specific emphasis on the risks posed by anthropomorphic AI design.The analysis highlights critical privacy concerns:(1)Trust and accountability:The lack of true moral agency in AI systems complicates traditional notions of trust and responsibility;(2)Nissenbaum’s Contextual Integrity Framework as a tool to explore privacy issues in general and with LLM;(3)Data collection challenges:LLMs collect extensive user data,often without explicit consent,potentially breaching contextual privacy norms;(4)Anthropomorphism risks:Human-like AI interfaces can foster over-trust,leading users to share sensitive information inappropriately.This article underscores that privacy is a complex,multidimensional concept profoundly shaped by technological,cultural,and social forces.As AI technologies continue to advance,safeguarding privacy will necessitate a nuanced approach that strikes a balance between individual rights,societal needs,and technological progress.We conclude with useroriented guidelines and future research directions,offering a comprehensive framework for understanding and addressing the privacy implications of LLMs.
文摘Dear readers,Welcome to the first issue of volume 6 of the Journal of Social Computing!We present six articles that highlight the interplay between artificial intelligence,computational modeling,and research resources in addressing both methodological challenges and real-world societal issues.These papers are grouped into three thematic clusters:(1)AI and computational methods for social and economic research,(2)data-driven insights into real-world social challenges,and(3)research resources for scientific and policy advancements.
文摘Dear readers,Welcome to the second issue of the sixth volume of the Journal of Social Computing!This issue presents six interdisciplinary research articles that explore how modern computational tools can help us better understand and make sense of complex systems.These include social behavior,technological development,scientific discovery,and the evolving capabilities of artificial intelligence.To guide readers through this diverse content,the articles are grouped into three thematic clusters:(1)Understanding Human and Machine Behavior,(2)Modeling Social and Legislative Systems,and(3)Extracting Insights from Scientific and Technological Data.
文摘The exponential growth of social media has heightened concerns about digital addiction and its mental health consequences,particularly among younger populations.Existing digital health tools,including conversational agents and large language models,offer real-time support but often neglect the predictive value of structured behavioural data.This study introduces a machine learning framework to assess digital addiction risk using 3200 anonymised self-reports comprising screen time,social media engagement,sleep duration,and mental health indicators.Across multiple models,categorical boosting(CatBoost)achieves the highest performance(precision=85.4%,receiver operating characteristic-area under the curve(ROC-AUC)=0.93),outperforming extreme gradient boosting(XGBoost)and graph neural networks(GNN).A linear regression model provides interpretable correlations between behavioural variables and addiction risk.Structural equation modelling(SEM)reveals that anxiety and depression mediate the relationship between digital behaviours and addiction risk,offering causal insights into these pathways.Feature importance analysis identified excessive screen time,frequent social media checking,and reduced sleep as the most influential predictors.To translate findings into practice,K-means clustering generated behavioural risk profiles,enabling personalised,data-driven recommendations.While clinical validation remains a next step,this framework demonstrates how predictive modelling and clustering can inform scalable,noninvasive digital health interventions.By integrating machine learning with causal modelling and personalised intervention design,this study advances computational approaches to digital addiction and contributes to the broader discourse on artificial intelligence applications in mental health and social computing.
基金from the Content Moderation Research Award granted by Facebook。
文摘Morally controversial content,such as offensive and hateful images over social media,is especially challenging to categorize,given widespread disagreement in how people interpret and evaluate this content.Numerous studies argue that a range of subjective biases,such as partisan differences in moral reasoning,lead people not only to diverge in their classifications of controversial content,but also to resist any attempts to change their classification judgments via social influence.Yet,recent large-scale analyses of classification patterns over social media suggest that separate populations,such as democrats and republicans,can reach surprising levels of agreement in the categorization of inflammatory content like fake news and hate speech,despite considerable differences in their moral reasoning and worldview.This poses a fundamental puzzle:how can populations of diverse individuals who disagree in the interpretation of controversial content nevertheless arrive at highly similar decisions for the classification and removal of such content?Here,we use an online platform to test the hypothesis that structural symmetries in information exchange networks can synchronize convergence on decisions regarding the classification and removal of controversial images across independent networks,leading them to independently reproduce consistent systems of classification.We find that isolated individuals diverge considerably in their classification of controversial content,whereas separate,structurally similar networks independently synchronize in their classifications and content removal decisions,reducing partisan biases across all networks.We also find that when participant experience is compared to subjects evaluating content individually in the control condition,participants within synchronizing networks reported having significantly more positive feelings about their task,and experience significantly less emotional stress when evaluating controversial content.
基金supported in part by the National Natural Science Foundation of China(Nos.62301510,62271455,and 72474198)the Public Computing Cloud,CUC.
文摘Understanding influencers’perspectives and predicting public sentiment are crucial for event assessment and guidance in computational social systems,enabling more informed decision-making.However,this task is inherently challenging due to the unstructured,context-sensitive,and heterogeneous nature of online communication.To address these challenges,we propose a novel intelligent computational framework,Multi-domain Opinion Leader Agents Emotion Prediction(MOAEP).Our framework comprises three key components:(1)An Automatic Question Generation(AQG)module employing“Who,What,Where,When,Why,and How”(5W1H)questioning to systematically explore topic dimensions;(2)A Multi-domain Opinion Leader Agents(MOA)module that integrates enhanced Large Language Models(LLMs)with Retrieval-Augmented Generation(RAG)to produce domain-specific responses;and(3)An emotion prediction engine that synthesizes agent interactions to forecast collective emotional responses,enabling proactive social computing analysis that surpasses conventional post-event methods.Experimental results demonstrate the framework’s efficacy:the AQG module generates high-fidelity outputs,while the influencer agents maintain consistent performance,achieving an average“Generative Pre-trained Transformer 4”(GPT-4)evaluation score of 6.85(on a 0-10 scale)across multiple dimensions.In a social media conflict case study,“Russia-Ukraine War”,our framework successfully predicts key influencers’perspectives and aligns emotional forecasts with observed real-world sentiment trends.These findings underscore the potential of MOAEP to provide actionable insights for decision-making in computational social science.
文摘This research introduced a predictive modeling framework to analyze the temporal dynamics of social media discourse during three global movements:the Mahsa Amini Protests(2022),South African Unrest(2021),and the Black Lives Matter Movement(2020).By utilizing Twitter data,the study developed a Protest Social Media Archetype to capture the evolution of tweet activity during key protest periods,identifying common patterns and notable variations in public engagement.Techniques such as LOESS regression and correlation analysis were used to model fluctuations in online activity and assess the impact of social media on public mobilization during socio-political unrest.The findings revealed distinct temporal signatures for each movement,showing similarities in initial engagement and differences in sustained activity across various contexts.These insights underscore the role of digital platforms in protest organization and global solidarity,offering a framework for future studies on digital activism and protest dynamics.
基金supported by the Oxfam Hong Kong(No.C2202-P2)additional matching grant support(No.ISG220208)by Saint Francis University.
文摘Defining poverty based on relative income,intended to identify individuals who are significantly worse off than the mainstream living standard—is widely adopted by more developed countries and regions,such as those in the Organization for Economic Co-operation and Development(OECD).There are,however,different ways income is counted,sometimes only counting earnings(e.g.,from employment),sometimes counting government transfers(e.g.,social welfare distributions),sometimes counting that generated from savings and investments as well.While some simpler form of income may be used for calculating relative poverty for ease of measurement(or other practical considerations),the intention of the relative poverty definition should be based on full income from all sources(including assets).This paper studies a method for evaluating the inaccuracy caused by using a simpler(and easier to measure)income distribution and understanding where the inaccuracy comes from.We test our method by using a 2000-household dataset from Hong Kong Special Administrative Region,to evaluate the relative poverty approach once adopted there.We also recommend practical alternatives:focusing on economically active households only or using disposable income instead of market income.We show how much such alternatives can improve accuracy and explain why.
文摘Video classification typically requires large labeled datasets which are costly and time-consuming to obtain.This paper proposes a novel Active Learning(AL)framework to improve video classification performance while minimizing the human annotation effort.Unlike passive learning methods that randomly select samples for labeling,our approach actively identifies the most informative unlabeled instances to be annotated.Specifically,we develop batch mode AL techniques that select useful videos based on uncertainty and diversity sampling.The algorithm then extracts a diverse set of representative keyframes from the queried videos.Human annotators only need to label these keyframes instead of watching the full videos.We implement this approach by leveraging recent advances in deep neural networks for visual feature extraction and sequence modeling.Our experiments on benchmark datasets demonstrate that our method achieves significant improvements in video classification accuracy with less training data.This enables more efficient video dataset construction and could make large-scale video annotation more feasible.Our AL framework minimizes the human effort needed to train accurate video classifiers.
文摘Dear Readers,We present six articles in this issue that display a wide range of methods used in social computing.These six articles can be classified into three categories:(1)Theoretical perspectives and formal models,(2)simulation modeling,and(3)computational models.A theoretical perspective offers deep reflection on social phenomena。
基金funded by the National Natural Science Foundation of China(Nos.62172090 and 62202209)the Key Laboratory of Computer Network and Information Integration of Ministry of Education of China(No.93K-9-2024-04)the Jiangsu Province Higher Education Basic Science(Natural Science)Foundation(No.24KJB520014).
文摘The problem of Point-Of-Interest(POI)recommendation,based on the user’s historical checkin records,determines whether a user checks in at specific POI.However,the user-POI data have a longtail distribution phenomenon.To mitigate the sparsity of check-in data,it is a good idea to exploit the sufficient attributes of POI and recommend POIs in both geography wise and category wise.Generally,this problem can be treated as two specific tasks with feature combination,ignoring cross-task dependencies and feature disentanglement.To address the aforementioned problems,this paper proposes a novel joint framework named InteractPOI,enabling two-stage interaction bewteen geographywise and category-wise POI recommendations.Specifically,this paper comprehensively considers the sequence effect and the neighbor effect both from geography wise and category wise.For the firststage interaction,we design a disentangled graph embedding model to distinguish different influencing factors from geography wise and category wise.For the second-stage interaction,we integrate a gating mechanism for feature fusion with a complementary algorithm for interactive optimization.Extensive experiments on two datasets demonstrate the superiority of the proposed model.
基金partially funded by the National Natural Science Foundation of China(Nos.71503103 and 72372059)National Social Science Foundation of China(Nos.19FGLB031 and 22AJL002)+7 种基金National Statistical Science Research Program of China(No.2024LZ015)Outstanding Youth in Social Sciences of Jiangsu ProvinceQinglan Project of Jiangsu Province,and Engineering Research Center of Integration and Application of Digital Learning Technology,Ministry of Education(No.1321005)Educational Planning Project of Jiangsu Province(No.ZYJN/2024/01)Postgraduate Research&Practice Innovation Program of Jiangsu Province”(No.SJCX241336)Fundamental Research Funds for the Central Universities(Nos.JUSRP622047 and JUSRP321016)Soft Science Foundation of Wuxi City(No.KX-24-A15)and Jiangsu Province Science and Technology Think Tank Program Youth Project(No.JSKX0125058).
文摘There exist many panel data decision problems in real life,and they take on obvious structural similarities and lag effects among decision objects or indicators,which are difficult to solve effectively based on traditional panel data analysis methods.To deal with these problems,considering the structural characteristics of panel data and lag effect,from multiple structural dimensions such as scale volume,development trend,and volatility,we exploit grey incidence analysis and panel data to establish an indicator-type grey structural incidence analysis model,and utilize it to analyze and identify factors influencing technological innovation of industrial enterprises.The results show that the proposed method fully considers the structural characteristics of panel data and lag effect,and it can deal with panel data decision problems and provide a new methodological support for the grey incidence analysis.
基金supported by the National Key R&D Program of China(No.2022ZD0116205).
文摘In this paper,we propose and implement a systematic pipeline for the automatic classification of AI-related documents extracted from large-scale literature databases.This process results in the creation of an AI-related literature dataset named DeepDiveAI.The dataset construction pipeline integrates expert knowledge with the capabilities of advanced models,structured into two primary stages.In the first stage,expert-curated classification datasets are used to train a Long Short-Term Memory(LSTM)model,which performs coarse-grained classification of AI-related records from large-scale datasets.In the second stage,a large language model,specifically Qwen2.5 Plus,is employed to annotate a random 10%of the initially coarse set of classified AI-related records.These annotated records are subsequently used to train a Bidirectional Encoder Representations from Transformers(BERT)based binary classifier,further refining the coarse set to produce the final DeepDiveAI dataset.Evaluation results indicate that the proposed pipeline achieves both accuracy and efficiency in identifying AI-related literature from large-scale datasets.
文摘The paper critically evaluates the global discourses on algorithmic fairness,reviews key Western literature on artificial intelligence(AI)fairness,identifies twelve documented cases of algorithmic discrimination in Western contexts,and extends them for their analytical relevance to non-Western socio-political environments.The study applies these frameworks in particular to the Indian context,and proposes that India’s entrenched socio-cultural structures—caste,religion,language,regional identity,and minority status—tends to misalign with Western paradigms of fairness.The proposed study identifies identity-specific factors unique to India that are likely to contribute to algorithmic oppression if unaddressed.A critical policy analysis of major documents shaping India’s AI and digital governance landscape reveals that these critical factors remain largely unacknowledged.Indian policy responses tend to replicate Western techno-legal models without engaging indigenous socio-structural realities.The paper concludes that ethical AI governance in India must transcend imported normative models and instead be rooted in context-sensitive approaches that accommodates the nation’s distinct social fabric,in order to prevent algorithm induced structural discrimination and ensure inclusive algorithmic justice.
文摘Common legislative prediction methods often emphasize bill content or social relationships.This paper,motivated by the insight that similar policy texts reflect comparable political ideologies and can lead to similar voting outcomes,proposes a deep learning method that exploits attention mechanisms to incorporate semantic similarity between bills into legislative prediction models.Our approach uses attention scores to identify bills that are most similar to the one being predicted,and combines the encoded features of these similar bills as additional auxiliary information.By integrating these related features,the model goes beyond the semantic information of individual bills,leading to a more comprehensive use of roll-call data.Empirical results show that utilizing bill similarity along with traditional social relationships,voter characteristics,and bill content significantly improves performance in terms of accuracy,recall,precision,and F1 score compared to models that ignore bill similarity.The results also confirm that legislators tend to maintain consistent views or voting patterns on bills that are similar in nature.In addition,we demonstrate that the attention mechanism is more effective than conventional similarity measures,such as cosine similarity and Euclidean distance,in capturing the similarities between bills.
文摘In the global venture capital(VC)landscape,cross-community collaboration is vital for foreign VC firms,especially in markets like China,where the business environment and the guanxi culture present unique challenges.Using co-investment data from 2000 to 2014,this study identifies seven communities through a semi-supervised detection method,categorizing them by the predominance of domestic or foreign VCs.Cross-community collaboration refers to partnerships between VC firms from different communities,involving at least one domestic and one foreign VC.Logistic regression analysis reveals that industry distance does not significantly impact cross-community collaboration.However,industry hotness and local knowledge positively moderate this relationship.In the Chinese context,signaling theory suggests that cross-community collaborations between foreign and domestic VCs act as a signal of credibility.Guanxi,characterized by trust and reciprocity,encourages foreign VCs to foster long-term relationships with domestic counterparts,helping them bridge industrial and cultural gaps.Additionally,industry hotness and local experience reduce investment risk and uncertainty,leading foreign VCs to engage more frequently in cross-community collaborations that link domestic and foreign ecosystems.This study integrates signaling theory with guanxi in the cross-community VC context,emphasizing the strategic role of syndication as a signal in emerging markets like China.
文摘The human-centric visual analysis field thrives on rich video datasets that explore human behaviours and interactions.Yet,a gap persists in datasets covering both human pose estimation and parsing challenges.In this study,a notable effort has been made to develop a dedicated dataset named“Single Person Video-in-Person(SP-VIP)”to suit the research scenario,resolving a lack of a universal dataset to support three major human-centric visual analysis methods.The SP-VIP dataset was derived by extracting videos from the VIP dataset initially designed exclusively for parsing-related tasks.Furthermore,the VIP dataset did not encompass provisions for pose estimation and human activity recognition,which are crucial elements for human activity recognition.To bridge this gap,the SP-VIP dataset was meticulously curated with a specific focus on single-person activities.Videos in the newly created dataset are split into frames with semantic labels and joint values for each frame.To assess the performance of the tailored dataset,a novel architecture Single-person Parsing and Pose Network(SPPNet)was employed using a Deep ConvNet network for parsing while simultaneously performing pose estimation using the stacked hourglass method.To demonstrate the effectiveness of the newly created dataset,extensive experiments were performed on the discussed architecture,which produced favourable results with a pixel accuracy of 88.50%,a mean accuracy of 60.50%,and a mean Intersection over Union(IoU)of 49.30%signifying enhancement in performance.
基金supported by the National Natural Science Foundation of China(No.62302213)Key Laboratory of Social Computing and Cognitive Intelligence(Dalian University of Technology),Ministry of Education,China.
文摘The rise of online social platforms has enhanced connectivity and access to information.Still,it has also enabled the proliferation of malicious social bots that threaten platform security and disrupt social order.In this paper,we introduce a unified framework for defining and classifying malicious social bots along three dimensions:behavior,interaction,and operation.We then present a comprehensive review of social bot detection methods,tracing their evolution from traditional machine learning techniques to deep learning architectures and graph neural networks,with particular emphasis on recent advances in group-level detection.We also explore the emerging paradigm of Large Language Model(LLM)based bot detection.This paper reviews the current state of research,identifies key challenges,and outlines future directions.It provides a cohesive foundation for building more robust detection frameworks to counter the evolving threats posed by malicious social bots.