期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Efficient Arabic Essay Scoring with Hybrid Models: Feature Selection, Data Optimization, and Performance Trade-Offs
1
作者 Mohamed Ezz Meshrif Alruily +4 位作者 Ayman Mohamed Mostafa Alaa SAlaerjan Bader Aldughayfiq Hisham Allahem Abdulaziz Shehab 《Computers, Materials & Continua》 2026年第1期2274-2301,共28页
Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic... Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage. 展开更多
关键词 Automated essay scoring text-based features vector-based features embedding-based features feature selection optimal data efficiency
在线阅读 下载PDF
From Symbols to Embeddings:A Tale of Two Representations in Computational Social Science 被引量:5
2
作者 Huimin Chen Cheng Yang +3 位作者 Xuanming Zhang Zhiyuan Liu Maosong Sun Jianbin Jin 《Journal of Social Computing》 2021年第2期103-156,共54页
Computational Social Science(CSS),aiming at utilizing computational methods to address social science problems,is a recent emerging and fast-developing field.The study of CSS is data-driven and significantly benefits ... Computational Social Science(CSS),aiming at utilizing computational methods to address social science problems,is a recent emerging and fast-developing field.The study of CSS is data-driven and significantly benefits from the availability of online user-generated contents and social networks,which contain rich text and network data for investigation.However,these large-scale and multi-modal data also present researchers with a great challenge:how to represent data effectively to mine the meanings we want in CSS?To explore the answer,we give a thorough review of data representations in CSS for both text and network.Specifically,we summarize existing representations into two schemes,namely symbol-based and embeddingbased representations,and introduce a series of typical methods for each scheme.Afterwards,we present the applications of the above representations based on the investigation of more than 400 research articles from 6 top venues involved with CSS.From the statistics of these applications,we unearth the strength of each kind of representations and discover the tendency that embedding-based representations are emerging and obtaining increasing attention over the last decade.Finally,we discuss several key challenges and open issues for future directions.This survey aims to provide a deeper understanding and more advisable applications of data representations for CSS researchers. 展开更多
关键词 Computational Social Science(CSS) symbol-based representation embedding-based representation social network
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部