Student cognitive modeling is a fundamental task in the intelligence education field.It serves as the basis for various downstream applications,such as student profiling,personalized educational content recommendation...Student cognitive modeling is a fundamental task in the intelligence education field.It serves as the basis for various downstream applications,such as student profiling,personalized educational content recommendation,and adaptive testing.Cognitive Diagnosis(CD)and Knowledge Tracing(KT)are two mainstream categories for student cognitive modeling,which measure the cognitive ability from a limited time(e.g.,an exam)and the learning ability dynamics over a long period(e.g.,learning records from a year),respectively.Recent efforts have been dedicated to the development of open-source code libraries for student cognitive modeling.However,existing libraries often focus on a particular category and overlook the relationships between them.Additionally,these libraries lack sufficient modularization,which hinders reusability.To address these limitations,we have developed a unified PyTorch-based library EduStudio,which unifies CD and KT for student cognitive modeling.The design philosophy of EduStudio is from two folds.From a horizontal perspective,EduStudio employs the modularization that separates the main step pipeline of each algorithm.From a vertical perspective,we use templates with the inheritance style to implement each module.We also provide eco-services of EduStudio,such as the repository that collects resources about student cognitive modeling and the leaderboard that demonstrates comparison among models.Our open-source project is available at the website of edustudio.ai.展开更多
Recently,with the rapid advancements in Large Language Models(LLMs),LLM-based Open-domain Question Answering(OpenQA)methods have reaped the benefits of emergent understanding and answering capabilities enabled by mass...Recently,with the rapid advancements in Large Language Models(LLMs),LLM-based Open-domain Question Answering(OpenQA)methods have reaped the benefits of emergent understanding and answering capabilities enabled by massive parameters compared to traditional methods.However,most of these methods encounter two critical challenges:how to integrate knowledge into LLMs effectively and how to adaptively generate results with specific answer formats.To address these challenges,we propose a novel framework,which aims to improve the OpenQA performance by exploring knowledge integration and controllable generation on LLMs simultaneously,namely GenKI.Specifically,we first train a dense passage retrieval model to retrieve associated knowledge from a given knowledge base.Subsequently,we introduce a novel knowledge integration model that incorporates the retrieval knowledge into instructions during fine-tuning to intensify the model.Furthermore,to enable controllable generation in LLMs,we leverage a certain fine-tuned LLM and an ensemble framework based on text consistency incorporating all coherence,fluency,and answer format assurance.Finally,extensive experiments conducted on three datasets with diverse answer formats demonstrate the effectiveness of GenKI with comparison of state-of-the-art baselines.Moreover,ablation studies have disclosed a linear relationship between the frequency of retrieved knowledge and the model’s ability to recall knowledge accurately with the ground truth.Tests focusing on the out-of-domain scenario and knowledge base independence scenario have further affirmed the robustness and controllable capability of GenKI.Our code of GenKI is available at https://github.com/USTC-StarTeam/GenKI.展开更多
Career indecision is a difficult obstacle confronting adolescents. Traditional vocational assessment research measures it by means of questionnaires and diagnoses the potential sources of career indecision. Based on t...Career indecision is a difficult obstacle confronting adolescents. Traditional vocational assessment research measures it by means of questionnaires and diagnoses the potential sources of career indecision. Based on the diagnostic outcomes, career counselors develop treatment plans tailored to students. However, because of personal motives and the architecture of the mind, it may be difficult for students to know themselves, and the outcome of questionnaires may not fully reflect their inner states and statuses. Selfperception theory suggests that students' behavior could be used as a clue for inference. Thus, we proposed a data-driven framework for forecasting student career choice upon graduation based on their behavior in and around the campus, thereby playing an important role in supporting career counseling and career guidance. By evaluating on 10M behavior data of over four thousand students, we show the potential of this framework for this functionality.展开更多
基金supported in part by grants from the National Science and Technology Major Project,China(Grant No.2021ZD0111802)the National Natural Science Foundation of China(Grant Nos.72188101,62406096,and 62376086)the Fundamental Research Funds for the Central Universities,China(Grant No.JZ2024HGQB0093).
文摘Student cognitive modeling is a fundamental task in the intelligence education field.It serves as the basis for various downstream applications,such as student profiling,personalized educational content recommendation,and adaptive testing.Cognitive Diagnosis(CD)and Knowledge Tracing(KT)are two mainstream categories for student cognitive modeling,which measure the cognitive ability from a limited time(e.g.,an exam)and the learning ability dynamics over a long period(e.g.,learning records from a year),respectively.Recent efforts have been dedicated to the development of open-source code libraries for student cognitive modeling.However,existing libraries often focus on a particular category and overlook the relationships between them.Additionally,these libraries lack sufficient modularization,which hinders reusability.To address these limitations,we have developed a unified PyTorch-based library EduStudio,which unifies CD and KT for student cognitive modeling.The design philosophy of EduStudio is from two folds.From a horizontal perspective,EduStudio employs the modularization that separates the main step pipeline of each algorithm.From a vertical perspective,we use templates with the inheritance style to implement each module.We also provide eco-services of EduStudio,such as the repository that collects resources about student cognitive modeling and the leaderboard that demonstrates comparison among models.Our open-source project is available at the website of edustudio.ai.
基金funded by the National Natural Science Foundation of China(Nos.U23A20319,62441227,62441239,62472394,62202443,and 62506352)the Anhui Province Science and Technology Innovation Project(No.202423k09020011)the Anhui Provincial Science and Technology Major Project(No.2023z020006).
文摘Recently,with the rapid advancements in Large Language Models(LLMs),LLM-based Open-domain Question Answering(OpenQA)methods have reaped the benefits of emergent understanding and answering capabilities enabled by massive parameters compared to traditional methods.However,most of these methods encounter two critical challenges:how to integrate knowledge into LLMs effectively and how to adaptively generate results with specific answer formats.To address these challenges,we propose a novel framework,which aims to improve the OpenQA performance by exploring knowledge integration and controllable generation on LLMs simultaneously,namely GenKI.Specifically,we first train a dense passage retrieval model to retrieve associated knowledge from a given knowledge base.Subsequently,we introduce a novel knowledge integration model that incorporates the retrieval knowledge into instructions during fine-tuning to intensify the model.Furthermore,to enable controllable generation in LLMs,we leverage a certain fine-tuned LLM and an ensemble framework based on text consistency incorporating all coherence,fluency,and answer format assurance.Finally,extensive experiments conducted on three datasets with diverse answer formats demonstrate the effectiveness of GenKI with comparison of state-of-the-art baselines.Moreover,ablation studies have disclosed a linear relationship between the frequency of retrieved knowledge and the model’s ability to recall knowledge accurately with the ground truth.Tests focusing on the out-of-domain scenario and knowledge base independence scenario have further affirmed the robustness and controllable capability of GenKI.Our code of GenKI is available at https://github.com/USTC-StarTeam/GenKI.
基金This work was supported by the National Natural Science Foundation of China (Grant Nos. 61502077, 61631005) and the Fundamental Research Funds for the Central Universities (ZYGX2014Z012).
文摘Career indecision is a difficult obstacle confronting adolescents. Traditional vocational assessment research measures it by means of questionnaires and diagnoses the potential sources of career indecision. Based on the diagnostic outcomes, career counselors develop treatment plans tailored to students. However, because of personal motives and the architecture of the mind, it may be difficult for students to know themselves, and the outcome of questionnaires may not fully reflect their inner states and statuses. Selfperception theory suggests that students' behavior could be used as a clue for inference. Thus, we proposed a data-driven framework for forecasting student career choice upon graduation based on their behavior in and around the campus, thereby playing an important role in supporting career counseling and career guidance. By evaluating on 10M behavior data of over four thousand students, we show the potential of this framework for this functionality.