随着大语言模型(LLM)的快速发展,基于LLM的对话助手逐渐成为学生学习的新方式。通过学生的问答互动,对话助手能生成相应的解答,从而帮助学生解决问题,并提高学习效率。然而,现有的对话助手忽略了学生的个性化需求,无法为学生提供个性化...随着大语言模型(LLM)的快速发展,基于LLM的对话助手逐渐成为学生学习的新方式。通过学生的问答互动,对话助手能生成相应的解答,从而帮助学生解决问题,并提高学习效率。然而,现有的对话助手忽略了学生的个性化需求,无法为学生提供个性化的回答,实现“因材施教”。因此,提出一种基于学生能力感知的个性化对话助手框架。该框架包括2个主要模块:学生能力感知模块和个性化回答生成模块。能力感知模块通过分析学生的答题记录来挖掘学生的知识掌握程度,回答生成模块则根据学生的能力生成个性化回答。基于此框架,设计基于指令、基于小模型驱动和基于智能体Agent的3种实现范式,以深入探讨框架的实际效果。基于指令的对话助手利用LLM的推理能力,从学生的答题记录中挖掘知识掌握程度以帮助生成个性化回答;基于小模型驱动的对话助手利用深度知识追踪(DKT)模型生成学生的知识掌握程度;基于Agent的个性化对话助手采用LLM Agent的方式整合学生能力感知、个性化检测、答案修正等工具辅助答案的生成。基于ChatGLM(Chat General Language Model)、GPT4o_mini的对比实验结果表明,应用3种范式的LLM均能为学生提供个性化的回答,其中基于Agent的范式的准确度更高,表明该范式能更好地感知学生能力,并生成个性化回答。展开更多
The primary objective of Chinese grammatical error correction(CGEC)is to detect and correct errors in Chinese sentences.Recent research shows that large language models(LLMs)have been applied to CGEC with significant ...The primary objective of Chinese grammatical error correction(CGEC)is to detect and correct errors in Chinese sentences.Recent research shows that large language models(LLMs)have been applied to CGEC with significant results.For LLMs,selecting appropriate reference examples can help improve their performance.However,existing methods predominantly rely on text similarity for example retrieval,a strategy that frequently mismatches actual error patterns and retrieves lexically similar yet grammatically irrelevant sentences.To address this problem,we propose a method named RE^(2),which retrieves appropriate examples with explanations of grammatical errors.Instead of using text similarity of the input sentence,we use explanations of grammatical errors to select reference examples,which are used by LLMs to improve the performance of CGEC.We conduct experiments on two CGEC datasets and create a high-quality grammatical error explanation(GEE)dataset,which is not only used in our research but also serves as a valuable resource for future studies in both CGEC and GEE.The experimental results on the two datasets indicate that our proposed method effectively improves the performance of CGEC.展开更多
Empirical Risk Minimization(ERM)models often rely on spurious correlations between features and labels during the learning process,leading to shortcut learning behavior that undermines robustness generalization perfor...Empirical Risk Minimization(ERM)models often rely on spurious correlations between features and labels during the learning process,leading to shortcut learning behavior that undermines robustness generalization performance.Current research mainly targets identifying or mitigating a single shortcut;however,in real-world scenarios,cues within the data are diverse and unknown.In empirical studies,we reveal that models rely more on strong shortcuts than weak ones,with their performance under multiple shortcuts typically falling between that of an individual shortcut.To address these challenges,we propose MiMu,a novel method integrated with Transformer-based ERMs designed to Mitigate Multiple shortcut learning behavior,which incorporates self-calibration strategy and self-improvement strategy.In the source model,we first propose the self-calibration strategy to prevent the model from relying on shortcuts and make overconfident predictions.Then,we design self-improvement strategy in target model to further reduce the reliance on multiple shortcuts.The random mask strategy involves randomly masking partial attention positions to diversify the focus of target model avoiding fixation on a fixed region.Meanwhile,the adaptive attention alignment module facilitates the alignment of attention weights to the calibrated source model,without the need for post-hoc attention maps or supervision.Finally,extensive experiments conducted on Natural Language Processing(NLP)and Computer Vision(CV)demonstrate the effectiveness of MiMu in improving the robustness generalization abilities.展开更多
Recently,many online Karaoke(KTV)platforms have been released,where music lovers sing songs on these platforms.In the meantime,the system automatically evaluates user proficiency according to their singing behavior.Re...Recently,many online Karaoke(KTV)platforms have been released,where music lovers sing songs on these platforms.In the meantime,the system automatically evaluates user proficiency according to their singing behavior.Recommending approximate songs to users can initialize singers5 participation and improve users,loyalty to these platforms.However,this is not an easy task due to the unique characteristics of these platforms.First,since users may be not achieving high scores evaluated by the system on their favorite songs,how to balance user preferences with user proficiency on singing for song recommendation is still open.Second,the sparsity of the user-song interaction behavior may greatly impact the recommendation task.To solve the above two challenges,in this paper,we propose an informationfused song recommendation model by considering the unique characteristics of the singing data.Specifically,we first devise a pseudo-rating matrix by combing users’singing behavior and the system evaluations,thus users'preferences and proficiency are leveraged.Then we mitigate the data sparsity problem by fusing users*and songs'rich information in the matrix factorization process of the pseudo-rating matrix.Finally,extensive experimental results on a real-world dataset show the effectiveness of our proposed model.展开更多
Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"informatio...Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"information(contextual information of the mention in the text),or"global"information(relations among candidate entities).However,either local or global information might be insufficient especially when the given text is short.To get richer local and global information for entity linking,we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines(WSE).Based on the intuition above,two novel attempts are made.The first one adds web-searched results into an embedding-based method to expand the mention's local information,where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method.The second one uses the web contexts to extend the global information,i.e.,finding and utilizing more extra relevant mentions from the web contexts with a graph-based model.Finally,we combine the two models we propose to use both extended local and global information from the extra web contexts.Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.展开更多
Large language models(LLMs)have made unprecedented progress,demonstrating human-like language proficiency and an extraordinary ability to encode complex knowledge.The emergence of high-level cognitive capabilities in ...Large language models(LLMs)have made unprecedented progress,demonstrating human-like language proficiency and an extraordinary ability to encode complex knowledge.The emergence of high-level cognitive capabilities in LLMs,such as in-context learning and complex reasoning,suggests a path toward the realization of artificial general intelligence(AGI).However,we lack scientific theories and tools to assess and interpret such an emergence of the advanced intelligence of LLMs.Artificial intelligence(AI)has been extensively applied in various areas of fundamental science to accelerate scientific research.展开更多
Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine th...Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.展开更多
The primary goal of visible-infrared person re-identification(VI-ReID)is to match pedestrian photos obtained during the day and night.The majority of existing methods simply generate auxiliary modalities to reduce the...The primary goal of visible-infrared person re-identification(VI-ReID)is to match pedestrian photos obtained during the day and night.The majority of existing methods simply generate auxiliary modalities to reduce the modality discrepancy for cross-modality matching.They capture modality-invariant representations but ignore the extraction of modality-specific representations that can aid in distinguishing among various identities of the same modality.To alleviate these issues,this work provides a novel specific and shared representations learning(SSRL)model for VI-ReID to learn modality-specific and modality-shared representations.We design a shared branch in SSRL to bridge the image-level gap and learn modality-shared representations,while a specific branch retains the discriminative information of visible images to learn modality-specific representations.In addition,we propose intra-class aggregation and inter-class separation learning strategies to optimize the distribution of feature embeddings at afine-grained level.Extensive experimental results on two challenging benchmark datasets,SYSU-MM01 and RegDB,demonstrate the superior performance of SSRL over state-of-the-art methods.展开更多
文摘随着大语言模型(LLM)的快速发展,基于LLM的对话助手逐渐成为学生学习的新方式。通过学生的问答互动,对话助手能生成相应的解答,从而帮助学生解决问题,并提高学习效率。然而,现有的对话助手忽略了学生的个性化需求,无法为学生提供个性化的回答,实现“因材施教”。因此,提出一种基于学生能力感知的个性化对话助手框架。该框架包括2个主要模块:学生能力感知模块和个性化回答生成模块。能力感知模块通过分析学生的答题记录来挖掘学生的知识掌握程度,回答生成模块则根据学生的能力生成个性化回答。基于此框架,设计基于指令、基于小模型驱动和基于智能体Agent的3种实现范式,以深入探讨框架的实际效果。基于指令的对话助手利用LLM的推理能力,从学生的答题记录中挖掘知识掌握程度以帮助生成个性化回答;基于小模型驱动的对话助手利用深度知识追踪(DKT)模型生成学生的知识掌握程度;基于Agent的个性化对话助手采用LLM Agent的方式整合学生能力感知、个性化检测、答案修正等工具辅助答案的生成。基于ChatGLM(Chat General Language Model)、GPT4o_mini的对比实验结果表明,应用3种范式的LLM均能为学生提供个性化的回答,其中基于Agent的范式的准确度更高,表明该范式能更好地感知学生能力,并生成个性化回答。
基金support of the National Natural Science Foundation of China(NSFC)(Grant Nos.62236004,62206078,and 62476073).
文摘The primary objective of Chinese grammatical error correction(CGEC)is to detect and correct errors in Chinese sentences.Recent research shows that large language models(LLMs)have been applied to CGEC with significant results.For LLMs,selecting appropriate reference examples can help improve their performance.However,existing methods predominantly rely on text similarity for example retrieval,a strategy that frequently mismatches actual error patterns and retrieves lexically similar yet grammatically irrelevant sentences.To address this problem,we propose a method named RE^(2),which retrieves appropriate examples with explanations of grammatical errors.Instead of using text similarity of the input sentence,we use explanations of grammatical errors to select reference examples,which are used by LLMs to improve the performance of CGEC.We conduct experiments on two CGEC datasets and create a high-quality grammatical error explanation(GEE)dataset,which is not only used in our research but also serves as a valuable resource for future studies in both CGEC and GEE.The experimental results on the two datasets indicate that our proposed method effectively improves the performance of CGEC.
文摘Empirical Risk Minimization(ERM)models often rely on spurious correlations between features and labels during the learning process,leading to shortcut learning behavior that undermines robustness generalization performance.Current research mainly targets identifying or mitigating a single shortcut;however,in real-world scenarios,cues within the data are diverse and unknown.In empirical studies,we reveal that models rely more on strong shortcuts than weak ones,with their performance under multiple shortcuts typically falling between that of an individual shortcut.To address these challenges,we propose MiMu,a novel method integrated with Transformer-based ERMs designed to Mitigate Multiple shortcut learning behavior,which incorporates self-calibration strategy and self-improvement strategy.In the source model,we first propose the self-calibration strategy to prevent the model from relying on shortcuts and make overconfident predictions.Then,we design self-improvement strategy in target model to further reduce the reliance on multiple shortcuts.The random mask strategy involves randomly masking partial attention positions to diversify the focus of target model avoiding fixation on a fixed region.Meanwhile,the adaptive attention alignment module facilitates the alignment of attention weights to the calibrated source model,without the need for post-hoc attention maps or supervision.Finally,extensive experiments conducted on Natural Language Processing(NLP)and Computer Vision(CV)demonstrate the effectiveness of MiMu in improving the robustness generalization abilities.
基金grants from the National Key Research and Development Program of China(2016YFB1000904)the National Natural Science Foundation of China(Grant Nos.61325010 and U 1605251)+3 种基金the Fundamental Research Funds for the Central Universities of China(WK2350000001)Le Wu gratefully acknowledges the support of the Open Project Program of the National Laboratory of Pattern Recognition(201700017)the Fundamental Research Funds for the Central Universities(JZ2016HGBZ0749)Yong Ge acknowledges the support of the National Natural Science Foundation of China(NSFC,Grant Nos.61602234 and 61572032).
文摘Recently,many online Karaoke(KTV)platforms have been released,where music lovers sing songs on these platforms.In the meantime,the system automatically evaluates user proficiency according to their singing behavior.Recommending approximate songs to users can initialize singers5 participation and improve users,loyalty to these platforms.However,this is not an easy task due to the unique characteristics of these platforms.First,since users may be not achieving high scores evaluated by the system on their favorite songs,how to balance user preferences with user proficiency on singing for song recommendation is still open.Second,the sparsity of the user-song interaction behavior may greatly impact the recommendation task.To solve the above two challenges,in this paper,we propose an informationfused song recommendation model by considering the unique characteristics of the singing data.Specifically,we first devise a pseudo-rating matrix by combing users’singing behavior and the system evaluations,thus users'preferences and proficiency are leveraged.Then we mitigate the data sparsity problem by fusing users*and songs'rich information in the matrix factorization process of the pseudo-rating matrix.Finally,extensive experimental results on a real-world dataset show the effectiveness of our proposed model.
基金supported by the National Key Research and Development Program of China under Grant No.2018AAAO10190the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20191420+2 种基金the National Natural Science Foundation of China under Grant No.61632016the Natural Science Research Project of Jiangsu Higher Education Institution under Grant No.17KJA520003the Priority Academic Program Development of JiangsuHigher Education Institutions,and the Suda-Toycloud Data Intelligence Joint Laboratory.
文摘Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"information(contextual information of the mention in the text),or"global"information(relations among candidate entities).However,either local or global information might be insufficient especially when the given text is short.To get richer local and global information for entity linking,we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines(WSE).Based on the intuition above,two novel attempts are made.The first one adds web-searched results into an embedding-based method to expand the mention's local information,where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method.The second one uses the web contexts to extend the global information,i.e.,finding and utilizing more extra relevant mentions from the web contexts with a graph-based model.Finally,we combine the two models we propose to use both extended local and global information from the extra web contexts.Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.
基金This work was funded by National Natural Science Foundation of China(62001205)National Key R&D Program of China(2021YFF1200804)Shenzhen Science and Technology Innovation Committee(2022410129,KCXFZ20201221173400001,and SGDX2020110309280100).
文摘Large language models(LLMs)have made unprecedented progress,demonstrating human-like language proficiency and an extraordinary ability to encode complex knowledge.The emergence of high-level cognitive capabilities in LLMs,such as in-context learning and complex reasoning,suggests a path toward the realization of artificial general intelligence(AGI).However,we lack scientific theories and tools to assess and interpret such an emergence of the advanced intelligence of LLMs.Artificial intelligence(AI)has been extensively applied in various areas of fundamental science to accelerate scientific research.
基金National Natural Science Foundation of China(Grant Nos.62376166,62306188,61876113)National Key R&D Program of China(No.2022YFC3303504).
文摘Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.
基金supported by the National Key R&D Program of China(2022ZD0160605)the National Natural Science Foundation of China(61976002)+3 种基金the University Synergy Innovation Program of Anhui Province(GXXT-2022-036)the Natural Science Foundation of Anhui Province(No.2208085J18)the National Natural Science Foundation of China under Grant(62106006)the Natural Science Foundation of Anhui Higher Education Institution(No.2022AH040014).
文摘The primary goal of visible-infrared person re-identification(VI-ReID)is to match pedestrian photos obtained during the day and night.The majority of existing methods simply generate auxiliary modalities to reduce the modality discrepancy for cross-modality matching.They capture modality-invariant representations but ignore the extraction of modality-specific representations that can aid in distinguishing among various identities of the same modality.To alleviate these issues,this work provides a novel specific and shared representations learning(SSRL)model for VI-ReID to learn modality-specific and modality-shared representations.We design a shared branch in SSRL to bridge the image-level gap and learn modality-shared representations,while a specific branch retains the discriminative information of visible images to learn modality-specific representations.In addition,we propose intra-class aggregation and inter-class separation learning strategies to optimize the distribution of feature embeddings at afine-grained level.Extensive experimental results on two challenging benchmark datasets,SYSU-MM01 and RegDB,demonstrate the superior performance of SSRL over state-of-the-art methods.