电阻式随机存取存储器(Resistive Random Access Memory,RRAM)因具备存内计算能力,被认为是高效的神经网络加速器。剪枝技术通过去除冗余权重可有效压缩模型,从而节省基于RRAM的神经网络加速器的硬件资源。现有的针对RRAM的结构化剪枝...电阻式随机存取存储器(Resistive Random Access Memory,RRAM)因具备存内计算能力,被认为是高效的神经网络加速器。剪枝技术通过去除冗余权重可有效压缩模型,从而节省基于RRAM的神经网络加速器的硬件资源。现有的针对RRAM的结构化剪枝方法因其过粗的剪枝粒度易导致精度下降,且普遍忽视了权重之间的数值规律,导致这类潜在冗余未能被利用,难以在保证精度的同时进一步提升模型压缩率与硬件效率。为此,本文提出一种基于权重重构的忆阻神经网络剪枝方法,使用基于整数缩放的权重重构策略提取并共享权重中的数值共性,同时舍弃对精度影响较小的数值部分,仅映射权重关键信息至RRAM交叉阵列进行网络推理,实现权重的压缩表示。随后,使用渐进式重训练机制,将被舍弃的信息作为引导信号逐步衰减引入,从而在保持模型压缩率和硬件效率的同时有效恢复模型精度。实验结果表明,与现有方法相比,本文方法在模型压缩率、面积效率与能效方面实现了最多1.2倍、1.2倍与1.3倍的提升,且几乎不损失模型精度。展开更多
Conversational recommender systems(CRSs)focus on refining preferences and providing personalized recommendations through natural language interactions and dialogue history.Large language models(LLMs)have shown outstan...Conversational recommender systems(CRSs)focus on refining preferences and providing personalized recommendations through natural language interactions and dialogue history.Large language models(LLMs)have shown outstanding performance across various domains,thereby prompting researchers to investigate their applicability in recommendation systems.However,due to the lack of task-specific knowledge and an inefficient feature extraction process,LLMs still have suboptimal performance in recommendation tasks.Therefore,external knowledge sources,such as knowledge graphs(KGs)and knowledge bases(KBs),are often introduced to address the issue of data sparsity.Compared to KGs,KBs possess higher retrieval efficiency,making them more suitable for scenarios where LLMs serve as recommenders.To this end,we introduce a novel framework integrating LLMs with KBs for enhanced retrieval generation,namely LLMKB.LLMKB initially leverages structured knowledge to create mapping dictionaries,extracting entity-relation information from heterogeneous knowledge to construct KBs.Then,LLMKB achieves the embedding calibration between user information representations and documents in KBs through retrieval model fine-tuning.Finally,LLMKB employs retrievalaugmented generation to produce recommendations based on fused text inputs,followed by post-processing.Experiment results on two public CRS datasets demonstrate the effectiveness of our framework.Our code is publicly available at the link:https://anonymous.4open.science/r/LLMKB-6FD0.展开更多
文摘电阻式随机存取存储器(Resistive Random Access Memory,RRAM)因具备存内计算能力,被认为是高效的神经网络加速器。剪枝技术通过去除冗余权重可有效压缩模型,从而节省基于RRAM的神经网络加速器的硬件资源。现有的针对RRAM的结构化剪枝方法因其过粗的剪枝粒度易导致精度下降,且普遍忽视了权重之间的数值规律,导致这类潜在冗余未能被利用,难以在保证精度的同时进一步提升模型压缩率与硬件效率。为此,本文提出一种基于权重重构的忆阻神经网络剪枝方法,使用基于整数缩放的权重重构策略提取并共享权重中的数值共性,同时舍弃对精度影响较小的数值部分,仅映射权重关键信息至RRAM交叉阵列进行网络推理,实现权重的压缩表示。随后,使用渐进式重训练机制,将被舍弃的信息作为引导信号逐步衰减引入,从而在保持模型压缩率和硬件效率的同时有效恢复模型精度。实验结果表明,与现有方法相比,本文方法在模型压缩率、面积效率与能效方面实现了最多1.2倍、1.2倍与1.3倍的提升,且几乎不损失模型精度。
文摘Conversational recommender systems(CRSs)focus on refining preferences and providing personalized recommendations through natural language interactions and dialogue history.Large language models(LLMs)have shown outstanding performance across various domains,thereby prompting researchers to investigate their applicability in recommendation systems.However,due to the lack of task-specific knowledge and an inefficient feature extraction process,LLMs still have suboptimal performance in recommendation tasks.Therefore,external knowledge sources,such as knowledge graphs(KGs)and knowledge bases(KBs),are often introduced to address the issue of data sparsity.Compared to KGs,KBs possess higher retrieval efficiency,making them more suitable for scenarios where LLMs serve as recommenders.To this end,we introduce a novel framework integrating LLMs with KBs for enhanced retrieval generation,namely LLMKB.LLMKB initially leverages structured knowledge to create mapping dictionaries,extracting entity-relation information from heterogeneous knowledge to construct KBs.Then,LLMKB achieves the embedding calibration between user information representations and documents in KBs through retrieval model fine-tuning.Finally,LLMKB employs retrievalaugmented generation to produce recommendations based on fused text inputs,followed by post-processing.Experiment results on two public CRS datasets demonstrate the effectiveness of our framework.Our code is publicly available at the link:https://anonymous.4open.science/r/LLMKB-6FD0.