摘要
图卷积神经网络善于理解文本整体结构和语义信息,可捕获文本数据中的全局关联性,但训练数据的质量会影响其分类性能.因此,提出了基于词嵌入约束和图卷积神经网络的法律案情分类模型,该模型以法律案情数据为基础,构建全局词语共现信息的异构图,输入图卷积网络实现司法案件的归纳和分类.同时,利用预训练好的词嵌入模块提取局部连续词之间的语义相关性信息.通过余弦相似度层对高维词语特征向量进行约束,使全局词语共现和局部语义相关性特征在分类过程中协同作用.在法律文书数据集上,模型在给予数据全部标签和部分标签两种情况下均获得最优F1值,超参数敏感度测定验证了模型结构的合理性,文本分类特征的可视化进一步证实了模型的有效性.
Graph convolutional neural networks excel in comprehending text structure and semantics,capturing global correlations.However,their strong reliance on training data poses challenges when data quality is suboptimal.This paper introduces a legal text classification model based on word embedding constraints and graph convolutional neural network.Built on legal case data,the model constructs a heterogeneous graph for global word co-occurrence,utilizing a GCN for case induction and classification.A pre-trained word embedding module extracts semantic relatedness between local words.Constrained by a cosine similarity layer,this information synergistically enhances global co-occurrence and local semantic relatedness during classification.The model achieves optimal F1 values on a legal dataset under both fully and partially labeled scenarios.Sensitivity analyses confirm the model′s rational structure,and visualizations validate its effectiveness in text classification.
作者
孟春运
谈镇
栾力
ABEO Timothy Apasiba
MENG Chunyun;TAN Zhen;LUAN Li;ABEO Timothy Apasiba(School of Economics and Management,Jiangsu University of Science and Technology,Zhenjiang 212100,China;School of Public Affairs,University of Science and Technology of China,Hefei 230026,China;Faculty of Applied Science and Technology,Tamale Technical University,Tamale 00233,Ghana)
出处
《江苏科技大学学报(自然科学版)》
2025年第2期84-91,共8页
Journal of Jiangsu University of Science and Technology:Natural Science Edition
基金
国家社会科学基金重点项目(16AJL008)。
关键词
深度学习
数据挖掘
文本分类
司法效率
图卷积神经网络
deep learning
data mining
text classification
judicial efficiency
graph convolutional neural network