In the international shipping industry, digital intelligence transformation has become essential, with both governments and enterprises actively working to integrate diverse datasets. The domain of maritime and shippi...In the international shipping industry, digital intelligence transformation has become essential, with both governments and enterprises actively working to integrate diverse datasets. The domain of maritime and shipping is characterized by a vast array of document types, filled with complex, large-scale, and often chaotic knowledge and relationships. Effectively managing these documents is crucial for developing a Large Language Model (LLM) in the maritime domain, enabling practitioners to access and leverage valuable information. A Knowledge Graph (KG) offers a state-of-the-art solution for enhancing knowledge retrieval, providing more accurate responses and enabling context-aware reasoning. This paper presents a framework for utilizing maritime and shipping documents to construct a knowledge graph using GraphRAG, a hybrid tool combining graph-based retrieval and generation capabilities. The extraction of entities and relationships from these documents and the KG construction process are detailed. Furthermore, the KG is integrated with an LLM to develop a Q&A system, demonstrating that the system significantly improves answer accuracy compared to traditional LLMs. Additionally, the KG construction process is up to 50% faster than conventional LLM-based approaches, underscoring the efficiency of our method. This study provides a promising approach to digital intelligence in shipping, advancing knowledge accessibility and decision-making.展开更多
Based on the definition of class shortest path in weighted rough graph, class shortest path algorithm in weighted rough graph is presented, which extends classical shortest path algorithm. The application in relations...Based on the definition of class shortest path in weighted rough graph, class shortest path algorithm in weighted rough graph is presented, which extends classical shortest path algorithm. The application in relationship mining shows effectiveness of it.展开更多
According to the current problems of higher education management informatization,this paper puts forward a development scheme of collaborative platform on education management.The main technology includes three parts...According to the current problems of higher education management informatization,this paper puts forward a development scheme of collaborative platform on education management.The main technology includes three parts.First,integrate the distributed database and use two-tier linked list to realize dynamic data access.Second,the relation graph is used to display the data of each student,so as to realize the visual sharing of data.Third,realize the collaborative information security mechanism from three aspects to ensure the legal sharing of data.Finally,the platform development is completed with Java.It can help to improve the effectiveness of educating students.展开更多
阿尔茨海默病(Alzheimer’s Disease,AD)是一种慢性神经系统退行性疾病,其准确分类有助于实现AD的早期诊断,从而及时采取针对性的治疗和干预措施.本文提出了一种最近邻域聚合图神经网络(Graph neural network with nearest Neighborhood...阿尔茨海默病(Alzheimer’s Disease,AD)是一种慢性神经系统退行性疾病,其准确分类有助于实现AD的早期诊断,从而及时采取针对性的治疗和干预措施.本文提出了一种最近邻域聚合图神经网络(Graph neural network with nearest Neighborhood AgGrEgation,GraphNAGE)的AD分类新方法.首先进行图数据建模,将AD数据样本表示为图数据.采用基于互信息(Mutual Information,MI)的特征选择方法,从样本的114维大脑皮层与皮层下感兴趣区域(Cerebral Cortex and Subcortical Regions Of Interest,CCS-ROI)的体积特征中选取重要性高的体积特征,并将其用于节点建模.提出基于相似性度量的关系建模方法,利用重要性高的体积特征、遗传基因、人口统计信息和认知评分对样本之间的关系进行建模.进而构建GraphNAGE,针对每个节点,基于与该节点相关的边的权重进行最近邻域采样,然后使用均值聚合方法对采样得到的邻居节点和中心节点的数据进行聚合,最后通过一个全连接层和一个Softmax层实现AD分类.在TADPOLE(The Alzheimer’s Disease Prediction Of Longitudinal Evolution)数据集上进行实验,结果表明:本文提出的AD分类方法的准确率(ACCuracy,ACC)为98.20%,F_(1)分数为97.34%,曲线下面积(Area Under Curve,AUC)为97.80%.实验结果表明:本文提出的AD分类方法充分利用了AD数据样本之间的相关性,其性能优于传统的基于机器学习、深度学习和图神经网络(Graph Neural Network,GNN)的AD分类方法.展开更多
羊疾病领域知识图谱是实现羊疾病防控与智能诊疗的前提。针对羊疾病文本语义边界模糊、实体角色重叠及关系语义复杂等问题,该研究提出了一种基于CaRoMHPE(CasRel-based model combined with RoBERTa,multi-scale crossattention mechani...羊疾病领域知识图谱是实现羊疾病防控与智能诊疗的前提。针对羊疾病文本语义边界模糊、实体角色重叠及关系语义复杂等问题,该研究提出了一种基于CaRoMHPE(CasRel-based model combined with RoBERTa,multi-scale crossattention mechanism,and hybrid position encoding in multi-head attention)模型的知识图谱构建方法。首先根据羊疾病语料特点,构建了一个包含9类实体和8种关系的羊疾病数据集,涵盖了羊疾病诊疗全过程中的关键实体及关系,为实体关系抽取任务提供数据支持。随后,以CasRel(cascade relational triple extraction)为基础模型,使用RoBERTa-wwmext(robustly optimized BERT approach)替换BERT(bidirectional encoder representations from transformers)作为预训练编码模型,以增强模型对上下文的理解和对复杂语言结构的处理能力;在主体标注模块后添加多尺度跨注意力机制,更好地细化实体之间的语义关系,同时融入混合位置编码(hybrid position encoding,HPE)对多头注意力机制进行改进,增强关系抽取任务中的实体边界划分和角色区分能力。结果表明,该模型知识抽取的准确率、召回率和F1值分别达到了94.70%、94.04%、94.37%,相较于CasRel模型分别提升了9.14、9.21和9.18个百分点,增强了羊疾病信息实体关系抽取效果。最后,在抽取得到的三元组基础上,结合语义嵌入技术和余弦相似度算法,通过消除同义词重复和处理潜在歧义,构建了规范化的知识图谱,为智能化羊疾病诊疗提供有力的支持。展开更多
文摘In the international shipping industry, digital intelligence transformation has become essential, with both governments and enterprises actively working to integrate diverse datasets. The domain of maritime and shipping is characterized by a vast array of document types, filled with complex, large-scale, and often chaotic knowledge and relationships. Effectively managing these documents is crucial for developing a Large Language Model (LLM) in the maritime domain, enabling practitioners to access and leverage valuable information. A Knowledge Graph (KG) offers a state-of-the-art solution for enhancing knowledge retrieval, providing more accurate responses and enabling context-aware reasoning. This paper presents a framework for utilizing maritime and shipping documents to construct a knowledge graph using GraphRAG, a hybrid tool combining graph-based retrieval and generation capabilities. The extraction of entities and relationships from these documents and the KG construction process are detailed. Furthermore, the KG is integrated with an LLM to develop a Q&A system, demonstrating that the system significantly improves answer accuracy compared to traditional LLMs. Additionally, the KG construction process is up to 50% faster than conventional LLM-based approaches, underscoring the efficiency of our method. This study provides a promising approach to digital intelligence in shipping, advancing knowledge accessibility and decision-making.
基金Natural Science Foundation of Shandong Province of China (Y2004A04)Natural Science Foundation of Shandong Province of China (Y2006A12)Foundation of Ministry of Fujian Province Education of China (JA04268).
文摘Based on the definition of class shortest path in weighted rough graph, class shortest path algorithm in weighted rough graph is presented, which extends classical shortest path algorithm. The application in relationship mining shows effectiveness of it.
基金The authors received a specific funding with No.218051360020XN113 for this study。
文摘According to the current problems of higher education management informatization,this paper puts forward a development scheme of collaborative platform on education management.The main technology includes three parts.First,integrate the distributed database and use two-tier linked list to realize dynamic data access.Second,the relation graph is used to display the data of each student,so as to realize the visual sharing of data.Third,realize the collaborative information security mechanism from three aspects to ensure the legal sharing of data.Finally,the platform development is completed with Java.It can help to improve the effectiveness of educating students.
文摘阿尔茨海默病(Alzheimer’s Disease,AD)是一种慢性神经系统退行性疾病,其准确分类有助于实现AD的早期诊断,从而及时采取针对性的治疗和干预措施.本文提出了一种最近邻域聚合图神经网络(Graph neural network with nearest Neighborhood AgGrEgation,GraphNAGE)的AD分类新方法.首先进行图数据建模,将AD数据样本表示为图数据.采用基于互信息(Mutual Information,MI)的特征选择方法,从样本的114维大脑皮层与皮层下感兴趣区域(Cerebral Cortex and Subcortical Regions Of Interest,CCS-ROI)的体积特征中选取重要性高的体积特征,并将其用于节点建模.提出基于相似性度量的关系建模方法,利用重要性高的体积特征、遗传基因、人口统计信息和认知评分对样本之间的关系进行建模.进而构建GraphNAGE,针对每个节点,基于与该节点相关的边的权重进行最近邻域采样,然后使用均值聚合方法对采样得到的邻居节点和中心节点的数据进行聚合,最后通过一个全连接层和一个Softmax层实现AD分类.在TADPOLE(The Alzheimer’s Disease Prediction Of Longitudinal Evolution)数据集上进行实验,结果表明:本文提出的AD分类方法的准确率(ACCuracy,ACC)为98.20%,F_(1)分数为97.34%,曲线下面积(Area Under Curve,AUC)为97.80%.实验结果表明:本文提出的AD分类方法充分利用了AD数据样本之间的相关性,其性能优于传统的基于机器学习、深度学习和图神经网络(Graph Neural Network,GNN)的AD分类方法.
文摘针对农药登记文本中信息密集、逻辑结构复杂、实体间跨度大以及实体长度异质性等特点,同时为克服传统联合抽取方法中面临的三元组重叠、曝光偏差和冗余计算问题,本研究提出一种多特征融合的单阶段实体关系联合抽取模型(Multi-feature fusion single-stage entity and relation joint extraction model,MF-SERel)。首先,在编码层,通过融合语义与句法特征,丰富字符向量表示,提升模型对复杂语料的表征能力;其次,在多维标注框架层,提出HT-BES多维标注策略,以解决重叠三元组问题。通过并行评分函数与细粒度分类组件,将实体关系联合抽取转化为了基于关系维度的多标签标注任务,该过程不包含相互依赖步骤,从而实现单阶段并行标注,避免了曝光偏差并降低了计算冗余;最后,在解码层依据细粒度分类预测标签,解码出实体关系三元组。将本研究提出的模型与GraphRel、CasRel和TPLinker等基线模型进行对比,在农药数据集(Pesticide registration dataset,PRD)和公开数据集(Dataset of unstructured information extraction,DuIE)上进行检测。结果表明MF-SERel模型在农药数据集PRD和公开数据集DuIE上具有良好的表现。在农药数据集PRD上,本研究提出的模型MF-SERel在推理速度上提升了20%,F1值提升了2.3%,说明MF-SERel模型在农药登记文本中具有良好的知识挖掘能力;在公开数据集DuIE上,MF-SERel模型在推理速度上提升了54%,F1值提升了1.7%,同样取得了较好结果,证明MF-SERel模型具有较好的泛化能力。综上,本研究提出的MF-SERel模型可为农药领域知识的结构化抽取提供新方法。
文摘羊疾病领域知识图谱是实现羊疾病防控与智能诊疗的前提。针对羊疾病文本语义边界模糊、实体角色重叠及关系语义复杂等问题,该研究提出了一种基于CaRoMHPE(CasRel-based model combined with RoBERTa,multi-scale crossattention mechanism,and hybrid position encoding in multi-head attention)模型的知识图谱构建方法。首先根据羊疾病语料特点,构建了一个包含9类实体和8种关系的羊疾病数据集,涵盖了羊疾病诊疗全过程中的关键实体及关系,为实体关系抽取任务提供数据支持。随后,以CasRel(cascade relational triple extraction)为基础模型,使用RoBERTa-wwmext(robustly optimized BERT approach)替换BERT(bidirectional encoder representations from transformers)作为预训练编码模型,以增强模型对上下文的理解和对复杂语言结构的处理能力;在主体标注模块后添加多尺度跨注意力机制,更好地细化实体之间的语义关系,同时融入混合位置编码(hybrid position encoding,HPE)对多头注意力机制进行改进,增强关系抽取任务中的实体边界划分和角色区分能力。结果表明,该模型知识抽取的准确率、召回率和F1值分别达到了94.70%、94.04%、94.37%,相较于CasRel模型分别提升了9.14、9.21和9.18个百分点,增强了羊疾病信息实体关系抽取效果。最后,在抽取得到的三元组基础上,结合语义嵌入技术和余弦相似度算法,通过消除同义词重复和处理潜在歧义,构建了规范化的知识图谱,为智能化羊疾病诊疗提供有力的支持。