Under the background of rapid progress of science and technology,the trend of media integration is constantly strengthened,which urges the original media management concept to be constantly changed and to form a new m...Under the background of rapid progress of science and technology,the trend of media integration is constantly strengthened,which urges the original media management concept to be constantly changed and to form a new media management concept.In order to apply to the development needs under the trend of media integration.This paper summarizes the media integration,analyzes the influence of the media integration trend on the media management concept,explores the direction of the evolution of the media management concept under the media integration trend,and aims to provide a reference for the development of the media industry.展开更多
[目的/意义]近年来,随着社交媒体平台的快速发展,多模态命名实体识别(Multimodal Named Entity Recognition,MNER)成为一个备受关注的研究课题。最新研究表明,基于视觉Transformer的视觉语言模型在性能上优于传统的基于目标检测器的方法...[目的/意义]近年来,随着社交媒体平台的快速发展,多模态命名实体识别(Multimodal Named Entity Recognition,MNER)成为一个备受关注的研究课题。最新研究表明,基于视觉Transformer的视觉语言模型在性能上优于传统的基于目标检测器的方法,但目前尚缺乏对基于视觉语言Transformer的MNER模型的系统性研究。[方法/过程]为了解决上述问题,本文提出一种新的端到端框架,旨在深入研究如何设计和训练完全基于Transformer的视觉语言MNER模型。该框架充分考虑了模型设计中的所有关键要素,包括多模态特征提取、多模态融合模块以及解码架构。[结果/结论]实验结果表明,本文模型的表现优于所有基线模型,包括基于大语言模型的方法,并在两个数据集上取得了最佳整体指标。具体而言,该模型在Twitter-2015和Twitter-2017数据集上分别获得了80.06%和94.27%的整体F1分数,相较于目前最先进的视觉语言模型,分别提高了1.34%和3.80%。此外,该模型在跨数据集评估中表现出优于基线模型的出色泛化能力。展开更多
文摘Under the background of rapid progress of science and technology,the trend of media integration is constantly strengthened,which urges the original media management concept to be constantly changed and to form a new media management concept.In order to apply to the development needs under the trend of media integration.This paper summarizes the media integration,analyzes the influence of the media integration trend on the media management concept,explores the direction of the evolution of the media management concept under the media integration trend,and aims to provide a reference for the development of the media industry.
文摘[目的/意义]近年来,随着社交媒体平台的快速发展,多模态命名实体识别(Multimodal Named Entity Recognition,MNER)成为一个备受关注的研究课题。最新研究表明,基于视觉Transformer的视觉语言模型在性能上优于传统的基于目标检测器的方法,但目前尚缺乏对基于视觉语言Transformer的MNER模型的系统性研究。[方法/过程]为了解决上述问题,本文提出一种新的端到端框架,旨在深入研究如何设计和训练完全基于Transformer的视觉语言MNER模型。该框架充分考虑了模型设计中的所有关键要素,包括多模态特征提取、多模态融合模块以及解码架构。[结果/结论]实验结果表明,本文模型的表现优于所有基线模型,包括基于大语言模型的方法,并在两个数据集上取得了最佳整体指标。具体而言,该模型在Twitter-2015和Twitter-2017数据集上分别获得了80.06%和94.27%的整体F1分数,相较于目前最先进的视觉语言模型,分别提高了1.34%和3.80%。此外,该模型在跨数据集评估中表现出优于基线模型的出色泛化能力。