Sentiment analysis is a fine‐grained analysis task that aims to identify the sentiment polarity of a specified sentence.Existing methods in Chinese sentiment analysis tasks only consider sentiment features from a sin...Sentiment analysis is a fine‐grained analysis task that aims to identify the sentiment polarity of a specified sentence.Existing methods in Chinese sentiment analysis tasks only consider sentiment features from a single pole and scale and thus cannot fully exploit and utilise sentiment feature information,making their performance less than ideal.To resolve the problem,the authors propose a new method,GP‐FMLNet,that integrates both glyph and phonetic information and design a novel feature matrix learning process for phonetic features with which to model words that have the same pinyin information but different glyph information.Our method solves the problem of misspelling words influencing sentiment polarity prediction results.Specifically,the authors iteratively mine character,glyph,and pinyin features from the input comments sentences.Then,the authors use soft attention and matrix compound modules to model the phonetic features,which empowers their model to keep on zeroing in on the dynamic‐setting words in various positions and to dispense with the impacts of the deceptive‐setting ones.Ex-periments on six public datasets prove that the proposed model fully utilises the glyph and phonetic information and improves on the performance of existing Chinese senti-ment analysis algorithms.展开更多
human-automation collaboration.This problem is particularly pronounced in time-constrained safety critical domains such as in Air Traffic Management.A visual representation should aid operators understanding why the s...human-automation collaboration.This problem is particularly pronounced in time-constrained safety critical domains such as in Air Traffic Management.A visual representation should aid operators understanding why the system initiates the communication,when the operator must act,and the consequences of not responding to the cue.Data glyphs can be used to present multidimensional data,including temporal data in a compact format to facilitate this type of communication.In this paper,we propose a glyph design for communication initialization for highly automated systems in Air Traffic Management,Vessel Traffic Service,and Train Traffic Management.The design was assessed by experts in these domains in three workshop sessions.The results showed that the number of glyphs to be presented simultaneously and the type of situation were domain-specific glyph design aspects that needed to be adjusted for each work domain.The results also showed that the core of the glyph design could be reused between domains,and that the operators could successfully interpret the temporal data representations.We discuss similarities and differences in the applicability of the glyph design between the different domains,and finally,we provide some suggestions for future work based on the results from this study.展开更多
做好法律文书的实体识别可极大地帮助推动“智慧司法”,但目前对法律文书的命名实体识别存在着公共数据集缺乏、低频生僻和长实体识别效果不好、句法信息捕捉不足等问题。因此,该文针对民事案件提出了实体定义方案,构建了民事案件法律...做好法律文书的实体识别可极大地帮助推动“智慧司法”,但目前对法律文书的命名实体识别存在着公共数据集缺乏、低频生僻和长实体识别效果不好、句法信息捕捉不足等问题。因此,该文针对民事案件提出了实体定义方案,构建了民事案件法律文书数据集,并且提出了GLYCE-ONLSTM-CRF(GOC)模型来识别法律文书的实体。该模型嵌入层基于BERT预训练模型并融合了汉字字形特征,再通过ONLSTM(Ordered Neuron Long Short Term Memory Networks)层学习句子的层级结构,最后通过条件随机场(CRF)算法输出结果。在构建的民事案件数据集上进行实验,测试集的F 1值提高了5.15%,证明了模型的优越性,为法律文书命名实体识别提供了新思路。展开更多
多模态是一种提高中文命名实体识别(Chinese Named Entity Recognition,CNER)性能的有效技术,它可以利用不同的信息源来丰富中文字符的语义和边界信息。然而,现有的方法往往忽略了汉字的多模态信息之间的细粒度相关性。因此,该算法提出...多模态是一种提高中文命名实体识别(Chinese Named Entity Recognition,CNER)性能的有效技术,它可以利用不同的信息源来丰富中文字符的语义和边界信息。然而,现有的方法往往忽略了汉字的多模态信息之间的细粒度相关性。因此,该算法提出了一种基于注意力机制和字形结构的模型AGMN,结合词汇信息,将三模态特征输入到基于注意力的模型中,使汉字的部首和图像特征能与汉字的字词特征相结合,从而提升模型的识别效果。在四个公开的CNER数据集上的实验表明,AGMN相较于基线模型,其在weibo、resume、msra及ontonotes四大数据集上的F1值分别实现了6.94%、3.29%、0.6%和0.8%的提升。展开更多
基金Science and Technology Innovation 2030‐“New Generation Artificial Intelligence”major project,Grant/Award Number:2020AAA0108703。
文摘Sentiment analysis is a fine‐grained analysis task that aims to identify the sentiment polarity of a specified sentence.Existing methods in Chinese sentiment analysis tasks only consider sentiment features from a single pole and scale and thus cannot fully exploit and utilise sentiment feature information,making their performance less than ideal.To resolve the problem,the authors propose a new method,GP‐FMLNet,that integrates both glyph and phonetic information and design a novel feature matrix learning process for phonetic features with which to model words that have the same pinyin information but different glyph information.Our method solves the problem of misspelling words influencing sentiment polarity prediction results.Specifically,the authors iteratively mine character,glyph,and pinyin features from the input comments sentences.Then,the authors use soft attention and matrix compound modules to model the phonetic features,which empowers their model to keep on zeroing in on the dynamic‐setting words in various positions and to dispense with the impacts of the deceptive‐setting ones.Ex-periments on six public datasets prove that the proposed model fully utilises the glyph and phonetic information and improves on the performance of existing Chinese senti-ment analysis algorithms.
基金funded by the Swedish Transport Administration,Sweden through project F AUTO (part I:TRV 2018/41347 and part II:TRV 2020/138317).
文摘human-automation collaboration.This problem is particularly pronounced in time-constrained safety critical domains such as in Air Traffic Management.A visual representation should aid operators understanding why the system initiates the communication,when the operator must act,and the consequences of not responding to the cue.Data glyphs can be used to present multidimensional data,including temporal data in a compact format to facilitate this type of communication.In this paper,we propose a glyph design for communication initialization for highly automated systems in Air Traffic Management,Vessel Traffic Service,and Train Traffic Management.The design was assessed by experts in these domains in three workshop sessions.The results showed that the number of glyphs to be presented simultaneously and the type of situation were domain-specific glyph design aspects that needed to be adjusted for each work domain.The results also showed that the core of the glyph design could be reused between domains,and that the operators could successfully interpret the temporal data representations.We discuss similarities and differences in the applicability of the glyph design between the different domains,and finally,we provide some suggestions for future work based on the results from this study.
文摘做好法律文书的实体识别可极大地帮助推动“智慧司法”,但目前对法律文书的命名实体识别存在着公共数据集缺乏、低频生僻和长实体识别效果不好、句法信息捕捉不足等问题。因此,该文针对民事案件提出了实体定义方案,构建了民事案件法律文书数据集,并且提出了GLYCE-ONLSTM-CRF(GOC)模型来识别法律文书的实体。该模型嵌入层基于BERT预训练模型并融合了汉字字形特征,再通过ONLSTM(Ordered Neuron Long Short Term Memory Networks)层学习句子的层级结构,最后通过条件随机场(CRF)算法输出结果。在构建的民事案件数据集上进行实验,测试集的F 1值提高了5.15%,证明了模型的优越性,为法律文书命名实体识别提供了新思路。
文摘多模态是一种提高中文命名实体识别(Chinese Named Entity Recognition,CNER)性能的有效技术,它可以利用不同的信息源来丰富中文字符的语义和边界信息。然而,现有的方法往往忽略了汉字的多模态信息之间的细粒度相关性。因此,该算法提出了一种基于注意力机制和字形结构的模型AGMN,结合词汇信息,将三模态特征输入到基于注意力的模型中,使汉字的部首和图像特征能与汉字的字词特征相结合,从而提升模型的识别效果。在四个公开的CNER数据集上的实验表明,AGMN相较于基线模型,其在weibo、resume、msra及ontonotes四大数据集上的F1值分别实现了6.94%、3.29%、0.6%和0.8%的提升。