With the development of large scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension...With the development of large scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co occurrence of展开更多
Multi-modal Named Entity Recognition(MNER),which is vision-language task,utilizes images as auxiliary to detect and classify named entities from input sentence.Recent studies find visual information is helpful for Nam...Multi-modal Named Entity Recognition(MNER),which is vision-language task,utilizes images as auxiliary to detect and classify named entities from input sentence.Recent studies find visual information is helpful for Named Entity Recognition(NER),while the difference between those two modalities is not carefully considered.Therefore,these approaches utilizing different pre-trained models do not reduce the gap between textual and visual features,which give the same weight of different modalities usually predict wrong because of the noise of visual information.To reduce these bias,we propose a Masked Multi-modal Attention Fusion approach for MNER,named MMAF.Firstly,we utilize Image Caption to generate textual representation of image,which is combined with original sentence.Then,to get textual and visual features,we map the multi-modal inputs into a shared space and stack Multi-modal Attention Fusion layer that performs fully interaction between two modalities.We add Multi-modal Attention Mask to highlight the importance of certain words in sentences,enhancing the performance of entity detection.Finally,we achieve Multi-modal Attention based representation for each word and perform entity labeling via CRF decoder.Experiments show our method outperforms state-of-the-art models by 0.23%and 0.84%on Twitter 2015 and 2017 MNER datasets respectively,demonstrating its effectiveness.展开更多
文摘With the development of large scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co occurrence of
基金supported by the Beijing Natural Science Foundation under Grant No.L247008
文摘Multi-modal Named Entity Recognition(MNER),which is vision-language task,utilizes images as auxiliary to detect and classify named entities from input sentence.Recent studies find visual information is helpful for Named Entity Recognition(NER),while the difference between those two modalities is not carefully considered.Therefore,these approaches utilizing different pre-trained models do not reduce the gap between textual and visual features,which give the same weight of different modalities usually predict wrong because of the noise of visual information.To reduce these bias,we propose a Masked Multi-modal Attention Fusion approach for MNER,named MMAF.Firstly,we utilize Image Caption to generate textual representation of image,which is combined with original sentence.Then,to get textual and visual features,we map the multi-modal inputs into a shared space and stack Multi-modal Attention Fusion layer that performs fully interaction between two modalities.We add Multi-modal Attention Mask to highlight the importance of certain words in sentences,enhancing the performance of entity detection.Finally,we achieve Multi-modal Attention based representation for each word and perform entity labeling via CRF decoder.Experiments show our method outperforms state-of-the-art models by 0.23%and 0.84%on Twitter 2015 and 2017 MNER datasets respectively,demonstrating its effectiveness.