There are all kinds of unknown and known signals in the actual electromagnetic environment,which hinders the development of practical cognitive radio applications.However,most existing signal recognition models are di...There are all kinds of unknown and known signals in the actual electromagnetic environment,which hinders the development of practical cognitive radio applications.However,most existing signal recognition models are difficult to discover unknown signals while recognizing known ones.In this paper,a compact manifold mixup feature-based open-set recognition approach(OR-CMMF)is proposed to address the above problem.First,the proposed approach utilizes the center loss to constrain decision boundaries so that it obtains the compact latent signal feature representations and extends the low-confidence feature space.Second,the latent signal feature representations are used to construct synthetic representations as substitutes for unknown categories of signals.Then,these constructed representations can occupy the extended low-confidence space.Finally,the proposed approach applies the distillation loss to adjust the decision boundaries between the known categories signals and the constructed unknown categories substitutes so that it accurately discovers unknown signals.The OR-CMMF approach outperformed other state-of-the-art open-set recognition methods in comprehensive recognition performance and running time,as demonstrated by simulation experiments on two public datasets RML2016.10a and ORACLE.展开更多
医疗命名实体识别是从非结构化医疗文本中识别命名实体,在许多下游任务中起重要作用。医疗命名实体的复杂性需要专家利用领域知识进行标注,导致医疗领域存在严重的标注数据稀缺问题。为解决该问题,提出了一种基于实体感知掩码局部融合...医疗命名实体识别是从非结构化医疗文本中识别命名实体,在许多下游任务中起重要作用。医疗命名实体的复杂性需要专家利用领域知识进行标注,导致医疗领域存在严重的标注数据稀缺问题。为解决该问题,提出了一种基于实体感知掩码局部融合命名实体识别数据增强(entity aware mask local mixup data augmentation,EALMDA)方法。首先,使用实体感知掩码通道提取关键元素并掩码非实体部分,以保留核心语义。其次,通过上下文实体相似度和k近邻两种采样策略的线性组合对掩码句子进行融合,保留核心语义的同时增加样本的多样性。最后,经序列线性化操作后,将句子输入生成的模型中得到增强样本。在NCBI-disease等五个主流医疗命名实体识别数据集上,模拟低资源场景与主流的数据增强基线方法进行对比实验,所提方法的性能相比基线方法有显著提升。展开更多
文摘特定辐射源识别(Specific emitter identification,SEI)通过分析设备信号硬件特征保障物联网数据安全。现有的深度学习方法在进行特定辐射源识别时,样本数量受限,过于依赖大量已标记样本,无法做到高区分度表征,存在识别性能差的问题。针对这些问题,提出了基于样本插值(Mixup)增强的少样本SEI方法。首先采用Mixup的增强方式来扩展无线电信号样本的数量解决标注样本不足的问题;其次,基于孪生神经网络与复数神经网络(Complex-valued neural networks,CVNN)构建变体三元组网络(Triplet margin network based on CVNN,CVNN-TMN)提高模型的泛化能力和区分度,实现了少样本场景下特定辐射源的精准识别。实验结果表明,与现有多种先进SEI方法对比,在训练集和测试集样本划分比例不同情况下,提出的CVNN-TMN识别精度整体有5%~30%的提升,表明所构建的CVNN-TMN模型在区分度上的优异表现。
基金fully supported by National Natural Science Foundation of China(61871422)Natural Science Foundation of Sichuan Province(2023NSFSC1422)Central Universities of South west Minzu University(ZYN2022032)。
文摘There are all kinds of unknown and known signals in the actual electromagnetic environment,which hinders the development of practical cognitive radio applications.However,most existing signal recognition models are difficult to discover unknown signals while recognizing known ones.In this paper,a compact manifold mixup feature-based open-set recognition approach(OR-CMMF)is proposed to address the above problem.First,the proposed approach utilizes the center loss to constrain decision boundaries so that it obtains the compact latent signal feature representations and extends the low-confidence feature space.Second,the latent signal feature representations are used to construct synthetic representations as substitutes for unknown categories of signals.Then,these constructed representations can occupy the extended low-confidence space.Finally,the proposed approach applies the distillation loss to adjust the decision boundaries between the known categories signals and the constructed unknown categories substitutes so that it accurately discovers unknown signals.The OR-CMMF approach outperformed other state-of-the-art open-set recognition methods in comprehensive recognition performance and running time,as demonstrated by simulation experiments on two public datasets RML2016.10a and ORACLE.
文摘医疗命名实体识别是从非结构化医疗文本中识别命名实体,在许多下游任务中起重要作用。医疗命名实体的复杂性需要专家利用领域知识进行标注,导致医疗领域存在严重的标注数据稀缺问题。为解决该问题,提出了一种基于实体感知掩码局部融合命名实体识别数据增强(entity aware mask local mixup data augmentation,EALMDA)方法。首先,使用实体感知掩码通道提取关键元素并掩码非实体部分,以保留核心语义。其次,通过上下文实体相似度和k近邻两种采样策略的线性组合对掩码句子进行融合,保留核心语义的同时增加样本的多样性。最后,经序列线性化操作后,将句子输入生成的模型中得到增强样本。在NCBI-disease等五个主流医疗命名实体识别数据集上,模拟低资源场景与主流的数据增强基线方法进行对比实验,所提方法的性能相比基线方法有显著提升。