The majority of existing graph-network-based few-shot models focus on a node-similarity update mode.The lack of adequate information intensies the risk of overtraining.In this paper,we propose a novel Multihead Attent...The majority of existing graph-network-based few-shot models focus on a node-similarity update mode.The lack of adequate information intensies the risk of overtraining.In this paper,we propose a novel Multihead Attention Graph Network to excavate discriminative relation and fulll effective information propagation.For edge update,the node-level attention is used to evaluate the similarities between the two nodes and the distributionlevel attention extracts more in-deep global relation.The cooperation between those two parts provides a discriminative and comprehensive expression for edge feature.For node update,we embrace the label-level attention to soften the noise of irrelevant nodes and optimize the update direction.Our proposed model is veried through extensive experiments on two few-shot benchmark MiniImageNet and CIFAR-FS dataset.The results suggest that our method has a strong capability of noise immunity and quick convergence.The classication accuracy outperforms most state-of-the-art approaches.展开更多
Fine-tuning is a popular approach to solve the few-shot object detection problem.In this paper,we attempt to introduce a new perspective on it.We formulate the few-shot novel tasks as a type of distribution shifted fr...Fine-tuning is a popular approach to solve the few-shot object detection problem.In this paper,we attempt to introduce a new perspective on it.We formulate the few-shot novel tasks as a type of distribution shifted from its ground-truth distribution.We introduce the concept of imaginary placeholder masks to show that this distribution shift is essentially a composite of in-distribution(ID)and out-of-distribution(OOD)shifts.Our empirical investigation results show that it is significant to balance the trade-off between adapting to the available few-shot distribution and keeping the distribution-shift robustness of the pre-trained model.We explore improvements in the few-shot finetuning transfer in the few-shot object detection(FSOD)settings from three aspects.First,we explore the LinearProbe-Finetuning(LP-FT)technique to balance this trade-off to mitigate the feature distortion problem.Second,we explore the effectiveness of utilizing the protection freezing strategy for querybased object detectors to keep their OOD robustness.Third,we try to utilize ensembling methods to circumvent the feature distortion.All these techniques are integrated into a whole method called BIOT(Balanced ID-OOD Transfer).Evaluation results show that our method is simple yet effective and general to tap the FSOD potential of query-based object detectors.It outperforms the current SOTA method in many FSOD settings and has a promising scaling capability.展开更多
Taking the real part and the imaginary part of complex sound pressure of the sound field as features,a transfer learning model is constructed.Based on the pre-training of a large amount of underwater acoustic data in ...Taking the real part and the imaginary part of complex sound pressure of the sound field as features,a transfer learning model is constructed.Based on the pre-training of a large amount of underwater acoustic data in the preselected sea area using the convolutional neural network(CNN),the few-shot underwater acoustic data in the test sea area are retrained to study the underwater sound source ranging problem.The S5 voyage data of SWellEX-96 experiment is used to verify the proposed method,realize the range estimation for the shallow source in the experiment,and compare the range estimation performance of the underwater target sound source of four methods:matched field processing(MFP),generalized regression neural network(GRNN),traditional CNN,and transfer learning.Experimental data processing results show that the transfer learning model based on residual CNN can effectively realize range estimation in few-shot scenes,and the estimation performance is remarkably better than that of other methods.展开更多
Now object detection based on deep learning tries different strategies.It uses fewer data training networks to achieve the effect of large dataset training.However,the existing methods usually do not achieve the balan...Now object detection based on deep learning tries different strategies.It uses fewer data training networks to achieve the effect of large dataset training.However,the existing methods usually do not achieve the balance between network parameters and training data.It makes the information provided by a small amount of picture data insufficient to optimize model parameters,resulting in unsatisfactory detection results.To improve the accuracy of few shot object detection,this paper proposes a network based on the transformer and high-resolution feature extraction(THR).High-resolution feature extractionmaintains the resolution representation of the image.Channels and spatial attention are used to make the network focus on features that are more useful to the object.In addition,the recently popular transformer is used to fuse the features of the existing object.This compensates for the previous network failure by making full use of existing object features.Experiments on the Pascal VOC and MS-COCO datasets prove that the THR network has achieved better results than previous mainstream few shot object detection.展开更多
基于图的虚假评论检测主要面临着如何在仅有少量正样本标注的属性图中,有效聚合图中不同关系的邻居信息,提高图表示学习对于异常节点的敏感性和泛化能力的挑战。针对此挑战,提出基于元学习的多信息融合图差异网络(multi-information fus...基于图的虚假评论检测主要面临着如何在仅有少量正样本标注的属性图中,有效聚合图中不同关系的邻居信息,提高图表示学习对于异常节点的敏感性和泛化能力的挑战。针对此挑战,提出基于元学习的多信息融合图差异网络(multi-information fusion graph deviation network based on meta-learning,MetaMGDN)。通过构建多视图划分与多信息融合模块,充分挖掘用户、项目、评分的结构信息与属性信息,以实现网络对多方面信息的获取并挖掘评论节点之间的关系。设计多视图邻域差异聚合模块,合并邻域信息与自身-邻域差异信息,使网络同时关注节点之间的关联性与差异性,提高了网络对于异常节点的敏感性。最后,引入元学习框架,利用多个辅助网络增强目标网络学习小样本任务的经验,从而在小样本虚假评论检测场景下保持较高泛化能力。在真实公开评论数据集上进行的实验表明,Meta-MGDN在基于图的虚假评论检测领域上的效果优于先进的基线。展开更多
为了降低柚子等水果目标检测对大量标注数据的依赖,本文提出了一种融合视觉语言模型的柚子分形树图像生成增强方法。该方法仅需3~5幅无标注真实图像,即可在无训练条件下生成大规模带标注的训练数据集。首先利用基于文本提示的零样本分...为了降低柚子等水果目标检测对大量标注数据的依赖,本文提出了一种融合视觉语言模型的柚子分形树图像生成增强方法。该方法仅需3~5幅无标注真实图像,即可在无训练条件下生成大规模带标注的训练数据集。首先利用基于文本提示的零样本分割模型(Grounded segment anything model,Grounded SAM)提取柚树组件,然后结合稳定扩散模型Stable Diffusion使用文本提示生成随机背景,最后使用改进的分形树算法生成柚树以提升多样性及真实感。试验采用YOLO v10轻量化版本进行验证,在自建的非结构化环境柚子目标检测数据集上,当训练集真实图像数量分别为0、8、16、32、64幅时,使用本文方法后模型多阈值平均精度均值(Mean average precision at intersection over union thresholds from 0.50 to 0.95,mAP50-95)提升率依次达到662.3%、24.9%、13.7%、8.8%、1.8%。当训练集中真实图像数量为221幅,生成图像数量为512幅时,模型达到最优性能:精确率为76.9%,召回率为62.7%,mAP50为70.3%,mAP50-95为38.4%。迁移到橙子目标检测任务,相同数据规模下的性能提升分别为212.9%、16.5%、14.0%、5.2%、4.1%。当训练集中真实图像数量为1302幅,生成图像数量为512幅时,模型同样达到最优性能:精确率为90.3%,召回率为87.8%,mAP50为94.0%,mAP50-95为54.0%。试验结果表明,该图像生成增强方法在零样本和少样本学习场景中能够有效扩展训练数据,提高YOLO v10轻量化版本目标检测的性能,并展现出良好的泛化能力。展开更多
基金supported in part by the Natural Science Foundation of China under Grant 61972169 and U1536203in part by the National key research and developm program of China(2016QY01W0200)in part by the Major Scientic and Technological Project of Hubei Province(2018AAA068 and 2019AAA051).
文摘The majority of existing graph-network-based few-shot models focus on a node-similarity update mode.The lack of adequate information intensies the risk of overtraining.In this paper,we propose a novel Multihead Attention Graph Network to excavate discriminative relation and fulll effective information propagation.For edge update,the node-level attention is used to evaluate the similarities between the two nodes and the distributionlevel attention extracts more in-deep global relation.The cooperation between those two parts provides a discriminative and comprehensive expression for edge feature.For node update,we embrace the label-level attention to soften the noise of irrelevant nodes and optimize the update direction.Our proposed model is veried through extensive experiments on two few-shot benchmark MiniImageNet and CIFAR-FS dataset.The results suggest that our method has a strong capability of noise immunity and quick convergence.The classication accuracy outperforms most state-of-the-art approaches.
文摘Fine-tuning is a popular approach to solve the few-shot object detection problem.In this paper,we attempt to introduce a new perspective on it.We formulate the few-shot novel tasks as a type of distribution shifted from its ground-truth distribution.We introduce the concept of imaginary placeholder masks to show that this distribution shift is essentially a composite of in-distribution(ID)and out-of-distribution(OOD)shifts.Our empirical investigation results show that it is significant to balance the trade-off between adapting to the available few-shot distribution and keeping the distribution-shift robustness of the pre-trained model.We explore improvements in the few-shot finetuning transfer in the few-shot object detection(FSOD)settings from three aspects.First,we explore the LinearProbe-Finetuning(LP-FT)technique to balance this trade-off to mitigate the feature distortion problem.Second,we explore the effectiveness of utilizing the protection freezing strategy for querybased object detectors to keep their OOD robustness.Third,we try to utilize ensembling methods to circumvent the feature distortion.All these techniques are integrated into a whole method called BIOT(Balanced ID-OOD Transfer).Evaluation results show that our method is simple yet effective and general to tap the FSOD potential of query-based object detectors.It outperforms the current SOTA method in many FSOD settings and has a promising scaling capability.
基金supported by the National Natural Science Foundation of China(1197428611904274)+1 种基金the Shaanxi Young Science and Technology Star Program(2021KJXX-07)the fundamental research funding for characteristic disciplines(G2022WD0235)。
文摘Taking the real part and the imaginary part of complex sound pressure of the sound field as features,a transfer learning model is constructed.Based on the pre-training of a large amount of underwater acoustic data in the preselected sea area using the convolutional neural network(CNN),the few-shot underwater acoustic data in the test sea area are retrained to study the underwater sound source ranging problem.The S5 voyage data of SWellEX-96 experiment is used to verify the proposed method,realize the range estimation for the shallow source in the experiment,and compare the range estimation performance of the underwater target sound source of four methods:matched field processing(MFP),generalized regression neural network(GRNN),traditional CNN,and transfer learning.Experimental data processing results show that the transfer learning model based on residual CNN can effectively realize range estimation in few-shot scenes,and the estimation performance is remarkably better than that of other methods.
基金the National Natural Science Foundation of China under grant 62172059 and 62072055Hunan Provincial Natural Science Foundations of China under Grant 2020JJ4626+2 种基金Scientific Research Fund of Hunan Provincial Education Department of China under Grant 19B004“Double First-class”International Cooperation and Development Scientific Research Project of Changsha University of Science and Technology under Grant 2018IC25the Young Teacher Growth Plan Project of Changsha University of Science and Technology under Grant 2019QJCZ076.
文摘Now object detection based on deep learning tries different strategies.It uses fewer data training networks to achieve the effect of large dataset training.However,the existing methods usually do not achieve the balance between network parameters and training data.It makes the information provided by a small amount of picture data insufficient to optimize model parameters,resulting in unsatisfactory detection results.To improve the accuracy of few shot object detection,this paper proposes a network based on the transformer and high-resolution feature extraction(THR).High-resolution feature extractionmaintains the resolution representation of the image.Channels and spatial attention are used to make the network focus on features that are more useful to the object.In addition,the recently popular transformer is used to fuse the features of the existing object.This compensates for the previous network failure by making full use of existing object features.Experiments on the Pascal VOC and MS-COCO datasets prove that the THR network has achieved better results than previous mainstream few shot object detection.
文摘基于图的虚假评论检测主要面临着如何在仅有少量正样本标注的属性图中,有效聚合图中不同关系的邻居信息,提高图表示学习对于异常节点的敏感性和泛化能力的挑战。针对此挑战,提出基于元学习的多信息融合图差异网络(multi-information fusion graph deviation network based on meta-learning,MetaMGDN)。通过构建多视图划分与多信息融合模块,充分挖掘用户、项目、评分的结构信息与属性信息,以实现网络对多方面信息的获取并挖掘评论节点之间的关系。设计多视图邻域差异聚合模块,合并邻域信息与自身-邻域差异信息,使网络同时关注节点之间的关联性与差异性,提高了网络对于异常节点的敏感性。最后,引入元学习框架,利用多个辅助网络增强目标网络学习小样本任务的经验,从而在小样本虚假评论检测场景下保持较高泛化能力。在真实公开评论数据集上进行的实验表明,Meta-MGDN在基于图的虚假评论检测领域上的效果优于先进的基线。
文摘为了降低柚子等水果目标检测对大量标注数据的依赖,本文提出了一种融合视觉语言模型的柚子分形树图像生成增强方法。该方法仅需3~5幅无标注真实图像,即可在无训练条件下生成大规模带标注的训练数据集。首先利用基于文本提示的零样本分割模型(Grounded segment anything model,Grounded SAM)提取柚树组件,然后结合稳定扩散模型Stable Diffusion使用文本提示生成随机背景,最后使用改进的分形树算法生成柚树以提升多样性及真实感。试验采用YOLO v10轻量化版本进行验证,在自建的非结构化环境柚子目标检测数据集上,当训练集真实图像数量分别为0、8、16、32、64幅时,使用本文方法后模型多阈值平均精度均值(Mean average precision at intersection over union thresholds from 0.50 to 0.95,mAP50-95)提升率依次达到662.3%、24.9%、13.7%、8.8%、1.8%。当训练集中真实图像数量为221幅,生成图像数量为512幅时,模型达到最优性能:精确率为76.9%,召回率为62.7%,mAP50为70.3%,mAP50-95为38.4%。迁移到橙子目标检测任务,相同数据规模下的性能提升分别为212.9%、16.5%、14.0%、5.2%、4.1%。当训练集中真实图像数量为1302幅,生成图像数量为512幅时,模型同样达到最优性能:精确率为90.3%,召回率为87.8%,mAP50为94.0%,mAP50-95为54.0%。试验结果表明,该图像生成增强方法在零样本和少样本学习场景中能够有效扩展训练数据,提高YOLO v10轻量化版本目标检测的性能,并展现出良好的泛化能力。