As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image...As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.展开更多
The problem of detecting community structures of a social network has been extensively studied over recent years, but most existing methods solely rely on the network structure and neglect the context information of t...The problem of detecting community structures of a social network has been extensively studied over recent years, but most existing methods solely rely on the network structure and neglect the context information of the social relations. The main reason is that a context-rich network offers too much flexibility and complexity for automatic or manual modulation of the multifaceted context in the analysis process. We address the challenging problem of incorporating context information into the community analysis with a novel visual analysis mechanism. Our approach consists of two stages: interactive discovery of salient context, and iterative context-guided community detection. Central to the analysis process is a context relevance model (CRM) that visually characterizes the influence of a given set of contexts on the variation of the detected communities, and discloses the community structure in specific context configurations. The extracted relevance is used to drive an iterative visual reasoning process, in which the community structures are progressively discovered. We introduce a suite of visual representations to encode the community structures, the context as well as the CRM. In particular, we propose an enhanced parallel coordinates representation to depict the context and community structures, which allows for interactive data exploration and community investigation. Case studies on several datasets demonstrate the efficiency and accuracy of our approach.展开更多
针对野生动物数据集样本量小、目标尺度多变所导致的野生动物检测困难以及检测精度低等问题,提出一种基于多尺度上下文提取的小样本野生动物检测(MS-FSWD)算法。首先,通过多尺度上下文提取模块增强模型对不同尺度的野生动物的感知能力,...针对野生动物数据集样本量小、目标尺度多变所导致的野生动物检测困难以及检测精度低等问题,提出一种基于多尺度上下文提取的小样本野生动物检测(MS-FSWD)算法。首先,通过多尺度上下文提取模块增强模型对不同尺度的野生动物的感知能力,提高检测性能;其次,引入Res2Net作为原型校准模块的强分类网络对分类器输出的分类分数进行校正;然后,在RPN中加入置换注意力机制,增强目标区域的特征图,弱化背景信息;最后,将平衡L1损失作为定位损失函数,提升目标定位性能。实验结果表明,相比DeFRCN算法,MS-FSWD在小样本野生动物数据集FSWA上,1-shot和3-shot检测任务中新类AP50分别提升了9.9%和6.6%;在公共数据集PASCAL VOC上,MS-FSWD最高提升了12.6%。与VFA算法相比,在PASCAL VOC数据集Novel Set 3的10-shot任务中,新类AP50提升了3.3%。展开更多
针对无人机航拍图像中存在小目标、目标遮挡、背景复杂的问题,提出一种基于高效特征提取和大感受野的目标检测网络(efficient feature and large receptive field network,EFLF-Net)。通过优化检测层架构降低小目标漏检率;在主干网络融...针对无人机航拍图像中存在小目标、目标遮挡、背景复杂的问题,提出一种基于高效特征提取和大感受野的目标检测网络(efficient feature and large receptive field network,EFLF-Net)。通过优化检测层架构降低小目标漏检率;在主干网络融合新的构建模块以提升特征提取效率;引入内容感知特征重组模块和大型选择性核网络,增强颈部网络对遮挡目标的上下文感知能力;采用Wise-IoU损失函数优化边界框回归稳定性。在VisDrone2019数据集上的实验结果表明,EFLF-Net较基准模型在平均精度上提高了5.2%。与已有代表性的目标检测算法相比,该方法对存在小目标、目标相互遮挡和复杂背景的无人机航拍图像有更好的检测效果。展开更多
基金Project supported by the Hi-Tech Research and Development Pro-gram (863) of China (No. 2003AA119010), and China-American Digital Academic Library (CADAL) Project (No. CADAL2004002)
文摘As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.
基金supported by the National Natural Science Foundation of China under Grant Nos. 61232012, 61202279the National High Technology Research and Development 863 Program of China under Grant No. 2012AA12090+1 种基金the Natural Science Foundation of Zhejiang Province of China under Grant No. LR13F020001the Doctoral Fund of Ministry of Education of China under Grant No. 20120101110134
文摘The problem of detecting community structures of a social network has been extensively studied over recent years, but most existing methods solely rely on the network structure and neglect the context information of the social relations. The main reason is that a context-rich network offers too much flexibility and complexity for automatic or manual modulation of the multifaceted context in the analysis process. We address the challenging problem of incorporating context information into the community analysis with a novel visual analysis mechanism. Our approach consists of two stages: interactive discovery of salient context, and iterative context-guided community detection. Central to the analysis process is a context relevance model (CRM) that visually characterizes the influence of a given set of contexts on the variation of the detected communities, and discloses the community structure in specific context configurations. The extracted relevance is used to drive an iterative visual reasoning process, in which the community structures are progressively discovered. We introduce a suite of visual representations to encode the community structures, the context as well as the CRM. In particular, we propose an enhanced parallel coordinates representation to depict the context and community structures, which allows for interactive data exploration and community investigation. Case studies on several datasets demonstrate the efficiency and accuracy of our approach.
文摘针对野生动物数据集样本量小、目标尺度多变所导致的野生动物检测困难以及检测精度低等问题,提出一种基于多尺度上下文提取的小样本野生动物检测(MS-FSWD)算法。首先,通过多尺度上下文提取模块增强模型对不同尺度的野生动物的感知能力,提高检测性能;其次,引入Res2Net作为原型校准模块的强分类网络对分类器输出的分类分数进行校正;然后,在RPN中加入置换注意力机制,增强目标区域的特征图,弱化背景信息;最后,将平衡L1损失作为定位损失函数,提升目标定位性能。实验结果表明,相比DeFRCN算法,MS-FSWD在小样本野生动物数据集FSWA上,1-shot和3-shot检测任务中新类AP50分别提升了9.9%和6.6%;在公共数据集PASCAL VOC上,MS-FSWD最高提升了12.6%。与VFA算法相比,在PASCAL VOC数据集Novel Set 3的10-shot任务中,新类AP50提升了3.3%。
文摘针对无人机航拍图像中存在小目标、目标遮挡、背景复杂的问题,提出一种基于高效特征提取和大感受野的目标检测网络(efficient feature and large receptive field network,EFLF-Net)。通过优化检测层架构降低小目标漏检率;在主干网络融合新的构建模块以提升特征提取效率;引入内容感知特征重组模块和大型选择性核网络,增强颈部网络对遮挡目标的上下文感知能力;采用Wise-IoU损失函数优化边界框回归稳定性。在VisDrone2019数据集上的实验结果表明,EFLF-Net较基准模型在平均精度上提高了5.2%。与已有代表性的目标检测算法相比,该方法对存在小目标、目标相互遮挡和复杂背景的无人机航拍图像有更好的检测效果。