As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image...As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.展开更多
目标检测是遥感影像解译当中最重要的任务之一。当前,基于深度学习的遥感目标检测模型大多依赖于预定义的锚框,且往往忽略了场景中的上下文信息,导致检测性能和泛化能力受限。基于此,本文提出了一种面向遥感影像目标检测的场景关联无锚...目标检测是遥感影像解译当中最重要的任务之一。当前,基于深度学习的遥感目标检测模型大多依赖于预定义的锚框,且往往忽略了场景中的上下文信息,导致检测性能和泛化能力受限。基于此,本文提出了一种面向遥感影像目标检测的场景关联无锚框YOLO网络(Scene Related Anchor-Free YOLO,SRAF-YOLO)。SRAF-YOLO首先引入了一种场景增强的多尺度特征提取模块,通过将场景特征与目标特征融合,生成富含场景上下文信息的场景增强特征,并进一步利用多尺度操作提取包含场景语义的多尺度特征,有效引入场景上下文信息。在此基础上,设计了一种场景辅助无锚框检测头,利用特征图中的场景信息对目标类别预测进行约束,以提升检测精度,同时无锚框结构有效减少了锚框相关参数的计算量。在RSOD和NWPU VHR-10数据集上的实验结果表明,SRAF-YOLO通过融合场景信息和无锚框机制提升了目标检测精度,平均精度均值(mAP)分别达到94.58%和95.95%,相较于基线模型YOLOv8分别提升了1.51%和3.0%,并优于其他对比方法。在外部数据集上的验证结果进一步证实,该算法具备良好的泛化能力。展开更多
The problem of detecting community structures of a social network has been extensively studied over recent years, but most existing methods solely rely on the network structure and neglect the context information of t...The problem of detecting community structures of a social network has been extensively studied over recent years, but most existing methods solely rely on the network structure and neglect the context information of the social relations. The main reason is that a context-rich network offers too much flexibility and complexity for automatic or manual modulation of the multifaceted context in the analysis process. We address the challenging problem of incorporating context information into the community analysis with a novel visual analysis mechanism. Our approach consists of two stages: interactive discovery of salient context, and iterative context-guided community detection. Central to the analysis process is a context relevance model (CRM) that visually characterizes the influence of a given set of contexts on the variation of the detected communities, and discloses the community structure in specific context configurations. The extracted relevance is used to drive an iterative visual reasoning process, in which the community structures are progressively discovered. We introduce a suite of visual representations to encode the community structures, the context as well as the CRM. In particular, we propose an enhanced parallel coordinates representation to depict the context and community structures, which allows for interactive data exploration and community investigation. Case studies on several datasets demonstrate the efficiency and accuracy of our approach.展开更多
基金Project supported by the Hi-Tech Research and Development Pro-gram (863) of China (No. 2003AA119010), and China-American Digital Academic Library (CADAL) Project (No. CADAL2004002)
文摘As the popularity of digital images is rapidly increasing on the Internet, research on technologies for semantic image classification has become an important research topic. However, the well-known content-based image classification methods do not overcome the so-called semantic gap problem in which low-level visual features cannot represent the high-level semantic content of images. Image classification using visual and textual information often performs poorly since the extracted textual features are often too limited to accurately represent the images. In this paper, we propose a semantic image classification ap- proach using multi-context analysis. For a given image, we model the relevant textual information as its multi-modal context, and regard the related images connected by hyperlinks as its link context. Two kinds of context analysis models, i.e., cross-modal correlation analysis and link-based correlation model, are used to capture the correlation among different modals of features and the topical dependency among images induced by the link structure. We propose a new collective classification model called relational support vector classifier (RSVC) based on the well-known Support Vector Machines (SVMs) and the link-based cor- relation model. Experiments showed that the proposed approach significantly improved classification accuracy over that of SVM classifiers using visual and/or textual features.
文摘目标检测是遥感影像解译当中最重要的任务之一。当前,基于深度学习的遥感目标检测模型大多依赖于预定义的锚框,且往往忽略了场景中的上下文信息,导致检测性能和泛化能力受限。基于此,本文提出了一种面向遥感影像目标检测的场景关联无锚框YOLO网络(Scene Related Anchor-Free YOLO,SRAF-YOLO)。SRAF-YOLO首先引入了一种场景增强的多尺度特征提取模块,通过将场景特征与目标特征融合,生成富含场景上下文信息的场景增强特征,并进一步利用多尺度操作提取包含场景语义的多尺度特征,有效引入场景上下文信息。在此基础上,设计了一种场景辅助无锚框检测头,利用特征图中的场景信息对目标类别预测进行约束,以提升检测精度,同时无锚框结构有效减少了锚框相关参数的计算量。在RSOD和NWPU VHR-10数据集上的实验结果表明,SRAF-YOLO通过融合场景信息和无锚框机制提升了目标检测精度,平均精度均值(mAP)分别达到94.58%和95.95%,相较于基线模型YOLOv8分别提升了1.51%和3.0%,并优于其他对比方法。在外部数据集上的验证结果进一步证实,该算法具备良好的泛化能力。
基金supported by the National Natural Science Foundation of China under Grant Nos. 61232012, 61202279the National High Technology Research and Development 863 Program of China under Grant No. 2012AA12090+1 种基金the Natural Science Foundation of Zhejiang Province of China under Grant No. LR13F020001the Doctoral Fund of Ministry of Education of China under Grant No. 20120101110134
文摘The problem of detecting community structures of a social network has been extensively studied over recent years, but most existing methods solely rely on the network structure and neglect the context information of the social relations. The main reason is that a context-rich network offers too much flexibility and complexity for automatic or manual modulation of the multifaceted context in the analysis process. We address the challenging problem of incorporating context information into the community analysis with a novel visual analysis mechanism. Our approach consists of two stages: interactive discovery of salient context, and iterative context-guided community detection. Central to the analysis process is a context relevance model (CRM) that visually characterizes the influence of a given set of contexts on the variation of the detected communities, and discloses the community structure in specific context configurations. The extracted relevance is used to drive an iterative visual reasoning process, in which the community structures are progressively discovered. We introduce a suite of visual representations to encode the community structures, the context as well as the CRM. In particular, we propose an enhanced parallel coordinates representation to depict the context and community structures, which allows for interactive data exploration and community investigation. Case studies on several datasets demonstrate the efficiency and accuracy of our approach.