期刊文献+
共找到546篇文章
< 1 2 28 >
每页显示 20 50 100
High-dimensional features of adaptive superpixels for visually degraded images 被引量:1
1
作者 LIAO Feng-feng CAO Ke-ye +1 位作者 ZHANG Yu-xiang LIU Sheng 《Optoelectronics Letters》 EI 2019年第3期231-235,共5页
This study presents a novel and highly efficient superpixel algorithm, namely, depth-fused adaptive superpixel(DFASP), which can generate accurate superpixels in a degraded image. In many applications, particularly in... This study presents a novel and highly efficient superpixel algorithm, namely, depth-fused adaptive superpixel(DFASP), which can generate accurate superpixels in a degraded image. In many applications, particularly in actual scenes, vision degradation, such as motion blur, overexposure, and underexposure, often occurs. Well-known color-based superpixel algorithms are incapable of producing accurate superpixels in degraded images because of the ambiguity of color information caused by vision degradation. To eliminate this ambiguity, we use depth and color information to generate superpixels. We map the depth and color information to a high-dimensional feature space. Then, we develop a fast multilevel clustering algorithm to produce superpixels. Furthermore, we design an adaptive mechanism to adjust the color and depth information automatically during pixel clustering. Experimental results demonstrate that regardless of boundary recall, under segmentation error, run time, or achievable segmentation accuracy, DFASP is better than state-of-the-art superpixel methods. 展开更多
关键词 HIGH-DIMENSIONAL featureS visually degraded imageS
原文传递
A Concise and Varied Visual Features-Based Image Captioning Model with Visual Selection
2
作者 Alaa Thobhani Beiji Zou +4 位作者 Xiaoyan Kui Amr Abdussalam Muhammad Asim Naveed Ahmed Mohammed Ali Alshara 《Computers, Materials & Continua》 SCIE EI 2024年第11期2873-2894,共22页
Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms... Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image,improving the effectiveness of identifying relevant image regions at each step of caption generation.However,providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features.Consequently,this leads to enhanced captioning network performance.In light of this,we present an image captioning framework that efficiently exploits the extracted representations of the image.Our framework comprises three key components:the Visual Feature Detector module(VFD),the Visual Feature Visual Attention module(VFVA),and the language model.The VFD module is responsible for detecting a subset of the most pertinent features from the local visual features,creating an updated visual features matrix.Subsequently,the VFVA directs its attention to the visual features matrix generated by the VFD,resulting in an updated context vector employed by the language model to generate an informative description.Integrating the VFD and VFVA modules introduces an additional layer of processing for the visual features,thereby contributing to enhancing the image captioning model’s performance.Using the MS-COCO dataset,our experiments show that the proposed framework competes well with state-of-the-art methods,effectively leveraging visual representations to improve performance.The implementation code can be found here:https://github.com/althobhani/VFDICM(accessed on 30 July 2024). 展开更多
关键词 visual attention image captioning visual feature detector visual feature visual attention
在线阅读 下载PDF
A Visual Indoor Localization Method Based on Efficient Image Retrieval 被引量:1
3
作者 Mengyan Lyu Xinxin Guo +1 位作者 Kunpeng Zhang Liye Zhang 《Journal of Computer and Communications》 2024年第2期47-66,共20页
The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor l... The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method. 展开更多
关键词 visual Indoor Positioning feature Point Matching image Retrieval Position Calculation Five-Point Method
在线阅读 下载PDF
Structured Computational Modeling of Human Visual System for No-reference Image Quality Assessment
4
作者 Wen-Han Zhu Wei Sun +2 位作者 Xiong-Kuo Min Guang-Tao Zhai Xiao-Kang Yang 《International Journal of Automation and computing》 EI CSCD 2021年第2期204-218,共15页
Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate eval... Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures. 展开更多
关键词 image quality assessment(IQA) no-reference(NR) structural computational modeling human visual system visual feature extraction
原文传递
Bag-of-visual-words model for artificial pornographic images recognition
5
作者 李芳芳 罗四伟 +1 位作者 刘熙尧 邹北骥 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第6期1383-1389,共7页
It is illegal to spread and transmit pornographic images over internet,either in real or in artificial format.The traditional methods are designed to identify real pornographic images and they are less efficient in de... It is illegal to spread and transmit pornographic images over internet,either in real or in artificial format.The traditional methods are designed to identify real pornographic images and they are less efficient in dealing with artificial images.Therefore,criminals turn to release artificial pornographic images in some specific scenes,e.g.,in social networks.To efficiently identify artificial pornographic images,a novel bag-of-visual-words based approach is proposed in the work.In the bag-of-words(Bo W)framework,speeded-up robust feature(SURF)is adopted for feature extraction at first,then a visual vocabulary is constructed through K-means clustering and images are represented by an improved Bo W encoding method,and finally the visual words are fed into a learning machine for training and classification.Different from the traditional BoW method,the proposed method sets a weight on each visual word according to the number of features that each cluster contains.Moreover,a non-binary encoding method and cross-matching strategy are utilized to improve the discriminative power of the visual words.Experimental results indicate that the proposed method outperforms the traditional method. 展开更多
关键词 artificial pornographic image bag-of-words (BoW) speeded-up robust feature (SURF) descriptors visual vocabulary
在线阅读 下载PDF
Multi-source image fusion algorithm based on fast weighted guided filter 被引量:6
6
作者 WANG Jian YANG Ke +2 位作者 REN Ping QIN Chunxia ZHANG Xiufei 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第5期831-840,共10页
In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Fi... In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Firstly,the source images are separated into a series of high and low frequency components.Secondly,three visual features of the source image are extracted to construct a decision graph model.Thirdly,a fast weighted guided filter is raised to optimize the result obtained in the previous step and reduce the time complexity by considering the correlation among neighboring pixels.Finally,the image obtained in the previous step is combined with the weight map to realize the image fusion.The proposed algorithm is applied to multi-focus,visible-infrared and multi-modal image respectively and the final results show that the algorithm effectively solves the halo artifacts of the merged images with higher efficiency,and is better than the traditional method considering subjective visual consequent and objective evaluation. 展开更多
关键词 FAST GUIDED FILTER image fusion visual feature DECISION map
在线阅读 下载PDF
Content-based retrieval based on binary vectors for 2-D medical images
7
作者 龚鹏 邹亚东 洪海 《吉林大学学报(信息科学版)》 CAS 2003年第S1期127-130,共4页
In medical research and clinical diagnosis, automated or computer-assisted classification and retrieval methods are highly desirable to offset the high cost of manual classification and manipulation by medical experts... In medical research and clinical diagnosis, automated or computer-assisted classification and retrieval methods are highly desirable to offset the high cost of manual classification and manipulation by medical experts. To facilitate the decision-making in the health-care and the related areas, in this paper, a two-step content-based medical image retrieval algorithm is proposed. Firstly, in the preprocessing step, the image segmentation is performed to distinguish image objects, and on the basis of the ... 展开更多
关键词 Content-based image retrieval Medical images feature space: Spatial relationship visual information retrieval
在线阅读 下载PDF
Historical Arabic Images Classification and Retrieval Using Siamese Deep Learning Model
8
作者 Manal M.Khayyat Lamiaa A.Elrefaei Mashael M.Khayyat 《Computers, Materials & Continua》 SCIE EI 2022年第7期2109-2125,共17页
Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of eff... Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively. 展开更多
关键词 visual features vectors deep learning models distance methods similar image retrieval
在线阅读 下载PDF
Sparse representation of global features of visual images in human primary visual cortex: Evidence from fMRI 被引量:1
9
作者 ZHAO SongNian YAO Li +6 位作者 JIN Zhen XIONG XiaoYun WU Xia ZOU Qi YAO GuoZheng CAI XiaoHong LIU YiJun 《Chinese Science Bulletin》 SCIE EI CAS 2008年第14期2165-2174,共10页
In fMRI experiments on object representation in visual cortex, we designed two types of stimuli: one is the gray face image and its line drawing, and the other is the illusion and its corresponding completed illusion.... In fMRI experiments on object representation in visual cortex, we designed two types of stimuli: one is the gray face image and its line drawing, and the other is the illusion and its corresponding completed illusion. Both of them have the same global features with different minute details so that the results of fMRI experiments can be compared with each other. The first kind of visual stimuli was used in a block design fMRI experiment, and the second was used in an event-related fMRI experiment. Comparing and analyzing interesting visual cortex activity patterns and blood oxygenation level dependent (BOLD)- fMRI signal, we obtained results to show some invariance of global features of visual images. A plau- sible explanation about the invariant mechanism is related with the cooperation of synchronized re- sponse to the global features of the visual image with a feedback of shape perception from higher cortex to cortex V1, namely the integration of global features and embodiment of sparse representation and distributed population code. 展开更多
关键词 视皮层 视觉成像 fMRI表达 生物学
在线阅读 下载PDF
基于GLF-ViT算法的地面侦察机器人多标签图像分类
10
作者 杨成山 王明 +1 位作者 郭东兵 赵爱军 《火力与指挥控制》 北大核心 2026年第2期168-173,共6页
现有多标签图像分类算法在地面侦察机器人任务中面临复杂背景、高噪声干扰和目标间存在显著尺度差异等挑战,导致视觉特征提取效果受限。为此,提出一种基于ViT模型的全局-局部特征融合算法(GLF-ViT),通过自注意力机制筛选高响应区域增强... 现有多标签图像分类算法在地面侦察机器人任务中面临复杂背景、高噪声干扰和目标间存在显著尺度差异等挑战,导致视觉特征提取效果受限。为此,提出一种基于ViT模型的全局-局部特征融合算法(GLF-ViT),通过自注意力机制筛选高响应区域增强局部特征表达,并结合全局特征实现跨尺度协同建模。在PASCAL VOC2012数据集上的实验表明,GLF-ViT算法能够有效融合全局与局部特征,在视觉特征提取方面表现出一定的优越性。 展开更多
关键词 多标签图像分类 ViT模型 特征融合 自注意力机制 特征提取
在线阅读 下载PDF
基于闭环图像矫正和线特征聚类的改进PL-VINS
11
作者 张原玮 王祝 +1 位作者 姚万业 王天宁 《仪器仪表学报》 北大核心 2026年第1期340-352,共13页
在光照变化和重复纹理环境中,现有视觉惯性导航系统(VINS)存在特征提取数量不足和特征误匹配率高等问题,导致位姿估计精度和系统鲁棒性难以满足应用需求。对此,提出了一种改进PL-VINS算法,改善光照变化环境下的特征提取性能和重复纹理... 在光照变化和重复纹理环境中,现有视觉惯性导航系统(VINS)存在特征提取数量不足和特征误匹配率高等问题,导致位姿估计精度和系统鲁棒性难以满足应用需求。对此,提出了一种改进PL-VINS算法,改善光照变化环境下的特征提取性能和重复纹理环境下的特征匹配性能。具体地,在图像预处理模块,提出一种闭环伽马矫正方法对图像亮度进行迭代调整,直至图像亮度达到期望值,以提高可提取到的特征数量,从而增强系统在光照变化环境下的鲁棒性;在线特征检测和跟踪模块,先计算空间平行线段对在图像平面的交点,并对交点进行聚类得到交点簇及其加权中心点,再依据线特征与加权中心点的距离和方向实现线特征的聚类,以提升重复纹理环境下线特征匹配的鲁棒性;在后端优化模块,将同簇线特征的交点作为特征加入到优化中,构建点、线和交点特征融合的重投影残差,以提升重复纹理环境下的位姿估计精度。公开数据集上对比测试结果表明,改进PL-VINS在EuRoC数据集上的绝对位姿误差平均值相比PL-VINS算法降低17.4%;在UMA-VI数据集上的绝对位姿误差平均值相比SuperVINS算法降低12.2%。为了进一步验证算法有效性,基于移动机器人搭建试验平台进行实物测试。实物试验结果表明,改进PL-VINS相比对比算法在光照变化和重复纹理环境下表现出更好的准确性和鲁棒性。 展开更多
关键词 视觉定位 图像矫正 线特征聚类 PL-VINS
原文传递
基于胶囊网络与Transformer的细粒度图像分类
12
作者 刘正华 龚小玉 +1 位作者 梁彧骁 梁艳洁 《现代电子技术》 北大核心 2026年第8期137-144,共8页
花卉细粒度图像分类在品种鉴定、精准园艺和智能育种等领域具有重要的应用价值,但形态相似品种间特征差异微小、背景复杂干扰显著,导致现有方法的精度识别不高。针对该问题,提出一种基于胶囊网络与视觉Transformer的细粒度图像分类架构... 花卉细粒度图像分类在品种鉴定、精准园艺和智能育种等领域具有重要的应用价值,但形态相似品种间特征差异微小、背景复杂干扰显著,导致现有方法的精度识别不高。针对该问题,提出一种基于胶囊网络与视觉Transformer的细粒度图像分类架构,以提升特征表征能力与分类性能。首先,设计双频注意力特征提取模块,通过高频与低频并行分支并结合基于Sobel梯度的空间注意力、频域注意力、ECA通道注意力,实现纹理边缘与结构信息的高效建模;其次,构建基于胶囊的视觉Transformer框架,该框架包括胶囊视觉嵌入模块和改进型胶囊感知Transformer编码器,通过显式解耦胶囊模长与方向并引入门控残差与squash非线性,协同建模局部与全局特征;最后,提出联合损失优化策略,从判别性、重构性与泛化能力等角度对模型训练进行优化。实验结果表明,所提方法在Flowers数据集上的识别准确率较高,鲁棒性强,验证了其在复杂场景下的有效性与先进性。 展开更多
关键词 细粒度图像分类 视觉Transformer 融合机制 胶囊网络 联合损失优化 双频注意力
在线阅读 下载PDF
一种应用于视觉导航的轻量级FPGA图像预处理加速器方案
13
作者 薛仁魁 张杰 +2 位作者 李斌 李萌 吴洋 《中国科学院大学学报(中英文)》 北大核心 2026年第2期277-287,共11页
针对视觉导航图像前端的加速处理需求,提出一种基于轻量级、低成本FPGA的图像预处理加速器方案。该方案通过高效的流水线设计以及并行处理技术集成直方图均衡化、FAST特征点检测及多源传感器数据时间同步等关键功能,解决了在有限硬件资... 针对视觉导航图像前端的加速处理需求,提出一种基于轻量级、低成本FPGA的图像预处理加速器方案。该方案通过高效的流水线设计以及并行处理技术集成直方图均衡化、FAST特征点检测及多源传感器数据时间同步等关键功能,解决了在有限硬件资源下实现多功能集成、满足实时性要求、平衡成本与性能、多源传感器信息时间同步,以及实现软硬件协同设计等技术难点。该方案基于Xilinx公司Zynq-7000系列轻量级FPGA实现,在实现低成本的同时大大降低了图像处理延迟。当FPGA以160 MHz的频率运行时,对于1280×720的图像可实现150帧/s的处理速度,提供了一种低成本、高性能的视觉导航图像前端加速解决方案。 展开更多
关键词 图像加速器 直方图均衡化 特征点提取 时间同步 视觉导航 现场可编程门阵列(FPGA)
在线阅读 下载PDF
基于双分支并行编码网络的医学图像分割模型
14
作者 吴灿辉 王乐 +2 位作者 毛国君 饶艳莺 苏宇征 《计算机科学与探索》 北大核心 2026年第4期1181-1192,共12页
在磁共振(MRI)和计算机断层成像(CT)等医学图像分割任务中,人体器官的边缘模糊、小器官难识别以及腹部多器官重叠等问题显著制约了分割精度。同时,主流方法如纯卷积神经网络(CNN)难以建模长距离依赖,而纯Transformer对局部细节捕捉不足... 在磁共振(MRI)和计算机断层成像(CT)等医学图像分割任务中,人体器官的边缘模糊、小器官难识别以及腹部多器官重叠等问题显著制约了分割精度。同时,主流方法如纯卷积神经网络(CNN)难以建模长距离依赖,而纯Transformer对局部细节捕捉不足且计算开销大。针对上述问题,利用全局特征与局部特征之间的互补性,提出双分支编码器DPEncoder,并在此基础上构建了医学图像分割模型PMCNet。DPEncoder采用双分支并行结构:一个分支基于视觉状态空间模型捕捉图像的全局上下文信息与长距离依赖关系;另一分支则利用卷积神经网络提取精细的局部特征和空间细节,并通过通道多尺度卷积特征融合模块有效增强了模型的复杂特征表征能力,很好地实现了全局与局部信息的互补融合。PMCNet基于U型结构,由DPEncoder编码器、对应的解码器以及跳跃连接共同组成,能够实现对MRI或CT切片的高精度分割。实验结果表明,所提出模型在Synapse、ACDC和AMOS2022数据集上的Dice指标分别较基于Mamba的模型Swin-UMamba提高了4.06、1.82和2.74个百分点,并且在与其他先进模型的对比中也展示出了显著的优势。 展开更多
关键词 医学图像分割 双分支并行编码器 多尺度特征 视觉状态空间 卷积神经网络
在线阅读 下载PDF
人行横道处视觉道路环境整体特征分析
15
作者 任蔚溪 陈雨人 《交通与运输》 2026年第1期88-93,共6页
立足视觉感知分析人行横道处的道路环境特征,对提升行人安全具有重要意义。基于街景图像,从语义分割、色彩计算和深度估计3个维度自动提取视觉道路环境特征;采用层次聚类将人行横道处的视觉道路环境划分为自然主导低饱和度型、建筑主导... 立足视觉感知分析人行横道处的道路环境特征,对提升行人安全具有重要意义。基于街景图像,从语义分割、色彩计算和深度估计3个维度自动提取视觉道路环境特征;采用层次聚类将人行横道处的视觉道路环境划分为自然主导低饱和度型、建筑主导高视觉复杂度型及开阔且均衡型3种类型。结果表明:这3种类型对应的行人事故平均数分别为0.741、0.457与0.380,其中开阔且均衡型具有最高行人安全性;与单一视觉特征相比,视觉道路环境类别与行人事故的相关性更强,更能反映驾驶人与行人对场景的整体视觉感知。 展开更多
关键词 人行横道 视觉道路环境 视觉特征 层次聚类 街景图像
在线阅读 下载PDF
PRANet:图像描述的伪区域对齐网络
16
作者 赵勇勇 刘金星 任孟月 《中阿科技论坛(中英文)》 2026年第3期100-104,共5页
图像描述生成是一类典型的跨模态生成任务,其核心在于精准理解图像中不同目标间的语义关联。现有主流方法通常借助预训练模型提取网格、区域等多类型视觉特征,再通过特征交互实现描述文本生成。但此类方法普遍忽略一个关键问题:网格特... 图像描述生成是一类典型的跨模态生成任务,其核心在于精准理解图像中不同目标间的语义关联。现有主流方法通常借助预训练模型提取网格、区域等多类型视觉特征,再通过特征交互实现描述文本生成。但此类方法普遍忽略一个关键问题:网格特征与区域特征通常源自不同预训练模型,两者难以实现真正意义上的特征对齐,且存在明显语义鸿沟。仅通过简单特征交互,会导致模型在复杂多变场景下泛化能力不足,进而影响整体描述性能。为了解决上述问题,文章提出伪区域对齐网络(PRANet)。该网络通过构建区域到网格映射模块,为网格特征生成对应的伪区域特征,在空间结构层面实现高效特征对齐,从而提升模型性能。在MSCOCO数据集上的实验结果表明,该方法取得了具有竞争力的性能指标。 展开更多
关键词 图像描述 特征对齐 图像描述 伪区域 PRANet 视觉语言理解
在线阅读 下载PDF
基于图像处理的莫奈《睡莲》系列色彩特征分析
17
作者 黄如平 孟瑜 《色彩》 2026年第1期66-68,共3页
莫奈《睡莲》系列是印象派色彩表现的典范,作品跨越近三十年,展现出色彩结构与视觉语言的持续演化。本文以图像处理为切入点,提取不同阶段《睡莲》作品的主色调、明度分布与冷暖对比,揭示其视觉构成规律与情绪表达路径。通过量化图像特... 莫奈《睡莲》系列是印象派色彩表现的典范,作品跨越近三十年,展现出色彩结构与视觉语言的持续演化。本文以图像处理为切入点,提取不同阶段《睡莲》作品的主色调、明度分布与冷暖对比,揭示其视觉构成规律与情绪表达路径。通过量化图像特征,建立艺术风格与数字模型之间的关联逻辑,同时探索色彩数据在美术教学、数字创作及东方绘画分析中的应用潜力,为艺术研究与图像技术融合提供实践范式。 展开更多
关键词 莫奈 《睡莲》系列 色彩特征 图像处理 数字视觉分析
在线阅读 下载PDF
Brain functional network connectivity based on a visual task: visual information processing-related brain regions are significantly activated in the task state 被引量:2
18
作者 Yan-li Yang Hong-xia Deng +2 位作者 Gui-yang Xing Xiao-luan Xia Hai-fang Li 《Neural Regeneration Research》 SCIE CAS CSCD 2015年第2期298-307,共10页
It is not clear whether the method used in functional brain-network related research can be applied to explore the feature binding mechanism of visual perception. In this study, we inves-tigated feature binding of col... It is not clear whether the method used in functional brain-network related research can be applied to explore the feature binding mechanism of visual perception. In this study, we inves-tigated feature binding of color and shape in visual perception. Functional magnetic resonance imaging data were collected from 38 healthy volunteers at rest and while performing a visual perception task to construct brain networks active during resting and task states. Results showed that brain regions involved in visual information processing were obviously activated during the task. The components were partitioned using a greedy algorithm, indicating the visual network existed during the resting state.Z-values in the vision-related brain regions were calculated, conifrming the dynamic balance of the brain network. Connectivity between brain regions was determined, and the result showed that occipital and lingual gyri were stable brain regions in the visual system network, the parietal lobe played a very important role in the binding process of color features and shape features, and the fusiform and inferior temporal gyri were crucial for processing color and shape information. Experimental ifndings indicate that understanding visual feature binding and cognitive processes will help establish computational models of vision, improve image recognition technology, and provide a new theoretical mechanism for feature binding in visual perception. 展开更多
关键词 nerve regeneration functional magnetic resonance imaging resting state task state brain network module division feature binding Fisher’s Z transform CONNECTIVITY visual stimuli NSFC grants neural regeneration
在线阅读 下载PDF
EGSNet:An Efficient Glass Segmentation Network Based on Multi-Level Heterogeneous Architecture and Boundary Awareness
19
作者 Guojun Chen Tao Cui +1 位作者 Yongjie Hou Huihui Li 《Computers, Materials & Continua》 SCIE EI 2024年第12期3969-3987,共19页
Existing glass segmentation networks have high computational complexity and large memory occupation,leading to high hardware requirements and time overheads for model inference,which is not conducive to efficiency-see... Existing glass segmentation networks have high computational complexity and large memory occupation,leading to high hardware requirements and time overheads for model inference,which is not conducive to efficiency-seeking real-time tasks such as autonomous driving.The inefficiency of the models is mainly due to employing homogeneous modules to process features of different layers.These modules require computationally intensive convolutions and weight calculation branches with numerous parameters to accommodate the differences in information across layers.We propose an efficient glass segmentation network(EGSNet)based on multi-level heterogeneous architecture and boundary awareness to balance the model performance and efficiency.EGSNet divides the feature layers from different stages into low-level understanding,semantic-level understanding,and global understanding with boundary guidance.Based on the information differences among the different layers,we further propose the multi-angle collaborative enhancement(MCE)module,which extracts the detailed information from shallow features,and the large-scale contextual feature extraction(LCFE)module to understand semantic logic through deep features.The models are trained and evaluated on the glass segmentation datasets HSO(Home-Scene-Oriented)and Trans10k-stuff,respectively,and EGSNet achieves the best efficiency and performance compared to advanced methods.In the HSO test set results,the IoU,Fβ,MAE(Mean Absolute Error),and BER(Balance Error Rate)of EGSNet are 0.804,0.847,0.084,and 0.085,and the GFLOPs(Giga Floating Point Operations Per Second)are only 27.15.Experimental results show that EGSNet significantly improves the efficiency of the glass segmentation task with better performance. 展开更多
关键词 image segmentation multi-level heterogeneous architecture feature differences
在线阅读 下载PDF
Improved Blending Attention Mechanism in Visual Question Answering
20
作者 Siyu Lu Yueming Ding +4 位作者 Zhengtong Yin Mingzhe Liu Xuan Liu Wenfeng Zheng Lirong Yin 《Computer Systems Science & Engineering》 SCIE EI 2023年第10期1149-1161,共13页
Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to ach... Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to achieve better results in VQA tasks.Analysis of all features may cause information redundancy and heavy computational burden.Attention mechanism is a wise way to solve this problem.However,using single attention mechanism may cause incomplete concern of features.This paper improves the attention mechanism method and proposes a hybrid attention mechanism that combines the spatial attention mechanism method and the channel attention mechanism method.In the case that the attention mechanism will cause the loss of the original features,a small portion of image features were added as compensation.For the attention mechanism of text features,a selfattention mechanism was introduced,and the internal structural features of sentences were strengthened to improve the overall model.The results show that attention mechanism and feature compensation add 6.1%accuracy to multimodal low-rank bilinear pooling network. 展开更多
关键词 visual question answering spatial attention mechanism channel attention mechanism image feature processing text feature extraction
在线阅读 下载PDF
上一页 1 2 28 下一页 到第
使用帮助 返回顶部