期刊文献+
共找到488篇文章
< 1 2 25 >
每页显示 20 50 100
Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity
1
作者 Chin-Sheng Chen Kang-Yi Peng +1 位作者 Chien-Liang Huang Chun-Wei Yeh 《Journal of Signal and Information Processing》 2013年第3期114-119,共6页
This paper presents a corner-based image alignment algorithm based on the procedures of corner-based template matching and geometric parameter estimation. This algorithm consists of two stages: 1) training phase, and ... This paper presents a corner-based image alignment algorithm based on the procedures of corner-based template matching and geometric parameter estimation. This algorithm consists of two stages: 1) training phase, and 2) matching phase. In the training phase, a corner detection algorithm is used to extract the corners. These corners are then used to build the pyramid images. In the matching phase, the corners are obtained using the same corner detection algorithm. The similarity measure is then determined by the differences of gradient vector between the corners obtained in the template image and the inspection image, respectively. A parabolic function is further applied to evaluate the geometric relationship between the template and the inspection images. Results show that the corner-based template matching outperforms the original edge-based template matching in efficiency, and both of them are robust against non-liner light changes. The accuracy and precision of the corner-based image alignment are competitive to that of edge-based image alignment under the same environment. In practice, the proposed algorithm demonstrates its precision, efficiency and robustness in image alignment for real world applications. 展开更多
关键词 corner-based image alignment CORNER Detection Edge-Based TEMPLATE Matching Gradient Vector
在线阅读 下载PDF
UniTrans:Unified Parameter-Efficient Transfer Learning and Multimodal Alignment for Large Multimodal Foundation Model
2
作者 Jiakang Sun Ke Chen +3 位作者 Xinyang He Xu Liu Ke Li Cheng Peng 《Computers, Materials & Continua》 2025年第4期219-238,共20页
With the advancements in parameter-efficient transfer learning techniques,it has become feasible to leverage large pre-trained language models for downstream tasks under low-cost and low-resource conditions.However,ap... With the advancements in parameter-efficient transfer learning techniques,it has become feasible to leverage large pre-trained language models for downstream tasks under low-cost and low-resource conditions.However,applying this technique to multimodal knowledge transfer introduces a significant challenge:ensuring alignment across modalities while minimizing the number of additional parameters required for downstream task adaptation.This paper introduces UniTrans,a framework aimed at facilitating efficient knowledge transfer across multiple modalities.UniTrans leverages Vector-based Cross-modal Random Matrix Adaptation to enable fine-tuning with minimal parameter overhead.To further enhance modality alignment,we introduce two key components:the Multimodal Consistency Alignment Module and the Query-Augmentation Side Network,specifically optimized for scenarios with extremely limited trainable parameters.Extensive evaluations on various cross-modal downstream tasks demonstrate that our approach surpasses state-of-the-art methods while using just 5%of their trainable parameters.Additionally,it achieves superior performance compared to fully fine-tuned models on certain benchmarks. 展开更多
关键词 Parameter-efficient transfer learning multimodal alignment image captioning image-text retrieval visual question answering
在线阅读 下载PDF
Alignment-dependent ionization of molecules in near-circularly polarized intense laser fields
3
作者 Jie Liu Yong-Kang Zhang Xiao-Lei Hao 《Chinese Physics B》 2025年第5期347-354,共8页
The alignment-dependent photoelectron spectrum is a valuable tool for mapping out the electronic structure of molecular orbitals.However,this approach may not be applicable to all molecules,such as CO_(2),as the ioniz... The alignment-dependent photoelectron spectrum is a valuable tool for mapping out the electronic structure of molecular orbitals.However,this approach may not be applicable to all molecules,such as CO_(2),as the ionization process in a linearly polarized laser field involves contributions from orbitals other than the highest occupied molecular orbital(HOMO).Here,we conducted a theoretical investigation into the ionization process of N_(2) and CO_(2) in near-circularly polarized laser field using the Coulomb-corrected strong-field approximation(CCSFA)method for molecules.In particular,we introduced a generalized dressed state into the CCSFA method in order to account for the impact of the laser field on the molecular initial state.The simulated alignment-dependent photoelectron momentum distribution(PMD)of the two molecules exhibited markedly disparate behaviors,which were in excellent agreement with the previous experimental observations reported in[Phys.Rev.A 102,013117(2020)].Our findings indicate that under a near-circularly polarized laser field,the alignment-dependent PMD of molecules is primarily sourced from the HOMO,in contrast to the situation under a linearly polarized laser field.Moreover,a satisfactory correlation between the alignment-dependent angular distribution and the orbital symmetry was observed,which suggests an effective approach for molecular orbital imaging. 展开更多
关键词 alignment Coulomb-corrected strong-field approximation(CCSFA) photoelectron momentum distribution(PMD) image
原文传递
FPCNet-based change detection for remote sensing images
4
作者 LI Jiying WANG Qi SHI Hongping 《Journal of Measurement Science and Instrumentation》 2025年第3期371-383,共13页
The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on ... The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics. 展开更多
关键词 remote sensing image change detection semantic misalignment dual flow alignment deep supervised discrimination
在线阅读 下载PDF
Multi-domain abdomen image alignment based on multi-scale diffeomorphic jointed network
5
作者 LU Zhengwei WANG Yong +3 位作者 GUAN Qiu CHEN Yizhou LIU Dongchun XU Xinli 《Optoelectronics Letters》 EI 2022年第10期628-634,共7页
Recently, the generative adversarial network(GAN) has been extensively applied to the cross-modality conversion of medical images and has shown outstanding performance than other image conversion algorithms. Hence, we... Recently, the generative adversarial network(GAN) has been extensively applied to the cross-modality conversion of medical images and has shown outstanding performance than other image conversion algorithms. Hence, we propose a novel GAN-based multi-domain registration method named multiscale diffeomorphic jointed network of registration and synthesis(MDJRS-Net). The deviation of the generator of the GAN-based approach affects the alignment phase, so a joint training strategy is introduced to improve the performance of the generator, which feedbacks the structural loss contained in the deformation field. Meanwhile, the nature of diffeomorphism can enable the network to generate deformation fields with more anatomical properties. The average dice score(Dice) is improved by 1.95% for the computer tomography venous(CTV) to magnetic resonance imaging(MRI) registration task and by 1.92% for the CTV to computer tomography plain(CTP) task compared with the other methods. 展开更多
关键词 NETWORK alignment image
原文传递
A Two-Stage Algorithm of High Resolution Image Alignment for Mobile Applications
6
作者 Ren-You Huang Lan-Rong Dung Tang-Suan Hong 《Journal of Computer and Communications》 2016年第4期36-51,共16页
Global motion estimation (GME) algorithms are widely applied to computer vision and video processing. In the previous works, the image resolutions are usually low for the real-time requirement (e.g. video stabilizatio... Global motion estimation (GME) algorithms are widely applied to computer vision and video processing. In the previous works, the image resolutions are usually low for the real-time requirement (e.g. video stabilization). However, in some mobile devices applications (e.g. image sequence panoramic stitching), the high resolution is necessary to obtain satisfactory quality of panoramic image. However, the computational cost will become too expensive to be suitable for the low power consumption requirement of mobile device. The full search algorithm can obtain the global minimum with extremely computational cost, while the typical fast algorithms may suffer from the local minimum problem. This paper proposed a fast algorithm to deal with 2560 × 1920 high-resolution (HR) image sequences. The proposed method estimates the motion vector by a two-level coarse-to-fine scheme which only exploits sparse reference blocks (25 blocks in this paper) in each level to determine the global motion vector, thus the computational costs are significantly decreased. In order to increase the effective search range and robustness, the predictive motion vector (PMV) technique is used in this work. By the comparisons of computational complexity, the proposed algorithm costs less addition operations than the typical Three-Step Search algorithm (TSS) for estimating the global motion of the HR images without the local minimum problem. The quantitative evaluations show that our method is comparable to the full search algorithm (FSA) which is considered to be the golden baseline. 展开更多
关键词 Global Motion Estimation Block Matching High Resolution image alignment Mobile Applications
在线阅读 下载PDF
Imaging alignment of rotational state-selected CH_3I molecule
7
作者 Le-Le Song Yan-Hui Wang +9 位作者 Xiao-Chun Wang Hong-Tao Sun Lan-Hai He Si-Zuo Luo Wen-Hui Hu Dong-Xu Li Wen-Hui Zhu Ya-Nan Sun Da-Jun Ding Fu-Chun Liu 《Chinese Physics B》 SCIE EI CAS CSCD 2019年第2期167-172,共6页
We experimentally and numerically investigate CH_3I molecular alignment by using a femtosecond laser and a hexapole. The hexapole provides the single |111〉rotational state condition at 4.5-kV hexapole rod voltage. Ba... We experimentally and numerically investigate CH_3I molecular alignment by using a femtosecond laser and a hexapole. The hexapole provides the single |111〉rotational state condition at 4.5-kV hexapole rod voltage. Based on this single rotational state, an enhanced alignment degree of 0.73 is achieved. Our experimental results are in agreement with the simulation results. We experimentally obtain the ion velocity map images and show the influence of the initial rotational-state population. With the I+ion images and angular distributions at different pump-probe delay time, the alignment and anti-alignment phenomena are further demonstrated. The molecules will be under field-free conditions when the laser effect disappears completely at the full revival time. Our work shows that the quantum control and spatial control on CH_3I molecules can be realized and molecular coordinate frame can be obtained for further molecular experiment. 展开更多
关键词 hexapole state selection VELOCITY MAP imagING alignment
原文传递
Novel registration algorithm for 3-D images captured from multiple views of object surface
8
作者 衡伟 《Journal of Southeast University(English Edition)》 EI CAS 2005年第4期411-413,共3页
A novel algorithm of 3-D surface image registration is proposed. It makes use of the array information of 3-D points and takes vector/vertex-like features as the basis of the matching. That array information of 3-D po... A novel algorithm of 3-D surface image registration is proposed. It makes use of the array information of 3-D points and takes vector/vertex-like features as the basis of the matching. That array information of 3-D points can be easily obtained when capturing original 3-D images. The iterative least-mean-squared (LMS) algorithm is applied to optimizing adaptively the transformation matrix parameters. These can effectively improve the registration performance and hurry up the matching process. Experimental results show that it can reach a good subjective impression on aligned 3-D images. Although the algorithm focuses primarily on the human head model, it can also be used for other objects with small modifications. 展开更多
关键词 image alignment 3-D image 3-D capture image registration iterative least-mean-squared algorithm
在线阅读 下载PDF
Multi-level distribution alignment-based domain adaptation for segmentation of 3D neuronal soma images
9
作者 Li Ma Xuantai Xu Xiaoquan Yang 《Journal of Innovative Optical Health Sciences》 2025年第6期69-85,共17页
Deep learning networks are increasingly exploited in the field of neuronal soma segmentation.However,annotating dataset is also an expensive and time-consuming task.Unsupervised domain adaptation is an effective metho... Deep learning networks are increasingly exploited in the field of neuronal soma segmentation.However,annotating dataset is also an expensive and time-consuming task.Unsupervised domain adaptation is an effective method to mitigate the problem,which is able to learn an adaptive segmentation model by transferring knowledge from a rich-labeled source domain.In this paper,we propose a multi-level distribution alignment-based unsupervised domain adaptation network(MDA-Net)for segmentation of 3D neuronal soma images.Distribution alignment is performed in both feature space and output space.In the feature space,features from different scales are adaptively fused to enhance the feature extraction capability for small target somata and con-strained to be domain invariant by adversarial adaptation strategy.In the output space,local discrepancy maps that can reveal the spatial structures of somata are constructed on the predicted segmentation results.Then thedistribution alignment is performed on the local discrepancies maps across domains to obtain a superior discrepancy map in the target domain,achieving refined segmentation performance of neuronal somata.Additionally,after a period of distribution align-ment procedure,a portion of target samples with high confident pseudo-labels are selected as training data,which assist in learning a more adaptive segmentation network.We verified the superiority of the proposed algorithm by comparing several domain adaptation networks on two 3D mouse brain neuronal somata datasets and one macaque brain neuronal soma dataset. 展开更多
关键词 Unsupervised domain adaptation multi-level distribution alignment pseudo-labels 3D neuronal soma images
原文传递
A survey on image and video stitching 被引量:9
10
作者 Wei LYU Zhong ZHOU +1 位作者 Lang CHEN Yi ZHOU 《Virtual Reality & Intelligent Hardware》 2019年第1期55-83,共29页
Image/video stitching is a technology for solving the field of view(FOV)limitation of images/videos.It stitches multiple overlapping images/videos to generate a wide-FOV image/video,and has been used in various fields... Image/video stitching is a technology for solving the field of view(FOV)limitation of images/videos.It stitches multiple overlapping images/videos to generate a wide-FOV image/video,and has been used in various fields such as sports broadcasting,video surveillance,street view,and entertainment.This survey reviews image/video stitching algorithms,with a particular focus on those developed in recent years.Image stitching first calculates the corresponding relationships between multiple overlapping images,deforms and aligns the matched images,and then blends the aligned images to generate a wide-FOV image.A seamless method is always adopted to eliminate such potential flaws as ghosting and blurring caused by parallax or objects moving across the overlapping regions.Video stitching is the further extension of image stitching.It usually stitches selected frames of original videos to generate a stitching template by performing image stitching algorithms,and the subsequent frames can then be stitched according to the template.Video stitching is more complicated with moving objects or violent camera movement,because these factors introduce jitter,shakiness,ghosting,and blurring.Foreground detection technique is usually combined into stitching to eliminate ghosting and blurring,while video stabilization algorithms are adopted to solve the jitter and shakiness.This paper further discusses panoramic stitching as a special-extension of image/video stitching.Panoramic stitching is currently the most widely used application in stitching.This survey reviews the latest image/video stitching methods,and introduces the fundamental principles/advantages/weaknesses of image/video stitching algorithms.Image/video stitching faces long-term challenges such as wide baseline,large parallax,and low-texture problem in the overlapping region.New technologies may present new opportunities to address these issues,such as deep learning-based semantic correspondence,and 3D image stitching.Finally,this survey discusses the challenges of image/video stitching and proposes potential solutions. 展开更多
关键词 image stitching Video stitching Panoramic stitching REGISTRATION alignment Mesh optimization Deep learning 3D stitching
在线阅读 下载PDF
Planning Margins to CTV for Image-Guided Whole Pelvis Prostate Cancer Intensity-Modulated Radiotherapy
11
作者 Zhendong Wang Kelin Wang +5 位作者 Fritz A. Lerma Bei Liu Pradip Amin Byongyong Yi Georges Hobeika Cedric Yu 《International Journal of Medical Physics, Clinical Engineering and Radiation Oncology》 2012年第2期23-31,共9页
Purpose: We investigated the margin recipes with different alignment techniques in the image-guided intensity-modulated radiotherapy (IMRT) of whole pelvis prostate cancer patients. Materials and Methods: Forty-eight ... Purpose: We investigated the margin recipes with different alignment techniques in the image-guided intensity-modulated radiotherapy (IMRT) of whole pelvis prostate cancer patients. Materials and Methods: Forty-eight computed tomography (CT) scans of eight prostate cancer patients were investigated. Each patient had an initial planning CT scan and 5 consecutive serial CT scans during the course of treatment, all of which were acquired using 3 mm slice separation and 0.94 mm resolution in the axial plane at 120 kVp, on a PQ 5000 CT scanner. Three different whole pelvis planning margin recipes, ranging from 3 to 13 mm, were investigated. A unique IMRT plan was created with each PTV on the initial CT scan, and was then registered to the 5 serial CT scans, by bony alignment or by prostate gland-based alignment. The dose computed on each serial CT scans was accumulated back to the initial CT scan using deformable image registration for final dosimetric evaluation of the interplay of the margin selection and alignment methods. Results: Bony alignment and prostate gland-based alignment gave very similar result to the pelvic lymphatic nodes (PLNs), regardless of its margin around. The prostate gland-based alignment greatly enhanced the coverage to the prostate and SV, especially with small margins. Meanwhile, the soft-tissue alignment also raised the incidental dose to the rectum and reduces the dose to the bladder. With small to intermediate margins, only soft-tissue alignment gave acceptable mean coverage to SV. Margin of 13mm or more was needed for PLNs to maintain good target coverage. Conclusion: We commend prostate-based alignment along with margins less than or equal to 5mm around prostate and SV, and margins greater than or equal to 13 mm around the vascular spaces. 展开更多
关键词 Prostate Cancer WHOLE PELVIS image Guidance IMRT Margin alignment
暂未订购
Use of Image processing software in Hip Joint surgery
12
作者 Rashmi Uddanwadiker 《Advances in Bioscience and Biotechnology》 2011年第2期68-74,共7页
The scope of this project was to investigate the possibility of application of Image Processing Technique in the field of Shaft Alignment process. Misalignment of shaft using image processing software Visionbuilder wa... The scope of this project was to investigate the possibility of application of Image Processing Technique in the field of Shaft Alignment process. Misalignment of shaft using image processing software Visionbuilder was calculated. The further purpose of this project was to check whether the image processing technique can be used in bone transplant surgery. The model of the hip was used for the experimentation purpose. Image processing software Visionbuilder was used to match the profiles of the bone before implant and bone after implant. 展开更多
关键词 image Processing SHAFT alignment HIP Joint BONE TRANSPLANT
暂未订购
Determination of Phase Transitions of <i>p</i>,<i>n</i>-Alkyloxy Benzoic Acid Mesogens Using Legendre Moments and Image Analysis
13
作者 S. Sreehari Sastry C. Nageswara Rao +2 位作者 K. Mallika S. Lakshminarayan Ha Sie Tiong 《World Journal of Condensed Matter Physics》 2013年第1期54-61,共8页
Phase transition temperatures of p,n-alkyloxy benzoic acids (nOBA, n = 3 to 10 and 12) are investigated basing on the textural image analysis of liquid crystal. The analysis is carried out by the computation of Legend... Phase transition temperatures of p,n-alkyloxy benzoic acids (nOBA, n = 3 to 10 and 12) are investigated basing on the textural image analysis of liquid crystal. The analysis is carried out by the computation of Legendre moments. Textures of the homeotropically aligned compounds are recorded as a function of temperature using POM in arthroscopic mode attached to the hot stage and high resolution camera. A recurrence formula is used to compute the liquid crystal textures based on Legendre polynomial. The discontinuities and fluctuations in the values of Legendre moments as a function of temperature are related to the phase transition temperatures of the sample. This method is successful in conforming or detecting the phase transition temperatures and the present findings are comparable with literature. 展开更多
关键词 Alkyloxybenzoic Acids HOMEOTROPIC alignment Phase Transitions Textures Legendre Moments image ANALYSIS
在线阅读 下载PDF
一种基于链码的农村公路线形比对与筛选算法 被引量:2
14
作者 范文涛 孙翠羽 +2 位作者 崔应寿 刘柳杨 龙佳宁 《交通运输研究》 2025年第1期102-110,共9页
为提升农村公路电子地图数据校核的效率和精确性,提出一种基于链码技术的农村公路线形比对与筛选算法模型。首先,引入链码技术对农村公路电子地图中的路线进行识别和提取,获得路线的起点、拐点及方向变化等线形特征信息,并实现不同地理... 为提升农村公路电子地图数据校核的效率和精确性,提出一种基于链码技术的农村公路线形比对与筛选算法模型。首先,引入链码技术对农村公路电子地图中的路线进行识别和提取,获得路线的起点、拐点及方向变化等线形特征信息,并实现不同地理区域和时间维度下的路线线形链码特征信息的精确匹配。其次,选取我国东部、中部、西部区域中典型区县的2022及2023年农村公路线形数据,对算法模型进行训练,得出路线线形链码差异的不同参数阈值;对于超出阈值的农村公路路线数据,再运用卷积神经网络算法从遥感影像中提取路线线形并获取线形链码特征信息进行比对筛选以及影像评价。最后,将该算法应用于2023年度全国农村公路电子地图中459.9万km的路线数据进行解算验证。结果表明,与传统“全重叠”方法相比,该算法效率提升了72.1%,识别率从64.5%提升至90.6%,准确率从95.7%提升到97.3%。研究证明,该算法显著提高了电子地图线形数据处理的效率和准确率,可为农村公路基础数据入库提供技术支撑,提升农村公路数字化发展水平。 展开更多
关键词 链码算法 线形比对 农村公路 遥感影像 卷积神经网络
在线阅读 下载PDF
基于交叉协同注意力网络的小样本肠道息肉图像语义分割 被引量:1
15
作者 张浩 曹磊 马利亚 《中国数字医学》 2025年第1期39-44,共6页
目的:提高肠道息肉图像语义分割模型对查询图片中未知目标的分割性能。方法:提出一种基于交叉协同注意力网络的小样本肠道息肉图像语义分割方法。首先,利用预训练的VGG-16网络提取支持图片和查询图片的视觉特征;然后,利用支持特征和查... 目的:提高肠道息肉图像语义分割模型对查询图片中未知目标的分割性能。方法:提出一种基于交叉协同注意力网络的小样本肠道息肉图像语义分割方法。首先,利用预训练的VGG-16网络提取支持图片和查询图片的视觉特征;然后,利用支持特征和查询特征建立分支间特征的交叉融合,促进分支间特征语义的对齐;最后,利用无参数的度量方法,逐像素实现查询图片中每一位置的像素分类。结果:在Kvasir-SEG等4个开源的肠道息肉图像数据集中,本研究所提出方法的前景背景交并比(FB-IoU)分值均优于经典的医学图像语义分割模型U-Net。结论:基于交叉协同注意力网络的小样本肠道息肉图像语义分割方法可以精准定位支持图片和查询图片中的息肉区域,具有较好的分割性能。 展开更多
关键词 肠道息肉 图像语义分割 交叉协同注意力网络 语义对齐
暂未订购
时序无关和鲁棒性增强的遥感影像变化检测方法
16
作者 杨景玉 张文驰 +2 位作者 党建武 王锋 火久元 《湖南大学学报(自然科学版)》 北大核心 2025年第8期33-43,共11页
在遥感影像变化检测中,基于深度学习的方法大多采用孪生网络结构.然而,大量实验发现,此类方法会出现改变输入图像的顺序后性能严重下降的现象,其中ChangeFormer方法在LEVIR-CD数据集的交并比指标下降了79.86%,表明模型的时序鲁棒性不足... 在遥感影像变化检测中,基于深度学习的方法大多采用孪生网络结构.然而,大量实验发现,此类方法会出现改变输入图像的顺序后性能严重下降的现象,其中ChangeFormer方法在LEVIR-CD数据集的交并比指标下降了79.86%,表明模型的时序鲁棒性不足,严重影响变化检测模型的实用性.对此,提出了一种结合时序对齐与跨层特征混合的变化检测方法CINet(chronologic invariant network),在特征提取时设计时序对齐模块,通过对特征图进行空间混合和时序重建,在特征层面减少双分支的时序差异.然后设计了跨层特征混合模块,使用全尺度连接和差异引导来充分利用双分支中每一层级的特征图,提高在不同时序下的检测能力.最后,在LEVIR-CD数据集的实验结果显示,CINet的召回率和交并比分别达到了90.63%、84.13%,相较于ChangeFormer分别提高了1.83个百分点、1.65个百分点.在多个数据集上的实验结果也表明,即使在改变输入顺序后,所提方法仍能取得良好的变化检测结果,显示出优于其他方法的检测性能和更强的时序鲁棒性. 展开更多
关键词 遥感影像 变化检测 孪生网络 时序对齐
在线阅读 下载PDF
隐式多尺度对齐与交互的文本-图像行人重识别方法
17
作者 孙锐 杜云 +1 位作者 陈龙 张旭东 《软件学报》 北大核心 2025年第10期4846-4863,共18页
文本-图像行人重识别旨在使用文本描述检索图像库中的目标行人,该技术的主要挑战在于将图像和文本特征嵌入到共同的潜在空间中以实现跨模态对齐.现有的许多工作尝试利用单独预训练的单峰模型来提取视觉和文本特征,再利用切分或者注意力... 文本-图像行人重识别旨在使用文本描述检索图像库中的目标行人,该技术的主要挑战在于将图像和文本特征嵌入到共同的潜在空间中以实现跨模态对齐.现有的许多工作尝试利用单独预训练的单峰模型来提取视觉和文本特征,再利用切分或者注意力机制来获得显式的跨模态对齐.然而,这些显式对齐方法通常缺乏有效匹配多模态特征所需的底层对齐能力,并且使用预设的跨模态对应关系来实现显式对齐可能会导致模态内信息失真.提出了一种隐式多尺度对齐与交互的文本-图像行人重识别方法.首先利用语义一致特征金字塔网络提取图像的多尺度特征,并使用注意力权重融合包含全局和局部信息的不同尺度特征.其次,利用多元交互注意机制学习图像和文本之间的关联.该机制可以有效地捕捉到不同视觉特征和文本信息之间的对应关系,缩小模态间差距,实现隐式多尺度语义对齐.此外,利用前景增强判别器来增强目标行人,提取更纯洁的行人特征,有助于缓解图像与文本之间的信息不平等.在3个主流的文本-图像行人重识别数据集CUHK-PEDES、ICFG-PEDES及RSTPReid上的实验结果表明,所提方法有效提升了跨模态检索性能,比SOTA算法的Rank-1高出2%–9%. 展开更多
关键词 文本-图像行人重识别 隐式对齐 多尺度融合 多元交互注意力 语义对齐
在线阅读 下载PDF
结构感知增强与跨模态融合的文本图像超分辨率
18
作者 朱仲杰 张磊 +3 位作者 李沛 屠仁伟 白永强 王玉儿 《中国图象图形学报》 北大核心 2025年第5期1364-1376,共13页
目的 场景文本图像超分辨率是一种新兴的视觉增强技术,用于提升低分辨率文本图像的分辨率,从而提高文本可读性。然而,现有方法无法有效提取文本结构动态特征,导致形成的语义先验无法与图像特征有效对齐并融合,进而影响图像重建质量并造... 目的 场景文本图像超分辨率是一种新兴的视觉增强技术,用于提升低分辨率文本图像的分辨率,从而提高文本可读性。然而,现有方法无法有效提取文本结构动态特征,导致形成的语义先验无法与图像特征有效对齐并融合,进而影响图像重建质量并造成文本识别困难。为此,提出一种基于文本结构动态感知的跨模态融合超分辨率方法以提高文本图像质量和文本可读性。方法 首先,构建文本结构动态感知模块,通过方向感知层和上下文关联单元,分别提取文本的多尺度定向特征并解析字符邻域间的上下文联系,精准捕获文本图像的结构动态特征;其次,设计语义空间对齐模块,利用文本掩码信息促进精细化文本语义先验的生成,并通过仿射变换对齐语义先验和图像特征;最后,在此基础上,通过跨模态融合模块结合文本语义先验与图像特征,以自适应权重分配的方式促进跨模态交互融合,输出高分辨率文本图像。结果 在真实数据集TextZoom上与多种主流方法进行对比,实验结果表明所提方法在ASTER(attentional scene text recognizer)、CRNN(convolutional recurrent neural network)和MORAN(multiobject rectified attention network)3种文本识别器上的平均识别精度为62.4%,较性能第2的方法有2.8%的提升。此外,所提方法的峰值信噪比(peak signal-to-noise ratio,PSNR)和结构相似性(structural similarity index,SSIM)指标分别为21.9 dB和0.789,分别处于第1名和第2名的位置,领先大多数方法。结论 所提方法通过精准捕获文本结构动态特征来指导高级文本语义先验的生成,从而促进文本和图像两种模态的对齐和融合,有效提升了图像重建质量和文本可读性。 展开更多
关键词 场景文本图像超分辨率(STISR) 文本结构动态特征 多尺度定向特征 语义空间对齐 跨模态融合
原文传递
黄河下游游荡段畸形河势的时空分布及演变规律 被引量:1
19
作者 秦梦春 白玉川 +2 位作者 徐海珏 刘军政 白洋 《水科学进展》 北大核心 2025年第1期62-75,共14页
1985年以来黄河下游游荡段畸形河势频发,为科学治理黄河增加了难度。本文采用深度学习方法精准解译遥感影像,结合历史数据资料,系统研究了黄河下游游荡段1985—2023年畸形河势的时空分布,重点分析了游荡段上段、中段和下段的3个典型畸... 1985年以来黄河下游游荡段畸形河势频发,为科学治理黄河增加了难度。本文采用深度学习方法精准解译遥感影像,结合历史数据资料,系统研究了黄河下游游荡段1985—2023年畸形河势的时空分布,重点分析了游荡段上段、中段和下段的3个典型畸形河势河段的演变过程。为了定量表示主流偏离治导线的程度,提出了偏离规划度的概念,然后从局部河段、单个河湾和断面3个尺度的形态变化分析了游荡段畸形河势的演变规律。研究结果表明:小浪底水库运行后畸形河势位置发生了上移,目前多发生在游荡段上段;枯水多沙条件易形成畸形河势;偏离规划度小于0.2的河段不会发生畸形河势;畸形河势的发展受到河床质的影响,裁弯会从深泓点开始冲刷,并不断往两侧发展。本研究成果可为科学治理黄河下游游荡段提供参考依据。 展开更多
关键词 畸形河势 时空分布 演变过程 遥感影像 治导线 黄河下游游荡段
在线阅读 下载PDF
基于阅读策略和语义对齐的图文匹配方法 被引量:1
20
作者 甘凤梅 夏英 《重庆邮电大学学报(自然科学版)》 北大核心 2025年第1期67-75,共9页
针对跨媒体计算领域中的图文匹配任务,提出一种基于阅读策略和语义对齐的图文匹配方法(reading-strategy and semantic alignment network,RSAN)。设计基于Transformer和双向门控循环单元(bidirectional gated recurrent unit,Bi-GRU)... 针对跨媒体计算领域中的图文匹配任务,提出一种基于阅读策略和语义对齐的图文匹配方法(reading-strategy and semantic alignment network,RSAN)。设计基于Transformer和双向门控循环单元(bidirectional gated recurrent unit,Bi-GRU)的区域特征增强模块,生成具有语义关系的图像区域特征以提升语义对齐的准确性;设计包含概述分支和精读分支的阅读模块,聚合全局对齐和局部对齐来学习更准确的匹配分数。在Flickr30K和MS-COCO数据集上开展综合实验,结果表明:RSAN模型相较于现有基线模型,在准确率和效率上具有良好的表现。 展开更多
关键词 图文匹配 特征增强 语义对齐 相似度计算
在线阅读 下载PDF
上一页 1 2 25 下一页 到第
使用帮助 返回顶部