期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Side Information Generation Based on Hierarchical Motion Estimation in Distributed Video Coding 被引量:4
1
作者 刘荣科 岳志 陈长汶 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2009年第2期167-173,共7页
The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side inf... The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side information generation algorithm which is integrated into DVC system. First, forward motion estimation (FME) and bidirectional motion estimation (BME) on the basis of variable block size HME algorithm are used to acquire relatively accurate motion vectors. Second, a motion vector filter (MVF) is i... 展开更多
关键词 communication technology video signal processing hierarchical motion estimation side information motion vector filter frame interpolation
原文传递
AVCLNet:Multimodal Multispeaker Tracking Network Using Audio-Visual Contrastive Learning
2
作者 Yihan Li Yidi Li +3 位作者 Zhenhuan Xu Hao Guo Mengyuan Liu Weiwei Wan 《CAAI Transactions on Intelligence Technology》 2026年第1期238-255,共18页
Audio-visual speaker tracking aims to determine the locations of multiple speakers in the scene by leveraging signals captured from multisensor platforms.Multimodal fusion methods can improve both the accuracy and rob... Audio-visual speaker tracking aims to determine the locations of multiple speakers in the scene by leveraging signals captured from multisensor platforms.Multimodal fusion methods can improve both the accuracy and robustness of speaker tracking.However,in complex multispeaker tracking scenarios,critical challenges such as cross-modal feature discrepancy,weak sound source localisation ambiguity and frequent identity switch errors remain unresolved,which severely hinder the modelling of speaker identity consistency and consequently lead to degraded tracking accuracy and unstable tracking trajectories.To this end,this paper proposes a multimodal multispeaker tracking network using audio-visual contrastive learning(AVCLNet).By integrating heterogeneous modal representations into a unified space through audio-visual contrastive learning,which facilitates cross-modal feature alignment,mitigates cross-modal feature bias and enhances identity-consistent representations.In the audio-visual measurement stage,we design a vision-guided weak sound source weighted enhancement method,which leverages visual cues to establish cross-modal mappings and employs a spatiotemporal dynamic weighted mechanism to improve the detectability of weak sound sources.Furthermore,in the data association phase,a dual geometric constraint strategy is introduced by combining the 2D and 3D spatial geometric information,reducing frequent identity switch errors.Experiments on the AV16.3 and CAV3D datasets show that AVCLNet outperforms state-of-the-art methods,demonstrating superior robustness in multispeaker scenarios. 展开更多
关键词 computer vision machine perception multimodal approaches pattern recognition video signal processing
在线阅读 下载PDF
FAST WAY FOR MOVING OBJECT TRACKING BASED ON BALLOON SNAKE WITH REGION INFORMATION
3
作者 方挺 杨忠 沈春林 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2008年第1期37-42,共6页
A novel Snake model with region information is proposed to detect and track moving objects. Generally, the region-information-based approach is sensitive to illumination changes and small movement in the background, w... A novel Snake model with region information is proposed to detect and track moving objects. Generally, the region-information-based approach is sensitive to illumination changes and small movement in the background, while the edge-information-based approach often obtains incorrect results for ambiguous images. The two types of information are introduced in computing the image force. Edge-information-based features make the algorithm fast and robust, and region information makes the active confour energy function obtains correct results for ambiguous images. Furthermore, an automatic contour initialization method using double difference images is given to meet the requirement of video sequence tracking. Meanwhile, a simple forecast section is added to estimate the position of the contour in the algorithm so that it can improve the convergence speed of the active contour. Experimental results show that the computation time of the algorithm is less than 0.1 s/frame. And it can be applied to a real-time system. 展开更多
关键词 video signal processing target tracking Snake model region information
在线阅读 下载PDF
Salt and pepper noise removal in surveillance video based on low-rank matrix recovery
4
作者 Yongxia Zhang Yi Liu +1 位作者 Xuemei Li Caiming Zhang 《Computational Visual Media》 2015年第1期59-68,共10页
This paper proposes a new algorithm based on low-rank matrix recovery to remove salt &pepper noise from surveillance video. Unlike single image denoising techniques, noise removal from video sequences aims to util... This paper proposes a new algorithm based on low-rank matrix recovery to remove salt &pepper noise from surveillance video. Unlike single image denoising techniques, noise removal from video sequences aims to utilize both temporal and spatial information. By grouping neighboring frames based on similarities of the whole images in the temporal domain, we formulate the problem of removing salt &pepper noise from a video tracking sequence as a lowrank matrix recovery problem. The resulting nuclear norm and L1-norm related minimization problems can be efficiently solved by many recently developed methods. To determine the low-rank matrix, we use an averaging method based on other similar images. Our method can not only remove noise but also preserve edges and details. The performance of our proposed approach compares favorably to that of existing algorithms and gives better PSNR and SSIM results. 展开更多
关键词 multimedia computing noise cancellation signal denoising sparse matrices video signal processing video surveillance
原文传递
An improved partial SPIHT with classified weighted rate-distortion optimization for interferential multispectral image compression
5
作者 王柯俨 吴成柯 +1 位作者 孔繁锵 张磊 《Chinese Optics Letters》 SCIE EI CAS CSCD 2008年第5期331-333,共3页
Based on the property analysis of interferential multispectral images, a novel compression algorithm of partial set partitioning in hierarchical trees (SPIHT) with classified weighted rate-distortion optimization is... Based on the property analysis of interferential multispectral images, a novel compression algorithm of partial set partitioning in hierarchical trees (SPIHT) with classified weighted rate-distortion optimization is presented. After wavelet decomposition, partial SPIHT is applied to each zero tree independently by adaptively selecting one of three coding modes according to the probability of the significant coefficients in each bitplane. Meanwhile the interferential multispectral image is partitioned into two kinds of regions in terms of luminous intensity, and the rate-distortion slopes of zero trees are then lifted with classified weights according to their distortion contribution to the constructed spectrum. Finally a global rate- distortion optimization truncation is performed. Compared with the conventional methods, the proposed algorithm not only improves the performance in spatial domain but also reduces the distortion in spectral domain. 展开更多
关键词 Boolean functions Data compression Electric distortion Image coding Image compression Motion estimation OPTIMIZATION Programming theory Risk assessment signal distortion video signal processing Wavelet decomposition Wavelet transforms
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部