期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Dynamic Global-Principal Component Analysis Sparse Representation for Distributed Compressive Video Sampling
1
作者 武明虎 陈瑞 +1 位作者 李然 周尚丽 《China Communications》 SCIE CSCD 2013年第5期20-29,共10页
Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dyna... Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dynamic global-Principal Component Analysis (PCA) sparse representation algorithm for video based on the sparse-land model and nonlocal similarity. First, grouping by matching is realized at the decoder from key frames that are previously recovered. Second, we apply PCA to each group (sub-dataset) to compute the principle components from which the sub-dictionary is constructed. Finally, the non-key frames are reconstructed from random measurement data using a Compressed Sensing (CS) reconstruction algorithm with sparse regularization. Experimental results show that our algorithm has a better performance compared with the DCT and K-SVD dictionaries. 展开更多
关键词 distributed video compressive sampling global-PCA sparse representation sparseland model non-local similarity
在线阅读 下载PDF
CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval
2
作者 Tao Zhang Yu Zhang 《Journal of Computer and Communications》 2024年第11期26-36,共11页
Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval per... Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval performance, as it determines the quality of the visual content representation. Traditional sampling methods, such as uniform sampling and optical flow-based techniques, often fail to capture the full semantic range of videos, leading to redundancy and inefficiencies. In this work, we propose CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval, a global semantics-guided multi-granularity frame sampling strategy designed to optimize both computational efficiency and retrieval accuracy. By integrating multi-scale global and local temporal sampling and leveraging the CLIP (Contrastive Language-Image Pre-training) model’s powerful feature extraction capabilities, our method significantly outperforms existing approaches in both zero-shot and fine-tuned video-text retrieval tasks on popular datasets. CLIP4Video-Sampling reduces redundancy, ensures keyframe coverage, and serves as an adaptable pre-processing module for multimodal models. 展开更多
关键词 video Sampling Multimodal Large Language Model Text-video Retrieval CLIP Model
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部