期刊文献+
共找到591篇文章
< 1 2 30 >
每页显示 20 50 100
Semantic-Based Video Retrieval Survey 被引量:1
1
作者 Shaimaa Toriah Mohamed Toriah Atef Zaki Ghalwash Aliaa A. A. Youssif 《Journal of Computer and Communications》 2018年第8期28-44,共17页
There is a tremendous growth of digital data due to the stunning progress of digital devices which facilitates capturing them. Digital data include image, text, and video. Video represents a rich source of information... There is a tremendous growth of digital data due to the stunning progress of digital devices which facilitates capturing them. Digital data include image, text, and video. Video represents a rich source of information. Thus, there is an urgent need to retrieve, organize, and automate videos. Video retrieval is a vital process in multimedia applications such as video search engines, digital museums, and video-on-demand broadcasting. In this paper, the different approaches of video retrieval are outlined and briefly categorized. Moreover, the different methods that bridge the semantic gap in video retrieval are discussed in more details. 展开更多
关键词 SEMANTIC video retrieval CONCEPT Detectors CONTEXT Based CONCEPT FUSION SEMANTIC GAP
在线阅读 下载PDF
Real-time and Automatic Close-up Retrieval from Compressed Videos
2
作者 Ying Weng Jianmin Jiang 《International Journal of Automation and computing》 EI 2008年第2期198-201,共4页
This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based o... This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based on camera motion analysis. A new algorithm for fast camera motion estimation in compressed domain is presented. In the retrieval process, camera-motion-based semantic retrieval is built. To improve the coverage of the proposed scheme, close-up retrieval in all kinds of videos is investigated. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario. 展开更多
关键词 Camera motion analysis close-up retrieval moving picture experts group (MPEG) compressed videos
在线阅读 下载PDF
Automated neurosurgical video segmentation and retrieval system
3
作者 Engin Mendi Songul Cecen +1 位作者 Emre Ermisoglu Coskun Bayrak 《Journal of Biomedical Science and Engineering》 2010年第6期618-624,共7页
Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing ... Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing availability of the digital video data, indexing, annotating and the retrieval of the information are crucial. Since performing these processes are both computationally expensive and time consuming, automated systems are needed. In this paper, we present a medical video segmentation and retrieval research initiative. We describe the key components of the system including video segmentation engine, image retrieval engine and image quality assessment module. The aim of this research is to provide an online tool for indexing, browsing and retrieving the neurosurgical videotapes. This tool will allow people to retrieve the specific information in a long video tape they are interested in instead of looking through the entire content. 展开更多
关键词 video Processing video SUMMARIZATION video SEGMENTATION IMAGE retrieval IMAGE Quality Assessment
在线阅读 下载PDF
A Sentence Retrieval Generation Network Guided Video Captioning
4
作者 Ou Ye Mimi Wang +3 位作者 Zhenhua Yu Yan Fu Shun Yi Jun Deng 《Computers, Materials & Continua》 SCIE EI 2023年第6期5675-5696,共22页
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide... Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches. 展开更多
关键词 video captioning encoder-decoder sentence retrieval external corpus RS GPT-2 network model
在线阅读 下载PDF
Similar Video Retrieval via Order-Aware Exemplars and Alignment
5
作者 Teruki Horie Masato Uchida Yasuo Matsuyama 《Journal of Signal and Information Processing》 2018年第2期73-91,共19页
In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are ... In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed. 展开更多
关键词 Similar video retrieval EXEMPLAR Learning M-Distance Sequence ALIGNMENT Data STRUCTURING
在线阅读 下载PDF
Dynamic Hyperlinker: Innovative Solution for 3D Video Content Search and Retrieval
6
作者 Mohammad Rafiq Swash Amar Aggoun +1 位作者 Obaidullah Abdul Fatah Bei Li 《Journal of Computer and Communications》 2016年第6期10-23,共14页
Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing... Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing and becoming available ubiquitously. However, searching and visualizing 3D content remains a great challenge. In this paper, we propose and present the development of a novel approach for creating hypervideos, which ease the 3D content search and retrieval. It is called the dynamic hyperlinker for 3D content search and retrieval process. It advances 3D multimedia navigability and searchability by creating dynamic links for selectable and clickable objects in the video scene whilst the user consumes the 3D video clip. The proposed system involves 3D video processing, such as detecting/tracking clickable objects, annotating objects, and metadata engineering including 3D content descriptive protocol. Such system attracts the attention from both home and professional users and more specifically broadcasters and digital content providers. The experiment is conducted on full parallax holoscopic 3D videos “also known as integral images”. 展开更多
关键词 Holoscopic 3D Image Integral Image 3D video 3D Display video Search and retrieval Hyperlinker Hypervideo
在线阅读 下载PDF
CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval
7
作者 Tao Zhang Yu Zhang 《Journal of Computer and Communications》 2024年第11期26-36,共11页
Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval per... Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval performance, as it determines the quality of the visual content representation. Traditional sampling methods, such as uniform sampling and optical flow-based techniques, often fail to capture the full semantic range of videos, leading to redundancy and inefficiencies. In this work, we propose CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval, a global semantics-guided multi-granularity frame sampling strategy designed to optimize both computational efficiency and retrieval accuracy. By integrating multi-scale global and local temporal sampling and leveraging the CLIP (Contrastive Language-Image Pre-training) model’s powerful feature extraction capabilities, our method significantly outperforms existing approaches in both zero-shot and fine-tuned video-text retrieval tasks on popular datasets. CLIP4Video-Sampling reduces redundancy, ensures keyframe coverage, and serves as an adaptable pre-processing module for multimodal models. 展开更多
关键词 video Sampling Multimodal Large Language Model Text-video retrieval CLIP Model
在线阅读 下载PDF
Sign Language Video Retrieval Based on Trajectory
8
作者 Shilin Zhang Mei Gu 《通讯和计算机(中英文版)》 2010年第9期32-35,共4页
关键词 基于内容的视频检索 手语 编辑距离 距离算法 颜色直方图 字符串 修正方法 内存空间
在线阅读 下载PDF
Sign Video Retrieval under Complex Background
9
作者 Shilin Zhang Mei Gu 《通讯和计算机(中英文版)》 2010年第8期14-19,共6页
关键词 视频检索系统 复杂背景 隐马尔可夫模型 HMM模型 手语识别 搜索问题 动态特性 运动特征
在线阅读 下载PDF
Video Retrieval Using Color and Spatial Information of Human Appearance
10
作者 Sofina Yakhu Nikom Suvonvorn 《通讯和计算机(中英文版)》 2012年第6期636-643,共8页
关键词 基于内容的视频检索 外观颜色 空间信息 人性化 视频监控系统 目标搜索 视频数据 VR系统
在线阅读 下载PDF
Metric Learning for Semantic Metric Learning for Semantic⁃Based Clothes Retrieval
11
作者 YANG Bo GUO Caili LI Zheng 《ZTE Communications》 2022年第1期76-82,共7页
Existing clothes retrieval methods mostly adopt binary supervision in metric learning.For each iteration,only the clothes belonging to the same instance are positive samples,and all other clothes are“indistinguishabl... Existing clothes retrieval methods mostly adopt binary supervision in metric learning.For each iteration,only the clothes belonging to the same instance are positive samples,and all other clothes are“indistinguishable”negative samples,which causes the following problem.The relevance between the query and candidates is only treated as relevant or irrelevant,which makes the model difficult to learn the continu-ous semantic similarities between clothes.Clothes that do not belong to the same instance are completely considered irrelevant and are uni-formly pushed away from the query by an equal margin in the embedding space,which is not consistent with the ideal retrieval results.Moti-vated by this,we propose a novel method called semantic-based clothes retrieval(SCR).In SCR,we measure the semantic similarities be-tween clothes and design a new adaptive loss based on these similarities.The margin in the proposed adaptive loss can vary with different se-mantic similarities between the anchor and negative samples.In this way,more coherent embedding space can be learned,where candidates with higher semantic similarities are mapped closer to the query than those with lower ones.We use Recall@K and normalized Discounted Cu-mulative Gain(nDCG)as evaluation metrics to conduct experiments on the DeepFashion dataset and have achieved better performance. 展开更多
关键词 clothes retrieval metric learning semantic-based retrieval
在线阅读 下载PDF
Inference and retrieval of soccer event
12
作者 SUN Xing-hua YANG Jing-yu 《通讯和计算机(中英文版)》 2007年第3期18-32,共15页
关键词 英式足球比赛 视频提取 语境 贝氏网络 用户定义
在线阅读 下载PDF
NewsVideoCAR:一个基于内容的视频新闻节目浏览检索系统 被引量:3
13
作者 熊华 老松杨 +3 位作者 吴玲琦 李恒峰 吴玲达 李国辉 《计算机工程》 CAS CSCD 北大核心 2000年第11期73-75,共3页
介绍了NewsVideoCAR系统的构成,核心技术的基本思想和浏览界面的设计要点.
关键词 NewsvideoCAR 电视新闻节目 节目浏览检索系统
在线阅读 下载PDF
基于跨模态注意力机制的视频-文本检索方法
14
作者 董闯 栗伟 +1 位作者 巴聪 覃文军 《东北大学学报(自然科学版)》 北大核心 2026年第1期75-81,共7页
针对当前视频-文本检索方法未能有效结合时间信息与相关性信息进行联合建模的问题,提出一种基于跨模态注意力机制的视频-文本检索方法.首先,利用预训练的大规模图像-文本模型提取文本和视频帧的嵌入表示,通过知识迁移缓解不同模态数据... 针对当前视频-文本检索方法未能有效结合时间信息与相关性信息进行联合建模的问题,提出一种基于跨模态注意力机制的视频-文本检索方法.首先,利用预训练的大规模图像-文本模型提取文本和视频帧的嵌入表示,通过知识迁移缓解不同模态数据之间的异质性问题.然后,使用联合文本-帧跨模态注意力机制模块,同时编码视频帧之间的时间信息以及视频帧与文本之间的相关性信息,捕获更具竞争力的视频特征表示.最后,利用交叉熵损失函数约束模型训练.通过对比实验验证,该方法能够有效捕获视频帧的时间信息和相关性信息,在MSR-VTT(microsoft research video to text)和LSMDC(large-scale movie description challenge)数据集上取得具有竞争力的效果. 展开更多
关键词 视频-文本检索 跨模态 注意力机制 知识迁移 视频特征表示
在线阅读 下载PDF
Pano Video:摄像机运动建模及从视频估计摄像机运动参数的一种方法 被引量:6
15
作者 张茂军 胡晓峰 库锡树 《中国图象图形学报(A辑)》 CSCD 1997年第8期623-628,共6页
通过给摄像机平移、旋转与变焦等运动建模,并把运动模型与基于象素点亮度变化的方法相结合来估计摄像机运动参数。用得到的运动参数可以把视频构造成一幅全景图,全景图可广泛应用于视频压缩与检索。实验表明,该方法可成功地应用于视... 通过给摄像机平移、旋转与变焦等运动建模,并把运动模型与基于象素点亮度变化的方法相结合来估计摄像机运动参数。用得到的运动参数可以把视频构造成一幅全景图,全景图可广泛应用于视频压缩与检索。实验表明,该方法可成功地应用于视频会议系统中的视频压缩与视频检索。 展开更多
关键词 摄像机 运动估计 全景图 视频压缩 多媒体技术
在线阅读 下载PDF
基于DETR的视频时刻检索方法综述
16
作者 高杜娟 吴媛媛 +3 位作者 林文龙 谢天圻 嘉昊阳 冯昭天 《计算机工程与应用》 北大核心 2026年第5期18-38,共21页
视频时刻检索旨在根据自然语言查询精确定位视频中的特定片段,是视频理解下的重要任务之一。传统方法依赖冗余候选生成和手工特征设计,难以兼顾检索精度与计算效率。近年来,基于Detection Transformer(DETR)的端到端方法借助可学习查询... 视频时刻检索旨在根据自然语言查询精确定位视频中的特定片段,是视频理解下的重要任务之一。传统方法依赖冗余候选生成和手工特征设计,难以兼顾检索精度与计算效率。近年来,基于Detection Transformer(DETR)的端到端方法借助可学习查询机制和直接回归预测策略,简化了框架的同时提升了检索性能。对DETR在视频时刻检索中的关键技术进展进行了系统综述,回顾了DETR模型的基础原理及其在该任务中的适配改进;对DETR的模型框架结构的优化研究方法进行了分类,细分为基于输入建模的特征增强、基于跨模态对齐的交互机制优化以及基于解码器结构与时刻回归机制这三个优化方向。对主流方法进行了系统梳理与检索精度比较;结合实验结果,分析了不同优化策略对模型性能的影响,并总结了各方法在主流数据集上的表现差异。最后,针对面向真实应用场景的泛化、跨模态交互走向语义整合机制以及面向开放领域与个性化检索的扩展这三个未来发展方向进行了讨论展望,为后续研究提供理论参考与实践指导。 展开更多
关键词 视频时刻检索 Detection Transformer(DETR) 深度学习
在线阅读 下载PDF
基于Three.js的超维新视界平台设计与实现
17
作者 夏燊 余海洋 +3 位作者 刘敏娜 高伟量 倪艺轩 冀利华 《价值工程》 2026年第4期165-168,共4页
在高校智慧校园建设背景下,为新生提供便捷直观的校园认知服务具有重要价值。本项目围绕导航与线上游览需求,构建了集全景展示、路径引导与AI助手于一体的校园全景视频平台。系统采用SpringBoot、Vue 3与MySQL搭建基础框架,并利用Three... 在高校智慧校园建设背景下,为新生提供便捷直观的校园认知服务具有重要价值。本项目围绕导航与线上游览需求,构建了集全景展示、路径引导与AI助手于一体的校园全景视频平台。系统采用SpringBoot、Vue 3与MySQL搭建基础框架,并利用Three.js实现全景视频渲染、场景切换及多段式交互播放,形成“内容+互动、导航+游览”的服务模式。平台引入基于RAGFlow检索增强技术训练的AI机器人“咸师小匠”,构建“场景素材+官方知识”资源体系,有效降低大模型在校园领域问答中的幻觉问题,提升回答准确性。同时结合Web语音API实现语音讲解,增强沉浸式体验。实践表明,该平台能有效帮助新生熟悉校园环境,对高校数字迎新与智慧校园应用具有一定推广价值。 展开更多
关键词 全景视频平台 智慧校园 导航服务 检索增强生成 Three.js
在线阅读 下载PDF
Video Segmentation by Acoustic Analysis
18
作者 Shilin Zhang Mei Gu 《通讯和计算机(中英文版)》 2010年第10期33-36,共4页
关键词 视频分割 声学分析 电视频道 静音检测 层次结构 视频记录 自动分割 重复使用
在线阅读 下载PDF
基于对比学习的数据高效视频检索 被引量:1
19
作者 凌非 余京涛 +4 位作者 朱哲燕 罗剑 朱继祥 陈先客 董建锋 《图学学报》 北大核心 2025年第3期491-501,共11页
视频检索系统的性能很大程度上依赖标注数据,而在提高性能的同时减少对高昂手工标注的依赖是一个关键问题。为此,提出了一种基于对比学习的数据高效视频检索方法,包括2个关键的优化策略。首先,为构建更加多样且有效的学习数据,提出了基... 视频检索系统的性能很大程度上依赖标注数据,而在提高性能的同时减少对高昂手工标注的依赖是一个关键问题。为此,提出了一种基于对比学习的数据高效视频检索方法,包括2个关键的优化策略。首先,为构建更加多样且有效的学习数据,提出了基于内容感知的特征级别数据增强,利用基于帧间相似度的K-近邻算法来捕获深层语义信息,减少标注数据依赖。其次,设计了长-短动态采样策略,通过从视频中提取长片段及其内部短片段,使其能够构造具有多尺度信息的正样本对以进行更加有效的对比学习,同时通过动态调整采样长度来提高数据利用率。在SVD和UCF101数据集上的实验结果表明,该方法显著优于现有检索模型。大量消融实验证明,基于内容感知的特征级数据增强能提升模型适应性;长-短动态采样不仅适用于自监督学习,还能提升半监督模型性能。 展开更多
关键词 对比学习 内容感知 特征增强 视频检索 视频表征学习
在线阅读 下载PDF
基于联合嵌入空间的视频文本检索研究综述 被引量:1
20
作者 董闯 栗伟 +1 位作者 巴聪 覃文军 《中国图象图形学报》 北大核心 2025年第5期1220-1237,共18页
视频在人们日常生活中扮演着重要角色,面对爆炸式增长的视频数据,视频文本检索为用户提供便捷的方式检索感兴趣的信息。视频文本检索旨在利用用户输入的文本或视频查询,在视频或文本库中检索出与输入内容最相关的视频或文本。对基于联... 视频在人们日常生活中扮演着重要角色,面对爆炸式增长的视频数据,视频文本检索为用户提供便捷的方式检索感兴趣的信息。视频文本检索旨在利用用户输入的文本或视频查询,在视频或文本库中检索出与输入内容最相关的视频或文本。对基于联合嵌入空间的视频文本检索工作进行系统梳理和综述,以便认识和理解视频文本检索的发展。首先从基于联合嵌入空间的视频文本检索的4个步骤:视频特征表示提取、文本特征表示提取、视频文本特征对齐以及目标函数出发,对现有工作进行分类分析,并阐述不同类型方法的优缺点。接着从实验的角度给出视频文本检索的基准数据集和评价指标,并在多个常用数据集上比较典型模型的性能。最后讨论视频文本检索的挑战及发展方向。 展开更多
关键词 视频文本检索(VTR) 联合嵌入空间 特征提取 特征对齐 多模态
原文传递
上一页 1 2 30 下一页 到第
使用帮助 返回顶部