期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Video Summarization Approach Based on Binary Robust Invariant Scalable Keypoints and Bisecting K-Means
1
作者 Sameh Zarif Eman Morad +3 位作者 Khalid Amin Abdullah Alharbi Wail S.Elkilani Shouze Tang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3565-3583,共19页
Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract ... Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality. 展开更多
关键词 BRISK bisecting K-mean video summarization keyframe extraction shot detection
在线阅读 下载PDF
Adaptive Graph Convolutional Adjacency Matrix Network for Video Summarization
2
作者 Jing Zhang Guangli Wu Shanshan Song 《Computers, Materials & Continua》 SCIE EI 2024年第8期1947-1965,共19页
Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes an... Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques. 展开更多
关键词 Attention mechanism deep learning graph neural network key-shot video summarization
在线阅读 下载PDF
Effective Video Summarization Approach Based on Visual Attention
3
作者 Hilal Ahmad Habib Ullah Khan +3 位作者 Sikandar Ali Syed Ijaz Ur Rahman Fazli Wahid Hizbullah Khattak 《Computers, Materials & Continua》 SCIE EI 2022年第4期1427-1442,共16页
Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the... Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the frames that stand out visually are extracted as key frames based on humanattention modeling theories. The schemes for modeling visual attention haveproven to be effective for video summaries. Nevertheless, the high cost ofcomputing in such techniques restricts their usability in everyday situations.In this context, we propose a method based on KFE (key frame extraction)technique, which is recommended based on an efficient and accurate visualattention model. The calculation effort is minimized by utilizing dynamicvisual highlighting based on the temporal gradient instead of the traditionaloptical flow techniques. In addition, an efficient technique using a discretecosine transformation is utilized for the static visual salience. The dynamic andstatic visual attention metrics are merged by means of a non-linear weightedfusion technique. Results of the system are compared with some existing stateof-the-art techniques for the betterment of accuracy. The experimental resultsof our proposed model indicate the efficiency and high standard in terms ofthe key frames extraction as output. 展开更多
关键词 KFE video summarization visual saliency visual attention model
在线阅读 下载PDF
An Efficient Method for Underwater Video Summarization and Object Detection Using YoLoV3
4
作者 Mubashir Javaid Muazzam Maqsood +2 位作者 Farhan Aadil Jibran Safdar Yongsung Kim 《Intelligent Automation & Soft Computing》 SCIE 2023年第2期1295-1310,共16页
Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,s... Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,sev-eral cameras are installed underseas to collect videos.However,on the other hand,these large size videos require a lot of time and memory for their processing to extract relevant information.Hence,to automate this manual procedure of video assessment,an accurate and efficient automated system is a greater necessity.From this perspective,we intend to present a complete framework solution for the task of video summarization and object detection in underwater videos.We employed a perceived motion energy(PME)method tofirst extract the keyframes followed by an object detection model approach namely YoloV3 to perform object detection in underwater videos.The issues of blurriness and low contrast in underwater images are also taken into account in the presented approach by applying the image enhancement method.Furthermore,the suggested framework of underwater video summarization and object detection has been evaluated on a publicly available brackish dataset.It is observed that the proposed framework shows good performance and hence ultimately assists several marine researchers or scientists related to thefield of underwater archaeology,stock assessment,and surveillance. 展开更多
关键词 Computer vision deep learning digital image processing underwater video analysis video summarization object detection YOLOV3
在线阅读 下载PDF
Video summarization via global feature difference optimization
5
作者 ZHANG Yunzuo LIU Yameng 《Optoelectronics Letters》 EI 2023年第9期570-576,共7页
Video summarization aims at selecting valuable clips for browsing videos with high efficiency.Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representa... Video summarization aims at selecting valuable clips for browsing videos with high efficiency.Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representations in summarizing videos.In this paper,we present a global difference-aware network(GDANet)that exploits the feature difference across frame and video as guidance to enhance visual features.Initially,a difference optimization module(DOM)is devised to enhance the discriminability of visual features,bringing gains in accurately aggregating temporal cues.Subsequently,a dual-scale attention module(DSAM)is introduced to capture informative contextual information.Eventually,we design an adaptive feature fusion module(AFFM)to make the network adaptively learn context representations and perform feature fusion effectively.We have conducted experiments on benchmark datasets,and the empirical results demonstrate the effectiveness of the proposed framework. 展开更多
关键词 video summarization via global feature difference optimization
原文传递
Video summarization with a graph convolutional attention network 被引量:3
6
作者 Ping LI Chao TANG Xianghua XU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2021年第6期902-913,共12页
Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider th... Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider the local and global relations among frames of video, leading to a deteriorated summarization performance. To address the above problem, we propose a graph convolutional attention network(GCAN) for video summarization. GCAN consists of two parts, embedding learning and context fusion, where embedding learning includes the temporal branch and graph branch. In particular, GCAN uses dilated temporal convolution to model local cues and temporal self-attention to exploit global cues for video frames. It learns graph embedding via a multi-layer graph convolutional network to reveal the intrinsic structure of frame samples. The context fusion part combines the output streams from the temporal branch and graph branch to create the context-aware representation of frames, on which the importance scores are evaluated for selecting representative frames to generate video summary. Experiments are carried out on two benchmark databases, Sum Me and TVSum, showing that the proposed GCAN approach enjoys superior performance compared to several state-of-the-art alternatives in three evaluation settings. 展开更多
关键词 Temporal learning Self-attention mechanism Graph convolutional network Context fusion video summarization
原文传递
Shot classification and replay detection for sports video summarization 被引量:1
7
作者 Ali JAVED Amen ALI KHAN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2022年第5期790-800,共11页
Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effe... Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos. 展开更多
关键词 Extreme learning machine Lightweight convolutional neural network Local octa-patterns Shot classification Replay detection video summarization
原文传递
Technical features in the Portal to CADAL
8
作者 吴江琴 庄越挺 潘云鹤 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2005年第11期1249-1257,共9页
China-America Digital Academic Library Project (CADAL) is a collaborative project between universities and institutes in China and the USA, which aims to provide universal access to large scale digital resources and e... China-America Digital Academic Library Project (CADAL) is a collaborative project between universities and institutes in China and the USA, which aims to provide universal access to large scale digital resources and explore the ways of applying multimedia and virtual reality technologies to digital library. The distinct characteristic of the resources in CADAL is that it not only contains one million digital books of different languages, but also contains Terabyte level multimedia resources (image, video, and so on), which are utilized for education and research purposes. So, in the Portal to CADAL, both the traditional services of browsing and searching of digital books, and the services of quickly retrieving and structurally browsing of multimedia documents should be provided. In addition, the services of visual presentation of retrieved results are required too. In this paper, the underlying novel multimedia retrieval methods as well as visualization techniques, which are used in the CADAL portal, are investigated. 展开更多
关键词 CADAL video summarization Multimedia retrieval Chinese calligraphy 3-D visualization
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部