Convolutional BiLSTM Variational Sequence-To-Sequence Based Video Captioning for Capturing Intricate Temporal Dependencies

下载PDF

导出

摘要 In the realm of video understanding,the demand for accurate and contextually rich video captioning has surged with the increasing volume and complexity of multimedia content.This research introduces an innovative solution for video captioning by integrating a Convolutional BiLSTM Convolutional Bidirectional Long Short-Term Memory(BiLSTM)constructed Variational Sequence-to-Sequence(CBVSS)approach.The proposed framework is adept at capturing intricate temporal dependencies within video sequences,enabling a more nuanced and contextually relevant description of dynamic scenes.However,optimizing its parameters for improved performance remains a crucial challenge.In response,in this research Golden Eagle Optimization(GEO)a metaheuristic optimization technique is used to fine-tune the Convolutional BiLSTM variational sequence-to-sequence model parameters.The application of GEO aims to enhancing the CBVSS ability to produce more exact and contextually rich video captions.The proposed attains an overall higher Recall of 59.75%and Precision of 63.78%for both datasets.Additionally,the proposed CBVSS method demonstrated superior performance across both datasets,achieving the highest METEOR(25.67)and CIDER(39.87)scores on the ActivityNet dataset,and further outperforming all compared models on the YouCook2 dataset with METEOR(28.67)and CIDER(43.02),highlighting its effectiveness in generating semantically rich and contextually accurate video captions.

作者 M.Gowri Shankar D.Surendran

机构地区 Department of Computer Science and Engineering Department of Information Technology

出处《Journal of Bionic Engineering》 2025年第5期2700-2716,共17页 仿生工程学报(英文版)

关键词 Video captioning Convolutional BiLSTM Variational sequence-to-sequence model Golden eagleoptimization Intricate temporal dependencies

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1无.2024 Top Ten News of China's Paper Industry[J].造纸信息,2025(10):145-157.
22024 Top Ten News of China’s Paper Industry[J].造纸信息,2025(10):158-159.
3彭宇轩,韩巧玲,赵玥.基于Mamba多尺度特征提取的密集视频描述方法[J].红外与激光工程,2025,54(11):234-246.
4孙冰男,于思恺,张宇,刘俊,王军.融合动态多尺度轻量化无人机小目标检测算法[J].电子测量技术,2025,48(21):199-206.
5刘帅,曹菲,秦建强.基于GEO-KOA-BP的TSV三维互联结构总剂量效应预测模型[J].核电子学与探测技术,2025,45(9):1381-1391.
6Chong Yuan,Dong Zhao,Ali Asghar Heidari,Lei Liu,Shuihua Wang,Huiling Chen,Yudong Zhang.Bat algorithm based on kinetic adaptation and elite communication for engineering problems[J].CAAI Transactions on Intelligence Technology,2025,10(4):1174-1200.
7王元龙,张宁倩,张虎.基于图像分类规划学习的视觉故事生成模型[J].大数据,2025,11(6):108-122.
8Mengyuan HE,Chao ZENG,Ning WANG,Chenguang YANG.Adaptive Motion-State Estimation and Feature Reuse for Intermittent Dynamics in Visual SLAM[J].Artificial Intelligence Science and Engineering,2025,1(4):278-293.
9Lijing Wang,Tianyi Yang,Mengjiao Wei,Renquan Guan,Wei Wei,Jizhou Jiang.Machine learning and DFT dual-guidance of carbon dots implanted SrTiO_(3)hollow nanosphere for efficient all-pH-value photocatalysis[J].Journal of Materials Science & Technology,2025,217(14):169-181.
10Baolu Li,Ping Liu,Lan Fu,Jinlong Li,Jianwu Fang,Zhigang Xu,Hongkai Yu.Enhancing vehicle Re-identification by pair-flexible pose guided vehicle image synthesis[J].Green Energy and Intelligent Transportation,2025,4(5):15-25.

Journal of Bionic Engineering

2025年第5期

浏览历史

内容加载中请稍等...

Convolutional BiLSTM Variational Sequence-To-Sequence Based Video Captioning for Capturing Intricate Temporal Dependencies

相关作者

相关机构

相关主题

浏览历史