Objective:To study the application effect of short video combined with BOPPPS teaching mode in clinical anesthesia practice.Method:48 students assigned to clinical anesthesia in digestive endoscopy of Shanxi Bethune H...Objective:To study the application effect of short video combined with BOPPPS teaching mode in clinical anesthesia practice.Method:48 students assigned to clinical anesthesia in digestive endoscopy of Shanxi Bethune Hospital from July 1,2022 to April 1,2023 were selected as research objects.They were randomly divided into the control group(PowerPoint presentation teaching group)and the observation group(short video combined with BOPPPS teaching group),with 24 students in each group.After the internship,the students’theoretical and technical scores were tested,the effects of the two teaching modes were compared,and the students’satisfaction was investigated.Results:The test scores of students in the observation group were significantly better than those in the control group(P<0.05).The short video combined with BOPPPS teaching mode can significantly improve students’learning interest,operation skills,and memory(P<0.05).The students’satisfaction in the observation group was higher than that in the control group(P<0.05).Conclusion:In clinical practice,the application of short video combined with BOPPPS teaching mode has achieved great effect,which is worth further promotion and research.展开更多
Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dyna...Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dynamic global-Principal Component Analysis (PCA) sparse representation algorithm for video based on the sparse-land model and nonlocal similarity. First, grouping by matching is realized at the decoder from key frames that are previously recovered. Second, we apply PCA to each group (sub-dataset) to compute the principle components from which the sub-dictionary is constructed. Finally, the non-key frames are reconstructed from random measurement data using a Compressed Sensing (CS) reconstruction algorithm with sparse regularization. Experimental results show that our algorithm has a better performance compared with the DCT and K-SVD dictionaries.展开更多
Depth maps are used for synthesis virtual view in free-viewpoint television (FTV) systems. When depth maps are derived using existing depth estimation methods, the depth distortions will cause undesirable artifacts ...Depth maps are used for synthesis virtual view in free-viewpoint television (FTV) systems. When depth maps are derived using existing depth estimation methods, the depth distortions will cause undesirable artifacts in the synthesized views. To solve this problem, a 3D video quality model base depth maps (D-3DV) for virtual view synthesis and depth map coding in the FTV applications is proposed. First, the relationships between distortions in coded depth map and rendered view are derived. Then, a precisely 3DV quality model based depth characteristics is develop for the synthesized virtual views. Finally, based on D-3DV model, a multilateral filtering is applied as a pre-processed filter to reduce rendering artifacts. The experimental results evaluated by objective and subjective methods indicate that the proposed D-3DV model can reduce bit-rate of depth coding and achieve better rendering quality.展开更多
With the proliferation of the internet,big data continues to grow exponentially,and video has become the largest source.Video big data intro-duces many technological challenges,including compression,storage,trans-miss...With the proliferation of the internet,big data continues to grow exponentially,and video has become the largest source.Video big data intro-duces many technological challenges,including compression,storage,trans-mission,analysis,and recognition.The increase in the number of multimedia resources has brought an urgent need to develop intelligent methods to organize and process them.The integration between Semantic link Networks and multimedia resources provides a new prospect for organizing them with their semantics.The tags and surrounding texts of multimedia resources are used to measure their semantic association.Two evaluation methods including clustering and retrieval are performed to measure the semantic relatedness between images accurately and robustly.A Fuzzy Rule-Based Model for Semantic Content Extraction is designed which performs classification with fuzzy rules.The features extracted are trained with the neural network where each network contains several layers among them each layer of neurons is dedicated to measuring the weight towards different semantic events.Each neuron measures its weight according to different features like shape,size,direction,speed,and other features.The object is identified by subtracting the background features and trained to detect based on the features like size,shape,and direction.The weight measurement is performed according to the fuzzy rules and based on the weight measures.These frameworks enhance the video analytics feature and help in video surveillance systems with better accuracy and precision.展开更多
Video large language models(video-LLMs)have demonstrated impressive capabilities in multimodal understanding,but their potential as zero-shot evaluators for temporal consistency in video captions remains underexplored...Video large language models(video-LLMs)have demonstrated impressive capabilities in multimodal understanding,but their potential as zero-shot evaluators for temporal consistency in video captions remains underexplored.Existing methods notably underperform in detecting critical temporal errors,such as missing,hallucinated,or misordered actions.To address this gap,we introduce two key contributions.(1)TimeJudge:a novel zero-shot framework that recasts temporal error detection as answering calibrated binary question pairs.It incorporates modality-sensitive confidence calibration and uses consistency-weighted voting for robust prediction aggregation.(2)TEDBench:a rigorously constructed benchmark featuring videos across four distinct complexity levels,specifically designed with fine-grained temporal error annotations to evaluate video-LLM performance on this task.Through a comprehensive evaluation of multiple state-of-the-art video-LLMs on TEDBench,we demonstrate that TimeJudge consistently yields substantial gains in terms of recall and F1-score without requiring any task-specific fine-tuning.Our approach provides a generalizable,scalable,and training-free solution for enhancing the temporal error detection capabilities of video-LLMs.展开更多
Earthquakes pose a significant threat to life and property worldwide.Rapid and accurate assessment of earthquake damage is crucial for effective disaster response efforts.This study investigates the feasibility of emp...Earthquakes pose a significant threat to life and property worldwide.Rapid and accurate assessment of earthquake damage is crucial for effective disaster response efforts.This study investigates the feasibility of employing deep learning models for damage detection using drone imagery.We explore the adaptation of models like VGG16 for object detection through transfer learning and compare their performance to established object detection architectures like YOLOv8(You Only Look Once)and Detectron2.Our evaluation,based on various metrics including mAP,mAP50,and recall,demonstrates the superior performance of YOLOv8 in detecting damaged buildings within drone imagery,particularly for cases with moderate bounding box overlap.This finding suggests its potential suitability for real-world applications due to the balance between accuracy and efficiency.Furthermore,to enhance real-world feasibility,we explore two strategies for enabling the simultaneous operation of multiple deep learning models for video processing:frame splitting and threading.In addition,we optimize model size and computational complexity to facilitate real-time processing on resource-constrained platforms,such as drones.This work contributes to the field of earthquake damage detection by(1)demonstrating the effectiveness of deep learning models,including adapted architectures,for damage detection from drone imagery,(2)highlighting the importance of evaluation metrics like mAP50 for tasks with moderate bounding box overlap requirements,and(3)proposing methods for ensemble model processing and model optimization to enhance real-world feasibility.The potential for real-time damage assessment using drone-based deep learning models offers significant advantages for disaster response by enabling rapid information gathering to support resource allocation,rescue efforts,and recovery operations in the aftermath of earthquakes.展开更多
基金Shanxi Bethune Hospital Teaching Reform Project(2022JX06)Shanxi Provincial College Teaching Reform and Innovation Project(J20230467)。
文摘Objective:To study the application effect of short video combined with BOPPPS teaching mode in clinical anesthesia practice.Method:48 students assigned to clinical anesthesia in digestive endoscopy of Shanxi Bethune Hospital from July 1,2022 to April 1,2023 were selected as research objects.They were randomly divided into the control group(PowerPoint presentation teaching group)and the observation group(short video combined with BOPPPS teaching group),with 24 students in each group.After the internship,the students’theoretical and technical scores were tested,the effects of the two teaching modes were compared,and the students’satisfaction was investigated.Results:The test scores of students in the observation group were significantly better than those in the control group(P<0.05).The short video combined with BOPPPS teaching mode can significantly improve students’learning interest,operation skills,and memory(P<0.05).The students’satisfaction in the observation group was higher than that in the control group(P<0.05).Conclusion:In clinical practice,the application of short video combined with BOPPPS teaching mode has achieved great effect,which is worth further promotion and research.
基金supported by the Innovation Project of Graduate Students of Jiangsu Province, China under Grants No. CXZZ12_0466, No. CXZZ11_0390the National Natural Science Foundation of China under Grants No. 61071091, No. 61271240, No. 61201160, No. 61172118+2 种基金the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province, China under Grant No. 12KJB510019the Science and Technology Research Program of Hubei Provincial Department of Education under Grants No. D20121408, No. D20121402the Program for Research Innovation of Nanjing Institute of Technology Project under Grant No. CKJ20110006
文摘Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dynamic global-Principal Component Analysis (PCA) sparse representation algorithm for video based on the sparse-land model and nonlocal similarity. First, grouping by matching is realized at the decoder from key frames that are previously recovered. Second, we apply PCA to each group (sub-dataset) to compute the principle components from which the sub-dictionary is constructed. Finally, the non-key frames are reconstructed from random measurement data using a Compressed Sensing (CS) reconstruction algorithm with sparse regularization. Experimental results show that our algorithm has a better performance compared with the DCT and K-SVD dictionaries.
基金supported by the National Natural Science Foundation of China(Grant No.60832003)Key Laboratory of Advanced Display and System Application(Shanghai University),Ministry of Education,China(Grant No.P200902)the Key Project of Science and Technology Commission of Shanghai Municipality(Grant No.10510500500)
文摘Depth maps are used for synthesis virtual view in free-viewpoint television (FTV) systems. When depth maps are derived using existing depth estimation methods, the depth distortions will cause undesirable artifacts in the synthesized views. To solve this problem, a 3D video quality model base depth maps (D-3DV) for virtual view synthesis and depth map coding in the FTV applications is proposed. First, the relationships between distortions in coded depth map and rendered view are derived. Then, a precisely 3DV quality model based depth characteristics is develop for the synthesized virtual views. Finally, based on D-3DV model, a multilateral filtering is applied as a pre-processed filter to reduce rendering artifacts. The experimental results evaluated by objective and subjective methods indicate that the proposed D-3DV model can reduce bit-rate of depth coding and achieve better rendering quality.
基金funded in part by Major projects of the National Social Science Fund(16ZDA054)of Chinathe Postgraduate Research&Practice Innovation Program of Jiansu Province(NO.KYCX18_0999)of Chinathe Engineering Research Center for Software Testing and Evaluation of Fujian Province(ST2018004)of China.
文摘With the proliferation of the internet,big data continues to grow exponentially,and video has become the largest source.Video big data intro-duces many technological challenges,including compression,storage,trans-mission,analysis,and recognition.The increase in the number of multimedia resources has brought an urgent need to develop intelligent methods to organize and process them.The integration between Semantic link Networks and multimedia resources provides a new prospect for organizing them with their semantics.The tags and surrounding texts of multimedia resources are used to measure their semantic association.Two evaluation methods including clustering and retrieval are performed to measure the semantic relatedness between images accurately and robustly.A Fuzzy Rule-Based Model for Semantic Content Extraction is designed which performs classification with fuzzy rules.The features extracted are trained with the neural network where each network contains several layers among them each layer of neurons is dedicated to measuring the weight towards different semantic events.Each neuron measures its weight according to different features like shape,size,direction,speed,and other features.The object is identified by subtracting the background features and trained to detect based on the features like size,shape,and direction.The weight measurement is performed according to the fuzzy rules and based on the weight measures.These frameworks enhance the video analytics feature and help in video surveillance systems with better accuracy and precision.
基金Project supported by the National Natural Science Foundation of China(Nos.62272184 and 62402189)the China Postdoctoral Science Foundation(Nos.2024M751012,2025T180429,and GZC20230894)the Postdoctor Project of Hubei Province(No.2024HBBHCXB014)。
文摘Video large language models(video-LLMs)have demonstrated impressive capabilities in multimodal understanding,but their potential as zero-shot evaluators for temporal consistency in video captions remains underexplored.Existing methods notably underperform in detecting critical temporal errors,such as missing,hallucinated,or misordered actions.To address this gap,we introduce two key contributions.(1)TimeJudge:a novel zero-shot framework that recasts temporal error detection as answering calibrated binary question pairs.It incorporates modality-sensitive confidence calibration and uses consistency-weighted voting for robust prediction aggregation.(2)TEDBench:a rigorously constructed benchmark featuring videos across four distinct complexity levels,specifically designed with fine-grained temporal error annotations to evaluate video-LLM performance on this task.Through a comprehensive evaluation of multiple state-of-the-art video-LLMs on TEDBench,we demonstrate that TimeJudge consistently yields substantial gains in terms of recall and F1-score without requiring any task-specific fine-tuning.Our approach provides a generalizable,scalable,and training-free solution for enhancing the temporal error detection capabilities of video-LLMs.
文摘Earthquakes pose a significant threat to life and property worldwide.Rapid and accurate assessment of earthquake damage is crucial for effective disaster response efforts.This study investigates the feasibility of employing deep learning models for damage detection using drone imagery.We explore the adaptation of models like VGG16 for object detection through transfer learning and compare their performance to established object detection architectures like YOLOv8(You Only Look Once)and Detectron2.Our evaluation,based on various metrics including mAP,mAP50,and recall,demonstrates the superior performance of YOLOv8 in detecting damaged buildings within drone imagery,particularly for cases with moderate bounding box overlap.This finding suggests its potential suitability for real-world applications due to the balance between accuracy and efficiency.Furthermore,to enhance real-world feasibility,we explore two strategies for enabling the simultaneous operation of multiple deep learning models for video processing:frame splitting and threading.In addition,we optimize model size and computational complexity to facilitate real-time processing on resource-constrained platforms,such as drones.This work contributes to the field of earthquake damage detection by(1)demonstrating the effectiveness of deep learning models,including adapted architectures,for damage detection from drone imagery,(2)highlighting the importance of evaluation metrics like mAP50 for tasks with moderate bounding box overlap requirements,and(3)proposing methods for ensemble model processing and model optimization to enhance real-world feasibility.The potential for real-time damage assessment using drone-based deep learning models offers significant advantages for disaster response by enabling rapid information gathering to support resource allocation,rescue efforts,and recovery operations in the aftermath of earthquakes.