Behavioral scoring based on clinical observations remains the gold standard for screening,diagnosing,and evaluating infantile epileptic spasm syndrome(IESS).The accurate identification of seizures is crucial for clini...Behavioral scoring based on clinical observations remains the gold standard for screening,diagnosing,and evaluating infantile epileptic spasm syndrome(IESS).The accurate identification of seizures is crucial for clinical diagnosis and assessment.In this study,we propose an innovative seizure detection method based on video feature recognition of patient spasms.To capture the temporal characteristics of the spasm behavior presented in the videos effectively,we incorporate asymmetric convolutions and convolution–batch normalization–ReLU(CBR)modules.Specifically within the 3D-ResNet residual blocks,we split the larger convolutional kernels into two asymmetric 3D convolutional kernels.These kernels are connected in series to enhance the ability of the convolutional layers to extract key local features,both horizontally and vertically.In addition,we introduce a 3D convolutional block attention module to enhance the spatial correlations between video frame channels efficiently.To improve the generalization ability,we design a composite loss function that combines cross-entropy loss with triplet loss to balance the classification and similarity requirements.We train and evaluate our method using the PLA IESS-VIDEO dataset,achieving an average seizure recognition accuracy of 90.59%,precision of 90.94%,and recall of 87.64%.To validate its generalization capability further,we conducted external validation using six different patient monitoring videos compared with assessments by six human experts from various medical centers.The final test results demonstrate that our method achieved a recall of 0.6476,surpassing the average level achieved by human experts(0.5595),while attaining a high F1-score of 0.7219.These findings have substantial significance for the long-term assessment of patients with IESS.展开更多
Video-based person re-identification(Re-ID),a subset of retrieval tasks,faces challenges like uncoordinated sample capturing,viewpoint variations,occlusions,cluttered backgrounds,and sequence uncertainties.Recent adva...Video-based person re-identification(Re-ID),a subset of retrieval tasks,faces challenges like uncoordinated sample capturing,viewpoint variations,occlusions,cluttered backgrounds,and sequence uncertainties.Recent advancements in deep learning have significantly improved video-based person Re-ID,laying a solid foundation for further progress in the field.In order to enrich researchers’insights into the latest research findings and prospective developments,we offer an extensive overview and meticulous analysis of contemporary video-based person ReID methodologies,with a specific emphasis on network architecture design and loss function design.Firstly,we introduce methods based on network architecture design and loss function design from multiple perspectives,and analyzes the advantages and disadvantages of these methods.Furthermore,we provide a synthesis of prevalent datasets and key evaluation metrics utilized within this field to assist researchers in assessing methodological efficacy and establishing benchmarks for performance evaluation.Lastly,through a critical evaluation of the experimental outcomes derived from various methodologies across four prominent public datasets,we identify promising research avenues and offer valuable insights to steer future exploration and innovation in this vibrant and evolving field of video-based person Re-ID.This comprehensive analysis aims to equip researchers with the necessary knowledge and strategic foresight to navigate the complexities of video-based person Re-ID,fostering continued progress and breakthroughs in this challenging yet promising research domain.展开更多
Behavioral analysis of macaques provides important experimental evidence in the field of neuroscience.In recent years,video-based automatic animal behavior analysis has received widespread attention.However,methods ca...Behavioral analysis of macaques provides important experimental evidence in the field of neuroscience.In recent years,video-based automatic animal behavior analysis has received widespread attention.However,methods capable of extracting and analyzing daily movement trajectories of macaques in their daily living cages remain underdeveloped,with previous approaches usually requiring specific environments to reduce interference from occlusion or environmental change.Here,we introduce a novel method,called MonkeyTrail,which satisfies the above requirements by frequently generating virtual empty backgrounds and using background subtraction to accurately obtain the foreground of moving animals.The empty background is generated by combining the frame difference method(FDM)and deep learning-based model(YOLOv5).The entire setup can be operated with low-cost hardware and can be applied to the daily living environments of individually caged macaques.To test MonkeyTrail performance,we labeled a dataset containing>8000 video frames with the bounding boxes of macaques under various conditions as ground-truth.Results showed that the tracking accuracy and stability of MonkeyTrail exceeded that of two deep learningbased methods(YOLOv5 and Single-Shot MultiBox Detector),traditional frame difference method,and na?ve background subtraction method.Using MonkeyTrail to analyze long-term surveillance video recordings,we successfully assessed changes in animal behavior in terms of movement amount and spatial preference.Thus,these findings demonstrate that MonkeyTrail enables low-cost,large-scale daily behavioral analysis of macaques.展开更多
A new video-based measurement is proposed to collect and investigate traffic flow parameters. The output of the measurement is velocity-headway distance data pairs. Because density can be directly acquired by the reci...A new video-based measurement is proposed to collect and investigate traffic flow parameters. The output of the measurement is velocity-headway distance data pairs. Because density can be directly acquired by the reciprocal of headway distance, the data pairs have the advantage of better simultaneity than those from common detectors. By now, over 33 000 pairs of data have been collected from two road sections in the cities of Shanghai and Zhengzhou. Through analyzing the video files recording traffic movements on urban expressways, the following issues are studied:laws of vehicle velocity changing with headway distance, proportions of di0erent driving behaviors in the traffic system, and characteristics of traffic flow in snowy days. The results show that the real road traffic is very complex, and factors such as location and climate need to be taken into consideration in the formation of traffic flow models.展开更多
Many individuals with autism spectrum disorder(ASD)experience delays in the development of social and communications skills,which can limit their opportunities in higher education and employment resulting in an overal...Many individuals with autism spectrum disorder(ASD)experience delays in the development of social and communications skills,which can limit their opportunities in higher education and employment resulting in an overall negative impact to their quality of life.This systematic review identifies 15 studies that explored the effectiveness of Video-Based Interventions(VBIs)for those with ASD during the critical years of adolescence and young adulthood.The 15 studies described herein found this to be an effective intervention for this population for the improvement of their vocational,daily living,and academic skills.In addition,VBIs allow for the maintenance and generalization of the different target behaviors that were examined.The majority of the studies located by this review also investigated the social validity of the intervention method with participants and caregivers and found these VBIs to have high social validity.Although a few studies that implemented VBIs to improve academic skills were located,the research on their use in this area was found to be lacking,indicating a gap in the research on VBIs.Increased usage of VBIs—including video modeling and video prompting—with the target population of those aged 15–28 with ASD is recommended with specific attention given to the use of VBIs to improve the academic and social skills of adolescents and young adults with ASD.展开更多
This paper deals with the error analysis of a novel navigation algorithm that uses as input the sequence of images acquired from a moving camera and a Digital Terrain (or Elevation) Map (DTM/DEM). More specifically, i...This paper deals with the error analysis of a novel navigation algorithm that uses as input the sequence of images acquired from a moving camera and a Digital Terrain (or Elevation) Map (DTM/DEM). More specifically, it has been shown that the optical flow derived from two consecutive camera frames can be used in combination with a DTM to estimate the position, orientation and ego-motion parameters of the moving camera. As opposed to previous works, the proposed approach does not require an intermediate explicit reconstruction of the 3D world. In the present work the sensitivity of the algorithm outlined above is studied. The main sources for errors are identified to be the optical-flow evaluation and computation, the quality of the information about the terrain, the structure of the observed terrain and the trajectory of the camera. By assuming appropriate characterization of these error sources, a closed form expression for the uncertainty of the pose and motion of the camera is first developed and then the influence of these factors is confirmed using extensive numerical simulations. The main conclusion of this paper is to establish that the proposed navigation algorithm generates accurate estimates for reasonable scenarios and error sources, and thus can be effectively used as part of a navigation system of autonomous vehicles.展开更多
基金the National Social Science Foundation of China(No.21BTQ106),the Natural Science Foundation of Beijing(No.7222187),and the Key Project of Innovation Cultivation Fund of the Seventh Medical Center of PLA General Hospital(No.qzx-2023-1)。
文摘Behavioral scoring based on clinical observations remains the gold standard for screening,diagnosing,and evaluating infantile epileptic spasm syndrome(IESS).The accurate identification of seizures is crucial for clinical diagnosis and assessment.In this study,we propose an innovative seizure detection method based on video feature recognition of patient spasms.To capture the temporal characteristics of the spasm behavior presented in the videos effectively,we incorporate asymmetric convolutions and convolution–batch normalization–ReLU(CBR)modules.Specifically within the 3D-ResNet residual blocks,we split the larger convolutional kernels into two asymmetric 3D convolutional kernels.These kernels are connected in series to enhance the ability of the convolutional layers to extract key local features,both horizontally and vertically.In addition,we introduce a 3D convolutional block attention module to enhance the spatial correlations between video frame channels efficiently.To improve the generalization ability,we design a composite loss function that combines cross-entropy loss with triplet loss to balance the classification and similarity requirements.We train and evaluate our method using the PLA IESS-VIDEO dataset,achieving an average seizure recognition accuracy of 90.59%,precision of 90.94%,and recall of 87.64%.To validate its generalization capability further,we conducted external validation using six different patient monitoring videos compared with assessments by six human experts from various medical centers.The final test results demonstrate that our method achieved a recall of 0.6476,surpassing the average level achieved by human experts(0.5595),while attaining a high F1-score of 0.7219.These findings have substantial significance for the long-term assessment of patients with IESS.
基金We acknowledge funding from National Natural Science Foundation of China under Grants Nos.62101213,62103165the Shandong Provincial Natural Science Foundation under Grant Nos.ZR2020QF107,ZR2020MF137,ZR2021QF043.
文摘Video-based person re-identification(Re-ID),a subset of retrieval tasks,faces challenges like uncoordinated sample capturing,viewpoint variations,occlusions,cluttered backgrounds,and sequence uncertainties.Recent advancements in deep learning have significantly improved video-based person Re-ID,laying a solid foundation for further progress in the field.In order to enrich researchers’insights into the latest research findings and prospective developments,we offer an extensive overview and meticulous analysis of contemporary video-based person ReID methodologies,with a specific emphasis on network architecture design and loss function design.Firstly,we introduce methods based on network architecture design and loss function design from multiple perspectives,and analyzes the advantages and disadvantages of these methods.Furthermore,we provide a synthesis of prevalent datasets and key evaluation metrics utilized within this field to assist researchers in assessing methodological efficacy and establishing benchmarks for performance evaluation.Lastly,through a critical evaluation of the experimental outcomes derived from various methodologies across four prominent public datasets,we identify promising research avenues and offer valuable insights to steer future exploration and innovation in this vibrant and evolving field of video-based person Re-ID.This comprehensive analysis aims to equip researchers with the necessary knowledge and strategic foresight to navigate the complexities of video-based person Re-ID,fostering continued progress and breakthroughs in this challenging yet promising research domain.
基金supported by the National Key Research and Development Program of China(2017YFA0105203,2017YFA0105201)National Science Foundation of China(31771076,81925011)+2 种基金Strategic Priority Research Program of the Chinese Academy of Sciences(CAS)(XDB32040201)Beijing Academy of Artificial IntelligenceKey-Area Research and Development Program of Guangdong Province(2019B030335001)。
文摘Behavioral analysis of macaques provides important experimental evidence in the field of neuroscience.In recent years,video-based automatic animal behavior analysis has received widespread attention.However,methods capable of extracting and analyzing daily movement trajectories of macaques in their daily living cages remain underdeveloped,with previous approaches usually requiring specific environments to reduce interference from occlusion or environmental change.Here,we introduce a novel method,called MonkeyTrail,which satisfies the above requirements by frequently generating virtual empty backgrounds and using background subtraction to accurately obtain the foreground of moving animals.The empty background is generated by combining the frame difference method(FDM)and deep learning-based model(YOLOv5).The entire setup can be operated with low-cost hardware and can be applied to the daily living environments of individually caged macaques.To test MonkeyTrail performance,we labeled a dataset containing>8000 video frames with the bounding boxes of macaques under various conditions as ground-truth.Results showed that the tracking accuracy and stability of MonkeyTrail exceeded that of two deep learningbased methods(YOLOv5 and Single-Shot MultiBox Detector),traditional frame difference method,and na?ve background subtraction method.Using MonkeyTrail to analyze long-term surveillance video recordings,we successfully assessed changes in animal behavior in terms of movement amount and spatial preference.Thus,these findings demonstrate that MonkeyTrail enables low-cost,large-scale daily behavioral analysis of macaques.
基金supported by the National Natural Science Foundation of China (10772050)
文摘A new video-based measurement is proposed to collect and investigate traffic flow parameters. The output of the measurement is velocity-headway distance data pairs. Because density can be directly acquired by the reciprocal of headway distance, the data pairs have the advantage of better simultaneity than those from common detectors. By now, over 33 000 pairs of data have been collected from two road sections in the cities of Shanghai and Zhengzhou. Through analyzing the video files recording traffic movements on urban expressways, the following issues are studied:laws of vehicle velocity changing with headway distance, proportions of di0erent driving behaviors in the traffic system, and characteristics of traffic flow in snowy days. The results show that the real road traffic is very complex, and factors such as location and climate need to be taken into consideration in the formation of traffic flow models.
文摘Many individuals with autism spectrum disorder(ASD)experience delays in the development of social and communications skills,which can limit their opportunities in higher education and employment resulting in an overall negative impact to their quality of life.This systematic review identifies 15 studies that explored the effectiveness of Video-Based Interventions(VBIs)for those with ASD during the critical years of adolescence and young adulthood.The 15 studies described herein found this to be an effective intervention for this population for the improvement of their vocational,daily living,and academic skills.In addition,VBIs allow for the maintenance and generalization of the different target behaviors that were examined.The majority of the studies located by this review also investigated the social validity of the intervention method with participants and caregivers and found these VBIs to have high social validity.Although a few studies that implemented VBIs to improve academic skills were located,the research on their use in this area was found to be lacking,indicating a gap in the research on VBIs.Increased usage of VBIs—including video modeling and video prompting—with the target population of those aged 15–28 with ASD is recommended with specific attention given to the use of VBIs to improve the academic and social skills of adolescents and young adults with ASD.
文摘This paper deals with the error analysis of a novel navigation algorithm that uses as input the sequence of images acquired from a moving camera and a Digital Terrain (or Elevation) Map (DTM/DEM). More specifically, it has been shown that the optical flow derived from two consecutive camera frames can be used in combination with a DTM to estimate the position, orientation and ego-motion parameters of the moving camera. As opposed to previous works, the proposed approach does not require an intermediate explicit reconstruction of the 3D world. In the present work the sensitivity of the algorithm outlined above is studied. The main sources for errors are identified to be the optical-flow evaluation and computation, the quality of the information about the terrain, the structure of the observed terrain and the trajectory of the camera. By assuming appropriate characterization of these error sources, a closed form expression for the uncertainty of the pose and motion of the camera is first developed and then the influence of these factors is confirmed using extensive numerical simulations. The main conclusion of this paper is to establish that the proposed navigation algorithm generates accurate estimates for reasonable scenarios and error sources, and thus can be effectively used as part of a navigation system of autonomous vehicles.