Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from...Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from a single RGB camera for a non-rigid moving human by accurately computing pose and depth, for which two key tasks are segmentation and monocular Simultaneous Localization and Mapping (SLAM). Most existing monocular SLAM systems are designed for static scenes, while in this AR system, the human body is always moving and non-rigid. Methods In order to make the SLAM system suitable for a moving human, we first segment the rigid part of the human in each frame. A segmented moving body part can be regarded as a static object, and the relative motions between each moving body part and the camera can be considered the motion of the camera. Typical SLAM systems designed for static scenes can then be applied. In the segmentation step of this AR system, we first employ the proposed BowtieNet, which adds the atrous spatial pyramid pooling (ASPP) of DeepLab between the encoder and decoder of SegNet to segment the human in the original frame, and then we use color information to extract the face from the segmented human area. Results Based on the human segmentation results and a monocular SLAM, this system can change the video background and add a virtual object to humans. Conclusions The experiments on the human image segmentation datasets show that BowtieNet obtains state-of-the-art human image segmentation performance and enough speed for real-time segmentation. The experiments on videos show that the proposed AR system can robustly add a virtual object to humans and can accurately change the video background.展开更多
Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)i...Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)is a suitable option for human segmentation,and the ensemble of multiple DL-based segmentation models can improve the segmentation result.However,these approaches are not as effective when directly applied to the image segmentation in a video.This paper proposes an Adaptive N-Frames Ensemble(AFE)approach for high-movement human segmentation in a video using an ensemble of multiple DL models.In contrast to an ensemble,which executes multiple DL models simultaneously for every single video frame,the proposed AFE approach executes only a single DL model upon a current video frame.It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold.Our method employs the idea of the N-Frames Ensemble(NFE)method,which uses the ensemble of the image segmentation of a current video frame and previous video frames.However,NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates.The proposed AFE approach addresses the limitations of the NFE method.Our experiment uses three human segmentation models,namely Fully Convolutional Network(FCN),DeepLabv3,and Mediapipe.We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view.The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping,resizing and dividing it into videos having 50 frames each.This paper compares the proposed AFE with single models and the Two-Models Ensemble,as well as the NFE models.The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video.展开更多
As one of the most important daily motor activities, human locomotion has been investigated intensively in recent decades. The locomotor functions and mechanics of human lower limbs have become relatively well underst...As one of the most important daily motor activities, human locomotion has been investigated intensively in recent decades. The locomotor functions and mechanics of human lower limbs have become relatively well understood. However, so far our understanding of the motions and functional contributions of the human spine during locomotion is still very poor and simultaneous in-vivo limb and spinal column motion data are scarce. The objective of this study is to investigate the delicate in-vivo kinematic coupling between different functional regions of the human spinal column during locomotion as a stepping stone to explore the locomotor function of the human spine complex. A novel infrared reflective marker cluster system was constrncted using stereophotogrammetry techniques to record the 3D in-vivo geometric shape of the spinal column and the segmental position and orientation of each functional spinal region simultaneously. Gait measurements of normal walking were conducted. The preliminary results show that the spinal column shape changes periodically in the frontal plane during locomotion. The segmental motions of different spinal functional regions appear to be strongly coupled, indicating some synergistic strategy may be employed by the human spinal column to facilitate locomotion. In contrast to traditional medical imaging-based methods, the proposed technique can be used to investigate the dynamic characteristics of the spinal column, hence providing more insight into the functional biomechanics of the human spine.展开更多
Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by...Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.展开更多
The vascular diseases including aneurysm, occlusion, and thromboses in the mesenteric lesions could cause severe symptoms and appropriate diagnosis and treatment are essential for managing patients. With the developme...The vascular diseases including aneurysm, occlusion, and thromboses in the mesenteric lesions could cause severe symptoms and appropriate diagnosis and treatment are essential for managing patients. With the development and improvement of imaging modalities, diagnostic frequency of these vascular diseases in abdominal lesions is increasing even with the small changes in the vasculatures. Among various vascular diseases, fibromuscular dysplasia(FMD) and segmental arterial mediolysis(SAM) are noninflammatory, nonatherosclerotic arterial diseases which need to be diagnosed urgently because these diseases could affect various organs and be lethal if the appropriate management is not provided. However, because FMD and SAM are rare, the cause, prevalence, clinical characteristics including the symptoms, findings in the imaging studies, pathological findings, management, and prognoses have not been systematically summarized. Therefore, there have been neither standard diagnostic criteria nor therapeutic methodologies established, to date. To systematically summarize the information and to compare these disease entities, we have summarized the characteristics of FMD and SAM in the gastroenterological regions by reviewing the cases reported thus far. The information summarized will be helpful for physicians treating these patients in an emergency care unit and for the differential diagnosis of other diseases showing severe abdominal pain.展开更多
Surveillance systems can take various forms,but gait-based surveillance is emerging as a powerful approach due to its ability to identify individuals without requiring their cooperation.In the existing studies,several...Surveillance systems can take various forms,but gait-based surveillance is emerging as a powerful approach due to its ability to identify individuals without requiring their cooperation.In the existing studies,several approaches have been suggested for gait recognition;nevertheless,the performance of existing systems is often degraded in real-world conditions due to covariate factors such as occlusions,clothing changes,walking speed,and varying camera viewpoints.Furthermore,most existing research focuses on single-person gait recognition;however,counting,tracking,detecting,and recognizing individuals in dual-subject settings with occlusions remains a challenging task.Therefore,this research proposed a variant of an automated gait model for occluded dual-subject walk scenarios.More precisely,in the proposed method,we have designed a deep learning(DL)-based dual-subject gait model(DSG)involving three modules.The first module handles silhouette segmentation,localization,and counting(SLC)using Mask-RCNN with MobileNetV2.The next stage uses a Convolutional block attention module(CBAM)-based Siamese network for frame-level tracking with a modified gallery setting.Following the last,gait recognition based on regionbased deep learning is proposed for dual-subject gait recognition.The proposed method,tested on Shri Mata Vaishno Devi University(SMVDU)-Multi-Gait and Single-Gait datasets,shows strong performance with 94.00%segmentation,58.36%tracking,and 63.04%gait recognition accuracy in dual-subject walk scenarios.展开更多
This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction su...This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction subsystem (HFES), and human recognition subsystem (HRS). The codebook background model is applied in the BSS, the histogram of oriented gradients (HOG) features are used in the HFES, and the support vector machine (SVM) classification is employed in the HRS. By means of the integration of these subsystems, the human detection in a vision-based hospital surveillance environment is performed. Experimental results show that the proposed system can effectively detect most of the people in hospital surveillance video sequences.展开更多
In this paper, we propose several improved neural networks and training strategy using data augmentation to segment human radius accurately and efficiently. This method can provide pixel-level segmentation accuracy th...In this paper, we propose several improved neural networks and training strategy using data augmentation to segment human radius accurately and efficiently. This method can provide pixel-level segmentation accuracy through the low-level features of the neural network, and automatically distinguish the classification of radius. The versatility and applicability can be effectively improved by learning and training digital X-ray images obtained from digital X-ray imaging systems of different manufacturers.展开更多
文摘Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from a single RGB camera for a non-rigid moving human by accurately computing pose and depth, for which two key tasks are segmentation and monocular Simultaneous Localization and Mapping (SLAM). Most existing monocular SLAM systems are designed for static scenes, while in this AR system, the human body is always moving and non-rigid. Methods In order to make the SLAM system suitable for a moving human, we first segment the rigid part of the human in each frame. A segmented moving body part can be regarded as a static object, and the relative motions between each moving body part and the camera can be considered the motion of the camera. Typical SLAM systems designed for static scenes can then be applied. In the segmentation step of this AR system, we first employ the proposed BowtieNet, which adds the atrous spatial pyramid pooling (ASPP) of DeepLab between the encoder and decoder of SegNet to segment the human in the original frame, and then we use color information to extract the face from the segmented human area. Results Based on the human segmentation results and a monocular SLAM, this system can change the video background and add a virtual object to humans. Conclusions The experiments on the human image segmentation datasets show that BowtieNet obtains state-of-the-art human image segmentation performance and enough speed for real-time segmentation. The experiments on videos show that the proposed AR system can robustly add a virtual object to humans and can accurately change the video background.
基金This research was financially supported by the Ministry of Small and Medium-sized Enterprises(SMEs)and Startups(MSS)Korea,under the“Regional Specialized Industry Development Program(R&D,S3091627)”supervised by the Korea Institute for Advancement of Technology(KIAT).
文摘Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)is a suitable option for human segmentation,and the ensemble of multiple DL-based segmentation models can improve the segmentation result.However,these approaches are not as effective when directly applied to the image segmentation in a video.This paper proposes an Adaptive N-Frames Ensemble(AFE)approach for high-movement human segmentation in a video using an ensemble of multiple DL models.In contrast to an ensemble,which executes multiple DL models simultaneously for every single video frame,the proposed AFE approach executes only a single DL model upon a current video frame.It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold.Our method employs the idea of the N-Frames Ensemble(NFE)method,which uses the ensemble of the image segmentation of a current video frame and previous video frames.However,NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates.The proposed AFE approach addresses the limitations of the NFE method.Our experiment uses three human segmentation models,namely Fully Convolutional Network(FCN),DeepLabv3,and Mediapipe.We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view.The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping,resizing and dividing it into videos having 50 frames each.This paper compares the proposed AFE with single models and the Two-Models Ensemble,as well as the NFE models.The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video.
基金supported by the Key Project of National Natural Science Foundation of China (No. 50635030)the National Basic Research Program ("973" Program) of China (No. 2007CB616913)+2 种基金was also supported by the China Scholarship Council (CSC)We also would like to thank Karin Jespers and Sharon Warner of the Structure and Motion Laboratory for their support of the experimental workJRH’s con-tributions were supported by research grants BB/C516844/1 and BB/F01169/1 from the BBSRC, whom we thank.
文摘As one of the most important daily motor activities, human locomotion has been investigated intensively in recent decades. The locomotor functions and mechanics of human lower limbs have become relatively well understood. However, so far our understanding of the motions and functional contributions of the human spine during locomotion is still very poor and simultaneous in-vivo limb and spinal column motion data are scarce. The objective of this study is to investigate the delicate in-vivo kinematic coupling between different functional regions of the human spinal column during locomotion as a stepping stone to explore the locomotor function of the human spine complex. A novel infrared reflective marker cluster system was constrncted using stereophotogrammetry techniques to record the 3D in-vivo geometric shape of the spinal column and the segmental position and orientation of each functional spinal region simultaneously. Gait measurements of normal walking were conducted. The preliminary results show that the spinal column shape changes periodically in the frontal plane during locomotion. The segmental motions of different spinal functional regions appear to be strongly coupled, indicating some synergistic strategy may be employed by the human spinal column to facilitate locomotion. In contrast to traditional medical imaging-based methods, the proposed technique can be used to investigate the dynamic characteristics of the spinal column, hence providing more insight into the functional biomechanics of the human spine.
基金supported by the National Natural Science Foundation of China (Grant No.60872117)the Leading Academic Discipline Project of Shanghai Municipal Education Commission (Grant No.J50104)
文摘Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.
文摘The vascular diseases including aneurysm, occlusion, and thromboses in the mesenteric lesions could cause severe symptoms and appropriate diagnosis and treatment are essential for managing patients. With the development and improvement of imaging modalities, diagnostic frequency of these vascular diseases in abdominal lesions is increasing even with the small changes in the vasculatures. Among various vascular diseases, fibromuscular dysplasia(FMD) and segmental arterial mediolysis(SAM) are noninflammatory, nonatherosclerotic arterial diseases which need to be diagnosed urgently because these diseases could affect various organs and be lethal if the appropriate management is not provided. However, because FMD and SAM are rare, the cause, prevalence, clinical characteristics including the symptoms, findings in the imaging studies, pathological findings, management, and prognoses have not been systematically summarized. Therefore, there have been neither standard diagnostic criteria nor therapeutic methodologies established, to date. To systematically summarize the information and to compare these disease entities, we have summarized the characteristics of FMD and SAM in the gastroenterological regions by reviewing the cases reported thus far. The information summarized will be helpful for physicians treating these patients in an emergency care unit and for the differential diagnosis of other diseases showing severe abdominal pain.
基金supported by the MSIT(Ministry of Science and ICT),Republic of Korea,under the Convergence Security Core Talent Training Business Support Program(IITP-2025-RS-2023-00266605)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Surveillance systems can take various forms,but gait-based surveillance is emerging as a powerful approach due to its ability to identify individuals without requiring their cooperation.In the existing studies,several approaches have been suggested for gait recognition;nevertheless,the performance of existing systems is often degraded in real-world conditions due to covariate factors such as occlusions,clothing changes,walking speed,and varying camera viewpoints.Furthermore,most existing research focuses on single-person gait recognition;however,counting,tracking,detecting,and recognizing individuals in dual-subject settings with occlusions remains a challenging task.Therefore,this research proposed a variant of an automated gait model for occluded dual-subject walk scenarios.More precisely,in the proposed method,we have designed a deep learning(DL)-based dual-subject gait model(DSG)involving three modules.The first module handles silhouette segmentation,localization,and counting(SLC)using Mask-RCNN with MobileNetV2.The next stage uses a Convolutional block attention module(CBAM)-based Siamese network for frame-level tracking with a modified gallery setting.Following the last,gait recognition based on regionbased deep learning is proposed for dual-subject gait recognition.The proposed method,tested on Shri Mata Vaishno Devi University(SMVDU)-Multi-Gait and Single-Gait datasets,shows strong performance with 94.00%segmentation,58.36%tracking,and 63.04%gait recognition accuracy in dual-subject walk scenarios.
基金supported by the“MOST”under Grant No.103-2221-E-468-008-MY2
文摘This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction subsystem (HFES), and human recognition subsystem (HRS). The codebook background model is applied in the BSS, the histogram of oriented gradients (HOG) features are used in the HFES, and the support vector machine (SVM) classification is employed in the HRS. By means of the integration of these subsystems, the human detection in a vision-based hospital surveillance environment is performed. Experimental results show that the proposed system can effectively detect most of the people in hospital surveillance video sequences.
文摘In this paper, we propose several improved neural networks and training strategy using data augmentation to segment human radius accurately and efficiently. This method can provide pixel-level segmentation accuracy through the low-level features of the neural network, and automatically distinguish the classification of radius. The versatility and applicability can be effectively improved by learning and training digital X-ray images obtained from digital X-ray imaging systems of different manufacturers.