[Significance]In alignment with the national germplasm security strategy,current research efforts are accelerating the adoption of precision breeding in sheep.Within the whole-genome selection,accurate phenotyping of ...[Significance]In alignment with the national germplasm security strategy,current research efforts are accelerating the adoption of precision breeding in sheep.Within the whole-genome selection,accurate phenotyping of body morphometrics is critical for assessing growth performance and breeding value.Traditional manual measurements are inefficient,prone to human error,and may cause stress to sheep,limiting their suitability for precision sheep management.By summarizing the applications of sheep body size measurement technologies and analyzing their development directions,this paper provides theoretical references and practical guidance for the research and application of non contact sheep body size measurement.[Progress]This review synthesizes progress across three principal methodological paradigms:two-dimensional(2D)image-based techniques,three-dimensional(3D)point cloud-based approaches,and integrated 2D-3D fusion systems.2D methods,employing either handcrafted geometric features or deep learning-based keypoint detector algorithms,are cost-effective and operationally simple but sensitive to variation in imaging conditions and unable to capture critical circumference metrics.3D point-cloud approaches enable precise reconstruction of full animal morphology,supporting comprehensive body-size acquisition with higher accuracy,yet face challenges including high hardware costs,complex data workflows,and sensitivity to posture variability.Hybrid 2D-3D fusion systems combine semantic richness from RGB imagery with geometric completeness from point clouds.Having been effectively validated in other livestock specise,e.g.,cattle and pigs,these fusion systems have demonstrated excellent performance,providing important technical references and practical insights for sheep body size measurement.[Conclusions and Prospects]Firstly,future research should focus on constructing large-scale,high-quality datasets for sheep body size measurement that encompass diverse breeds,growth stages,and environmental conditions,thereby enhancing model robustness and generalization.Secondly,the development of lightweight artificial intelligence models is essential.Techniques such as model compression,quantization,and algorithmic optimization can substantially reduce computational complexity and storage requirements,facilitating deployment in resource-constrained environments.Thirdly,the 3D point cloud processing pipeline should be streamlined to improve the efficiency of data acquisition,filtering,registration,and segmentation,while promoting the integration of low-cost,high-resilience vision systems into practical farming scenarios.Fourthly,specific emphasis should be placed on improving the accuracy of curved-dimensional measurements,such as chest circumference,abdominal circumference,and shank circumference,through advances in pose standardization,refined 3D segmentation strategies,and multimodal data fusion.Finally,the cross-fertilization of sheep body size measurement technologies with analogous methods for other livestock species offers a promising pathway for mutual learning and collaborative innovation,accelerating the industrialization of automated sheep morphometric systems and supporting the development of intelligent,data-driven pasture management practices.展开更多
Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton s...Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton structure information is not utilized and multi-view pose information is not completely fused.Moreover,existing graph convolutional operations do not consider the specificity of different joints and different views of pose information when processing skeleton graphs,making the correlation weights between nodes in the graph and their neighborhood nodes shared.Existing Graph Convolutional Networks(GCNs)cannot extract global and deeplevel skeleton structure information and view correlations efficiently.To solve these problems,pre-estimated multiview 2D poses are designed as a multi-view skeleton graph to fuse skeleton priors and view correlations explicitly to process occlusion problem,with the skeleton-edge and symmetry-edge representing the structure correlations between adjacent joints in each viewof skeleton graph and the view-edge representing the view correlations between the same joints in different views.To make graph convolution operation mine elaborate and sufficient skeleton structure information and view correlations,different correlation weights are assigned to different categories of neighborhood nodes and further assigned to each node in the graph.Based on the graph convolution operation proposed above,a Residual Graph Convolution(RGC)module is designed as the basic module to be combined with the simplified Hourglass architecture to construct the Hourglass-GCN as our 3D pose estimation network.Hourglass-GCNwith a symmetrical and concise architecture processes three scales ofmulti-viewskeleton graphs to extract local-to-global scale and shallow-to-deep level skeleton features efficiently.Experimental results on common large 3D pose dataset Human3.6M and MPI-INF-3DHP show that Hourglass-GCN outperforms some excellent methods in 3D pose estimation accuracy.展开更多
With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid th...With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation,this paper proposes a novel multi-modal multi-view(MMV)3D hand pose estimation system,which introduces a registration before translation(RT)-translation before registration(TR)jointed conditional generative adversarial network(cGAN)to train a multi-modal registration network,and then employs the multi-modal feature fusion to achieve high-quality estimation,with low hardware and software costs both in data acquisition and processing.Experimental results demonstrate that the MMV system is effective and feasible in various scenarios.It is promising for the MMV system to be used in broad intelligent application areas.展开更多
3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimat...3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.展开更多
The field of vision-based human hand three-dimensional(3D)shape and pose estimation has attracted significant attention recently owing to its key role in various applications,such as natural human computer interaction...The field of vision-based human hand three-dimensional(3D)shape and pose estimation has attracted significant attention recently owing to its key role in various applications,such as natural human computer interactions.With the availability of large-scale annotated hand datasets and the rapid developments of deep neural networks(DNNs),numerous DNN-based data-driven methods have been proposed for accurate and rapid hand shape and pose estimation.Nonetheless,the existence of complicated hand articulation,depth and scale ambiguities,occlusions,and finger similarity remain challenging.In this study,we present a comprehensive survey of state-of-the-art 3D hand shape and pose estimation approaches using RGB-D cameras.Related RGB-D cameras,hand datasets,and a performance analysis are also discussed to provide a holistic view of recent achievements.We also discuss the research potential of this rapidly growing field.展开更多
In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains...In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains of computer vision and is used in solving several problems for human endeavours.After the detailed introduction,three different human body modes followed by the main stages of HPE and two pipelines of twodimensional(2D)HPE are presented.The details of the four components of HPE are also presented.The keypoints output format of two popular 2D HPE datasets and the most cited DL-based HPE articles from the year of breakthrough are both shown in tabular form.This study intends to highlight the limitations of published reviews and surveys respecting presenting a systematic review of the current DL-based solution to the 2D HPE model.Furthermore,a detailed and meaningful survey that will guide new and existing researchers on DL-based 2D HPE models is achieved.Finally,some future research directions in the field of HPE,such as limited data on disabled persons and multi-training DL-based models,are revealed to encourage researchers and promote the growth of HPE research.展开更多
文摘[Significance]In alignment with the national germplasm security strategy,current research efforts are accelerating the adoption of precision breeding in sheep.Within the whole-genome selection,accurate phenotyping of body morphometrics is critical for assessing growth performance and breeding value.Traditional manual measurements are inefficient,prone to human error,and may cause stress to sheep,limiting their suitability for precision sheep management.By summarizing the applications of sheep body size measurement technologies and analyzing their development directions,this paper provides theoretical references and practical guidance for the research and application of non contact sheep body size measurement.[Progress]This review synthesizes progress across three principal methodological paradigms:two-dimensional(2D)image-based techniques,three-dimensional(3D)point cloud-based approaches,and integrated 2D-3D fusion systems.2D methods,employing either handcrafted geometric features or deep learning-based keypoint detector algorithms,are cost-effective and operationally simple but sensitive to variation in imaging conditions and unable to capture critical circumference metrics.3D point-cloud approaches enable precise reconstruction of full animal morphology,supporting comprehensive body-size acquisition with higher accuracy,yet face challenges including high hardware costs,complex data workflows,and sensitivity to posture variability.Hybrid 2D-3D fusion systems combine semantic richness from RGB imagery with geometric completeness from point clouds.Having been effectively validated in other livestock specise,e.g.,cattle and pigs,these fusion systems have demonstrated excellent performance,providing important technical references and practical insights for sheep body size measurement.[Conclusions and Prospects]Firstly,future research should focus on constructing large-scale,high-quality datasets for sheep body size measurement that encompass diverse breeds,growth stages,and environmental conditions,thereby enhancing model robustness and generalization.Secondly,the development of lightweight artificial intelligence models is essential.Techniques such as model compression,quantization,and algorithmic optimization can substantially reduce computational complexity and storage requirements,facilitating deployment in resource-constrained environments.Thirdly,the 3D point cloud processing pipeline should be streamlined to improve the efficiency of data acquisition,filtering,registration,and segmentation,while promoting the integration of low-cost,high-resilience vision systems into practical farming scenarios.Fourthly,specific emphasis should be placed on improving the accuracy of curved-dimensional measurements,such as chest circumference,abdominal circumference,and shank circumference,through advances in pose standardization,refined 3D segmentation strategies,and multimodal data fusion.Finally,the cross-fertilization of sheep body size measurement technologies with analogous methods for other livestock species offers a promising pathway for mutual learning and collaborative innovation,accelerating the industrialization of automated sheep morphometric systems and supporting the development of intelligent,data-driven pasture management practices.
基金supported in part by the National Natural Science Foundation of China under Grants 61973065,U20A20197,61973063.
文摘Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton structure information is not utilized and multi-view pose information is not completely fused.Moreover,existing graph convolutional operations do not consider the specificity of different joints and different views of pose information when processing skeleton graphs,making the correlation weights between nodes in the graph and their neighborhood nodes shared.Existing Graph Convolutional Networks(GCNs)cannot extract global and deeplevel skeleton structure information and view correlations efficiently.To solve these problems,pre-estimated multiview 2D poses are designed as a multi-view skeleton graph to fuse skeleton priors and view correlations explicitly to process occlusion problem,with the skeleton-edge and symmetry-edge representing the structure correlations between adjacent joints in each viewof skeleton graph and the view-edge representing the view correlations between the same joints in different views.To make graph convolution operation mine elaborate and sufficient skeleton structure information and view correlations,different correlation weights are assigned to different categories of neighborhood nodes and further assigned to each node in the graph.Based on the graph convolution operation proposed above,a Residual Graph Convolution(RGC)module is designed as the basic module to be combined with the simplified Hourglass architecture to construct the Hourglass-GCN as our 3D pose estimation network.Hourglass-GCNwith a symmetrical and concise architecture processes three scales ofmulti-viewskeleton graphs to extract local-to-global scale and shallow-to-deep level skeleton features efficiently.Experimental results on common large 3D pose dataset Human3.6M and MPI-INF-3DHP show that Hourglass-GCN outperforms some excellent methods in 3D pose estimation accuracy.
文摘With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation,this paper proposes a novel multi-modal multi-view(MMV)3D hand pose estimation system,which introduces a registration before translation(RT)-translation before registration(TR)jointed conditional generative adversarial network(cGAN)to train a multi-modal registration network,and then employs the multi-modal feature fusion to achieve high-quality estimation,with low hardware and software costs both in data acquisition and processing.Experimental results demonstrate that the MMV system is effective and feasible in various scenarios.It is promising for the MMV system to be used in broad intelligent application areas.
基金supported by the Program of Entrepreneurship and Innovation Ph.D.in Jiangsu Province(JSSCBS20211175)the School Ph.D.Talent Funding(Z301B2055)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(21KJB520002).
文摘3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.
基金the National Key R&D Program of China(2018YFB1004600)the National Natural Science Foundation of China(61502187,61876211)the National Science Foundation Grant CNS(1951952).
文摘The field of vision-based human hand three-dimensional(3D)shape and pose estimation has attracted significant attention recently owing to its key role in various applications,such as natural human computer interactions.With the availability of large-scale annotated hand datasets and the rapid developments of deep neural networks(DNNs),numerous DNN-based data-driven methods have been proposed for accurate and rapid hand shape and pose estimation.Nonetheless,the existence of complicated hand articulation,depth and scale ambiguities,occlusions,and finger similarity remain challenging.In this study,we present a comprehensive survey of state-of-the-art 3D hand shape and pose estimation approaches using RGB-D cameras.Related RGB-D cameras,hand datasets,and a performance analysis are also discussed to provide a holistic view of recent achievements.We also discuss the research potential of this rapidly growing field.
基金supported by the[Universiti Sains Malaysia]under FRGS Grant Number[FRGS/1/2020/STG07/USM/02/12(203.PKOMP.6711930)]FRGS Grant Number[304PTEKIND.6316497.USM.].
文摘In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains of computer vision and is used in solving several problems for human endeavours.After the detailed introduction,three different human body modes followed by the main stages of HPE and two pipelines of twodimensional(2D)HPE are presented.The details of the four components of HPE are also presented.The keypoints output format of two popular 2D HPE datasets and the most cited DL-based HPE articles from the year of breakthrough are both shown in tabular form.This study intends to highlight the limitations of published reviews and surveys respecting presenting a systematic review of the current DL-based solution to the 2D HPE model.Furthermore,a detailed and meaningful survey that will guide new and existing researchers on DL-based 2D HPE models is achieved.Finally,some future research directions in the field of HPE,such as limited data on disabled persons and multi-training DL-based models,are revealed to encourage researchers and promote the growth of HPE research.