The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpow...The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection.展开更多
Methods of digital human modeling have been developed and utilized to reflect human shape features.However,most of published works focused on dynamic visualization or fashion design,instead of high-accuracy modeling,w...Methods of digital human modeling have been developed and utilized to reflect human shape features.However,most of published works focused on dynamic visualization or fashion design,instead of high-accuracy modeling,which was strongly demanded by medical or rehabilitation scenarios.Prior to a high-accuracy modeling of human legs based on non-uniform rational B-splines(NURBS),the method of extracting the required quasi-grid network of feature points for human legs is presented in this work.Given the 3 D scanned human body,the leg is firstly segmented and put in standardized position.Then re-sampling of the leg is conducted via a set of equidistant cross sections.Through analysis of leg circumferences and circumferential curvature,the characteristic sections of the leg as well as the characteristic points on the sections are then identified according to the human anatomy and shape features.The obtained collection can be arranged to form a grid of data points for knots calculation and high-accuracy shape reconstruction in future work.展开更多
Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafte...Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafted feature sets are used which are not adaptive for different image domains.To overcome this,an evolu-tionary learning method is developed to automatically learn the spatial-spectral features for classification.A modified Firefly Algorithm(FA)which achieves maximum classification accuracy with reduced size of feature set is proposed to gain the interest of feature selection for this purpose.For extracting the most effi-cient features from the data set,we have used 3-D discrete wavelet transform which decompose the multispectral image in all three dimensions.For selecting spatial and spectral features we have studied three different approaches namely overlapping window(OW-3DFS),non-overlapping window(NW-3DFS)adaptive window cube(AW-3DFS)and Pixel based technique.Fivefold Multiclass Support Vector Machine(MSVM)is used for classification purpose.Experiments con-ducted on Madurai LISS IV multispectral image exploited that the adaptive win-dow approach is used to increase the classification accuracy.展开更多
Hole repair processing is an important part of point cloud data processing in airborne 3-dimensional(3D)laser scanning technology.Due to the fragmentation and irregularity of the surface morphology,when applying the 3...Hole repair processing is an important part of point cloud data processing in airborne 3-dimensional(3D)laser scanning technology.Due to the fragmentation and irregularity of the surface morphology,when applying the 3D laser scanning technology to mountain mapping,the conventional mathematical cloud-based point cloud hole repair method is not ideal in practical applications.In order to solve this problem,we propose to repair the valley and ridge line first,and then repair the point cloud hole.The main technical steps of the method include the following points:First,the valley and ridge feature lines are extracted by the GIS slope analysis method;Then,the valley and ridge line missing from the hole are repaired by the mathematical interpolation method,and the repaired results are edited and inserted to the original point cloud;Finally,the traditional repair method is used to repair the point cloud hole whose valley line and ridge line have been repaired.Three experiments were designed and implemented in the east bank of the Xiaobaini River to test the performance of the proposed method.The results showed that compared with the direct point cloud hole repair method in Geomagic Studio software,the average repair accuracy of the proposed method,in the 16 m buffer zone of valley line and ridge line,is increased from 56.31 cm to 31.49 cm.The repair performance is significantly improved.展开更多
In this paper,a novel cancellable biometrics technique calledMulti-Biometric-Feature-Hashing(MBFH)is proposed.The MBFH strategy is utilized to actualize a single direction(non-invertibility)biometric shape.MBFH is a t...In this paper,a novel cancellable biometrics technique calledMulti-Biometric-Feature-Hashing(MBFH)is proposed.The MBFH strategy is utilized to actualize a single direction(non-invertibility)biometric shape.MBFH is a typical model security conspire that is distinguished in the utilization of this protection insurance framework in numerous sorts of biometric feature strategies(retina,palm print,Hand Dorsum,fingerprint).A more robust and accurate multilingual biological structure in expressing human loneliness requires a different format to record clients with inseparable comparisons from individual biographical sources.This may raise worries about their utilization and security when these spread out designs are subverted as everybody is acknowledged for another biometric attribute.The proposed structure comprises of four sections:input multi-biometric acquisition,feature extraction,Multi-Exposure Fusion(MEF)and secure hashing calculation(SHA-3).Multimodal biometrics systems that are more powerful and precise in human-unmistakable evidence require various configurations to store a comparative customer that can be contrasted with biometric wellsprings of people.Disparate top words,biometrics graphs can’t be denied and change to another request for positive Identifications(IDs)while settling.Cancellable biometrics is may be the special procedure used to recognize this issue.展开更多
Deep convolutional neural networks(CNNs)have demonstrated remarkable performance in video super-resolution(VSR).However,the ability of most existing methods to recover fine details in complex scenes is often hindered ...Deep convolutional neural networks(CNNs)have demonstrated remarkable performance in video super-resolution(VSR).However,the ability of most existing methods to recover fine details in complex scenes is often hindered by the loss of shallow texture information during feature extraction.To address this limitation,we propose a 3D Convolutional Enhanced Residual Video Super-Resolution Network(3D-ERVSNet).This network employs a forward and backward bidirectional propagation module(FBBPM)that aligns features across frames using explicit optical flow through lightweight SPyNet.By incorporating an enhanced residual structure(ERS)with skip connections,shallow and deep features are effectively integrated,enhancing texture restoration capabilities.Furthermore,3D convolution module(3DCM)is applied after the backward propagation module to implicitly capture spatio-temporal dependencies.The architecture synergizes these components where FBBPM extracts aligned features,ERS fuses hierarchical representations,and 3DCM refines temporal coherence.Finally,a deep feature aggregation module(DFAM)fuses the processed features,and a pixel-upsampling module(PUM)reconstructs the high-resolution(HR)video frames.Comprehensive evaluations on REDS,Vid4,UDM10,and Vim4 benchmarks demonstrate well performance including 30.95 dB PSNR/0.8822 SSIM on REDS and 32.78 dB/0.8987 on Vim4.3D-ERVSNet achieves significant gains over baselines while maintaining high efficiency with only 6.3M parameters and 77ms/frame runtime(i.e.,20×faster than RBPN).The network’s effectiveness stems from its task-specific asymmetric design that balances explicit alignment and implicit fusion.展开更多
针对卷积神经网络在高光谱图像特征提取和分类的过程中,存在空谱特征提取不充分以及网络层数太多引起的参数量大、计算复杂的问题,提出快速三维卷积神经网络(3D-CNN)结合深度可分离卷积(DSC)的轻量型卷积模型。该方法首先利用增量主成...针对卷积神经网络在高光谱图像特征提取和分类的过程中,存在空谱特征提取不充分以及网络层数太多引起的参数量大、计算复杂的问题,提出快速三维卷积神经网络(3D-CNN)结合深度可分离卷积(DSC)的轻量型卷积模型。该方法首先利用增量主成分分析(IPCA)对输入的数据进行降维预处理;其次将输入模型的像素分割成小的重叠的三维小卷积块,在分割的小块上基于中心像素形成地面标签,利用三维核函数进行卷积处理,形成连续的三维特征图,保留空谱特征。用3D-CNN同时提取空谱特征,然后在三维卷积中加入深度可分离卷积对空间特征再次提取,丰富空谱特征的同时减少参数量,从而减少计算时间,分类精度也有所提高。所提模型在Indian Pines、Salinas Scene和University of Pavia公开数据集上验证,并且同其他经典的分类方法进行比较。实验结果表明,该方法不仅能大幅度节省可学习的参数,降低模型复杂度,而且表现出较好的分类性能,其中总体精度(OA)、平均分类精度(AA)和Kappa系数均可达99%以上。展开更多
Increasing development of accurate and efficient road three-dimensional(3D)modeling presents great opportunities to improve the data exchange and integration of building information modeling(BIM)models.3D modeling of ...Increasing development of accurate and efficient road three-dimensional(3D)modeling presents great opportunities to improve the data exchange and integration of building information modeling(BIM)models.3D modeling of road scenes is crucial for reference in asset management,construction,and maintenance.Light detection and ranging(Li DAR)technology is increasingly employed to generate high-quality point clouds for road inventory.In this paper,we specifically investigate the use of Li DAR data for road 3D modeling.The purpose of this review is to provide references about the existing work on the road 3D modeling based on Li DAR point clouds,critically discuss them,and provide challenges for further study.Besides,we introduce modeling standards for roads and discuss the components,types,and distinctions of various Li DAR measurement systems.Then,we review state-of-the-art methods and provide a detailed examination of road segmentation and feature extraction.Furthermore,we systematically introduce point cloud-based 3D modeling methods,namely,parametric modeling and surface reconstruction.Parameters and rules are used to define model components based on geometric and non-geometric information,whereas surface modeling is conducted through individual faces within its geometry.Finally,we discuss and summarize future research directions in this field.This review can assist researchers in enhancing existing approaches and developing new techniques for road modeling based on Li DAR point clouds.展开更多
Real-time monitoring of wellbore stability during drilling is crucial for the early detection of instability and timely interventions.The cause and type of wellbore instability can be identified by analyzing the dropp...Real-time monitoring of wellbore stability during drilling is crucial for the early detection of instability and timely interventions.The cause and type of wellbore instability can be identified by analyzing the dropped blocks brought to the surface by the drilling fluid,enabling preventive measures to be taken.In this study,an image capture system with fully automated sorting and 3D scanning was developed to obtain the complete 3D point cloud data of dropping blocks.The raw data obtained were preprocessed using methods such as format conversion,down sampling,coordinate transformation,statistical filtering,and clustering.Feature extraction algorithms,including the principal component analysis bounding box method,triangular meshing method,triaxial projection method,local curvature method,and model segmentation projection method,were employed,which resulted in the extraction of 32 feature parameters from the point cloud data.An optimal machine learning algorithm was developed by training it with 10 machine learning algorithms and the block data collected in the field.The XGBoost algorithm was then used to optimize the feature parameters and improve the classification model.An intelligent,fully automated feature parameter extraction and classification system was developed and applied to classify the types of falling blocks in 12 sets of drilling field and laboratory experiments and to identify the causes of wellbore instability.An average accuracy of 93.9%was achieved.This system can thus enable the timely diagnosis and implementation of preventive and control measures for wellbore instability in the field.展开更多
Today,fatalities,physical injuries,and significant economic losses occur due to car accidents.Among the leading causes of car accidents is drowsiness behind the wheel,which can affect any driver.Drowsiness and sleepin...Today,fatalities,physical injuries,and significant economic losses occur due to car accidents.Among the leading causes of car accidents is drowsiness behind the wheel,which can affect any driver.Drowsiness and sleepiness often have associated indicators that researchers can use to identify and promptly warn drowsy drivers to avoid potential accidents.This paper proposes a spatiotemporal model for monitoring drowsiness visual indicators from videos.This model depends on integrating a 3D convolutional neural network(3D-CNN)and long short-term memory(LSTM).The 3DCNN-LSTM can analyze long sequences by applying the 3D-CNN to extract spatiotemporal features within adjacent frames.The learned features are then used as the input of the LSTM component for modeling high-level temporal features.In addition,we investigate how the training of the proposed model can be affected by changing the position of the batch normalization(BN)layers in the 3D-CNN units.The BN layer is examined in two different placement settings:before the non-linear activation function and after the non-linear activation function.The study was conducted on two publicly available drowsy drivers datasets named 3MDAD and YawDD.3MDAD is mainly composed of two synchronized datasets recorded from the frontal and side views of the drivers.We show that the position of the BN layers increases the convergence speed and reduces overfitting on one dataset but not the other.As a result,the model achieves a test detection accuracy of 96%,93%,and 90%on YawDD,Side-3MDAD,and Front-3MDAD,respectively.展开更多
In order to improve the accuracy and efficiency of 3D model retrieval,the method based on affinity propagation clustering algorithm is proposed. Firstly,projection ray-based method is proposed to improve the feature e...In order to improve the accuracy and efficiency of 3D model retrieval,the method based on affinity propagation clustering algorithm is proposed. Firstly,projection ray-based method is proposed to improve the feature extraction efficiency of 3D models. Based on the relationship between model and its projection,the intersection in 3D space is transformed into intersection in 2D space,which reduces the number of intersection and improves the efficiency of the extraction algorithm. In feature extraction,multi-layer spheres method is analyzed. The two-layer spheres method makes the feature vector more accurate and improves retrieval precision. Secondly,Semi-supervised Affinity Propagation ( S-AP) clustering is utilized because it can be applied to different cluster structures. The S-AP algorithm is adopted to find the center models and then the center model collection is built. During retrieval process,the collection is utilized to classify the query model into corresponding model base and then the most similar model is retrieved in the model base. Finally,75 sample models from Princeton library are selected to do the experiment and then 36 models are used for retrieval test. The results validate that the proposed method outperforms the original method and the retrieval precision and recall ratios are improved effectively.展开更多
The three-dimensional(3D)reconstruction technology based on structured light has been widely used in the field of industrial measurement due to its many advantages.Aiming at the problems of high mismatch rate and poor...The three-dimensional(3D)reconstruction technology based on structured light has been widely used in the field of industrial measurement due to its many advantages.Aiming at the problems of high mismatch rate and poor real-time performance caused by factors such as system jitter and noise,a lightweight stripe image feature extraction algorithm based on You Only Look Once v4(YOLOv4)network is proposed.First,Mobilenetv3 is used as the backbone network to effectively extract features,and then the Mish activation function and Complete Intersection over Union(CIoU)loss function are used to calculate the improved target frame regression loss,which effectively improves the accuracy and real-time performance of feature detection.Simulation experiment results show that the model size after the improved algorithm is only 52 MB,the mean average accuracy(mAP)of fringe image data reconstruction reaches 82.11%,and the 3D point cloud restoration rate reaches 90.1%.Compared with the existing model,it has obvious advantages and can satisfy the accuracy and real-time requirements of reconstruction tasks in resource-constrained equipment.展开更多
基金supported by the National Key Research and Development Program of China(2020YFB1807500)the National Natural Science Foundation of China(62072360,62001357,62172438,61901367)+4 种基金the key research and development plan of Shaanxi province(2021ZDLGY02-09,2023-GHZD-44,2023-ZDLGY-54)the Natural Science Foundation of Guangdong Province of China(2022A1515010988)Key Project on Artificial Intelligence of Xi'an Science and Technology Plan(2022JH-RGZN-0003,2022JH-RGZN-0103,2022JH-CLCJ-0053)Xi'an Science and Technology Plan(20RGZN0005)the Proof-ofconcept fund from Hangzhou Research Institute of Xidian University(GNYZ2023QC0201).
文摘The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection.
基金National Natural Science Foundation of China(Nos.12002085 and 51603039)Shanghai Pujiang Program,China(No.19PC002)+1 种基金Fundamental Research Funds for the Central Universities,China(No.2232019D3-58)Initial Research Funds for Young Teachers of Donghua University,China(No.104-07-0053088)。
文摘Methods of digital human modeling have been developed and utilized to reflect human shape features.However,most of published works focused on dynamic visualization or fashion design,instead of high-accuracy modeling,which was strongly demanded by medical or rehabilitation scenarios.Prior to a high-accuracy modeling of human legs based on non-uniform rational B-splines(NURBS),the method of extracting the required quasi-grid network of feature points for human legs is presented in this work.Given the 3 D scanned human body,the leg is firstly segmented and put in standardized position.Then re-sampling of the leg is conducted via a set of equidistant cross sections.Through analysis of leg circumferences and circumferential curvature,the characteristic sections of the leg as well as the characteristic points on the sections are then identified according to the human anatomy and shape features.The obtained collection can be arranged to form a grid of data points for knots calculation and high-accuracy shape reconstruction in future work.
文摘Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafted feature sets are used which are not adaptive for different image domains.To overcome this,an evolu-tionary learning method is developed to automatically learn the spatial-spectral features for classification.A modified Firefly Algorithm(FA)which achieves maximum classification accuracy with reduced size of feature set is proposed to gain the interest of feature selection for this purpose.For extracting the most effi-cient features from the data set,we have used 3-D discrete wavelet transform which decompose the multispectral image in all three dimensions.For selecting spatial and spectral features we have studied three different approaches namely overlapping window(OW-3DFS),non-overlapping window(NW-3DFS)adaptive window cube(AW-3DFS)and Pixel based technique.Fivefold Multiclass Support Vector Machine(MSVM)is used for classification purpose.Experiments con-ducted on Madurai LISS IV multispectral image exploited that the adaptive win-dow approach is used to increase the classification accuracy.
基金National Natural Science Foundation of China(Nos.41861054,41371423,61966010)National Key R&D Program of China(No.2016YFB0502105)。
文摘Hole repair processing is an important part of point cloud data processing in airborne 3-dimensional(3D)laser scanning technology.Due to the fragmentation and irregularity of the surface morphology,when applying the 3D laser scanning technology to mountain mapping,the conventional mathematical cloud-based point cloud hole repair method is not ideal in practical applications.In order to solve this problem,we propose to repair the valley and ridge line first,and then repair the point cloud hole.The main technical steps of the method include the following points:First,the valley and ridge feature lines are extracted by the GIS slope analysis method;Then,the valley and ridge line missing from the hole are repaired by the mathematical interpolation method,and the repaired results are edited and inserted to the original point cloud;Finally,the traditional repair method is used to repair the point cloud hole whose valley line and ridge line have been repaired.Three experiments were designed and implemented in the east bank of the Xiaobaini River to test the performance of the proposed method.The results showed that compared with the direct point cloud hole repair method in Geomagic Studio software,the average repair accuracy of the proposed method,in the 16 m buffer zone of valley line and ridge line,is increased from 56.31 cm to 31.49 cm.The repair performance is significantly improved.
基金supported by Taif University Researchers Supporting Project Number(TURSP-2020/215)Taif University,Taif,Saudi Arabia(www.tu.edu.sa).
文摘In this paper,a novel cancellable biometrics technique calledMulti-Biometric-Feature-Hashing(MBFH)is proposed.The MBFH strategy is utilized to actualize a single direction(non-invertibility)biometric shape.MBFH is a typical model security conspire that is distinguished in the utilization of this protection insurance framework in numerous sorts of biometric feature strategies(retina,palm print,Hand Dorsum,fingerprint).A more robust and accurate multilingual biological structure in expressing human loneliness requires a different format to record clients with inseparable comparisons from individual biographical sources.This may raise worries about their utilization and security when these spread out designs are subverted as everybody is acknowledged for another biometric attribute.The proposed structure comprises of four sections:input multi-biometric acquisition,feature extraction,Multi-Exposure Fusion(MEF)and secure hashing calculation(SHA-3).Multimodal biometrics systems that are more powerful and precise in human-unmistakable evidence require various configurations to store a comparative customer that can be contrasted with biometric wellsprings of people.Disparate top words,biometrics graphs can’t be denied and change to another request for positive Identifications(IDs)while settling.Cancellable biometrics is may be the special procedure used to recognize this issue.
基金supported in part by the Basic and Applied Basic Research Foundation of Guangdong Province[2025A1515011566]in part by the State Key Laboratory for Novel Software Technology,Nanjing University[KFKT2024B08]+1 种基金in part by Leading Talents in Gusu Innovation and Entrepreneurship[ZXL2023170]in part by the Basic Research Programs of Taicang 2024,[TC2024JC32].
文摘Deep convolutional neural networks(CNNs)have demonstrated remarkable performance in video super-resolution(VSR).However,the ability of most existing methods to recover fine details in complex scenes is often hindered by the loss of shallow texture information during feature extraction.To address this limitation,we propose a 3D Convolutional Enhanced Residual Video Super-Resolution Network(3D-ERVSNet).This network employs a forward and backward bidirectional propagation module(FBBPM)that aligns features across frames using explicit optical flow through lightweight SPyNet.By incorporating an enhanced residual structure(ERS)with skip connections,shallow and deep features are effectively integrated,enhancing texture restoration capabilities.Furthermore,3D convolution module(3DCM)is applied after the backward propagation module to implicitly capture spatio-temporal dependencies.The architecture synergizes these components where FBBPM extracts aligned features,ERS fuses hierarchical representations,and 3DCM refines temporal coherence.Finally,a deep feature aggregation module(DFAM)fuses the processed features,and a pixel-upsampling module(PUM)reconstructs the high-resolution(HR)video frames.Comprehensive evaluations on REDS,Vid4,UDM10,and Vim4 benchmarks demonstrate well performance including 30.95 dB PSNR/0.8822 SSIM on REDS and 32.78 dB/0.8987 on Vim4.3D-ERVSNet achieves significant gains over baselines while maintaining high efficiency with only 6.3M parameters and 77ms/frame runtime(i.e.,20×faster than RBPN).The network’s effectiveness stems from its task-specific asymmetric design that balances explicit alignment and implicit fusion.
文摘针对卷积神经网络在高光谱图像特征提取和分类的过程中,存在空谱特征提取不充分以及网络层数太多引起的参数量大、计算复杂的问题,提出快速三维卷积神经网络(3D-CNN)结合深度可分离卷积(DSC)的轻量型卷积模型。该方法首先利用增量主成分分析(IPCA)对输入的数据进行降维预处理;其次将输入模型的像素分割成小的重叠的三维小卷积块,在分割的小块上基于中心像素形成地面标签,利用三维核函数进行卷积处理,形成连续的三维特征图,保留空谱特征。用3D-CNN同时提取空谱特征,然后在三维卷积中加入深度可分离卷积对空间特征再次提取,丰富空谱特征的同时减少参数量,从而减少计算时间,分类精度也有所提高。所提模型在Indian Pines、Salinas Scene和University of Pavia公开数据集上验证,并且同其他经典的分类方法进行比较。实验结果表明,该方法不仅能大幅度节省可学习的参数,降低模型复杂度,而且表现出较好的分类性能,其中总体精度(OA)、平均分类精度(AA)和Kappa系数均可达99%以上。
基金supported by the projects found by the Jiangsu Transportation Science and Technology Project under Grants 2020Y191(1)Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grants KYCX23_0294。
文摘Increasing development of accurate and efficient road three-dimensional(3D)modeling presents great opportunities to improve the data exchange and integration of building information modeling(BIM)models.3D modeling of road scenes is crucial for reference in asset management,construction,and maintenance.Light detection and ranging(Li DAR)technology is increasingly employed to generate high-quality point clouds for road inventory.In this paper,we specifically investigate the use of Li DAR data for road 3D modeling.The purpose of this review is to provide references about the existing work on the road 3D modeling based on Li DAR point clouds,critically discuss them,and provide challenges for further study.Besides,we introduce modeling standards for roads and discuss the components,types,and distinctions of various Li DAR measurement systems.Then,we review state-of-the-art methods and provide a detailed examination of road segmentation and feature extraction.Furthermore,we systematically introduce point cloud-based 3D modeling methods,namely,parametric modeling and surface reconstruction.Parameters and rules are used to define model components based on geometric and non-geometric information,whereas surface modeling is conducted through individual faces within its geometry.Finally,we discuss and summarize future research directions in this field.This review can assist researchers in enhancing existing approaches and developing new techniques for road modeling based on Li DAR point clouds.
文摘针对无人机(UAV)影像中道路小目标漏检和目标检测精度低、鲁棒性差等问题,设计一种基于全局特征提取的UAV道路病害检测算法GFE-RDD(Global Feature Extraction-Road Disease Detection)。将卷积神经网络(CNN)与Transformer融合的GFE-Transformer模块嵌入主干网络,提升捕获长距离依赖关系的能力以获得全局上下文信息。为了更好地检测出小目标的道路病害,提出一个融合高效双通道注意力机制(EDA)的小目标检测头。另外,采用WIoUv3(Wise-Intersection over Union vision 3)作为网络的损失函数,解决训练数据中锚框质量差异较大的问题,并提高检测的准确性。在自制的道路多病害数据集上的实验结果表明,所提算法在道路病害检测任务中的F1分数达到0.765,mAP50达到0.796,均高于DETR(DEtection TRansformer)等当前主流算法,取得了较高的检测准确率。
基金supported by the Scientific research and technology development projects of CNPC“Research on Key Technologies and Equipment for Drilling and Completion of 10000-m Ultra-deep Oil and Gas Resources”(No.2022ZG06)“Development of a Complete Set of 70 MPa Intelligent Managed Pressure Drilling Equipment”(No.2024ZG35).
文摘Real-time monitoring of wellbore stability during drilling is crucial for the early detection of instability and timely interventions.The cause and type of wellbore instability can be identified by analyzing the dropped blocks brought to the surface by the drilling fluid,enabling preventive measures to be taken.In this study,an image capture system with fully automated sorting and 3D scanning was developed to obtain the complete 3D point cloud data of dropping blocks.The raw data obtained were preprocessed using methods such as format conversion,down sampling,coordinate transformation,statistical filtering,and clustering.Feature extraction algorithms,including the principal component analysis bounding box method,triangular meshing method,triaxial projection method,local curvature method,and model segmentation projection method,were employed,which resulted in the extraction of 32 feature parameters from the point cloud data.An optimal machine learning algorithm was developed by training it with 10 machine learning algorithms and the block data collected in the field.The XGBoost algorithm was then used to optimize the feature parameters and improve the classification model.An intelligent,fully automated feature parameter extraction and classification system was developed and applied to classify the types of falling blocks in 12 sets of drilling field and laboratory experiments and to identify the causes of wellbore instability.An average accuracy of 93.9%was achieved.This system can thus enable the timely diagnosis and implementation of preventive and control measures for wellbore instability in the field.
文摘Today,fatalities,physical injuries,and significant economic losses occur due to car accidents.Among the leading causes of car accidents is drowsiness behind the wheel,which can affect any driver.Drowsiness and sleepiness often have associated indicators that researchers can use to identify and promptly warn drowsy drivers to avoid potential accidents.This paper proposes a spatiotemporal model for monitoring drowsiness visual indicators from videos.This model depends on integrating a 3D convolutional neural network(3D-CNN)and long short-term memory(LSTM).The 3DCNN-LSTM can analyze long sequences by applying the 3D-CNN to extract spatiotemporal features within adjacent frames.The learned features are then used as the input of the LSTM component for modeling high-level temporal features.In addition,we investigate how the training of the proposed model can be affected by changing the position of the batch normalization(BN)layers in the 3D-CNN units.The BN layer is examined in two different placement settings:before the non-linear activation function and after the non-linear activation function.The study was conducted on two publicly available drowsy drivers datasets named 3MDAD and YawDD.3MDAD is mainly composed of two synchronized datasets recorded from the frontal and side views of the drivers.We show that the position of the BN layers increases the convergence speed and reduces overfitting on one dataset but not the other.As a result,the model achieves a test detection accuracy of 96%,93%,and 90%on YawDD,Side-3MDAD,and Front-3MDAD,respectively.
基金Sponsored by the National Natural Science Foundation of China (Grant No. 51075083)
文摘In order to improve the accuracy and efficiency of 3D model retrieval,the method based on affinity propagation clustering algorithm is proposed. Firstly,projection ray-based method is proposed to improve the feature extraction efficiency of 3D models. Based on the relationship between model and its projection,the intersection in 3D space is transformed into intersection in 2D space,which reduces the number of intersection and improves the efficiency of the extraction algorithm. In feature extraction,multi-layer spheres method is analyzed. The two-layer spheres method makes the feature vector more accurate and improves retrieval precision. Secondly,Semi-supervised Affinity Propagation ( S-AP) clustering is utilized because it can be applied to different cluster structures. The S-AP algorithm is adopted to find the center models and then the center model collection is built. During retrieval process,the collection is utilized to classify the query model into corresponding model base and then the most similar model is retrieved in the model base. Finally,75 sample models from Princeton library are selected to do the experiment and then 36 models are used for retrieval test. The results validate that the proposed method outperforms the original method and the retrieval precision and recall ratios are improved effectively.
基金This work is funded by the Training Plan for Young Backbone Teachers in Colleges and Universities in Henan Province under Grant No.2021GGJS077.
文摘The three-dimensional(3D)reconstruction technology based on structured light has been widely used in the field of industrial measurement due to its many advantages.Aiming at the problems of high mismatch rate and poor real-time performance caused by factors such as system jitter and noise,a lightweight stripe image feature extraction algorithm based on You Only Look Once v4(YOLOv4)network is proposed.First,Mobilenetv3 is used as the backbone network to effectively extract features,and then the Mish activation function and Complete Intersection over Union(CIoU)loss function are used to calculate the improved target frame regression loss,which effectively improves the accuracy and real-time performance of feature detection.Simulation experiment results show that the model size after the improved algorithm is only 52 MB,the mean average accuracy(mAP)of fringe image data reconstruction reaches 82.11%,and the 3D point cloud restoration rate reaches 90.1%.Compared with the existing model,it has obvious advantages and can satisfy the accuracy and real-time requirements of reconstruction tasks in resource-constrained equipment.