期刊文献+
共找到972篇文章
< 1 2 49 >
每页显示 20 50 100
Transforming Education with Photogrammetry:Creating Realistic 3D Objects for Augmented Reality Applications
1
作者 Kaviyaraj Ravichandran Uma Mohan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期185-208,共24页
Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in ed... Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in education continues to increase,educators actively seek innovative and immersive methods to engage students in learning.However,exploring these possibilities also entails identifying and overcoming existing barriers to optimal educational integration.Concurrently,this surge in demand has prompted the identification of specific barriers,one of which is three-dimensional(3D)modeling.Creating 3D objects for augmented reality education applications can be challenging and time-consuming for the educators.To address this,we have developed a pipeline that creates realistic 3D objects from the two-dimensional(2D)photograph.Applications for augmented and virtual reality can then utilize these created 3D objects.We evaluated the proposed pipeline based on the usability of the 3D object and performance metrics.Quantitatively,with 117 respondents,the co-creation team was surveyed with openended questions to evaluate the precision of the 3D object created by the proposed photogrammetry pipeline.We analyzed the survey data using descriptive-analytical methods and found that the proposed pipeline produces 3D models that are positively accurate when compared to real-world objects,with an average mean score above 8.This study adds new knowledge in creating 3D objects for augmented reality applications by using the photogrammetry technique;finally,it discusses potential problems and future research directions for 3D objects in the education sector. 展开更多
关键词 Augmented reality education immersive learning 3d object creation PHOTOGRAMMETRY and StructureFromMotion
在线阅读 下载PDF
Study on Color Difference of Color Reproduction of 3D Objects
2
作者 GU Chong DENG Yi-qiang 《印刷与数字媒体技术研究》 北大核心 2025年第4期33-38,69,共7页
To investigate the applicability of four commonly used color difference formulas(CIELAB,CIE94,CMC(1:1),and CIEDE2000)in the printing field on 3D objects,as well as the impact of four standard light sources(D65,D50,A,a... To investigate the applicability of four commonly used color difference formulas(CIELAB,CIE94,CMC(1:1),and CIEDE2000)in the printing field on 3D objects,as well as the impact of four standard light sources(D65,D50,A,and TL84)on 3D color difference evaluations,50 glossy spheres with a diameter of 2cm based on the Sailner J4003D color printing device were created.These spheres were centered around the five recommended colors(gray,red,yellow,green,and blue)by CIE.Color difference was calculated according to the four formulas,and 111 pairs of experimental samples meeting the CIELAB gray scale color difference requirements(1.0-14.0)were selected.Ten observers,aged between 22 and 27 with normal color vision,were participated in this study,using the gray scale method from psychophysical experiments to conduct color difference evaluations under the four light sources,with repeated experiments for each observer.The results indicated that the overall effect of the D65 light source on 3D objects color difference was minimal.In contrast,D50 and A light sources had a significant impact within the small color difference range,while the TL84 light source influenced both large and small color difference considerably.Among the four color difference formulas,CIEDE2000 demonstrated the best predictive performance for color difference in 3D objects,followed by CMC(1:1),CIE94,and CIELAB. 展开更多
关键词 Color difference formula 3d objects Light source Gray scale Normalized residual sum of squares
在线阅读 下载PDF
Syn-Aug:An Effective and General Synchronous Data Augmentation Framework for 3D Object Detection
3
作者 Huaijin Liu Jixiang Du +2 位作者 Yong Zhang Hongbo Zhang Jiandian Zeng 《CAAI Transactions on Intelligence Technology》 2025年第3期912-928,共17页
Data augmentation plays an important role in boosting the performance of 3D models,while very few studies handle the 3D point cloud data with this technique.Global augmentation and cut-paste are commonly used augmenta... Data augmentation plays an important role in boosting the performance of 3D models,while very few studies handle the 3D point cloud data with this technique.Global augmentation and cut-paste are commonly used augmentation techniques for point clouds,where global augmentation is applied to the entire point cloud of the scene,and cut-paste samples objects from other frames into the current frame.Both types of data augmentation can improve performance,but the cut-paste technique cannot effectively deal with the occlusion relationship between the foreground object and the background scene and the rationality of object sampling,which may be counterproductive and may hurt the overall performance.In addition,LiDAR is susceptible to signal loss,external occlusion,extreme weather and other factors,which can easily cause object shape changes,while global augmentation and cut-paste cannot effectively enhance the robustness of the model.To this end,we propose Syn-Aug,a synchronous data augmentation framework for LiDAR-based 3D object detection.Specifically,we first propose a novel rendering-based object augmentation technique(Ren-Aug)to enrich training data while enhancing scene realism.Second,we propose a local augmentation technique(Local-Aug)to generate local noise by rotating and scaling objects in the scene while avoiding collisions,which can improve generalisation performance.Finally,we make full use of the structural information of 3D labels to make the model more robust by randomly changing the geometry of objects in the training frames.We verify the proposed framework with four different types of 3D object detectors.Experimental results show that our proposed Syn-Aug significantly improves the performance of various 3D object detectors in the KITTI and nuScenes datasets,proving the effectiveness and generality of Syn-Aug.On KITTI,four different types of baseline models using Syn-Aug improved mAP by 0.89%,1.35%,1.61%and 1.14%respectively.On nuScenes,four different types of baseline models using Syn-Aug improved mAP by 14.93%,10.42%,8.47%and 6.81%respectively.The code is available at https://github.com/liuhuaijjin/Syn-Aug. 展开更多
关键词 3d object detection data augmentation DIVERSITY GENERALIZATION point cloud ROBUSTNESS
在线阅读 下载PDF
Algorithm and System of Scanning Color 3D Objects 被引量:1
4
作者 许智钦 孙长库 郑义忠 《Transactions of Tianjin University》 EI CAS 2002年第2期134-138,共5页
This paper presents a complete system for scanning the geometry and texture of a large 3D object, then the automatic registration is performed to obtain a whole realistic 3D model. This system is composed of one line ... This paper presents a complete system for scanning the geometry and texture of a large 3D object, then the automatic registration is performed to obtain a whole realistic 3D model. This system is composed of one line strip laser and one color CCD camera. The scanned object is pictured twice by a color CCD camera. First, the texture of the scanned object is taken by a color CCD camera. Then the 3D information of the scanned object is obtained from laser plane equations. This paper presents a practical way to implement the three dimensional measuring method and the automatic registration of a large 3D object and a pretty good result is obtained after experiment verification. 展开更多
关键词 D measurement color 3d object laser scanning surface construction
在线阅读 下载PDF
General and robust voxel feature learning with Transformer for 3D object detection 被引量:1
5
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2022年第1期51-60,共10页
The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.I... The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection. 展开更多
关键词 3d object detection self-attention networks voxel feature with Transformer(VFT) point cloud encoder-decoder
在线阅读 下载PDF
Exploring Local Regularities for 3D Object Recognition
6
作者 TIAN Huaiwen QIN Shengfeng 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2016年第6期1104-1113,共10页
In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviat... In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviation of angles(L-MSDA), localized minimizing standard deviation of segment magnitudes(L-MSDSM), localized minimum standard deviation of areas of child faces (L-MSDAF), localized minimum sum of segment magnitudes of common edges (L-MSSM), and localized minimum sum of areas of child face (L-MSAF). Based on their effectiveness measurements in terms of form and size distortions, it is found that when two local regularities: L-MSDA and L-MSDSM are combined together, they can produce better performance. In addition, the best weightings for them to work together are identified as 10% for L-MSDSM and 90% for L-MSDA. The test results show that the combined usage of L-MSDA and L-MSDSM with identified weightings has a potential to be applied in other optimization based 3D recognition methods to improve their efficacy and robustness. 展开更多
关键词 stepwise 3d reconstruction localized regularities 3d object recognition polyhedral objects line drawing
在线阅读 下载PDF
MMDistill:Multi-Modal BEV Distillation Framework for Multi-View 3D Object Detection
7
作者 Tianzhe Jiao Yuming Chen +2 位作者 Zhe Zhang Chaopeng Guo Jie Song 《Computers, Materials & Continua》 SCIE EI 2024年第12期4307-4325,共19页
Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a fea... Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a feasible solution due to its low cost.However,camera data lacks geometric depth,and only using camera data to obtain high accuracy is challenging.This paper proposes a multi-modal Bird-Eye-View(BEV)distillation framework(MMDistill)to make a trade-off between them.MMDistill is a carefully crafted two-stage distillation framework based on teacher and student models for learning cross-modal knowledge and generating multi-modal features.It can improve the performance of unimodal detectors without introducing additional costs during inference.Specifically,our method can effectively solve the cross-gap caused by the heterogeneity between data.Furthermore,we further propose a Light Detection and Ranging(LiDAR)-guided geometric compensation module,which can assist the student model in obtaining effective geometric features and reduce the gap between different modalities.Our proposed method generally requires fewer computational resources and faster inference speed than traditional multi-modal models.This advancement enables multi-modal technology to be applied more widely in practical scenarios.Through experiments,we validate the effectiveness and superiority of MMDistill on the nuScenes dataset,achieving an improvement of 4.1%mean Average Precision(mAP)and 4.6%NuScenes Detection Score(NDS)over the baseline detector.In addition,we also present detailed ablation studies to validate our method. 展开更多
关键词 3d object detection MULTI-MODAL knowledge distillation deep learning remote sensing
在线阅读 下载PDF
MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
8
作者 Peicheng Shi Zhiqiang Liu +1 位作者 Heng Qi Aixi Yang 《Computers, Materials & Continua》 SCIE EI 2023年第6期5615-5637,共23页
In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection ... In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection. 展开更多
关键词 3d object detection multimodal fusion neural network autonomous driving attention mechanism
在线阅读 下载PDF
Monocular 3D object detection with Pseudo-LiDAR confidence sampling and hierarchical geometric feature extraction in 6G network
9
作者 Jianlong Zhang Guangzu Fang +3 位作者 Bin Wang Xiaobo Zhou Qingqi Pei Chen Chen 《Digital Communications and Networks》 SCIE CSCD 2023年第4期827-835,共9页
The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpow... The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection. 展开更多
关键词 Monocular 3d object detection Pseudo-LiDAR Confidence sampling Hierarchical geometric feature extraction
在线阅读 下载PDF
Depth-Guided Vision Transformer With Normalizing Flows for Monocular 3D Object Detection
10
作者 Cong Pan Junran Peng Zhaoxiang Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第3期673-689,共17页
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t... Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts. 展开更多
关键词 Monocular 3d object detection normalizing flows Swin Transformer
在线阅读 下载PDF
3D Object Detection with Attention:Shell-Based Modeling
11
作者 Xiaorui Zhang Ziquan Zhao +1 位作者 Wei Sun Qi Cui 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期537-550,共14页
LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previou... LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision. 展开更多
关键词 3d object detection autonomous driving point cloud shell-based modeling self-attention mechanism
在线阅读 下载PDF
3D Object Detection Based on Vanishing Point and Prior Orientation 被引量:2
12
作者 GAO Yongbin ZHAO Huaqing +2 位作者 FANG Zhijun HUANG Bo ZHONG Cengsi 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2019年第5期369-375,共7页
3D object detection is one of the most challenging research tasks in computer vision. In order to solve the problem of template information dependency of 3D object proposal in the method of 3D object detection based o... 3D object detection is one of the most challenging research tasks in computer vision. In order to solve the problem of template information dependency of 3D object proposal in the method of 3D object detection based on 2.5D information, we proposed a 3D object detector based on fusion of vanishing point and prior orientation, which estimates an accurate 3D proposal from 2.5D data, and provides an excellent start point for 3D object classification and localization. The algorithm first calculates three mutually orthogonal vanishing points by the Euler angle principle and projects them into the pixel coordinate system. Then, the top edge of the 2D proposal is sampled by the preset sampling pitch, and the first one vertex is taken. Finally, the remaining seven vertices of the 3D proposal are calculated according to the linear relationship between the three vanishing points and the vertices, and the complete information of the 3D proposal is obtained. The experimental results show that this proposed method improves the Mean Average Precision score by 2.7% based on the Amodal3Det method. 展开更多
关键词 image analysis 3d object DETECTION prior ORIENTATION VANISHING point EULER ANGLE
原文传递
3D Object Recognition by Classification Using Neural Networks 被引量:1
13
作者 Mostafa Elhachloufi Ahmed El Oirrak +1 位作者 Aboutajdine Driss M. Najib Kaddioui Mohamed 《Journal of Software Engineering and Applications》 2011年第5期306-310,共5页
In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads... In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads to recognition of the former. 3D objects of this database are transformations of other objects by one element of the overall transformation. The set of transformations considered in this work is the general affine group. 展开更多
关键词 RECOGNITION CLASSIFICATION 3d object NEURAL Network AFFINE TRANSFORMATION
在线阅读 下载PDF
3D Object Detection Incorporating Instance Segmentation and Image Restoration
14
作者 HUANG Bo HUANG Man +3 位作者 GAO Yongbin YU Yuxin JIANG Xiaoyan ZHANG Juan 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2019年第4期360-368,共9页
Nowadays, 3D object detection, which uses the color and depth information to find object localization in the 3D world and estimate their physical size and pose, is one of the most important 3D perception tasks in the ... Nowadays, 3D object detection, which uses the color and depth information to find object localization in the 3D world and estimate their physical size and pose, is one of the most important 3D perception tasks in the field of computer vision. In order to solve the problem of mixed segmentation results when multiple instances appear in one frustum in the F-PointNet method and in the occlusion that leads to the loss of depth information, a 3D object detection approach based on instance segmentation and image restoration is proposed in this paper. Firstly, instance segmentation with Mask R-CNN on an RGB image is used to avoid mixed segmentation results. Secondly, for the detected occluded objects, we remove the occluding object first in the depth map and then restore the empty pixel region by utilizing the Criminisi Algorithm to recover the missing depth information of the object. The experimental results show that the proposed method improves the average precision score compared with the F-PointNet method. 展开更多
关键词 IMAGE processing 3d object DETECTION instance SEGMENTATION DEPTH information IMAGE RESTORATION
原文传递
ImVoxelENet:Image to voxels epipolar transformer for multi-view RGB-based 3D object detection
15
作者 Gang Xu Haoyu Liu +1 位作者 Biao Leng Zhang Xiong 《Computational Visual Media》 2025年第4期871-888,共18页
The task of detecting three-dimensional objects using only RGB images presents a considerable challenge within the domain of computer vision.The core issue lies in accurately performing epipolar geometry matching betw... The task of detecting three-dimensional objects using only RGB images presents a considerable challenge within the domain of computer vision.The core issue lies in accurately performing epipolar geometry matching between multiple views to obtain latent geometric priors.Existing methods establish correspondences along epipolar line features in voxel space through various layers of convolution.However,this step often occurs in the later stages of the network,which limits overall performance.To address this challenge,we introduce a novel framework,ImVoxelENet,that integrates a geometric epipolar constraint.We start from the back-projection of pixel-wise features and design an attention mechanism that captures the relationship between forward and backward features along the ray for multiple views.This approach enables the early establishment of geometric correspondences and structural connections between epipolar lines.Using ScanNetV2 as a benchmark,extensive comparative and ablation experiments demonstrate that our proposed network achieves a 1.1%improvement in mAP,highlighting its effectiveness in enhancing 3D object detection performance.Our code is available at https://github.com/xug-coder/ImVoxelENet. 展开更多
关键词 3d object detection epipolar geometry TRANSFORMERS ATTENTION deep learning
原文传递
BRTPillar:boosting real-time 3D object detection based point cloud and RGB image fusion in autonomous driving
16
作者 Zhitian Zhang Hongdong Zhao +3 位作者 Yazhou Zhao Dan Chen Ke Zhang Yanqi Li 《International Journal of Intelligent Computing and Cybernetics》 2025年第1期217-235,共19页
Purpose-In autonomous driving,the inherent sparsity of point clouds often limits the performance of object detection,while existing multimodal architectures struggle to meet the real-time requirements for 3D object de... Purpose-In autonomous driving,the inherent sparsity of point clouds often limits the performance of object detection,while existing multimodal architectures struggle to meet the real-time requirements for 3D object detection.Therefore,the main purpose of this paper is to significantly enhance the detection performance of objects,especially the recognition capability for small-sized objects and to address the issue of slow inference speed.This will improve the safety of autonomous driving systems and provide feasibility for devices with limited computing power to achieve autonomous driving.Design/methodology/approach-BRTPillar first adopts an element-based method to fuse image and point cloud features.Secondly,a local-global feature interaction method based on an efficient additive attention mechanism was designed to extract multi-scale contextual information.Finally,an enhanced multi-scale feature fusion method was proposed by introducing adaptive spatial and channel interaction attention mechanisms,thereby improving the learning of fine-grained features.Findings-Extensive experiments were conducted on the KITTI dataset.The results showed that compared with the benchmark model,the accuracy of cars,pedestrians and cyclists on the 3D object box improved by 3.05,9.01 and 22.65%,respectively;the accuracy in the bird’s-eye view has increased by 2.98,10.77 and 21.14%,respectively.Meanwhile,the running speed of BRTPillar can reach 40.27 Hz,meeting the real-time detection needs of autonomous driving.Originality/value-This paper proposes a boosting multimodal real-time 3D object detection method called BRTPillar,which achieves accurate location in many scenarios,especially for complex scenes with many small objects,while also achieving real-time inference speed. 展开更多
关键词 Autonomous driving MULTIMODAL 3d object detection Attention mechanism
在线阅读 下载PDF
基于BEV特征融合的3D目标检测方法
17
作者 曹江 韩雨霖 +3 位作者 王大方 赵文硕 赵逸飞 侯芹忠 《汽车工程》 北大核心 2026年第1期80-90,共11页
近年来,自动驾驶汽车飞速发展,行驶安全性是其核心要素,而这种安全性须依托良好的感知算法才能得到保证。现有技术通常采用BEV视角融合不同传感器的特征,但当前研究中的融合网络较为简单,因此本文设计特征融合网络,将跨传感器、跨模态的... 近年来,自动驾驶汽车飞速发展,行驶安全性是其核心要素,而这种安全性须依托良好的感知算法才能得到保证。现有技术通常采用BEV视角融合不同传感器的特征,但当前研究中的融合网络较为简单,因此本文设计特征融合网络,将跨传感器、跨模态的BEV特征进行融合,减缓BEV特征之间空间不对齐的问题,并增强BEV特征,提高3D目标检测精度。考虑到图像数据的深度预测进度不足,本文还设计了图像深度监督网络,利用点云生成高斯深度图,直接监督深度预测网络的训练过程。实验结果显示,该网络在nuScenes数据集上的mAP达到0.669,NDS达到0.698;本文方法预测的图像深度连续性更强、跳变更少,且BEV特征边缘信息更清晰,潜在目标位置的特征更显著。 展开更多
关键词 自动驾驶感知 3d目标检测 多传感器融合 BEV视角
在线阅读 下载PDF
用于小型无人船的轻量级水面3D目标检测方法
18
作者 张凯 余道洋 +1 位作者 胡敏 赵君亮 《哈尔滨工程大学学报》 北大核心 2026年第1期216-227,共12页
针对小型无人船嵌入式平台部署高精度3D目标检测模型存在计算资源受限问题,本文提出一种基于传感器融合的轻量级水面3D目标检测方法。针对相机图像数据,构建轻量级水面目标检测模型YOLO-LW。该模型在骨干网络中设计轻量级GCF结构并嵌入S... 针对小型无人船嵌入式平台部署高精度3D目标检测模型存在计算资源受限问题,本文提出一种基于传感器融合的轻量级水面3D目标检测方法。针对相机图像数据,构建轻量级水面目标检测模型YOLO-LW。该模型在骨干网络中设计轻量级GCF结构并嵌入SimAM注意力机制以增强特征提取能力;在颈部网络采用DySample动态上采样模块;增加一个小尺寸检测头以提高水面小目标检测精度;并设计融合归一化Wasserstein距离和动态聚焦机制的损失函数N-WIoU,进一步优化模型性能。针对激光雷达点云数据,通过改进基于密度的噪声应用空间聚类算法,设计了根据距离和高度自适应调整参数机制,并根据点云反射率自动调整水面目标聚类簇,精准获取目标的3D坐标。将相机提取的2D目标位置信息与点云聚类获得的3D坐标进行深度融合,实现水面目标类别、3D位置和距离的实时检测。综合实验结果表明,该方法在保证检测精度的同时满足小型无人船嵌入式平台的实时性要求。 展开更多
关键词 无人船 水面目标 3d目标检测 传感器融合 深度学习 聚类算法
在线阅读 下载PDF
Image attention transformer network for indoor 3D object detection 被引量:3
19
作者 REN KeYan YAN Tong +2 位作者 HU ZhaoXin HAN HongGui ZHANG YunLu 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2024年第7期2176-2190,共15页
Point clouds and RGB images are both critical data for 3D object detection. While recent multi-modal methods combine them directly and show remarkable performances, they ignore the distinct forms of these two types of... Point clouds and RGB images are both critical data for 3D object detection. While recent multi-modal methods combine them directly and show remarkable performances, they ignore the distinct forms of these two types of data. For mitigating the influence of this intrinsic difference on performance, we propose a novel but effective fusion model named LI-Attention model, which takes both RGB features and point cloud features into consideration and assigns a weight to each RGB feature by attention mechanism.Furthermore, based on the LI-Attention model, we propose a 3D object detection method called image attention transformer network(IAT-Net) specialized for indoor RGB-D scene. Compared with previous work on multi-modal detection, IAT-Net fuses elaborate RGB features from 2D detection results with point cloud features in attention mechanism, meanwhile generates and refines 3D detection results with transformer model. Extensive experiments demonstrate that our approach outperforms stateof-the-art performance on two widely used benchmarks of indoor 3D object detection, SUN RGB-D and NYU Depth V2, while ablation studies have been provided to analyze the effect of each module. And the source code for the proposed IAT-Net is publicly available at https://github.com/wisper181/IAT-Net. 展开更多
关键词 3d object detection TRANSFORMER attention mechanism
原文传递
Visualizing perceived spatial data quality of 3D objects within virtual globes 被引量:1
20
作者 Krista Jones Rodolphe Devillers +1 位作者 Yvan Bedard Olaf Schroth 《International Journal of Digital Earth》 SCIE EI 2014年第10期771-788,共18页
Virtual globes(VGs)allow Internet users to view geographic data of heterogeneous quality created by other users.This article presents a new approach for collecting and visualizing information about the perceived quali... Virtual globes(VGs)allow Internet users to view geographic data of heterogeneous quality created by other users.This article presents a new approach for collecting and visualizing information about the perceived quality of 3D data in VGs.It aims atimproving users’awareness of the qualityof 3D objects.Instead of relying onthe existing metadata or on formal accuracy assessments that are often impossible in practice,we propose a crowd-sourced quality recommender system based on the five-star visualization method successful in other types of Web applications.Four alternative five-star visualizations were implemented in a Google Earth-based prototype and tested through a formal user evaluation.These tests helped identifying the most effective method for a 3D environment.Results indicate that while most websites use a visualization approach that shows a‘number of stars’,this method was the least preferred by participants.Instead,participants ranked the‘number within a star’method highest as it allowed reducing the visual clutter in urban settings,suggesting that 3D environments such as VGs require different designapproachesthan2Dornon-geographicapplications.Resultsalsoconfirmed that expert and non-expert users in geographic data share similar preferences for the most and least preferred visualization methods. 展开更多
关键词 virtual globes spatial data quality UNCERTAINTY quality recommender system five-star 3d objects
原文传递
上一页 1 2 49 下一页 到第
使用帮助 返回顶部