期刊文献+
共找到98,058篇文章
< 1 2 250 >
每页显示 20 50 100
DI-YOLOv5:An Improved Dual-Wavelet-Based YOLOv5 for Dense Small Object Detection
1
作者 Zi-Xin Li Yu-Long Wang Fei Wang 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期457-459,共3页
Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dens... Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dense small objects is challenging. 展开更多
关键词 small objects receptive fields feature maps detection dense small objects object detection dense objects
在线阅读 下载PDF
Aerial Object Tracking with Attention Mechanisms:Accurate Motion Path Estimation under Moving Camera Perspectives
2
作者 Yu-Shiuan Tsai Yuk-Hang Sit 《Computer Modeling in Engineering & Sciences》 2025年第6期3065-3090,共26页
To improve small object detection and trajectory estimation from an aerial moving perspective,we propose the Aerial View Attention-PRB(AVA-PRB)model.AVA-PRB integrates two attention mechanisms—Coordinate Attention(CA... To improve small object detection and trajectory estimation from an aerial moving perspective,we propose the Aerial View Attention-PRB(AVA-PRB)model.AVA-PRB integrates two attention mechanisms—Coordinate Attention(CA)and the Convolutional Block Attention Module(CBAM)—to enhance detection accuracy.Additionally,Shape-IoU is employed as the loss function to refine localization precision.Our model further incorporates an adaptive feature fusion mechanism,which optimizes multi-scale object representation,ensuring robust tracking in complex aerial environments.We evaluate the performance of AVA-PRB on two benchmark datasets:Aerial Person Detection and VisDrone2019-Det.The model achieves 60.9%mAP@0.5 on the Aerial Person Detection dataset,and 51.2%mAP@0.5 on VisDrone2019-Det,demonstrating its effectiveness in aerial object detection.Beyond detection,we propose a novel trajectory estimation method that improves movement path prediction under aerial motion.Experimental results indicate that our approach reduces path deviation by up to 64%,effectively mitigating errors caused by rapid camera movements and background variations.By optimizing feature extraction and enhancing spatialtemporal coherence,our method significantly improves object tracking under aerial moving perspectives.This research addresses the limitations of fixed-camera tracking,enhancing flexibility and accuracy in aerial tracking applications.The proposed approach has broad potential for real-world applications,including surveillance,traffic monitoring,and environmental observation. 展开更多
关键词 Aerial View Attention-PRB(AVA-PRB) aerial object tracking small object detection deep learning for Aerial vision attention mechanisms in object detection shape-IoU loss function trajectory estimation drone-based visual surveillance
在线阅读 下载PDF
Transorbital craniocerebral injury caused by metallic foreign objects
3
作者 Chongqing Yang Hongguang Cui +2 位作者 Xiawei Wang Chenying Yu Yan Long 《World Journal of Emergency Medicine》 2025年第3期277-279,共3页
Transorbital craniocerebral injury is a relatively rare type of penetrating head injury that poses a significant threat to the ocular and cerebral structures.^([1])The clinical prognosis of transorbital craniocerebral... Transorbital craniocerebral injury is a relatively rare type of penetrating head injury that poses a significant threat to the ocular and cerebral structures.^([1])The clinical prognosis of transorbital craniocerebral injury is closely related to the size,shape,speed,nature,and trajectory of the foreign object,as well as the incidence of central nervous system damage and secondary complications.The foreign objects reported to have caused these injuries are categorized into wooden items,metallic items,^([2-8])and other materials,which penetrate the intracranial region via fi ve major pathways,including the orbital roof (OR),superior orbital fissure (SOF),inferior orbital fissure(IOF),optic canal (OC),and sphenoid wing.Herein,we present eight cases of transorbital craniocerebral injury caused by an unusual metallic foreign body. 展开更多
关键词 transorbital craniocerebral injury ocular cerebral structures foreign objectas central nervous system damage penetrating head injury foreign objects metallic foreign objects clinical prognosis
暂未订购
BOSD: Business Object Based Flexible Software Development for Enterprises 被引量:1
4
作者 Jindan Feng Dechen Zhan +1 位作者 Lanshun Nie Xiaofei Xu 《Journal of Software Engineering and Applications》 2010年第10期914-925,共12页
The enterprise software need adapt to new requirements from the continuous change management. The recent development methods have increased the flexibility of software. However, previous studies have ignored the stabi... The enterprise software need adapt to new requirements from the continuous change management. The recent development methods have increased the flexibility of software. However, previous studies have ignored the stability of business object and the particular business relationships to support the software development. In this paper, a coarse-grained business object based software development, BOSD, is presented to resolve this problem. By analyzing the characteristics of variable requirement, business objects are abstracted as the separately-developed unit from business process, and are assembled to system through their relationships. The methodology of BOSD is combined with MDA (Model Driven Architecture) and implemented on the semiautomatic platform. 展开更多
关键词 BUSINESS REQUIREMENT Change BUSINESS object Relationship BUSINESS Process Information Systems FLEXIBILITY MDA
在线阅读 下载PDF
Hybrid receptive field network for small object detection on drone view
5
作者 Zhaodong CHEN Hongbing JI +2 位作者 Yongquan ZHANG Wenke LIU Zhigang ZHU 《Chinese Journal of Aeronautics》 2025年第2期322-338,共17页
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones... Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built. 展开更多
关键词 Drone remote sensing object detection on drone view Small object detector Hybrid receptive field Feature pyramid network Feature augmentation Multi-scale object detection
原文传递
Transforming Education with Photogrammetry:Creating Realistic 3D Objects for Augmented Reality Applications
6
作者 Kaviyaraj Ravichandran Uma Mohan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期185-208,共24页
Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in ed... Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in education continues to increase,educators actively seek innovative and immersive methods to engage students in learning.However,exploring these possibilities also entails identifying and overcoming existing barriers to optimal educational integration.Concurrently,this surge in demand has prompted the identification of specific barriers,one of which is three-dimensional(3D)modeling.Creating 3D objects for augmented reality education applications can be challenging and time-consuming for the educators.To address this,we have developed a pipeline that creates realistic 3D objects from the two-dimensional(2D)photograph.Applications for augmented and virtual reality can then utilize these created 3D objects.We evaluated the proposed pipeline based on the usability of the 3D object and performance metrics.Quantitatively,with 117 respondents,the co-creation team was surveyed with openended questions to evaluate the precision of the 3D object created by the proposed photogrammetry pipeline.We analyzed the survey data using descriptive-analytical methods and found that the proposed pipeline produces 3D models that are positively accurate when compared to real-world objects,with an average mean score above 8.This study adds new knowledge in creating 3D objects for augmented reality applications by using the photogrammetry technique;finally,it discusses potential problems and future research directions for 3D objects in the education sector. 展开更多
关键词 Augmented reality education immersive learning 3D object creation PHOTOGRAMMETRY and StructureFromMotion
在线阅读 下载PDF
Study on Color Difference of Color Reproduction of 3D Objects
7
作者 GU Chong DENG Yi-qiang 《印刷与数字媒体技术研究》 北大核心 2025年第4期33-38,69,共7页
To investigate the applicability of four commonly used color difference formulas(CIELAB,CIE94,CMC(1:1),and CIEDE2000)in the printing field on 3D objects,as well as the impact of four standard light sources(D65,D50,A,a... To investigate the applicability of four commonly used color difference formulas(CIELAB,CIE94,CMC(1:1),and CIEDE2000)in the printing field on 3D objects,as well as the impact of four standard light sources(D65,D50,A,and TL84)on 3D color difference evaluations,50 glossy spheres with a diameter of 2cm based on the Sailner J4003D color printing device were created.These spheres were centered around the five recommended colors(gray,red,yellow,green,and blue)by CIE.Color difference was calculated according to the four formulas,and 111 pairs of experimental samples meeting the CIELAB gray scale color difference requirements(1.0-14.0)were selected.Ten observers,aged between 22 and 27 with normal color vision,were participated in this study,using the gray scale method from psychophysical experiments to conduct color difference evaluations under the four light sources,with repeated experiments for each observer.The results indicated that the overall effect of the D65 light source on 3D objects color difference was minimal.In contrast,D50 and A light sources had a significant impact within the small color difference range,while the TL84 light source influenced both large and small color difference considerably.Among the four color difference formulas,CIEDE2000 demonstrated the best predictive performance for color difference in 3D objects,followed by CMC(1:1),CIE94,and CIELAB. 展开更多
关键词 Color difference formula 3D objects Light source Gray scale Normalized residual sum of squares
在线阅读 下载PDF
PF-YOLO:An Improved YOLOv8 for Small Object Detection in Fisheye Images
8
作者 Cheng Zhang Cheng Xu Hongzhe Liu 《Journal of Beijing Institute of Technology》 2025年第1期57-70,共14页
Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objec... Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objects near image edges.To tackle these,we proposed peripheral focus you only look once(PF-YOLO),an enhanced YOLOv8n-based method.Firstly,we introduced a cutting-patch data augmentation strategy to mitigate the problem of insufficient small-object samples in various scenes.Secondly,to enhance the model's focus on small objects near the edges,we designed the peripheral focus loss,which uses dynamic focus coefficients to provide greater gradient gains for these objects,improving their regression accuracy.Finally,we designed the three dimensional(3D)spatial-channel coordinate attention C2f module,enhancing spatial and channel perception,suppressing noise,and improving personnel detection.Experimental results demonstrate that PF-YOLO achieves strong performance on the challenging events for person detection from overhead fisheye images(CEPDTOF)and in-the-wild events for people detection and tracking from overhead fisheye cameras(WEPDTOF)datasets.Compared to the original YOLOv8n model,PFYOLO achieves improvements on CEPDTOF with increases of 2.1%,1.7%and 2.9%in mean average precision 50(mAP 50),mAP 50-95,and tively.On WEPDTOF,PF-YOLO achieves substantial improvements with increases of 31.4%,14.9%,61.1%and 21.0%in 91.2%and 57.2%,respectively. 展开更多
关键词 FISHEYE object detection and recognition small object detection deep learning
在线阅读 下载PDF
Point-voxel dual transformer for LiDAR 3D object detection
9
作者 TONG Jigang YANG Fanhang +1 位作者 YANG Sen DU Shengzhi 《Optoelectronics Letters》 2025年第9期547-554,共8页
In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the propos... In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the proposed PV-DT3D,point-voxel fusion features are used for proposal refinement.Specifically,keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module.Subsequently,following the generation of proposals by the region proposal networks(RPN),the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture.In 3D object detection,the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions.Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods. 展开更多
关键词 proposal refinement encode representative scene features point voxel dual transformer object detection LIDAR d object detection generation proposals proposal refinementspecificallykeypoints
原文传递
ASL-OOD:Hierarchical Contextual Feature Fusion with Angle-Sensitive Loss for Oriented Object Detection
10
作者 Kexin Wang Jiancheng Liu +5 位作者 Yuqing Lin Tuo Wang Zhipeng Zhang Wanlong Qi Xingye Han Runyuan Wen 《Computers, Materials & Continua》 2025年第2期1879-1899,共21页
Detecting oriented targets in remote sensing images amidst complex and heterogeneous backgrounds remains a formidable challenge in the field of object detection.Current frameworks for oriented detection modules are co... Detecting oriented targets in remote sensing images amidst complex and heterogeneous backgrounds remains a formidable challenge in the field of object detection.Current frameworks for oriented detection modules are constrained by intrinsic limitations,including excessive computational and memory overheads,discrepancies between predefined anchors and ground truth bounding boxes,intricate training processes,and feature alignment inconsistencies.To overcome these challenges,we present ASL-OOD(Angle-based SIOU Loss for Oriented Object Detection),a novel,efficient,and robust one-stage framework tailored for oriented object detection.The ASL-OOD framework comprises three core components:the Transformer-based Backbone(TB),the Transformer-based Neck(TN),and the Angle-SIOU(Scylla Intersection over Union)based Decoupled Head(ASDH).By leveraging the Swin Transformer,the TB and TN modules offer several key advantages,such as the capacity to model long-range dependencies,preserve high-resolution feature representations,seamlessly integrate multi-scale features,and enhance parameter efficiency.These improvements empower the model to accurately detect objects across varying scales.The ASDH module further enhances detection performance by incorporating angle-aware optimization based on SIOU,ensuring precise angular consistency and bounding box coherence.This approach effectively harmonizes shape loss and distance loss during the optimization process,thereby significantly boosting detection accuracy.Comprehensive evaluations and ablation studies on standard benchmark datasets such as DOTA with an mAP(mean Average Precision)of 80.16 percent,HRSC2016 with an mAP of 91.07 percent,MAR20 with an mAP of 85.45 percent,and UAVDT with an mAP of 39.7 percent demonstrate the clear superiority of ASL-OOD over state-of-the-art oriented object detection models.These findings underscore the model’s efficacy as an advanced solution for challenging remote sensing object detection tasks. 展开更多
关键词 Oriented object detection transformer deep learning
在线阅读 下载PDF
Infrared road object detection algorithm based on spatial depth channel attention network and improved YOLOv8
11
作者 LI Song SHI Tao +1 位作者 JING Fangke CUI Jie 《Optoelectronics Letters》 2025年第8期491-498,共8页
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f... Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance. 展开更多
关键词 feature pyramid network infrared road object detection infrared imagesf yolov backbone networks channel attention mechanism spatial depth channel attention network object detection improved YOLOv
原文传递
Salient Object Detection Based on Multi-Strategy Feature Optimization
12
作者 Libo Han Sha Tao +3 位作者 Wen Xia Weixin Sun Li Yan Wanlin Gao 《Computers, Materials & Continua》 2025年第2期2431-2449,共19页
At present, salient object detection (SOD) has achieved considerable progress. However, the methods that perform well still face the issue of inadequate detection accuracy. For example, sometimes there are problems of... At present, salient object detection (SOD) has achieved considerable progress. However, the methods that perform well still face the issue of inadequate detection accuracy. For example, sometimes there are problems of missed and false detections. Effectively optimizing features to capture key information and better integrating different levels of features to enhance their complementarity are two significant challenges in the domain of SOD. In response to these challenges, this study proposes a novel SOD method based on multi-strategy feature optimization. We propose the multi-size feature extraction module (MSFEM), which uses the attention mechanism, the multi-level feature fusion, and the residual block to obtain finer features. This module provides robust support for the subsequent accurate detection of the salient object. In addition, we use two rounds of feature fusion and the feedback mechanism to optimize the features obtained by the MSFEM to improve detection accuracy. The first round of feature fusion is applied to integrate the features extracted by the MSFEM to obtain more refined features. Subsequently, the feedback mechanism and the second round of feature fusion are applied to refine the features, thereby providing a stronger foundation for accurately detecting salient objects. To improve the fusion effect, we propose the feature enhancement module (FEM) and the feature optimization module (FOM). The FEM integrates the upper and lower features with the optimized features obtained by the FOM to enhance feature complementarity. The FOM uses different receptive fields, the attention mechanism, and the residual block to more effectively capture key information. Experimental results demonstrate that our method outperforms 10 state-of-the-art SOD methods. 展开更多
关键词 Salient object detection multi-strategy feature optimization feedback mechanism
在线阅读 下载PDF
YOLOv8s-DroneNet: Small Object Detection Algorithm Based on Feature Selection and ISIoU
13
作者 Jian Peng Hui He Dengyong Zhang 《Computers, Materials & Continua》 2025年第9期5047-5061,共15页
Object detection plays a critical role in drone imagery analysis,especially in remote sensing applications where accurate and efficient detection of small objects is essential.Despite significant advancements in drone... Object detection plays a critical role in drone imagery analysis,especially in remote sensing applications where accurate and efficient detection of small objects is essential.Despite significant advancements in drone imagery detection,most models still struggle with small object detection due to challenges such as object size,complex backgrounds.To address these issues,we propose a robust detection model based on You Only Look Once(YOLO)that balances accuracy and efficiency.The model mainly contains several major innovation:feature selection pyramid network,Inner-Shape Intersection over Union(ISIoU)loss function and small object detection head.To overcome the limitations of traditional fusion methods in handling multi-level features,we introduce a Feature Selection Pyramid Network integrated into the Neck component,which preserves shallow feature details critical for detecting small objects.Additionally,recognizing that deep network structures often neglect or degrade small object features,we design a specialized small object detection head in the shallow layers to enhance detection accuracy for these challenging targets.To effectively model both local and global dependencies,we introduce a Conv-Former module that simulates Transformer mechanisms using a convolutional structure,thereby improving feature enhancement.Furthermore,we employ ISIoU to address object imbalance and scale variation This approach accelerates model conver-gence and improves regression accuracy.Experimental results show that,compared to the baseline model,the proposed method significantly improves small object detection performance on the VisDrone2019 dataset,with mAP@50 increasing by 4.9%and mAP@50-95 rising by 6.7%.This model also outperforms other state-of-the-art algorithms,demonstrating its reliability and effectiveness in both small object detection and remote sensing image fusion tasks. 展开更多
关键词 Drone imagery small object detection feature selection convolutional attention
在线阅读 下载PDF
Hypergraph-Based Asynchronous Event Processing for Moving Object Classification
14
作者 YU Nannan WANG Chaoyi +4 位作者 QIAO Yu WANG Yuxin ZHENG Chenglin ZHANG Qiang YANG Xin 《Journal of Shanghai Jiaotong university(Science)》 2025年第5期952-961,共10页
Unlike traditional video cameras,event cameras capture asynchronous event streams in which each event encodes pixel location,triggers’timestamps,and the polarity of brightness changes.In this paper,we introduce a nov... Unlike traditional video cameras,event cameras capture asynchronous event streams in which each event encodes pixel location,triggers’timestamps,and the polarity of brightness changes.In this paper,we introduce a novel hypergraph-based framework for moving object classification.Specifically,we capture moving objects with an event camera,to perceive and collect asynchronous event streams in a high temporal resolution.Unlike stacked event frames,we encode asynchronous event data into a hypergraph,fully mining the high-order correlation of event data,and designing a mixed convolutional hypergraph neural network for training to achieve a more efficient and accurate motion target recognition.The experimental results show that our method has a good performance in moving object classification(e.g.,gait identification). 展开更多
关键词 hypergraph learning event stream moving object classification
原文传递
Efficient Spatiotemporal Information Utilization for Video Camouflaged Object Detection
15
作者 Dongdong Zhang Chunping Wang +1 位作者 Huiying Wang Qiang Fu 《Computers, Materials & Continua》 2025年第3期4319-4338,共20页
Video camouflaged object detection(VCOD)has become a fundamental task in computer vision that has attracted significant attention in recent years.Unlike image camouflaged object detection(ICOD),VCOD not only requires ... Video camouflaged object detection(VCOD)has become a fundamental task in computer vision that has attracted significant attention in recent years.Unlike image camouflaged object detection(ICOD),VCOD not only requires spatial cues but also needs motion cues.Thus,effectively utilizing spatiotemporal information is crucial for generating accurate segmentation results.Current VCOD methods,which typically focus on exploring motion representation,often ineffectively integrate spatial and motion features,leading to poor performance in diverse scenarios.To address these issues,we design a novel spatiotemporal network with an encoder-decoder structure.During the encoding stage,an adjacent space-time memory module(ASTM)is employed to extract high-level temporal features(i.e.,motion cues)from the current frame and its adjacent frames.In the decoding stage,a selective space-time aggregation module is introduced to efficiently integrate spatial and temporal features.Additionally,a multi-feature fusion module is developed to progressively refine the rough prediction by utilizing the information provided by multiple types of features.Furthermore,we incorporate multi-task learning into the proposed network to obtain more accurate predictions.Experimental results show that the proposed method outperforms existing cutting-edge baselines on VCOD benchmarks. 展开更多
关键词 Video camouflaged object detection spatiotemporal information feature fusion multi-task learning
在线阅读 下载PDF
A Category-Agnostic Hybrid Contrastive Learning Method for Few-Shot Point Cloud Object Detection
16
作者 Xuejing Li 《Computers, Materials & Continua》 2025年第5期1667-1681,共15页
Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the nove... Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the novel classes.Due to imbalanced training data,existing FS3D methods based on fully supervised learning can lead to overfitting toward base classes,which impairs the network’s ability to generalize knowledge learned from base classes to novel classes and also prevents the network from extracting distinctive foreground and background representations for novel class objects.To address these issues,this thesis proposes a category-agnostic contrastive learning approach,enhancing the generalization and identification abilities for almost unseen categories through the construction of pseudo-labels and positive-negative sample pairs unrelated to specific classes.Firstly,this thesis designs a proposal-wise context contrastive module(CCM).By reducing the distance between foreground point features and increasing the distance between foreground and background point features within a region proposal,CCM aids the network in extracting more discriminative foreground and background feature representations without reliance on categorical annotations.Secondly,this thesis utilizes a geometric contrastive module(GCM),which enhances the network’s geometric perception capability by employing contrastive learning on the foreground point features associated with various basic geometric components,such as edges,corners,and surfaces,thereby enabling these geometric components to exhibit more distinguishable representations.This thesis also combines category-aware contrastive learning with former modules to maintain categorical distinctiveness.Extensive experimental results on FS-SUNRGBD and FS-ScanNet datasets demonstrate the effectiveness of this method with average precision exceeding the baseline by up to 8%. 展开更多
关键词 Contrastive learning few-shot learning point cloud object detection
在线阅读 下载PDF
Exploration of the Application of Artificial Intelligence Technology in the Transformation of Old Objects
17
作者 Tonghuan Zhang Xinyu Yang +1 位作者 Ying Chen Qiufan Xie 《Journal of Electronic Research and Application》 2025年第2期51-57,共7页
With the rapid development of technology,artificial intelligence(AI)is increasingly being applied in various fields.In today’s context of resource scarcity,pursuit of sustainable development and resource reuse,the tr... With the rapid development of technology,artificial intelligence(AI)is increasingly being applied in various fields.In today’s context of resource scarcity,pursuit of sustainable development and resource reuse,the transformation of old objects is particularly important.This article analyzes the current status of old object transformation and the opportunities brought by the internet to old objects and delves into the application of artificial intelligence in old object transformation.The focus is on five aspects:intelligent identification and classification,intelligent evaluation and prediction,automation integration,intelligent design and optimization,and integration of 3D printing technology.Finally,the process of“redesigning an old furniture,such as a wooden desk,through AI technology”is described,including the recycling,identification,detection,design,transformation,and final user feedback of the old wooden desk.This illustrates the unlimited potential of the“AI+old object transformation”approach,advocates for people to strengthen green environmental protection,and drives sustainable development. 展开更多
关键词 Artificial Intelligence(AI) Old object transformation Environmental protection
在线阅读 下载PDF
Syn-Aug:An Effective and General Synchronous Data Augmentation Framework for 3D Object Detection
18
作者 Huaijin Liu Jixiang Du +2 位作者 Yong Zhang Hongbo Zhang Jiandian Zeng 《CAAI Transactions on Intelligence Technology》 2025年第3期912-928,共17页
Data augmentation plays an important role in boosting the performance of 3D models,while very few studies handle the 3D point cloud data with this technique.Global augmentation and cut-paste are commonly used augmenta... Data augmentation plays an important role in boosting the performance of 3D models,while very few studies handle the 3D point cloud data with this technique.Global augmentation and cut-paste are commonly used augmentation techniques for point clouds,where global augmentation is applied to the entire point cloud of the scene,and cut-paste samples objects from other frames into the current frame.Both types of data augmentation can improve performance,but the cut-paste technique cannot effectively deal with the occlusion relationship between the foreground object and the background scene and the rationality of object sampling,which may be counterproductive and may hurt the overall performance.In addition,LiDAR is susceptible to signal loss,external occlusion,extreme weather and other factors,which can easily cause object shape changes,while global augmentation and cut-paste cannot effectively enhance the robustness of the model.To this end,we propose Syn-Aug,a synchronous data augmentation framework for LiDAR-based 3D object detection.Specifically,we first propose a novel rendering-based object augmentation technique(Ren-Aug)to enrich training data while enhancing scene realism.Second,we propose a local augmentation technique(Local-Aug)to generate local noise by rotating and scaling objects in the scene while avoiding collisions,which can improve generalisation performance.Finally,we make full use of the structural information of 3D labels to make the model more robust by randomly changing the geometry of objects in the training frames.We verify the proposed framework with four different types of 3D object detectors.Experimental results show that our proposed Syn-Aug significantly improves the performance of various 3D object detectors in the KITTI and nuScenes datasets,proving the effectiveness and generality of Syn-Aug.On KITTI,four different types of baseline models using Syn-Aug improved mAP by 0.89%,1.35%,1.61%and 1.14%respectively.On nuScenes,four different types of baseline models using Syn-Aug improved mAP by 14.93%,10.42%,8.47%and 6.81%respectively.The code is available at https://github.com/liuhuaijjin/Syn-Aug. 展开更多
关键词 3D object detection data augmentation DIVERSITY GENERALIZATION point cloud ROBUSTNESS
在线阅读 下载PDF
Research Progress on Multi-Modal Fusion Object Detection Algorithms for Autonomous Driving:A Review
19
作者 Peicheng Shi Li Yang +2 位作者 Xinlong Dong Heng Qi Aixi Yang 《Computers, Materials & Continua》 2025年第6期3877-3917,共41页
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan... As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection. 展开更多
关键词 Multi-modal fusion 3D object detection deep learning autonomous driving
在线阅读 下载PDF
An Infrared-Visible Image Fusion Network with Channel-Switching for Low-Light Object Detection
20
作者 Tianzhe Jiao Yuming Chen +2 位作者 Xiaoyue Feng Chaopeng Guo Jie Song 《Computers, Materials & Continua》 2025年第11期2681-2700,共20页
Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of vis... Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively. 展开更多
关键词 Infrared-visible image fusion channel switching low-light object detection cross-attention fusion
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部