On grounds of the advent of real-time applications,like autonomous driving,visual surveillance,and sports analysis,there is an augmenting focus of attention towards Multiple-Object Tracking(MOT).The tracking-by-detect...On grounds of the advent of real-time applications,like autonomous driving,visual surveillance,and sports analysis,there is an augmenting focus of attention towards Multiple-Object Tracking(MOT).The tracking-by-detection paradigm,a commonly utilized approach,connects the existing recognition hypotheses to the formerly assessed object trajectories by comparing the simila-rities of the appearance or the motion between them.For an efficient detection and tracking of the numerous objects in a complex environment,a Pearson Simi-larity-centred Kuhn-Munkres(PS-KM)algorithm was proposed in the present study.In this light,the input videos were,initially,gathered from the MOT dataset and converted into frames.The background subtraction occurred whichfiltered the inappropriate data concerning the frames after the frame conversion stage.Then,the extraction of features from the frames was executed.Afterwards,the higher dimensional features were transformed into lower-dimensional features,and feature reduction process was performed with the aid of Information Gain-centred Singular Value Decomposition(IG-SVD).Next,using the Modified Recurrent Neural Network(MRNN)method,classification was executed which identified the categories of the objects additionally.The PS-KM algorithm identi-fied that the recognized objects were tracked.Finally,the experimental outcomes exhibited that numerous targets were precisely tracked by the proposed system with 97%accuracy with a low false positive rate(FPR)of 2.3%.It was also proved that the present techniques viz.RNN,CNN,and KNN,were effective with regard to the existing models.展开更多
Multi-license plate detection in complex scenes is still a challenging task because of multiple vehicle license plates with different sizes and classes in the images having complex background.The edge features of high...Multi-license plate detection in complex scenes is still a challenging task because of multiple vehicle license plates with different sizes and classes in the images having complex background.The edge features of high-density distribution and the high curvature features of stroke turning of Chinese character are important signs to distinguish Chinese license plate from other objects.To accurately detect multiple vehicle license plates with different sizes and classes in complex scenes,a multi-object detection of Chinese license plate method based on improved YOLOv3 network was proposed in this research.The improvements include replacing the residual block of the YOLOv3 backbone network with the Inception-ResNet-A block,imbedding the SPP block into the detection network,cutting the redundant Inception-ResNet-A block to suit for the multi-license plate detection task,and clustering the ground truth boxes of license plates to obtain a new set of anchor boxes.A Chinese vehicle license plate image dataset was built for training and testing the improved network,and the location and class of the license plates in each image were accurately labeled.The dataset has 62,153 pieces of images and 4 classes of China vehicle license plates,almost images have multiple license plates with different sizes.Experiments demonstrated that the multilicense plate detection method obtained 83.4%mAP,98.88%precision,98.17%recall,98.52 F1 score,89.196 BFLOPS and 22 FPS on the test dataset,and whole performance was better than the other five compared networks including YOLOv3,SSD,Faster-RCNN,EfficientDet and RetinaNet.展开更多
Environment perception is one of the most critical technology of intelligent transportation systems(ITS).Motion interaction between multiple vehicles in ITS makes it important to perform multi-object tracking(MOT).How...Environment perception is one of the most critical technology of intelligent transportation systems(ITS).Motion interaction between multiple vehicles in ITS makes it important to perform multi-object tracking(MOT).However,most existing MOT algorithms follow the tracking-by-detection framework,which separates detection and tracking into two independent segments and limit the global efciency.Recently,a few algorithms have combined feature extraction into one network;however,the tracking portion continues to rely on data association,and requires com‑plex post-processing for life cycle management.Those methods do not combine detection and tracking efciently.This paper presents a novel network to realize joint multi-object detection and tracking in an end-to-end manner for ITS,named as global correlation network(GCNet).Unlike most object detection methods,GCNet introduces a global correlation layer for regression of absolute size and coordinates of bounding boxes,instead of ofsetting predictions.The pipeline of detection and tracking in GCNet is conceptually simple,and does not require compli‑cated tracking strategies such as non-maximum suppression and data association.GCNet was evaluated on a multivehicle tracking dataset,UA-DETRAC,demonstrating promising performance compared to state-of-the-art detectors and trackers.展开更多
The increasing global prevalence of mild cognitive impairment(MCI)necessitates a paradigm shift in early detection strategies.Conventional neuropsychological assessment methods,predominantly paper-and-pencil tests suc...The increasing global prevalence of mild cognitive impairment(MCI)necessitates a paradigm shift in early detection strategies.Conventional neuropsychological assessment methods,predominantly paper-and-pencil tests such as the Mini-Mental State Examination and the Montreal Cognitive Assessment,exhibit inherent limitations with respect to accessibility,administration burden,and sensitivity to subtle cognitive decline,particularly among diverse populations.This commentary critically examines a recent study that champions a novel approach:The integration of gait and handwriting kinematic parameters analyzed via machine learning for MCI screening.The present study positions itself within the broader landscape of MCI detection,with a view to comparing its advantages against established neuropsychological batteries,advanced neuroimaging(e.g.,positron emission tomography,magnetic resonance imaging),and emerging fluid biomarkers(e.g.,cerebrospinal fluid,blood-based assays).While the study demonstrates promising accuracy(74.44%area under the curve 0.74 with gait and graphic handwriting)and addresses key unmet needs in accessibility and objectivity,we highlight its cross-sectional nature,limited sample diversity,and lack of dual-task assessment as areas for future refinement.This commentary posits that kinematic biomarkers offer a distinctive,scalable,and ecologically valid approach to widespread MCI screening,thereby complementing existing methods by providing real-world functional insights.Future research should prioritize longitudinal validation,expansion to diverse cohorts,integration with multimodal data including dual-tasking,and the development of highly portable,artificial intelligence-driven solutions to achieve the democratization of early MCI detection and enable timely interventions.展开更多
In recent years,the number of patientswith colon disease has increased significantly.Colon polyps are the precursor lesions of colon cancer.If not diagnosed in time,they can easily develop into colon cancer,posing a s...In recent years,the number of patientswith colon disease has increased significantly.Colon polyps are the precursor lesions of colon cancer.If not diagnosed in time,they can easily develop into colon cancer,posing a serious threat to patients’lives and health.A colonoscopy is an important means of detecting colon polyps.However,in polyp imaging,due to the large differences and diverse types of polyps in size,shape,color,etc.,traditional detection methods face the problem of high false positive rates,which creates problems for doctors during the diagnosis process.In order to improve the accuracy and efficiency of colon polyp detection,this question proposes a network model suitable for colon polyp detection(PD-YOLO).This method introduces the self-attention mechanism CBAM(Convolutional Block Attention Module)in the backbone layer based on YOLOv7,allowing themodel to adaptively focus on key information and ignore the unimportant parts.To help themodel do a better job of polyp localization and bounding box regression,add the SPD-Conv(Symmetric Positive Definite Convolution)module to the neck layer and use deconvolution instead of upsampling.Theexperimental results indicate that the PD-YOLO algorithm demonstrates strong robustness in colon polyp detection.Compared to the original YOLOv7,on the Kvasir-SEG dataset,PD-YOLO has shown an increase of 5.44 percentage points in AP@0.5,showcasing significant advantages over other mainstream methods.展开更多
The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photograp...The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.展开更多
The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approac...The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.展开更多
With the approval of more and more genetically modified(GM)crops in our country,GM safety management has become more important.Transgenic detection is a major approach for transgenic safety management.Nevertheless,a c...With the approval of more and more genetically modified(GM)crops in our country,GM safety management has become more important.Transgenic detection is a major approach for transgenic safety management.Nevertheless,a convenient and visual technique with low equipment requirements and high sensitivity for the field detection of GM plants is still lacking.On the basis of the existing recombinase polymerase amplification(RPA)technique,we developed a multiplex RPA(multi-RPA)method that can simultaneously detect three transgenic elements,including the cauliflower mosaic virus 35S gene(CaMV35S)promoter,neomycin phosphotransferaseⅡgene(NptⅡ)and hygromycin B phosphotransferase gene(Hyg),thus improving the detection rate.Moreover,we coupled this multi-RPA technique with the CRISPR/Cas12a reporter system,which enabled the detection results to be clearly observed by naked eyes under ultraviolet(UV)light(254 nm;which could be achieved by a portable UV flashlight),therefore establishing a multi-RPA visual detection technique.Compared with the traditional test strip detection method,this multi-RPA-CRISPR/Cas12a technique has the higher specificity,higher sensitivity,wider application range and lower cost.Compared with other polymerase chain reaction(PCR)techniques,it also has the advantages of low equipment requirements and visualization,making it a potentially feasible method for the field detection of GM plants.展开更多
Security and safety remain paramount concerns for both governments and individuals worldwide.In today’s context,the frequency of crimes and terrorist attacks is alarmingly increasing,becoming increasingly intolerable...Security and safety remain paramount concerns for both governments and individuals worldwide.In today’s context,the frequency of crimes and terrorist attacks is alarmingly increasing,becoming increasingly intolerable to society.Consequently,there is a pressing need for swift identification of potential threats to preemptively alert law enforcement and security forces,thereby preventing potential attacks or violent incidents.Recent advancements in big data analytics and deep learning have significantly enhanced the capabilities of computer vision in object detection,particularly in identifying firearms.This paper introduces a novel automatic firearm detection surveillance system,utilizing a one-stage detection approach named MARIE(Mechanism for Realtime Identification of Firearms).MARIE incorporates the Single Shot Multibox Detector(SSD)model,which has been specifically optimized to balance the speed-accuracy trade-off critical in firearm detection applications.The SSD model was further refined by integrating MobileNetV2 and InceptionV2 architectures for superior feature extraction capabilities.The experimental results demonstrate that this modified SSD configuration provides highly satisfactory performance,surpassing existing methods trained on the same dataset in terms of the critical speedaccuracy trade-off.Through these innovations,MARIE sets a new standard in surveillance technology,offering a robust solution to enhance public safety effectively.展开更多
In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,par...In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,particularly in snowy environments,remains a challenge.Snow-covered roads introduce unpredictable surface conditions,occlusions,and reduced visibility,that require robust and adaptive path detection algorithms.This paper presents an enhanced road detection framework for snowy environments,leveraging Simple Framework forContrastive Learning of Visual Representations(SimCLR)for Self-Supervised pretraining,hyperparameter optimization,and uncertainty-aware object detection to improve the performance of YouOnly Look Once version 8(YOLOv8).Themodel is trained and evaluated on a custom-built dataset collected from snowy roads in Tromsø,Norway,which covers a range of snow textures,illumination conditions,and road geometries.The proposed framework achieves scores in terms of mAP@50 equal to 99%and mAP@50–95 equal to 97%,demonstrating the effectiveness of YOLOv8 for real-time road detection in extreme winter conditions.The findings contribute to the safe and reliable deployment of autonomous vehicles in Arctic environments,enabling robust decision-making in hazardous weather conditions.This research lays the groundwork for more resilient perceptionmodels in self-driving systems,paving the way for the future development of intelligent and adaptive transportation networks.展开更多
The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more e...The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more efficient and reliable intrusion detection systems(IDSs).However,the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs.Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues.While most of these researchers reported the success of these preprocessing techniques on a shallow level,very few studies have been performed on their effects on a wider scale.Furthermore,the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used,which most of the existing studies give little emphasis on.Thus,this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets:NSL-KDD,UNSW-NB15,and CSE–CIC–IDS2018,and various AI algorithms.A wrapper-based approach,which tends to give superior performance,and min-max normalization methods were used for feature selection and normalization,respectively.Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization.The models were evaluated using popular evaluation metrics in IDS modeling,intra-and inter-model comparisons were performed between models and with state-of-the-art works.Random forest(RF)models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86%and 96.01%,respectively,whereas artificial neural network(ANN)achieved the best accuracy of 95.43%on the CSE–CIC–IDS2018 dataset.The RF models also achieved an excellent performance compared to recent works.The results show that normalization and feature selection positively affect IDS modeling.Furthermore,while feature selection benefits simpler algorithms(such as RF),normalization is more useful for complex algorithms like ANNs and deep neural networks(DNNs),and algorithms such as Naive Bayes are unsuitable for IDS modeling.The study also found that the UNSW-NB15 and CSE–CIC–IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset.Our findings suggest that prioritizing robust algorithms like RF,alongside complex models such as ANN and DNN,can significantly enhance IDS performance.These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates.展开更多
Conventional superconducting nanowire single-photon detectors(SNSPDs)have been typically limited in their applications due to their size,weight,and power consumption,which confine their use to laboratory settings.Howe...Conventional superconducting nanowire single-photon detectors(SNSPDs)have been typically limited in their applications due to their size,weight,and power consumption,which confine their use to laboratory settings.However,with the rapid development of remote imaging,sensing technologies,and long-range quantum communication with fewer topographical constraints,the demand for high-efficiency single-photon detectors integrated with avionic platforms is rapidly growing.We herein designed and manufactured the first drone-based SNSPD system with a system detection efficiency(SDE)as high as 91.8%.This drone-based system incorporates high-performance NbTiN SNSPDs,a self-developed miniature liquid helium dewar,and custom-built integrated electrical setups,making it capable of being launched in complex topographical conditions.Such a drone-based SNSPD system may open the use of SNSPDs for applications that demand high SDE in complex environments.展开更多
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones...Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.展开更多
Plants play a crucial role in maintaining ecological balance and biodiversity.However,plant health is easily affected by environmental stresses.Hence,the rapid and precise monitoring of plant health is crucial for glo...Plants play a crucial role in maintaining ecological balance and biodiversity.However,plant health is easily affected by environmental stresses.Hence,the rapid and precise monitoring of plant health is crucial for global food security and ecological balance.Currently,traditional detection strategies for monitoring plant health mainly rely on expensive equipment and complex operational procedures,which limit their widespread application.Fortunately,near-infrared(NIR)fluorescence and surface-enhanced Raman scattering(SERS)techniques have been recently highlighted in plants.NIR fluorescence imaging holds the advantages of being non-invasive,high-resolution and real-time,which is suitable for rapid screening in large-scale scenarios.While SERS enables highly sensitive and specific detection of trace chemical substances within plant tissues.Therefore,the complementarity of NIR fluorescence and SERS modalities can provide more comprehensive and accurate information for plant disease diagnosis and growth status monitoring.This article summarizes these two modalities in plant applications,and discusses the advantages of multimodal NIR fluorescence/SERS for a better understanding of a plant’s response to stress,thereby improving the accuracy and sensitivity of detection.展开更多
Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by ...Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by exploring the evolution of different methods and applications over the past three years,highlighting the shift from conventional computer vision to deep learning-based methodologies owing to their enhanced efficacy in real time.The review emphasizes the integration of advanced models,such as You Only Look Once(YOLO)v9,v10,EfficientDet,Transformer-based models,and hybrid frameworks that improve the precision,accuracy,and scalability for crop monitoring and disease detection.The review also highlights benchmark datasets and evaluation metrics.It addresses limitations,like domain adaptation challenges,dataset heterogeneity,and occlusion,while offering insights into prospective research avenues,such as multimodal learning,explainable AI,and federated learning.Furthermore,the main aim of this paper is to serve as a thorough resource guide for scientists,researchers,and stakeholders for implementing deep learning-based object detection methods for the development of intelligent,robust,and sustainable agricultural systems.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
Aiming at the problem that the existing algorithms for vehicle detection in smart factories are difficult to detect partial occlusion of vehicles,vulnerable to background interference,lack of global vision,and excessi...Aiming at the problem that the existing algorithms for vehicle detection in smart factories are difficult to detect partial occlusion of vehicles,vulnerable to background interference,lack of global vision,and excessive suppression of real targets,which ultimately cause accuracy degradation.At the same time,to facilitate the subsequent positioning of vehicles in the factory,this paper proposes an improved YOLOv8 algorithm.Firstly,the RFCAConv module is combined to improve the original YOLOv8 backbone.Pay attention to the different features in the receptive field,and give priority to the spatial features of the receptive field to capture more vehicle feature information and solve the problem that the vehicle is partially occluded and difficult to detect.Secondly,the SFE module is added to the neck of v8,which improves the saliency of the target in the reasoning process and reduces the influence of background interference on vehicle detection.Finally,the head of the RT-DETR algorithm is used to replace the head in the original YOLOv8 algorithm,which avoids the excessive suppression of the real target while combining the context information.The experimental results show that compared with the original YOLOv8 algorithm,the detection accuracy of the improved YOLOv8 algorithm is improved by 4.6%on the self-made smart factory data set,and the detection speed also meets the real-time requirements of smart factory vehicle detection and subsequent vehicle positioning.展开更多
Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which ...Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.展开更多
Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models...Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models achieve speed-accuracy trade-offs via fixed convolution kernels and manual feature fusion,their rigid architectures struggle with multi-scale adaptability,as exemplified by YOLOv8n’s 36.4%mAP and 13.9%small-object AP on VisDrone2019.This paper presents YOLO-LE,a lightweight framework addressing these limitations through three novel designs:(1)We introduce the C2f-Dy and LDown modules to enhance the backbone’s sensitivity to small-object features while reducing backbone parameters,thereby improving model efficiency.(2)An adaptive feature fusion module is designed to dynamically integrate multi-scale feature maps,optimizing the neck structure,reducing neck complexity,and enhancing overall model performance.(3)We replace the original loss function with a distributed focal loss and incorporate a lightweight self-attention mechanism to improve small-object recognition and bounding box regression accuracy.Experimental results demonstrate that YOLO-LE achieves 39.9%mAP@0.5 on VisDrone2019,representing a 9.6%improvement over YOLOv8n,while maintaining 8.5 GFLOPs computational efficiency.This provides an efficient solution for UAV object detection in complex scenarios.展开更多
To map the rock joints in the underground rock mass,a method was proposed to semiautomatically detect the rock joints from borehole imaging logs using a deep learning algorithm.First,450 images containing rock joints ...To map the rock joints in the underground rock mass,a method was proposed to semiautomatically detect the rock joints from borehole imaging logs using a deep learning algorithm.First,450 images containing rock joints were selected from borehole ZKZ01 in the Rumei hydropower station.These images were labeled to establish ground truth which was subdivided into training,validation,and testing data.Second,the YOLO v2 model with optimal parameter settings was constructed.Third,the training and validation data were used for model training,while the test data was used to generate the precision-recall curve for prediction evaluation.Fourth,the trained model was applied to a new borehole ZKZ02 to verify the feasibility of the model.There were 12 rock joints detected from the selected images in borehole ZKZ02 and four geometric parameters for each rock joint were determined by sinusoidal curve fitting.The average precision of the trained model reached 0.87.展开更多
文摘On grounds of the advent of real-time applications,like autonomous driving,visual surveillance,and sports analysis,there is an augmenting focus of attention towards Multiple-Object Tracking(MOT).The tracking-by-detection paradigm,a commonly utilized approach,connects the existing recognition hypotheses to the formerly assessed object trajectories by comparing the simila-rities of the appearance or the motion between them.For an efficient detection and tracking of the numerous objects in a complex environment,a Pearson Simi-larity-centred Kuhn-Munkres(PS-KM)algorithm was proposed in the present study.In this light,the input videos were,initially,gathered from the MOT dataset and converted into frames.The background subtraction occurred whichfiltered the inappropriate data concerning the frames after the frame conversion stage.Then,the extraction of features from the frames was executed.Afterwards,the higher dimensional features were transformed into lower-dimensional features,and feature reduction process was performed with the aid of Information Gain-centred Singular Value Decomposition(IG-SVD).Next,using the Modified Recurrent Neural Network(MRNN)method,classification was executed which identified the categories of the objects additionally.The PS-KM algorithm identi-fied that the recognized objects were tracked.Finally,the experimental outcomes exhibited that numerous targets were precisely tracked by the proposed system with 97%accuracy with a low false positive rate(FPR)of 2.3%.It was also proved that the present techniques viz.RNN,CNN,and KNN,were effective with regard to the existing models.
基金supported by the China Sichuan Science and Technology Program under Grant 2019YFG0299the Fundamental Research Funds of China West Normal University under Grant 19B045the Research Foundation for Talents of China Normal University under Grant 17YC163。
文摘Multi-license plate detection in complex scenes is still a challenging task because of multiple vehicle license plates with different sizes and classes in the images having complex background.The edge features of high-density distribution and the high curvature features of stroke turning of Chinese character are important signs to distinguish Chinese license plate from other objects.To accurately detect multiple vehicle license plates with different sizes and classes in complex scenes,a multi-object detection of Chinese license plate method based on improved YOLOv3 network was proposed in this research.The improvements include replacing the residual block of the YOLOv3 backbone network with the Inception-ResNet-A block,imbedding the SPP block into the detection network,cutting the redundant Inception-ResNet-A block to suit for the multi-license plate detection task,and clustering the ground truth boxes of license plates to obtain a new set of anchor boxes.A Chinese vehicle license plate image dataset was built for training and testing the improved network,and the location and class of the license plates in each image were accurately labeled.The dataset has 62,153 pieces of images and 4 classes of China vehicle license plates,almost images have multiple license plates with different sizes.Experiments demonstrated that the multilicense plate detection method obtained 83.4%mAP,98.88%precision,98.17%recall,98.52 F1 score,89.196 BFLOPS and 22 FPS on the test dataset,and whole performance was better than the other five compared networks including YOLOv3,SSD,Faster-RCNN,EfficientDet and RetinaNet.
基金Supported by National Key Research and Development Program of China(Grant No.2021YFB1600402)National Natural Science Foundation of China(Grant No.52072212)+1 种基金Dongfeng USharing Technology Co.,Ltd.,China Intelli‑gent and Connected Vehicles(Beijing)Research Institute Co.,Ltd.“Shuimu Tsinghua Scholarship”of Tsinghua University of China.
文摘Environment perception is one of the most critical technology of intelligent transportation systems(ITS).Motion interaction between multiple vehicles in ITS makes it important to perform multi-object tracking(MOT).However,most existing MOT algorithms follow the tracking-by-detection framework,which separates detection and tracking into two independent segments and limit the global efciency.Recently,a few algorithms have combined feature extraction into one network;however,the tracking portion continues to rely on data association,and requires com‑plex post-processing for life cycle management.Those methods do not combine detection and tracking efciently.This paper presents a novel network to realize joint multi-object detection and tracking in an end-to-end manner for ITS,named as global correlation network(GCNet).Unlike most object detection methods,GCNet introduces a global correlation layer for regression of absolute size and coordinates of bounding boxes,instead of ofsetting predictions.The pipeline of detection and tracking in GCNet is conceptually simple,and does not require compli‑cated tracking strategies such as non-maximum suppression and data association.GCNet was evaluated on a multivehicle tracking dataset,UA-DETRAC,demonstrating promising performance compared to state-of-the-art detectors and trackers.
文摘The increasing global prevalence of mild cognitive impairment(MCI)necessitates a paradigm shift in early detection strategies.Conventional neuropsychological assessment methods,predominantly paper-and-pencil tests such as the Mini-Mental State Examination and the Montreal Cognitive Assessment,exhibit inherent limitations with respect to accessibility,administration burden,and sensitivity to subtle cognitive decline,particularly among diverse populations.This commentary critically examines a recent study that champions a novel approach:The integration of gait and handwriting kinematic parameters analyzed via machine learning for MCI screening.The present study positions itself within the broader landscape of MCI detection,with a view to comparing its advantages against established neuropsychological batteries,advanced neuroimaging(e.g.,positron emission tomography,magnetic resonance imaging),and emerging fluid biomarkers(e.g.,cerebrospinal fluid,blood-based assays).While the study demonstrates promising accuracy(74.44%area under the curve 0.74 with gait and graphic handwriting)and addresses key unmet needs in accessibility and objectivity,we highlight its cross-sectional nature,limited sample diversity,and lack of dual-task assessment as areas for future refinement.This commentary posits that kinematic biomarkers offer a distinctive,scalable,and ecologically valid approach to widespread MCI screening,thereby complementing existing methods by providing real-world functional insights.Future research should prioritize longitudinal validation,expansion to diverse cohorts,integration with multimodal data including dual-tasking,and the development of highly portable,artificial intelligence-driven solutions to achieve the democratization of early MCI detection and enable timely interventions.
基金funded by the Undergraduate Higher Education Teaching and Research Project(No.FBJY20230216)Research Projects of Putian University(No.2023043)the Education Department of the Fujian Province Project(No.JAT220300).
文摘In recent years,the number of patientswith colon disease has increased significantly.Colon polyps are the precursor lesions of colon cancer.If not diagnosed in time,they can easily develop into colon cancer,posing a serious threat to patients’lives and health.A colonoscopy is an important means of detecting colon polyps.However,in polyp imaging,due to the large differences and diverse types of polyps in size,shape,color,etc.,traditional detection methods face the problem of high false positive rates,which creates problems for doctors during the diagnosis process.In order to improve the accuracy and efficiency of colon polyp detection,this question proposes a network model suitable for colon polyp detection(PD-YOLO).This method introduces the self-attention mechanism CBAM(Convolutional Block Attention Module)in the backbone layer based on YOLOv7,allowing themodel to adaptively focus on key information and ignore the unimportant parts.To help themodel do a better job of polyp localization and bounding box regression,add the SPD-Conv(Symmetric Positive Definite Convolution)module to the neck layer and use deconvolution instead of upsampling.Theexperimental results indicate that the PD-YOLO algorithm demonstrates strong robustness in colon polyp detection.Compared to the original YOLOv7,on the Kvasir-SEG dataset,PD-YOLO has shown an increase of 5.44 percentage points in AP@0.5,showcasing significant advantages over other mainstream methods.
文摘The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.
基金supported by the National Natural Science Foundation of China(No.U21A20290)Guangdong Basic and Applied Basic Research Foundation(No.2022A1515011656)+2 种基金the Projects of Talents Recruitment of GDUPT(No.2023rcyj1003)the 2022“Sail Plan”Project of Maoming Green Chemical Industry Research Institute(No.MMGCIRI2022YFJH-Y-024)Maoming Science and Technology Project(No.2023382).
文摘The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.
基金the Experimental Technology Research Project of Zhejiang University(SYB202138)National Natural Science Foundation of China(32000195)。
文摘With the approval of more and more genetically modified(GM)crops in our country,GM safety management has become more important.Transgenic detection is a major approach for transgenic safety management.Nevertheless,a convenient and visual technique with low equipment requirements and high sensitivity for the field detection of GM plants is still lacking.On the basis of the existing recombinase polymerase amplification(RPA)technique,we developed a multiplex RPA(multi-RPA)method that can simultaneously detect three transgenic elements,including the cauliflower mosaic virus 35S gene(CaMV35S)promoter,neomycin phosphotransferaseⅡgene(NptⅡ)and hygromycin B phosphotransferase gene(Hyg),thus improving the detection rate.Moreover,we coupled this multi-RPA technique with the CRISPR/Cas12a reporter system,which enabled the detection results to be clearly observed by naked eyes under ultraviolet(UV)light(254 nm;which could be achieved by a portable UV flashlight),therefore establishing a multi-RPA visual detection technique.Compared with the traditional test strip detection method,this multi-RPA-CRISPR/Cas12a technique has the higher specificity,higher sensitivity,wider application range and lower cost.Compared with other polymerase chain reaction(PCR)techniques,it also has the advantages of low equipment requirements and visualization,making it a potentially feasible method for the field detection of GM plants.
文摘Security and safety remain paramount concerns for both governments and individuals worldwide.In today’s context,the frequency of crimes and terrorist attacks is alarmingly increasing,becoming increasingly intolerable to society.Consequently,there is a pressing need for swift identification of potential threats to preemptively alert law enforcement and security forces,thereby preventing potential attacks or violent incidents.Recent advancements in big data analytics and deep learning have significantly enhanced the capabilities of computer vision in object detection,particularly in identifying firearms.This paper introduces a novel automatic firearm detection surveillance system,utilizing a one-stage detection approach named MARIE(Mechanism for Realtime Identification of Firearms).MARIE incorporates the Single Shot Multibox Detector(SSD)model,which has been specifically optimized to balance the speed-accuracy trade-off critical in firearm detection applications.The SSD model was further refined by integrating MobileNetV2 and InceptionV2 architectures for superior feature extraction capabilities.The experimental results demonstrate that this modified SSD configuration provides highly satisfactory performance,surpassing existing methods trained on the same dataset in terms of the critical speedaccuracy trade-off.Through these innovations,MARIE sets a new standard in surveillance technology,offering a robust solution to enhance public safety effectively.
文摘In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,particularly in snowy environments,remains a challenge.Snow-covered roads introduce unpredictable surface conditions,occlusions,and reduced visibility,that require robust and adaptive path detection algorithms.This paper presents an enhanced road detection framework for snowy environments,leveraging Simple Framework forContrastive Learning of Visual Representations(SimCLR)for Self-Supervised pretraining,hyperparameter optimization,and uncertainty-aware object detection to improve the performance of YouOnly Look Once version 8(YOLOv8).Themodel is trained and evaluated on a custom-built dataset collected from snowy roads in Tromsø,Norway,which covers a range of snow textures,illumination conditions,and road geometries.The proposed framework achieves scores in terms of mAP@50 equal to 99%and mAP@50–95 equal to 97%,demonstrating the effectiveness of YOLOv8 for real-time road detection in extreme winter conditions.The findings contribute to the safe and reliable deployment of autonomous vehicles in Arctic environments,enabling robust decision-making in hazardous weather conditions.This research lays the groundwork for more resilient perceptionmodels in self-driving systems,paving the way for the future development of intelligent and adaptive transportation networks.
文摘The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more efficient and reliable intrusion detection systems(IDSs).However,the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs.Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues.While most of these researchers reported the success of these preprocessing techniques on a shallow level,very few studies have been performed on their effects on a wider scale.Furthermore,the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used,which most of the existing studies give little emphasis on.Thus,this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets:NSL-KDD,UNSW-NB15,and CSE–CIC–IDS2018,and various AI algorithms.A wrapper-based approach,which tends to give superior performance,and min-max normalization methods were used for feature selection and normalization,respectively.Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization.The models were evaluated using popular evaluation metrics in IDS modeling,intra-and inter-model comparisons were performed between models and with state-of-the-art works.Random forest(RF)models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86%and 96.01%,respectively,whereas artificial neural network(ANN)achieved the best accuracy of 95.43%on the CSE–CIC–IDS2018 dataset.The RF models also achieved an excellent performance compared to recent works.The results show that normalization and feature selection positively affect IDS modeling.Furthermore,while feature selection benefits simpler algorithms(such as RF),normalization is more useful for complex algorithms like ANNs and deep neural networks(DNNs),and algorithms such as Naive Bayes are unsuitable for IDS modeling.The study also found that the UNSW-NB15 and CSE–CIC–IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset.Our findings suggest that prioritizing robust algorithms like RF,alongside complex models such as ANN and DNN,can significantly enhance IDS performance.These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates.
基金the Innovation Program for Quantum Science and Technology(Grant No.2023ZD0300100)the National Key Research and Development Program of China(Grant Nos.2023YFB3809600 and 2023YFC3007801)+1 种基金the National Natural Science Foundation of China(Grant Nos.62301543 and U24A20320)the Shanghai Sailing Program(Grant No.21YF1455700).
文摘Conventional superconducting nanowire single-photon detectors(SNSPDs)have been typically limited in their applications due to their size,weight,and power consumption,which confine their use to laboratory settings.However,with the rapid development of remote imaging,sensing technologies,and long-range quantum communication with fewer topographical constraints,the demand for high-efficiency single-photon detectors integrated with avionic platforms is rapidly growing.We herein designed and manufactured the first drone-based SNSPD system with a system detection efficiency(SDE)as high as 91.8%.This drone-based system incorporates high-performance NbTiN SNSPDs,a self-developed miniature liquid helium dewar,and custom-built integrated electrical setups,making it capable of being launched in complex topographical conditions.Such a drone-based SNSPD system may open the use of SNSPDs for applications that demand high SDE in complex environments.
基金supported by the National Natural Science Foundation of China(Nos.62276204 and 62203343)the Fundamental Research Funds for the Central Universities(No.YJSJ24011)+1 种基金the Natural Science Basic Research Program of Shanxi,China(Nos.2022JM-340 and 2023-JC-QN-0710)the China Postdoctoral Science Foundation(Nos.2020T130494 and 2018M633470).
文摘Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.
基金funded by the National Natural Science Foundation of China(Nos.22374055,22022404,22074050,82172055)the National Natural Science Foundation of Hubei Province(No.22022CFA033)the Fundamental Research Funds for the Central Universities(Nos.CCNU24JCPT001,CCNU24JCPT020)。
文摘Plants play a crucial role in maintaining ecological balance and biodiversity.However,plant health is easily affected by environmental stresses.Hence,the rapid and precise monitoring of plant health is crucial for global food security and ecological balance.Currently,traditional detection strategies for monitoring plant health mainly rely on expensive equipment and complex operational procedures,which limit their widespread application.Fortunately,near-infrared(NIR)fluorescence and surface-enhanced Raman scattering(SERS)techniques have been recently highlighted in plants.NIR fluorescence imaging holds the advantages of being non-invasive,high-resolution and real-time,which is suitable for rapid screening in large-scale scenarios.While SERS enables highly sensitive and specific detection of trace chemical substances within plant tissues.Therefore,the complementarity of NIR fluorescence and SERS modalities can provide more comprehensive and accurate information for plant disease diagnosis and growth status monitoring.This article summarizes these two modalities in plant applications,and discusses the advantages of multimodal NIR fluorescence/SERS for a better understanding of a plant’s response to stress,thereby improving the accuracy and sensitivity of detection.
文摘Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by exploring the evolution of different methods and applications over the past three years,highlighting the shift from conventional computer vision to deep learning-based methodologies owing to their enhanced efficacy in real time.The review emphasizes the integration of advanced models,such as You Only Look Once(YOLO)v9,v10,EfficientDet,Transformer-based models,and hybrid frameworks that improve the precision,accuracy,and scalability for crop monitoring and disease detection.The review also highlights benchmark datasets and evaluation metrics.It addresses limitations,like domain adaptation challenges,dataset heterogeneity,and occlusion,while offering insights into prospective research avenues,such as multimodal learning,explainable AI,and federated learning.Furthermore,the main aim of this paper is to serve as a thorough resource guide for scientists,researchers,and stakeholders for implementing deep learning-based object detection methods for the development of intelligent,robust,and sustainable agricultural systems.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
基金funded by Changzhou Science and Technology Project(No.CZ20230025)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.XSJCX23_36).
文摘Aiming at the problem that the existing algorithms for vehicle detection in smart factories are difficult to detect partial occlusion of vehicles,vulnerable to background interference,lack of global vision,and excessive suppression of real targets,which ultimately cause accuracy degradation.At the same time,to facilitate the subsequent positioning of vehicles in the factory,this paper proposes an improved YOLOv8 algorithm.Firstly,the RFCAConv module is combined to improve the original YOLOv8 backbone.Pay attention to the different features in the receptive field,and give priority to the spatial features of the receptive field to capture more vehicle feature information and solve the problem that the vehicle is partially occluded and difficult to detect.Secondly,the SFE module is added to the neck of v8,which improves the saliency of the target in the reasoning process and reduces the influence of background interference on vehicle detection.Finally,the head of the RT-DETR algorithm is used to replace the head in the original YOLOv8 algorithm,which avoids the excessive suppression of the real target while combining the context information.The experimental results show that compared with the original YOLOv8 algorithm,the detection accuracy of the improved YOLOv8 algorithm is improved by 4.6%on the self-made smart factory data set,and the detection speed also meets the real-time requirements of smart factory vehicle detection and subsequent vehicle positioning.
基金supported by the National Natural Science Foundation of China(No.52188102).
文摘Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.
文摘Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models achieve speed-accuracy trade-offs via fixed convolution kernels and manual feature fusion,their rigid architectures struggle with multi-scale adaptability,as exemplified by YOLOv8n’s 36.4%mAP and 13.9%small-object AP on VisDrone2019.This paper presents YOLO-LE,a lightweight framework addressing these limitations through three novel designs:(1)We introduce the C2f-Dy and LDown modules to enhance the backbone’s sensitivity to small-object features while reducing backbone parameters,thereby improving model efficiency.(2)An adaptive feature fusion module is designed to dynamically integrate multi-scale feature maps,optimizing the neck structure,reducing neck complexity,and enhancing overall model performance.(3)We replace the original loss function with a distributed focal loss and incorporate a lightweight self-attention mechanism to improve small-object recognition and bounding box regression accuracy.Experimental results demonstrate that YOLO-LE achieves 39.9%mAP@0.5 on VisDrone2019,representing a 9.6%improvement over YOLOv8n,while maintaining 8.5 GFLOPs computational efficiency.This provides an efficient solution for UAV object detection in complex scenarios.
基金supported by the National Key R&D Program of China(No.2023YFC3081200)the National Natural Science Foundation of China(No.42077264)。
文摘To map the rock joints in the underground rock mass,a method was proposed to semiautomatically detect the rock joints from borehole imaging logs using a deep learning algorithm.First,450 images containing rock joints were selected from borehole ZKZ01 in the Rumei hydropower station.These images were labeled to establish ground truth which was subdivided into training,validation,and testing data.Second,the YOLO v2 model with optimal parameter settings was constructed.Third,the training and validation data were used for model training,while the test data was used to generate the precision-recall curve for prediction evaluation.Fourth,the trained model was applied to a new borehole ZKZ02 to verify the feasibility of the model.There were 12 rock joints detected from the selected images in borehole ZKZ02 and four geometric parameters for each rock joint were determined by sinusoidal curve fitting.The average precision of the trained model reached 0.87.