An iterative detection/decoding algorithm of correlated sources for the LDPC-based relay systems is presented. The signal from the source-destination(S-D) link is formulated as a highly correlated counterpart from the...An iterative detection/decoding algorithm of correlated sources for the LDPC-based relay systems is presented. The signal from the source-destination(S-D) link is formulated as a highly correlated counterpart from the relay-destination(R-D) link. A special XOR vector is defined using the correlated hard decision information blocks from two decoders and the extrinsic information exchanged between the two decoders is derived by the log-likelihood ratio(LLR) associated with the XOR vector. Such the decoding scheme is different from the traditional turbo-like detection/decoding algorithm, where the extrinsic information is computed by the side information and the soft decoder outputs. Simulations show that the presented algorithm has a slightly better performance than the traditional turbo-like algorithm(Taking the(255,175) EG-LDPC code as an example, it achieves about 0.1 dB performance gains aroundBLER=10^(-4)). Furthermore, the presented algorithm requires fewer computing operations per iteration and has faster convergence rate. For example, the average iteration of the presented algorithm is 33 at SNR=1.8 dB, which is about twice faster than that of the turbo-like algorithm, when decoding the(961,721) QC-LDPC code. Therefore, the presented decoding algorithm of correlated sources provides an alternative decoding solution for the LDPC-based relay systems.展开更多
A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decod...A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.展开更多
Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decode...Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods.展开更多
Reliable electricity infrastructure is critical for modern society,highlighting the importance of securing the stability of fundamental power electronic systems.However,as such systems frequently involve high-current ...Reliable electricity infrastructure is critical for modern society,highlighting the importance of securing the stability of fundamental power electronic systems.However,as such systems frequently involve high-current and high-voltage conditions,there is a greater likelihood of failures.Consequently,anomaly detection of power electronic systems holds great significance,which is a task that properly-designed neural networks can well undertake,as proven in various scenarios.Transformer-like networks are promising for such application,yet with its structure initially designed for different tasks,features extracted by beginning layers are often lost,decreasing detection performance.Also,such data-driven methods typically require sufficient anomalous data for training,which could be difficult to obtain in practice.Therefore,to improve feature utilization while achieving efficient unsupervised learning,a novel model,Densely-connected Decoder Transformer(DDformer),is proposed for unsupervised anomaly detection of power electronic systems in this paper.First,efficient labelfree training is achieved based on the concept of autoencoder with recursive-free output.An encoder-decoder structure with densely-connected decoder is then adopted,merging features from all encoder layers to avoid possible loss of mined features while reducing training difficulty.Both simulation and real-world experiments are conducted to validate the capabilities of DDformer,and the average FDR has surpassed baseline models,reaching 89.39%,93.91%,95.98%in different experiment setups respectively.展开更多
Security and safety remain paramount concerns for both governments and individuals worldwide.In today’s context,the frequency of crimes and terrorist attacks is alarmingly increasing,becoming increasingly intolerable...Security and safety remain paramount concerns for both governments and individuals worldwide.In today’s context,the frequency of crimes and terrorist attacks is alarmingly increasing,becoming increasingly intolerable to society.Consequently,there is a pressing need for swift identification of potential threats to preemptively alert law enforcement and security forces,thereby preventing potential attacks or violent incidents.Recent advancements in big data analytics and deep learning have significantly enhanced the capabilities of computer vision in object detection,particularly in identifying firearms.This paper introduces a novel automatic firearm detection surveillance system,utilizing a one-stage detection approach named MARIE(Mechanism for Realtime Identification of Firearms).MARIE incorporates the Single Shot Multibox Detector(SSD)model,which has been specifically optimized to balance the speed-accuracy trade-off critical in firearm detection applications.The SSD model was further refined by integrating MobileNetV2 and InceptionV2 architectures for superior feature extraction capabilities.The experimental results demonstrate that this modified SSD configuration provides highly satisfactory performance,surpassing existing methods trained on the same dataset in terms of the critical speedaccuracy trade-off.Through these innovations,MARIE sets a new standard in surveillance technology,offering a robust solution to enhance public safety effectively.展开更多
The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approac...The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.展开更多
In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,par...In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,particularly in snowy environments,remains a challenge.Snow-covered roads introduce unpredictable surface conditions,occlusions,and reduced visibility,that require robust and adaptive path detection algorithms.This paper presents an enhanced road detection framework for snowy environments,leveraging Simple Framework forContrastive Learning of Visual Representations(SimCLR)for Self-Supervised pretraining,hyperparameter optimization,and uncertainty-aware object detection to improve the performance of YouOnly Look Once version 8(YOLOv8).Themodel is trained and evaluated on a custom-built dataset collected from snowy roads in Tromsø,Norway,which covers a range of snow textures,illumination conditions,and road geometries.The proposed framework achieves scores in terms of mAP@50 equal to 99%and mAP@50–95 equal to 97%,demonstrating the effectiveness of YOLOv8 for real-time road detection in extreme winter conditions.The findings contribute to the safe and reliable deployment of autonomous vehicles in Arctic environments,enabling robust decision-making in hazardous weather conditions.This research lays the groundwork for more resilient perceptionmodels in self-driving systems,paving the way for the future development of intelligent and adaptive transportation networks.展开更多
By introducing the bit-level multi-stream coded Layered Space-Time (LST) transmitter along with a novel iterative MultiStage Decoding (MSD) at the receiver, the paper shows how to achieve the near-capacity perform...By introducing the bit-level multi-stream coded Layered Space-Time (LST) transmitter along with a novel iterative MultiStage Decoding (MSD) at the receiver, the paper shows how to achieve the near-capacity performance of the Multiple-Input Multiple-Output (MIMO) systems with square Quadrature Amplitude Modulation (QAM). In the proposed iterative MSD scheme, the detection at each stage is equivalent to multiuser detection of synchronous Code Division Multiple Access (CDMA) multiuser systems with the aid of the binary representation of the transmitted symbols. Therefore, the optimal Soft-Input Soft-Output (SISO) multiuser detection and low-complexity SISO multiuser detection can be utilized herein. And the proposed scheme with low-complexity SISO multiuser detection has polynomial complexity in the number of transmit antennas M, the number of receive antennas N, and the number of bits per constellation point Me. Simulation results demonstrate that the proposed scheme has similar Bit Error Rate (BER) performance to that of the known Iterative Tree Search (ITS) detection.展开更多
With the approval of more and more genetically modified(GM)crops in our country,GM safety management has become more important.Transgenic detection is a major approach for transgenic safety management.Nevertheless,a c...With the approval of more and more genetically modified(GM)crops in our country,GM safety management has become more important.Transgenic detection is a major approach for transgenic safety management.Nevertheless,a convenient and visual technique with low equipment requirements and high sensitivity for the field detection of GM plants is still lacking.On the basis of the existing recombinase polymerase amplification(RPA)technique,we developed a multiplex RPA(multi-RPA)method that can simultaneously detect three transgenic elements,including the cauliflower mosaic virus 35S gene(CaMV35S)promoter,neomycin phosphotransferaseⅡgene(NptⅡ)and hygromycin B phosphotransferase gene(Hyg),thus improving the detection rate.Moreover,we coupled this multi-RPA technique with the CRISPR/Cas12a reporter system,which enabled the detection results to be clearly observed by naked eyes under ultraviolet(UV)light(254 nm;which could be achieved by a portable UV flashlight),therefore establishing a multi-RPA visual detection technique.Compared with the traditional test strip detection method,this multi-RPA-CRISPR/Cas12a technique has the higher specificity,higher sensitivity,wider application range and lower cost.Compared with other polymerase chain reaction(PCR)techniques,it also has the advantages of low equipment requirements and visualization,making it a potentially feasible method for the field detection of GM plants.展开更多
Plants play a crucial role in maintaining ecological balance and biodiversity.However,plant health is easily affected by environmental stresses.Hence,the rapid and precise monitoring of plant health is crucial for glo...Plants play a crucial role in maintaining ecological balance and biodiversity.However,plant health is easily affected by environmental stresses.Hence,the rapid and precise monitoring of plant health is crucial for global food security and ecological balance.Currently,traditional detection strategies for monitoring plant health mainly rely on expensive equipment and complex operational procedures,which limit their widespread application.Fortunately,near-infrared(NIR)fluorescence and surface-enhanced Raman scattering(SERS)techniques have been recently highlighted in plants.NIR fluorescence imaging holds the advantages of being non-invasive,high-resolution and real-time,which is suitable for rapid screening in large-scale scenarios.While SERS enables highly sensitive and specific detection of trace chemical substances within plant tissues.Therefore,the complementarity of NIR fluorescence and SERS modalities can provide more comprehensive and accurate information for plant disease diagnosis and growth status monitoring.This article summarizes these two modalities in plant applications,and discusses the advantages of multimodal NIR fluorescence/SERS for a better understanding of a plant’s response to stress,thereby improving the accuracy and sensitivity of detection.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
In recent years,the number of patientswith colon disease has increased significantly.Colon polyps are the precursor lesions of colon cancer.If not diagnosed in time,they can easily develop into colon cancer,posing a s...In recent years,the number of patientswith colon disease has increased significantly.Colon polyps are the precursor lesions of colon cancer.If not diagnosed in time,they can easily develop into colon cancer,posing a serious threat to patients’lives and health.A colonoscopy is an important means of detecting colon polyps.However,in polyp imaging,due to the large differences and diverse types of polyps in size,shape,color,etc.,traditional detection methods face the problem of high false positive rates,which creates problems for doctors during the diagnosis process.In order to improve the accuracy and efficiency of colon polyp detection,this question proposes a network model suitable for colon polyp detection(PD-YOLO).This method introduces the self-attention mechanism CBAM(Convolutional Block Attention Module)in the backbone layer based on YOLOv7,allowing themodel to adaptively focus on key information and ignore the unimportant parts.To help themodel do a better job of polyp localization and bounding box regression,add the SPD-Conv(Symmetric Positive Definite Convolution)module to the neck layer and use deconvolution instead of upsampling.Theexperimental results indicate that the PD-YOLO algorithm demonstrates strong robustness in colon polyp detection.Compared to the original YOLOv7,on the Kvasir-SEG dataset,PD-YOLO has shown an increase of 5.44 percentage points in AP@0.5,showcasing significant advantages over other mainstream methods.展开更多
Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which ...Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.展开更多
Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models...Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models achieve speed-accuracy trade-offs via fixed convolution kernels and manual feature fusion,their rigid architectures struggle with multi-scale adaptability,as exemplified by YOLOv8n’s 36.4%mAP and 13.9%small-object AP on VisDrone2019.This paper presents YOLO-LE,a lightweight framework addressing these limitations through three novel designs:(1)We introduce the C2f-Dy and LDown modules to enhance the backbone’s sensitivity to small-object features while reducing backbone parameters,thereby improving model efficiency.(2)An adaptive feature fusion module is designed to dynamically integrate multi-scale feature maps,optimizing the neck structure,reducing neck complexity,and enhancing overall model performance.(3)We replace the original loss function with a distributed focal loss and incorporate a lightweight self-attention mechanism to improve small-object recognition and bounding box regression accuracy.Experimental results demonstrate that YOLO-LE achieves 39.9%mAP@0.5 on VisDrone2019,representing a 9.6%improvement over YOLOv8n,while maintaining 8.5 GFLOPs computational efficiency.This provides an efficient solution for UAV object detection in complex scenarios.展开更多
To map the rock joints in the underground rock mass,a method was proposed to semiautomatically detect the rock joints from borehole imaging logs using a deep learning algorithm.First,450 images containing rock joints ...To map the rock joints in the underground rock mass,a method was proposed to semiautomatically detect the rock joints from borehole imaging logs using a deep learning algorithm.First,450 images containing rock joints were selected from borehole ZKZ01 in the Rumei hydropower station.These images were labeled to establish ground truth which was subdivided into training,validation,and testing data.Second,the YOLO v2 model with optimal parameter settings was constructed.Third,the training and validation data were used for model training,while the test data was used to generate the precision-recall curve for prediction evaluation.Fourth,the trained model was applied to a new borehole ZKZ02 to verify the feasibility of the model.There were 12 rock joints detected from the selected images in borehole ZKZ02 and four geometric parameters for each rock joint were determined by sinusoidal curve fitting.The average precision of the trained model reached 0.87.展开更多
The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more e...The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more efficient and reliable intrusion detection systems(IDSs).However,the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs.Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues.While most of these researchers reported the success of these preprocessing techniques on a shallow level,very few studies have been performed on their effects on a wider scale.Furthermore,the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used,which most of the existing studies give little emphasis on.Thus,this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets:NSL-KDD,UNSW-NB15,and CSE–CIC–IDS2018,and various AI algorithms.A wrapper-based approach,which tends to give superior performance,and min-max normalization methods were used for feature selection and normalization,respectively.Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization.The models were evaluated using popular evaluation metrics in IDS modeling,intra-and inter-model comparisons were performed between models and with state-of-the-art works.Random forest(RF)models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86%and 96.01%,respectively,whereas artificial neural network(ANN)achieved the best accuracy of 95.43%on the CSE–CIC–IDS2018 dataset.The RF models also achieved an excellent performance compared to recent works.The results show that normalization and feature selection positively affect IDS modeling.Furthermore,while feature selection benefits simpler algorithms(such as RF),normalization is more useful for complex algorithms like ANNs and deep neural networks(DNNs),and algorithms such as Naive Bayes are unsuitable for IDS modeling.The study also found that the UNSW-NB15 and CSE–CIC–IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset.Our findings suggest that prioritizing robust algorithms like RF,alongside complex models such as ANN and DNN,can significantly enhance IDS performance.These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates.展开更多
Z-curve’s encoding and decoding algorithms are primely important in many Z-curve-based applications.The bit interleaving algorithm is the current state-of-the-art algorithm for encoding and decoding Z-curve.Although ...Z-curve’s encoding and decoding algorithms are primely important in many Z-curve-based applications.The bit interleaving algorithm is the current state-of-the-art algorithm for encoding and decoding Z-curve.Although simple,its efficiency is hindered by the step-by-step coordinate shifting and bitwise operations.To tackle this problem,we first propose the efficient encoding algorithm LTFe and the corresponding decoding algorithm LTFd,which adopt two optimization methods to boost the algorithm’s efficiency:1)we design efficient lookup tables(LT)that convert encoding and decoding operations into table-lookup operations;2)we design a bit detection mechanism that skips partial order of a coordinate or a Z-value with consecutive 0s in the front,avoiding unnecessary iterative computations.We propose order-parallel and point-parallel OpenMP-based algorithms to exploit the modern multi-core hardware.Experimental results on discrete,skewed,and real datasets indicate that our point-parallel algorithms can be up to 12.6×faster than the existing algorithms.展开更多
To solve the problem of low detection accuracy for complex weld defects,the paper proposes a weld defects detection method based on improved YOLOv5s.To enhance the ability to focus on key information in feature maps,t...To solve the problem of low detection accuracy for complex weld defects,the paper proposes a weld defects detection method based on improved YOLOv5s.To enhance the ability to focus on key information in feature maps,the scSE attention mechanism is intro-duced into the backbone network of YOLOv5s.A Fusion-Block module and additional layers are added to the neck network of YOLOv5s to improve the effect of feature fusion,which is to meet the needs of complex object detection.To reduce the computation-al complexity of the model,the C3Ghost module is used to replace the CSP2_1 module in the neck network of YOLOv5s.The scSE-ASFF module is constructed and inserted between the neck network and the prediction end,which is to realize the fusion of features between the different layers.To address the issue of imbalanced sample quality in the dataset and improve the regression speed and accuracy of the loss function,the CIoU loss function in the YOLOv5s model is replaced with the Focal-EIoU loss function.Finally,ex-periments are conducted based on the collected weld defect dataset to verify the feasibility of the improved YOLOv5s for weld defects detection.The experimental results show that the precision and mAP of the improved YOLOv5s in detecting complex weld defects are as high as 83.4%and 76.1%,respectively,which are 2.5%and 7.6%higher than the traditional YOLOv5s model.The proposed weld defects detection method based on the improved YOLOv5s in this paper can effectively solve the problem of low weld defects detection accuracy.展开更多
To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-cap...To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-captured images has posed a challenge for traditional target detection methods,especially in identifying small objects in high-resolution images.This study presents an enhanced object detection algorithm based on the Faster Regionbased Convolutional Neural Network(Faster R-CNN)framework,specifically tailored for detecting small-scale electrical components like insulators,shock hammers,and screws in transmission line.The algorithm features an improved backbone network for Faster R-CNN,which significantly boosts the feature extraction network’s ability to detect fine details.The Region Proposal Network is optimized using a method of guided feature refinement(GFR),which achieves a balance between accuracy and speed.The incorporation of Generalized Intersection over Union(GIOU)and Region of Interest(ROI)Align further refines themodel’s accuracy.Experimental results demonstrate a notable improvement in mean Average Precision,reaching 89.3%,an 11.1%increase compared to the standard Faster R-CNN.This highlights the effectiveness of the proposed algorithm in identifying electrical components in high-resolution aerial images.展开更多
With the rapid advancement of visual generative models such as Generative Adversarial Networks(GANs)and stable Diffusion,the creation of highly realistic Deepfake through automated forgery has significantly progressed...With the rapid advancement of visual generative models such as Generative Adversarial Networks(GANs)and stable Diffusion,the creation of highly realistic Deepfake through automated forgery has significantly progressed.This paper examines the advancements inDeepfake detection and defense technologies,emphasizing the shift from passive detection methods to proactive digital watermarking techniques.Passive detection methods,which involve extracting features from images or videos to identify forgeries,encounter challenges such as poor performance against unknown manipulation techniques and susceptibility to counter-forensic tactics.In contrast,proactive digital watermarking techniques embed specificmarkers into images or videos,facilitating real-time detection and traceability,thereby providing a preemptive defense againstDeepfake content.We offer a comprehensive analysis of digitalwatermarking-based forensic techniques,discussing their advantages over passivemethods and highlighting four key benefits:real-time detection,embedded defense,resistance to tampering,and provision of legal evidence.Additionally,the paper identifies gaps in the literature concerning proactive forensic techniques and suggests future research directions,including cross-domain watermarking and adaptive watermarking strategies.By systematically classifying and comparing existing techniques,this review aims to contribute valuable insights for the development of more effective proactive defense strategies in Deepfake forensics.展开更多
基金supported by NSF of China (No.61362010,61661005)NSF of Guangxi (No.2015GXNSFAA139290,2014GXNSFBA118276,2012GXNSFAA053217)
文摘An iterative detection/decoding algorithm of correlated sources for the LDPC-based relay systems is presented. The signal from the source-destination(S-D) link is formulated as a highly correlated counterpart from the relay-destination(R-D) link. A special XOR vector is defined using the correlated hard decision information blocks from two decoders and the extrinsic information exchanged between the two decoders is derived by the log-likelihood ratio(LLR) associated with the XOR vector. Such the decoding scheme is different from the traditional turbo-like detection/decoding algorithm, where the extrinsic information is computed by the side information and the soft decoder outputs. Simulations show that the presented algorithm has a slightly better performance than the traditional turbo-like algorithm(Taking the(255,175) EG-LDPC code as an example, it achieves about 0.1 dB performance gains aroundBLER=10^(-4)). Furthermore, the presented algorithm requires fewer computing operations per iteration and has faster convergence rate. For example, the average iteration of the presented algorithm is 33 at SNR=1.8 dB, which is about twice faster than that of the turbo-like algorithm, when decoding the(961,721) QC-LDPC code. Therefore, the presented decoding algorithm of correlated sources provides an alternative decoding solution for the LDPC-based relay systems.
基金supported in part by the National Key R&D Program of China(Grant No.2023YFB3307604)the Shanxi Province Basic Research Program Youth Science Research Project(Grant Nos.202303021212054 and 202303021212046)+3 种基金the Key Projects Supported by Hebei Natural Science Foundation(Grant No.E2024203125)the National Science Foundation of China(Grant No.52105391)the Hebei Provincial Science and Technology Major Project(Grant No.23280101Z)the National Key Laboratory of Metal Forming Technology and Heavy Equipment Open Fund(Grant No.S2308100.W17).
文摘A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.
基金support for this work was supported by Key Lab of Intelligent and Green Flexographic Printing under Grant ZBKT202301.
文摘Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods.
基金supported in part by the National Natural Science Foundation of China under Grant 62303090,U2330206in part by the Postdoctoral Science Foundation of China under Grant 2023M740516+1 种基金in part by the Natural Science Foundation of Sichuan Province under Grant 2024NSFSC1480in part by the New Cornerstone Science Foundation through the XPLORER PRIZE.
文摘Reliable electricity infrastructure is critical for modern society,highlighting the importance of securing the stability of fundamental power electronic systems.However,as such systems frequently involve high-current and high-voltage conditions,there is a greater likelihood of failures.Consequently,anomaly detection of power electronic systems holds great significance,which is a task that properly-designed neural networks can well undertake,as proven in various scenarios.Transformer-like networks are promising for such application,yet with its structure initially designed for different tasks,features extracted by beginning layers are often lost,decreasing detection performance.Also,such data-driven methods typically require sufficient anomalous data for training,which could be difficult to obtain in practice.Therefore,to improve feature utilization while achieving efficient unsupervised learning,a novel model,Densely-connected Decoder Transformer(DDformer),is proposed for unsupervised anomaly detection of power electronic systems in this paper.First,efficient labelfree training is achieved based on the concept of autoencoder with recursive-free output.An encoder-decoder structure with densely-connected decoder is then adopted,merging features from all encoder layers to avoid possible loss of mined features while reducing training difficulty.Both simulation and real-world experiments are conducted to validate the capabilities of DDformer,and the average FDR has surpassed baseline models,reaching 89.39%,93.91%,95.98%in different experiment setups respectively.
文摘Security and safety remain paramount concerns for both governments and individuals worldwide.In today’s context,the frequency of crimes and terrorist attacks is alarmingly increasing,becoming increasingly intolerable to society.Consequently,there is a pressing need for swift identification of potential threats to preemptively alert law enforcement and security forces,thereby preventing potential attacks or violent incidents.Recent advancements in big data analytics and deep learning have significantly enhanced the capabilities of computer vision in object detection,particularly in identifying firearms.This paper introduces a novel automatic firearm detection surveillance system,utilizing a one-stage detection approach named MARIE(Mechanism for Realtime Identification of Firearms).MARIE incorporates the Single Shot Multibox Detector(SSD)model,which has been specifically optimized to balance the speed-accuracy trade-off critical in firearm detection applications.The SSD model was further refined by integrating MobileNetV2 and InceptionV2 architectures for superior feature extraction capabilities.The experimental results demonstrate that this modified SSD configuration provides highly satisfactory performance,surpassing existing methods trained on the same dataset in terms of the critical speedaccuracy trade-off.Through these innovations,MARIE sets a new standard in surveillance technology,offering a robust solution to enhance public safety effectively.
基金supported by the National Natural Science Foundation of China(No.U21A20290)Guangdong Basic and Applied Basic Research Foundation(No.2022A1515011656)+2 种基金the Projects of Talents Recruitment of GDUPT(No.2023rcyj1003)the 2022“Sail Plan”Project of Maoming Green Chemical Industry Research Institute(No.MMGCIRI2022YFJH-Y-024)Maoming Science and Technology Project(No.2023382).
文摘The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.
文摘In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,particularly in snowy environments,remains a challenge.Snow-covered roads introduce unpredictable surface conditions,occlusions,and reduced visibility,that require robust and adaptive path detection algorithms.This paper presents an enhanced road detection framework for snowy environments,leveraging Simple Framework forContrastive Learning of Visual Representations(SimCLR)for Self-Supervised pretraining,hyperparameter optimization,and uncertainty-aware object detection to improve the performance of YouOnly Look Once version 8(YOLOv8).Themodel is trained and evaluated on a custom-built dataset collected from snowy roads in Tromsø,Norway,which covers a range of snow textures,illumination conditions,and road geometries.The proposed framework achieves scores in terms of mAP@50 equal to 99%and mAP@50–95 equal to 97%,demonstrating the effectiveness of YOLOv8 for real-time road detection in extreme winter conditions.The findings contribute to the safe and reliable deployment of autonomous vehicles in Arctic environments,enabling robust decision-making in hazardous weather conditions.This research lays the groundwork for more resilient perceptionmodels in self-driving systems,paving the way for the future development of intelligent and adaptive transportation networks.
基金the National Natural Science Foundation of China (No. 60472098 and No. 60502046).
文摘By introducing the bit-level multi-stream coded Layered Space-Time (LST) transmitter along with a novel iterative MultiStage Decoding (MSD) at the receiver, the paper shows how to achieve the near-capacity performance of the Multiple-Input Multiple-Output (MIMO) systems with square Quadrature Amplitude Modulation (QAM). In the proposed iterative MSD scheme, the detection at each stage is equivalent to multiuser detection of synchronous Code Division Multiple Access (CDMA) multiuser systems with the aid of the binary representation of the transmitted symbols. Therefore, the optimal Soft-Input Soft-Output (SISO) multiuser detection and low-complexity SISO multiuser detection can be utilized herein. And the proposed scheme with low-complexity SISO multiuser detection has polynomial complexity in the number of transmit antennas M, the number of receive antennas N, and the number of bits per constellation point Me. Simulation results demonstrate that the proposed scheme has similar Bit Error Rate (BER) performance to that of the known Iterative Tree Search (ITS) detection.
基金the Experimental Technology Research Project of Zhejiang University(SYB202138)National Natural Science Foundation of China(32000195)。
文摘With the approval of more and more genetically modified(GM)crops in our country,GM safety management has become more important.Transgenic detection is a major approach for transgenic safety management.Nevertheless,a convenient and visual technique with low equipment requirements and high sensitivity for the field detection of GM plants is still lacking.On the basis of the existing recombinase polymerase amplification(RPA)technique,we developed a multiplex RPA(multi-RPA)method that can simultaneously detect three transgenic elements,including the cauliflower mosaic virus 35S gene(CaMV35S)promoter,neomycin phosphotransferaseⅡgene(NptⅡ)and hygromycin B phosphotransferase gene(Hyg),thus improving the detection rate.Moreover,we coupled this multi-RPA technique with the CRISPR/Cas12a reporter system,which enabled the detection results to be clearly observed by naked eyes under ultraviolet(UV)light(254 nm;which could be achieved by a portable UV flashlight),therefore establishing a multi-RPA visual detection technique.Compared with the traditional test strip detection method,this multi-RPA-CRISPR/Cas12a technique has the higher specificity,higher sensitivity,wider application range and lower cost.Compared with other polymerase chain reaction(PCR)techniques,it also has the advantages of low equipment requirements and visualization,making it a potentially feasible method for the field detection of GM plants.
基金funded by the National Natural Science Foundation of China(Nos.22374055,22022404,22074050,82172055)the National Natural Science Foundation of Hubei Province(No.22022CFA033)the Fundamental Research Funds for the Central Universities(Nos.CCNU24JCPT001,CCNU24JCPT020)。
文摘Plants play a crucial role in maintaining ecological balance and biodiversity.However,plant health is easily affected by environmental stresses.Hence,the rapid and precise monitoring of plant health is crucial for global food security and ecological balance.Currently,traditional detection strategies for monitoring plant health mainly rely on expensive equipment and complex operational procedures,which limit their widespread application.Fortunately,near-infrared(NIR)fluorescence and surface-enhanced Raman scattering(SERS)techniques have been recently highlighted in plants.NIR fluorescence imaging holds the advantages of being non-invasive,high-resolution and real-time,which is suitable for rapid screening in large-scale scenarios.While SERS enables highly sensitive and specific detection of trace chemical substances within plant tissues.Therefore,the complementarity of NIR fluorescence and SERS modalities can provide more comprehensive and accurate information for plant disease diagnosis and growth status monitoring.This article summarizes these two modalities in plant applications,and discusses the advantages of multimodal NIR fluorescence/SERS for a better understanding of a plant’s response to stress,thereby improving the accuracy and sensitivity of detection.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
基金funded by the Undergraduate Higher Education Teaching and Research Project(No.FBJY20230216)Research Projects of Putian University(No.2023043)the Education Department of the Fujian Province Project(No.JAT220300).
文摘In recent years,the number of patientswith colon disease has increased significantly.Colon polyps are the precursor lesions of colon cancer.If not diagnosed in time,they can easily develop into colon cancer,posing a serious threat to patients’lives and health.A colonoscopy is an important means of detecting colon polyps.However,in polyp imaging,due to the large differences and diverse types of polyps in size,shape,color,etc.,traditional detection methods face the problem of high false positive rates,which creates problems for doctors during the diagnosis process.In order to improve the accuracy and efficiency of colon polyp detection,this question proposes a network model suitable for colon polyp detection(PD-YOLO).This method introduces the self-attention mechanism CBAM(Convolutional Block Attention Module)in the backbone layer based on YOLOv7,allowing themodel to adaptively focus on key information and ignore the unimportant parts.To help themodel do a better job of polyp localization and bounding box regression,add the SPD-Conv(Symmetric Positive Definite Convolution)module to the neck layer and use deconvolution instead of upsampling.Theexperimental results indicate that the PD-YOLO algorithm demonstrates strong robustness in colon polyp detection.Compared to the original YOLOv7,on the Kvasir-SEG dataset,PD-YOLO has shown an increase of 5.44 percentage points in AP@0.5,showcasing significant advantages over other mainstream methods.
基金supported by the National Natural Science Foundation of China(No.52188102).
文摘Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.
文摘Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models achieve speed-accuracy trade-offs via fixed convolution kernels and manual feature fusion,their rigid architectures struggle with multi-scale adaptability,as exemplified by YOLOv8n’s 36.4%mAP and 13.9%small-object AP on VisDrone2019.This paper presents YOLO-LE,a lightweight framework addressing these limitations through three novel designs:(1)We introduce the C2f-Dy and LDown modules to enhance the backbone’s sensitivity to small-object features while reducing backbone parameters,thereby improving model efficiency.(2)An adaptive feature fusion module is designed to dynamically integrate multi-scale feature maps,optimizing the neck structure,reducing neck complexity,and enhancing overall model performance.(3)We replace the original loss function with a distributed focal loss and incorporate a lightweight self-attention mechanism to improve small-object recognition and bounding box regression accuracy.Experimental results demonstrate that YOLO-LE achieves 39.9%mAP@0.5 on VisDrone2019,representing a 9.6%improvement over YOLOv8n,while maintaining 8.5 GFLOPs computational efficiency.This provides an efficient solution for UAV object detection in complex scenarios.
基金supported by the National Key R&D Program of China(No.2023YFC3081200)the National Natural Science Foundation of China(No.42077264)。
文摘To map the rock joints in the underground rock mass,a method was proposed to semiautomatically detect the rock joints from borehole imaging logs using a deep learning algorithm.First,450 images containing rock joints were selected from borehole ZKZ01 in the Rumei hydropower station.These images were labeled to establish ground truth which was subdivided into training,validation,and testing data.Second,the YOLO v2 model with optimal parameter settings was constructed.Third,the training and validation data were used for model training,while the test data was used to generate the precision-recall curve for prediction evaluation.Fourth,the trained model was applied to a new borehole ZKZ02 to verify the feasibility of the model.There were 12 rock joints detected from the selected images in borehole ZKZ02 and four geometric parameters for each rock joint were determined by sinusoidal curve fitting.The average precision of the trained model reached 0.87.
文摘The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more efficient and reliable intrusion detection systems(IDSs).However,the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs.Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues.While most of these researchers reported the success of these preprocessing techniques on a shallow level,very few studies have been performed on their effects on a wider scale.Furthermore,the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used,which most of the existing studies give little emphasis on.Thus,this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets:NSL-KDD,UNSW-NB15,and CSE–CIC–IDS2018,and various AI algorithms.A wrapper-based approach,which tends to give superior performance,and min-max normalization methods were used for feature selection and normalization,respectively.Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization.The models were evaluated using popular evaluation metrics in IDS modeling,intra-and inter-model comparisons were performed between models and with state-of-the-art works.Random forest(RF)models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86%and 96.01%,respectively,whereas artificial neural network(ANN)achieved the best accuracy of 95.43%on the CSE–CIC–IDS2018 dataset.The RF models also achieved an excellent performance compared to recent works.The results show that normalization and feature selection positively affect IDS modeling.Furthermore,while feature selection benefits simpler algorithms(such as RF),normalization is more useful for complex algorithms like ANNs and deep neural networks(DNNs),and algorithms such as Naive Bayes are unsuitable for IDS modeling.The study also found that the UNSW-NB15 and CSE–CIC–IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset.Our findings suggest that prioritizing robust algorithms like RF,alongside complex models such as ANN and DNN,can significantly enhance IDS performance.These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates.
基金funded by the Key Project of the Open Fund for Computer Technology Applications in Yunnan under Grant no.CB23031D025A.
文摘Z-curve’s encoding and decoding algorithms are primely important in many Z-curve-based applications.The bit interleaving algorithm is the current state-of-the-art algorithm for encoding and decoding Z-curve.Although simple,its efficiency is hindered by the step-by-step coordinate shifting and bitwise operations.To tackle this problem,we first propose the efficient encoding algorithm LTFe and the corresponding decoding algorithm LTFd,which adopt two optimization methods to boost the algorithm’s efficiency:1)we design efficient lookup tables(LT)that convert encoding and decoding operations into table-lookup operations;2)we design a bit detection mechanism that skips partial order of a coordinate or a Z-value with consecutive 0s in the front,avoiding unnecessary iterative computations.We propose order-parallel and point-parallel OpenMP-based algorithms to exploit the modern multi-core hardware.Experimental results on discrete,skewed,and real datasets indicate that our point-parallel algorithms can be up to 12.6×faster than the existing algorithms.
基金supported by Postgraduate Research&Practice Innovation Program of Jiangsu Province(Grant No.KYCX24_4084).
文摘To solve the problem of low detection accuracy for complex weld defects,the paper proposes a weld defects detection method based on improved YOLOv5s.To enhance the ability to focus on key information in feature maps,the scSE attention mechanism is intro-duced into the backbone network of YOLOv5s.A Fusion-Block module and additional layers are added to the neck network of YOLOv5s to improve the effect of feature fusion,which is to meet the needs of complex object detection.To reduce the computation-al complexity of the model,the C3Ghost module is used to replace the CSP2_1 module in the neck network of YOLOv5s.The scSE-ASFF module is constructed and inserted between the neck network and the prediction end,which is to realize the fusion of features between the different layers.To address the issue of imbalanced sample quality in the dataset and improve the regression speed and accuracy of the loss function,the CIoU loss function in the YOLOv5s model is replaced with the Focal-EIoU loss function.Finally,ex-periments are conducted based on the collected weld defect dataset to verify the feasibility of the improved YOLOv5s for weld defects detection.The experimental results show that the precision and mAP of the improved YOLOv5s in detecting complex weld defects are as high as 83.4%and 76.1%,respectively,which are 2.5%and 7.6%higher than the traditional YOLOv5s model.The proposed weld defects detection method based on the improved YOLOv5s in this paper can effectively solve the problem of low weld defects detection accuracy.
基金supported by the Shanghai Science and Technology Innovation Action Plan High-Tech Field Project(Grant No.22511100601)for the year 2022 and Technology Development Fund for People’s Livelihood Research(Research on Transmission Line Deep Foundation Pit Environmental Situation Awareness System Based on Multi-Source Data).
文摘To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-captured images has posed a challenge for traditional target detection methods,especially in identifying small objects in high-resolution images.This study presents an enhanced object detection algorithm based on the Faster Regionbased Convolutional Neural Network(Faster R-CNN)framework,specifically tailored for detecting small-scale electrical components like insulators,shock hammers,and screws in transmission line.The algorithm features an improved backbone network for Faster R-CNN,which significantly boosts the feature extraction network’s ability to detect fine details.The Region Proposal Network is optimized using a method of guided feature refinement(GFR),which achieves a balance between accuracy and speed.The incorporation of Generalized Intersection over Union(GIOU)and Region of Interest(ROI)Align further refines themodel’s accuracy.Experimental results demonstrate a notable improvement in mean Average Precision,reaching 89.3%,an 11.1%increase compared to the standard Faster R-CNN.This highlights the effectiveness of the proposed algorithm in identifying electrical components in high-resolution aerial images.
基金supported by the National Fund Cultivation Project from China People’s Police University(Grant Number:JJPY202402)National Natural Science Foundation of China(Grant Number:62172165).
文摘With the rapid advancement of visual generative models such as Generative Adversarial Networks(GANs)and stable Diffusion,the creation of highly realistic Deepfake through automated forgery has significantly progressed.This paper examines the advancements inDeepfake detection and defense technologies,emphasizing the shift from passive detection methods to proactive digital watermarking techniques.Passive detection methods,which involve extracting features from images or videos to identify forgeries,encounter challenges such as poor performance against unknown manipulation techniques and susceptibility to counter-forensic tactics.In contrast,proactive digital watermarking techniques embed specificmarkers into images or videos,facilitating real-time detection and traceability,thereby providing a preemptive defense againstDeepfake content.We offer a comprehensive analysis of digitalwatermarking-based forensic techniques,discussing their advantages over passivemethods and highlighting four key benefits:real-time detection,embedded defense,resistance to tampering,and provision of legal evidence.Additionally,the paper identifies gaps in the literature concerning proactive forensic techniques and suggests future research directions,including cross-domain watermarking and adaptive watermarking strategies.By systematically classifying and comparing existing techniques,this review aims to contribute valuable insights for the development of more effective proactive defense strategies in Deepfake forensics.