A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decod...A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.展开更多
Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,...Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,and interference from contamination.To address these challenges,this paper proposes the Real-time Cable Defect Detection Network(RC2DNet),which achieves an optimal balance between detection accuracy and computational efficiency.Unlike conventional approaches,RC2DNet introduces a small object feature extraction module that enhances the semantic representation of small targets through feature pyramids,multi-level feature fusion,and an adaptive weighting mechanism.Additionally,a boundary feature enhancement module is designed,incorporating boundary-aware convolution,a novel boundary attention mechanism,and an improved loss function to significantly enhance boundary localization accuracy.Experimental results demonstrate that RC2DNet outperforms state-of-the-art methods in precision,recall,F1-score,mean Intersection over Union(mIoU),and frame rate,enabling real-time and highly accurate cable defect detection in complex backgrounds.展开更多
In printed circuit board(PCB)manufacturing,surface defects can significantly affect product quality.To address the performance degradation,high false detection rates,and missed detections caused by complex backgrounds...In printed circuit board(PCB)manufacturing,surface defects can significantly affect product quality.To address the performance degradation,high false detection rates,and missed detections caused by complex backgrounds in current intelligent inspection algorithms,this paper proposes CG-YOLOv8,a lightweight and improved model based on YOLOv8n for PCB surface defect detection.The proposed method optimizes the network architecture and compresses parameters to reduce model complexity while maintaining high detection accuracy,thereby enhancing the capability of identifying diverse defects under complex conditions.Specifically,a cascaded multi-receptive field(CMRF)module is adopted to replace the SPPF module in the backbone to improve feature perception,and an inverted residual mobile block(IRMB)is integrated into the C2f module to further enhance performance.Additionally,conventional convolution layers are replaced with GSConv to reduce computational cost,and a lightweight Convolutional Block Attention Module based Convolution(CBAMConv)module is introduced after Grouped Spatial Convolution(GSConv)to preserve accuracy through attention mechanisms.The detection head is also optimized by removing medium and large-scale detection layers,thereby enhancing the model’s ability to detect small-scale defects and further reducing complexity.Experimental results show that,compared to the original YOLOv8n,the proposed CG-YOLOv8 reduces parameter count by 53.9%,improves mAP@0.5 by 2.2%,and increases precision and recall by 2.0%and 1.8%,respectively.These improvements demonstrate that CG-YOLOv8 offers an efficient and lightweight solution for PCB surface defect detection.展开更多
To address the challenges of high-precision optical surface defect detection,we propose a novel design for a wide-field and broadband light field camera in this work.The proposed system can achieve a 50°field of ...To address the challenges of high-precision optical surface defect detection,we propose a novel design for a wide-field and broadband light field camera in this work.The proposed system can achieve a 50°field of view and operates at both visible and near-infrared wavelengths.Using the principles of light field imaging,the proposed design enables 3D reconstruction of optical surfaces,thus enabling vertical surface height measurements with enhanced accuracy.Using Zemax-based simulations,we evaluate the system’s modulation transfer function,its optical aberrations,and its tolerance to shape variations through Zernike coefficient adjustments.The results demonstrate that this camera can achieve the required spatial resolution while also maintaining high imaging quality and thus offers a promising solution for advanced optical surface defect inspection.展开更多
The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability...The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability,operational efficiency,and security depends on the identification of anomalies in these dynamic and resource-constrained systems.Due to their high computational requirements and inability to efficiently process continuous data streams,traditional anomaly detection techniques often fail in IoT systems.This work presents a resource-efficient adaptive anomaly detection model for real-time streaming data in IoT systems.Extensive experiments were carried out on multiple real-world datasets,achieving an average accuracy score of 96.06%with an execution time close to 7.5 milliseconds for each individual streaming data point,demonstrating its potential for real-time,resourceconstrained applications.The model uses Principal Component Analysis(PCA)for dimensionality reduction and a Z-score technique for anomaly detection.It maintains a low computational footprint with a sliding window mechanism,enabling incremental data processing and identification of both transient and sustained anomalies without storing historical data.The system uses a Multivariate Linear Regression(MLR)based imputation technique that estimates missing or corrupted sensor values,preserving data integrity prior to anomaly detection.The suggested solution is appropriate for many uses in smart cities,industrial automation,environmental monitoring,IoT security,and intelligent transportation systems,and is particularly well-suited for resource-constrained edge devices.展开更多
Automatic analysis of student behavior in classrooms has gained importance with the rise of smart education and vision technologies.However,the limited real-time accuracy of existing methods severely constrains their ...Automatic analysis of student behavior in classrooms has gained importance with the rise of smart education and vision technologies.However,the limited real-time accuracy of existing methods severely constrains their practical classroom deployment.To address this issue of low accuracy,we propose an improved YOLOv11-based detector that integrates CARAFE upsampling,DySnakeConv,DyHead,and SMFA fusion modules.This new model for real-time classroom behavior detection captures fine-grained student behaviors with low latency.Additionally,we have developed a visualization system that presents data through intuitive dashboards.This system enables teachers to dynamically grasp classroom engagement by tracking student participation and involvement.The enhanced YOLOv11 model achieves an mAP@0.5 of 87.2%on the evaluated datasets,surpassing baseline models.This significance lies in two aspects.First,it provides a practical technical route for deployable live classroom behavior monitoring and engagement feedback systems.Second,by integrating this proposed system,educators could make data-informed and fine-grained teaching decisions,ultimately improving instructional quality and learning outcomes.展开更多
The various bioacoustics signals obtained with auscultation contain complex clinical information that has been traditionally used as biomarkers,however,they are not extensively used in clinical studies owing to their ...The various bioacoustics signals obtained with auscultation contain complex clinical information that has been traditionally used as biomarkers,however,they are not extensively used in clinical studies owing to their spatiotemporal limitations.In this study,we developed a wearable stethoscope for wireless,skinattachable,low-power,continuous,real-time auscultation using a lung-sound-monitoring-patch(LSMP).LSMP can monitor respiratory function through a mobile app and classify normal and adventitious breathing by comparing their unique acoustic characteristics.The human heart and breathing sounds from humans can be distinguished from complex sound signals consisting of a mixture of bioacoustic signals and external noise.The performance of the LSMP sensor was further demonstrated in pediatric patients with asthma and elderly chronic obstructive pulmonary disease(COPD)patients where wheezing sounds were classified at specific frequencies.In addition,we developed a novel method for counting wheezing events based on a two-dimensional convolutional neural network deep-learning model constructed de novo and trained with our augmented fundamental lung-sound data set.We implemented a counting algorithm to identify wheezing events in real-time regardless of the respiratory cycle.The artificial intelligence-based adventitious breathing event counter distinguished>80%of the events(especially wheezing)in long-term clinical applications in patients with COPD.展开更多
Deep learning-based intelligent recognition algorithms are increasingly recognized for their potential to address the labor-intensive challenge of manual pest detection.However,their deployment on mobile devices has b...Deep learning-based intelligent recognition algorithms are increasingly recognized for their potential to address the labor-intensive challenge of manual pest detection.However,their deployment on mobile devices has been constrained by high computational demands.Here,we developed GBiDC-PEST,a mobile application that incorporates an improved,lightweight detection algorithm based on the You Only Look Once(YOLO)series singlestage architecture,for real-time detection of four tiny pests(wheat mites,sugarcane aphids,wheat aphids,and rice planthoppers).GBiDC-PEST incorporates several innovative modules,including GhostNet for lightweight feature extraction and architecture optimization by reconstructing the backbone,the bi-directional feature pyramid network(BiFPN)for enhanced multiscale feature fusion,depthwise convolution(DWConv)layers to reduce computational load,and the convolutional block attention module(CBAM)to enable precise feature focus.The newly developed GBiDC-PEST was trained and validated using a multitarget agricultural tiny pest dataset(Tpest-3960)that covered various field environments.GBiDC-PEST(2.8 MB)significantly reduced the model size to only 20%of the original model size,offering a smaller size than the YOLO series(v5-v10),higher detection accuracy than YOLOv10n and v10s,and faster detection speed than v8s,v9c,v10m and v10b.In Android deployment experiments,GBiDCPEST demonstrated enhanced performance in detecting pests against complex backgrounds,and the accuracy for wheat mites and rice planthoppers was improved by 4.5-7.5%compared with the original model.The GBiDC-PEST optimization algorithm and its mobile deployment proposed in this study offer a robust technical framework for the rapid,onsite identification and localization of tiny pests.This advancement provides valuable insights for effective pest monitoring,counting,and control in various agricultural settings.展开更多
To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved Y...To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved YOLOv5s model with adaptive motion compensation.A UAV-view dynamic feature enhancement strategy is innovatively introduced,and a lightweight detection network combining attention mechanisms and multi-scale fusion is constructed.The robustness of tracking under motion blur scenarios is also optimized.Experimental results demonstrate that the proposed method achieves a mAP@0.5 of 68.2%on the VisDrone dataset and reaches an inference speed of 32 FPS on the NVIDIA Jetson TX2 platform.This significantly improves the balance between accuracy and efficiency in complex scenes,offering reliable technical support for real-time applications such as emergency response.展开更多
People with visual impairments face substantial navigation difficulties in residential and unfamiliar indoor spaces.Neither canes nor verbal navigation systems possess adequate features to deliver real-time spatial aw...People with visual impairments face substantial navigation difficulties in residential and unfamiliar indoor spaces.Neither canes nor verbal navigation systems possess adequate features to deliver real-time spatial awareness to users.This research work represents a feasibility study for the wearable IoT-based indoor object detection assistant system architecture that employs a real-time indoor object detection approach to help visually impaired users recognize indoor objects.The system architecture includes four main layers:Wearable Internet of Things(IoT),Network,Cloud,and Indoor Object Detection Layers.The wearable hardware prototype is assembled using a Raspberry Pi 4,while the indoor object detection approach exploits YOLOv11.YOLOv11 represents the cutting edge of deep learning models optimized for both speed and accuracy in recognizing objects and powers the research prototype.In this work,we used a prototype implementation,comparative experiments,and two datasets compiled from Furniture Detection(i.e.,from Roboflow Universe)and Kaggle,which comprises 3000 images evenly distributed across three object categories,including bed,sofa,and table.In the evaluation process,the Raspberry Pi is only used for a feasibility demonstration of real-time inference performance(e.g.,latency and memory consumption)on embedded hardware.We also evaluated YOLOv11 by comparing its performance with other current methodologies,which involved a Convolutional Neural Network(CNN)(MobileNet-Single Shot MultiBox Detector(SSD))model together with the RTDETR Vision Transformer.The experimental results show that YOLOv11 stands out by reaching an average of 99.07%,98.51%,97.96%,and 98.22%for the accuracy,precision,recall,and F1-score,respectively.This feasibility study highlights the effectiveness of Raspberry Pi 4 and YOLOv11 in real-time indoor object detection,paving the way for structured user studies with visually impaired people in the future to evaluate their real-world use and impact.展开更多
BACKGROUND Early detection of precancerous lesions is of vital importance for reducing the incidence and mortality of upper gastrointestinal(UGI)tract cancer.However,traditional endoscopy has certain limitations in de...BACKGROUND Early detection of precancerous lesions is of vital importance for reducing the incidence and mortality of upper gastrointestinal(UGI)tract cancer.However,traditional endoscopy has certain limitations in detecting precancerous lesions.In contrast,real-time computer-aided detection(CAD)systems enhanced by artificial intelligence(AI)systems,although they may increase unnecessary medical procedures,can provide immediate feedback during examination,thereby improving the accuracy of lesion detection.This article aims to conduct a meta-analysis of the diagnostic performance of CAD systems in identifying precancerous lesions of UGI tract cancer during esophagogastroduodenoscopy(EGD),evaluate their potential clinical application value,and determine the direction for further research.AIM To investigate the improvement of the efficiency of EGD examination by the realtime AI-enabled real-time CAD system(AI-CAD)system.METHODS PubMed,EMBASE,Web of Science and Cochrane Library databases were searched by two independent reviewers to retrieve literature with per-patient analysis with a deadline up until April 2025.A meta-analysis was performed with R Studio software(R4.5.0).A random-effects model was used and subgroup analysis was carried out to identify possible sources of heterogeneity.RESULTS The initial search identified 802 articles.According to the inclusion criteria,2113 patients from 10 studies were included in this meta-analysis.The pooled accuracy difference,logarithmic difference of diagnostic odds ratios,sensitivity,specificity and the area under the summary receiver operating characteristic curve(area under the curve)of both AI group and endoscopist group for detecting precancerous lesion were 0.16(95%CI:0.12-0.20),-0.19(95%CI:-0.75-0.37),0.89(95%CI:0.85-0.92,AI group),0.67(95%CI:0.63-0.71,endoscopist group),0.89(95%CI:0.84-0.93,AI group),0.77(95%CI:0.70-0.83,endoscopist group),0.928(95%CI:0.841-0.948,AI group),0.722(95%CI:0.677-0.821,endoscopist group),respectively.CONCLUSION The present studies further provide evidence that the AI-CAD is a reliable endoscopic diagnostic tool that can be used to assist endoscopists in detection of precancerous lesions in the UGI tract.It may be introduced on a large scale for clinical application to enhance the accuracy of detecting precancerous lesions in the UGI tract.展开更多
Objective:To evaluate the clinical efficacy of high-throughput real-time mass spectrometry detection technology for exhaled breath in the rapid diagnosis of pulmonary tuberculosis(PTB),providing a novel technological ...Objective:To evaluate the clinical efficacy of high-throughput real-time mass spectrometry detection technology for exhaled breath in the rapid diagnosis of pulmonary tuberculosis(PTB),providing a novel technological support for early screening and diagnosis of PTB.Methods:A total of 120 PTB patients admitted to a hospital from January 2023 to June 2024 were selected as the case group,and 150 healthy individuals and patients with non-tuberculous pulmonary diseases during the same period were selected as the control group.Exhaled breath samples were collected from all study subjects,and the types and concentrations of volatile organic compounds(VOCs)in the samples were detected using a high-throughput real-time mass spectrometer.A diagnostic model was constructed using machine learning algorithms,and core indicators such as diagnostic sensitivity,specificity,and area under the curve(AUC)of this technology were analyzed and compared with the efficacy of traditional sputum smear examination,sputum culture,and GeneXpert MTB/RIF detection.Results:The diagnostic sensitivity of the high-throughput real-time mass spectrometry diagnostic model for exhaled breath in diagnosing PTB was 92.5%,the specificity was 94.0%,and the AUC was 0.978,which were significantly higher than those of sputum smear examination(sensitivity 58.3%,specificity 90.0%,AUC 0.741).Compared with GeneXpert technology,its specificity was comparable(94.0%vs 93.3%),and the detection time was shortened to less than 15 minutes.The model achieved an accuracy of 91.3%in distinguishing PTB from other pulmonary diseases and was not affected by demographic factors such as age and gender.Conclusion:High-throughput real-time mass spectrometry detection technology for exhaled breath has the advantages of being non-invasive,rapid,highly sensitive,and highly specific,and holds significant clinical application value in the rapid diagnosis and large-scale screening of PTB,warranting further promotion.展开更多
Psychological distress detection plays a critical role in modern healthcare,especially in ambient environments where continuous monitoring is essential for timely intervention.Advances in sensor technology and artific...Psychological distress detection plays a critical role in modern healthcare,especially in ambient environments where continuous monitoring is essential for timely intervention.Advances in sensor technology and artificial intelligence(AI)have enabled the development of systems capable of mental health monitoring using multimodal data.However,existing models often struggle with contextual adaptation and real-time decision-making in dynamic settings.This paper addresses these challenges by proposing TRANS-HEALTH,a hybrid framework that integrates transformer-based inference with Belief-Desire-Intention(BDI)reasoning for real-time psychological distress detection.The framework utilizes a multimodal dataset containing EEG,GSR,heart rate,and activity data to predict distress while adapting to individual contexts.The methodology combines deep learning for robust pattern recognition and symbolic BDI reasoning to enable adaptive decision-making.The novelty of the approach lies in its seamless integration of transformermodelswith BDI reasoning,providing both high accuracy and contextual relevance in real time.Performance metrics such as accuracy,precision,recall,and F1-score are employed to evaluate the system’s performance.The results show that TRANS-HEALTH outperforms existing models,achieving 96.1% accuracy with 4.78 ms latency and significantly reducing false alerts,with an enhanced ability to engage users,making it suitable for deployment in wearable and remote healthcare environments.展开更多
Cardiovascular diseases(CVDs)continue to present a leading cause ofmortalityworldwide,emphasizing the importance of early and accurate prediction.Electrocardiogram(ECG)signals,central to cardiac monitoring,have increa...Cardiovascular diseases(CVDs)continue to present a leading cause ofmortalityworldwide,emphasizing the importance of early and accurate prediction.Electrocardiogram(ECG)signals,central to cardiac monitoring,have increasingly been integratedwithDeep Learning(DL)for real-time prediction of CVDs.However,DL models are prone to performance degradation due to concept drift and to catastrophic forgetting.To address this issue,we propose a realtime CVDs prediction approach,referred to as ADWIN-GFR that combines Convolutional Neural Network(CNN)layers,for spatial feature extraction,with Gated Recurrent Units(GRU),for temporal modeling,alongside adaptive drift detection and mitigation mechanisms.The proposed approach integratesAdaptiveWindowing(ADWIN)for realtime concept drift detection,a fine-tuning strategy based on Generative Features Replay(GFR)to preserve previously acquired knowledge,and a dynamic replay buffer ensuring variance,diversity,and data distribution coverage.Extensive experiments conducted on the MIT-BIH arrhythmia dataset demonstrate that ADWIN-GFR outperforms standard fine-tuning techniques,achieving an average post-drift accuracy of 95.4%,amacro F1-score of 93.9%,and a remarkably low forgetting score of 0.9%.It also exhibits an average drift detection delay of 12 steps and achieves an adaptation gain of 17.2%.These findings underscore the potential of ADWIN-GFR for deployment in real-world cardiac monitoring systems,including wearable ECG devices and hospital-based patient monitoring platforms.展开更多
A first and effective method is proposed to detect weld deject adaptively in various Dypes of real-time X-ray images obtained in different conditions. After weld extraction and noise reduction, a proper template of me...A first and effective method is proposed to detect weld deject adaptively in various Dypes of real-time X-ray images obtained in different conditions. After weld extraction and noise reduction, a proper template of median filter is used to estimate the weld background. After the weld background is subtracted from the original image, an adaptite threshold segmentation algorithm is proposed to obtain the binary image, and then the morphological close and open operation, labeling algorithm and fids'e alarm eliminating algorithm are applied to pracess the binary image to obtain the defect, ct detection result. At last, a fast realization procedure jbr proposed method is developed. The proposed method is tested in real-time X-ray image,s obtairted in different X-ray imaging sutems. Experiment results show that the proposed method is effective to detect low contrast weld dejects with few .false alarms and is adaptive to various types of real-time X-ray imaging systems.展开更多
The quality of the exposed avionics solder joints has a significant impact on the stable operation of the inorbit spacecrafts.Nevertheless,the previously reported inspection methods for multi-scale solder joint defect...The quality of the exposed avionics solder joints has a significant impact on the stable operation of the inorbit spacecrafts.Nevertheless,the previously reported inspection methods for multi-scale solder joint defects generally suffer low accuracy and slow detection speed.Herein,a novel real-time detector VMMAO-YOLO is demonstrated based on variable multi-scale concurrency and multi-depth aggregation network(VMMANet)backbone and“one-stop”global information gather-distribute(OS-GD)module.Combined with infrared thermography technology,it can achieve fast and high-precision detection of both internal and external solder joint defects.Specifically,VMMANet is designed for efficient multi-scale feature extraction,which mainly comprises variable multi-scale feature concurrency(VMC)and multi-depth feature aggregation-alignment(MAA)modules.VMC can extract multi-scale features via multiple fix-sized and deformable convolutions,while MAA can aggregate and align multi-depth features on the same order for feature inference.This allows the low-level features with more spatial details to be transmitted in depth-wise,enabling the deeper network to selectively utilize the preceding inference information.The VMMANet replaces inefficient highdensity deep convolution by increasing the width of intermediate feature levels,leading to a salient decline in parameters.The OS-GD is developed for efficacious feature extraction,aggregation and distribution,further enhancing the global information gather and deployment capability of the network.On a self-made solder joint image data set,the VMMAOYOLO achieves a mean average precision mAP@0.5 of 91.6%,surpassing all the mainstream YOLO-series models.Moreover,the VMMAO-YOLO has a body size of merely 19.3 MB and a detection speed up to 119 frame per second,far superior to the prevalent YOLO-series detectors.展开更多
To solve the problem of low detection accuracy for complex weld defects,the paper proposes a weld defects detection method based on improved YOLOv5s.To enhance the ability to focus on key information in feature maps,t...To solve the problem of low detection accuracy for complex weld defects,the paper proposes a weld defects detection method based on improved YOLOv5s.To enhance the ability to focus on key information in feature maps,the scSE attention mechanism is intro-duced into the backbone network of YOLOv5s.A Fusion-Block module and additional layers are added to the neck network of YOLOv5s to improve the effect of feature fusion,which is to meet the needs of complex object detection.To reduce the computation-al complexity of the model,the C3Ghost module is used to replace the CSP2_1 module in the neck network of YOLOv5s.The scSE-ASFF module is constructed and inserted between the neck network and the prediction end,which is to realize the fusion of features between the different layers.To address the issue of imbalanced sample quality in the dataset and improve the regression speed and accuracy of the loss function,the CIoU loss function in the YOLOv5s model is replaced with the Focal-EIoU loss function.Finally,ex-periments are conducted based on the collected weld defect dataset to verify the feasibility of the improved YOLOv5s for weld defects detection.The experimental results show that the precision and mAP of the improved YOLOv5s in detecting complex weld defects are as high as 83.4%and 76.1%,respectively,which are 2.5%and 7.6%higher than the traditional YOLOv5s model.The proposed weld defects detection method based on the improved YOLOv5s in this paper can effectively solve the problem of low weld defects detection accuracy.展开更多
Real-time detection for object size has now become a hot topic in the testing field and image processing is the core algorithm. This paper focuses on the processing and display of the collected dynamic images to achie...Real-time detection for object size has now become a hot topic in the testing field and image processing is the core algorithm. This paper focuses on the processing and display of the collected dynamic images to achieve a real-time image pro- cessing for the moving objects. Firstly, the median filtering, gain calibration, image segmentation, image binarization, cor- ner detection and edge fitting are employed to process the images of the moving objects to make the image close to the real object. Then, the processed images are simultaneously displayed on a real-time basis to make it easier to analyze, understand and identify them, and thus it reduces the computation complexity. Finally, human-computer interaction (HCI)-friendly in- terface based on VC ++ is designed to accomplish the digital logic transform, image processing and real-time display of the objects. The experiment shows that the proposed algorithm and software design have better real-time performance and accu- racy which can meet the industrial needs.展开更多
Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order t...Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order to solve this problem,we propose a new method,which combined the lightweight network mobile vision transformer(Mobile Vi T)with the convolutional block attention module(CBAM)mechanism and the new regression loss function.This method needed less computation resources,making it more suitable for embedded edge detection devices.Meanwhile,the new loss function improved the positioning accuracy of the bounding box and enhanced the robustness of the model.In addition,experiments on public datasets demonstrate that the improved model achieves an average accuracy of 87.9%across six typical defect detection tasks,while reducing computational costs by nearly 90%.It significantly reduces the model's computational requirements while maintaining accuracy,ensuring reliable performance for edge deployment.展开更多
This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as o...This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel ViT.Leveraging awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and understanding.These techniques mitigated overfitting,stabilized training,and improved generalization capabilities.The LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,respectively.The findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature extraction.The additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial applications.For instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often misclassify.This study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are key.Future research may focus on enhancing LMViT’s computational efficiency for deployment in real-time quality control systems.展开更多
基金supported in part by the National Key R&D Program of China(Grant No.2023YFB3307604)the Shanxi Province Basic Research Program Youth Science Research Project(Grant Nos.202303021212054 and 202303021212046)+3 种基金the Key Projects Supported by Hebei Natural Science Foundation(Grant No.E2024203125)the National Science Foundation of China(Grant No.52105391)the Hebei Provincial Science and Technology Major Project(Grant No.23280101Z)the National Key Laboratory of Metal Forming Technology and Heavy Equipment Open Fund(Grant No.S2308100.W17).
文摘A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.
基金supported by the National Natural Science Foundation of China under Grant 62306128the Basic Science Research Project of Jiangsu Provincial Department of Education under Grant 23KJD520003the Leading Innovation Project of Changzhou Science and Technology Bureau under Grant CQ20230072.
文摘Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,and interference from contamination.To address these challenges,this paper proposes the Real-time Cable Defect Detection Network(RC2DNet),which achieves an optimal balance between detection accuracy and computational efficiency.Unlike conventional approaches,RC2DNet introduces a small object feature extraction module that enhances the semantic representation of small targets through feature pyramids,multi-level feature fusion,and an adaptive weighting mechanism.Additionally,a boundary feature enhancement module is designed,incorporating boundary-aware convolution,a novel boundary attention mechanism,and an improved loss function to significantly enhance boundary localization accuracy.Experimental results demonstrate that RC2DNet outperforms state-of-the-art methods in precision,recall,F1-score,mean Intersection over Union(mIoU),and frame rate,enabling real-time and highly accurate cable defect detection in complex backgrounds.
基金funded by the Joint Funds of the National Natural Science Foundation of China(U2341223)the Beijing Municipal Natural Science Foundation(No.4232067).
文摘In printed circuit board(PCB)manufacturing,surface defects can significantly affect product quality.To address the performance degradation,high false detection rates,and missed detections caused by complex backgrounds in current intelligent inspection algorithms,this paper proposes CG-YOLOv8,a lightweight and improved model based on YOLOv8n for PCB surface defect detection.The proposed method optimizes the network architecture and compresses parameters to reduce model complexity while maintaining high detection accuracy,thereby enhancing the capability of identifying diverse defects under complex conditions.Specifically,a cascaded multi-receptive field(CMRF)module is adopted to replace the SPPF module in the backbone to improve feature perception,and an inverted residual mobile block(IRMB)is integrated into the C2f module to further enhance performance.Additionally,conventional convolution layers are replaced with GSConv to reduce computational cost,and a lightweight Convolutional Block Attention Module based Convolution(CBAMConv)module is introduced after Grouped Spatial Convolution(GSConv)to preserve accuracy through attention mechanisms.The detection head is also optimized by removing medium and large-scale detection layers,thereby enhancing the model’s ability to detect small-scale defects and further reducing complexity.Experimental results show that,compared to the original YOLOv8n,the proposed CG-YOLOv8 reduces parameter count by 53.9%,improves mAP@0.5 by 2.2%,and increases precision and recall by 2.0%and 1.8%,respectively.These improvements demonstrate that CG-YOLOv8 offers an efficient and lightweight solution for PCB surface defect detection.
基金supported by the Jilin Science and Technology Development Plan(20240101029JJ)the following study:synchronized high-speed detection of surface shape and defects in the grinding stage of complex surfaces(KLMSZZ202305)+3 种基金for the high-precision wide dynamic large aperture optical inspection system for fine astronomical observation by the National Major Research Instrument Development Project(62127901)for ultrasmooth manufacturing technology of large diameter complex curved surface by the National Key R&D Program(2022YFB3403405)for research on the key technology of rapid synchronous detection of surface shape and subsurface defects in the grinding stage of large diameter complex surfaces by the International Cooperation Project(2025010157)The Key Laboratory of Optical System Advanced Manufacturing Technology,Chinese Academy of Sciences(2022KLOMT02-04)also supported this study.
文摘To address the challenges of high-precision optical surface defect detection,we propose a novel design for a wide-field and broadband light field camera in this work.The proposed system can achieve a 50°field of view and operates at both visible and near-infrared wavelengths.Using the principles of light field imaging,the proposed design enables 3D reconstruction of optical surfaces,thus enabling vertical surface height measurements with enhanced accuracy.Using Zemax-based simulations,we evaluate the system’s modulation transfer function,its optical aberrations,and its tolerance to shape variations through Zernike coefficient adjustments.The results demonstrate that this camera can achieve the required spatial resolution while also maintaining high imaging quality and thus offers a promising solution for advanced optical surface defect inspection.
基金funded by the Ongoing Research Funding Program(ORF-2025-890)King Saud University,Riyadh,Saudi Arabia and was supported by the Competitive Research Fund of theUniversity of Aizu,Japan.
文摘The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability,operational efficiency,and security depends on the identification of anomalies in these dynamic and resource-constrained systems.Due to their high computational requirements and inability to efficiently process continuous data streams,traditional anomaly detection techniques often fail in IoT systems.This work presents a resource-efficient adaptive anomaly detection model for real-time streaming data in IoT systems.Extensive experiments were carried out on multiple real-world datasets,achieving an average accuracy score of 96.06%with an execution time close to 7.5 milliseconds for each individual streaming data point,demonstrating its potential for real-time,resourceconstrained applications.The model uses Principal Component Analysis(PCA)for dimensionality reduction and a Z-score technique for anomaly detection.It maintains a low computational footprint with a sliding window mechanism,enabling incremental data processing and identification of both transient and sustained anomalies without storing historical data.The system uses a Multivariate Linear Regression(MLR)based imputation technique that estimates missing or corrupted sensor values,preserving data integrity prior to anomaly detection.The suggested solution is appropriate for many uses in smart cities,industrial automation,environmental monitoring,IoT security,and intelligent transportation systems,and is particularly well-suited for resource-constrained edge devices.
文摘Automatic analysis of student behavior in classrooms has gained importance with the rise of smart education and vision technologies.However,the limited real-time accuracy of existing methods severely constrains their practical classroom deployment.To address this issue of low accuracy,we propose an improved YOLOv11-based detector that integrates CARAFE upsampling,DySnakeConv,DyHead,and SMFA fusion modules.This new model for real-time classroom behavior detection captures fine-grained student behaviors with low latency.Additionally,we have developed a visualization system that presents data through intuitive dashboards.This system enables teachers to dynamically grasp classroom engagement by tracking student participation and involvement.The enhanced YOLOv11 model achieves an mAP@0.5 of 87.2%on the evaluated datasets,surpassing baseline models.This significance lies in two aspects.First,it provides a practical technical route for deployable live classroom behavior monitoring and engagement feedback systems.Second,by integrating this proposed system,educators could make data-informed and fine-grained teaching decisions,ultimately improving instructional quality and learning outcomes.
基金supported by the Korea Environment Industry&Technology Institute(KEITI)through Digital Infrastructure Building Project for Monitoring,Surveying and Evaluating the Environmental Health program,funded by the Korea Ministry of Environment(MOE)(2021003330008)supported by the KIST Internal program(2E32851)+1 种基金supported by the Korea Health Technology Research and Development(R&D)Project through the Korea Health Industry Development Institute(KHIDI)and Korea Dementia Research Center(KDRC),funded by the Ministry of Health&Welfare and Ministry of Science and ICT,Republic of Korea(HU20C0164)the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(2022R1A6A3A01087298)。
文摘The various bioacoustics signals obtained with auscultation contain complex clinical information that has been traditionally used as biomarkers,however,they are not extensively used in clinical studies owing to their spatiotemporal limitations.In this study,we developed a wearable stethoscope for wireless,skinattachable,low-power,continuous,real-time auscultation using a lung-sound-monitoring-patch(LSMP).LSMP can monitor respiratory function through a mobile app and classify normal and adventitious breathing by comparing their unique acoustic characteristics.The human heart and breathing sounds from humans can be distinguished from complex sound signals consisting of a mixture of bioacoustic signals and external noise.The performance of the LSMP sensor was further demonstrated in pediatric patients with asthma and elderly chronic obstructive pulmonary disease(COPD)patients where wheezing sounds were classified at specific frequencies.In addition,we developed a novel method for counting wheezing events based on a two-dimensional convolutional neural network deep-learning model constructed de novo and trained with our augmented fundamental lung-sound data set.We implemented a counting algorithm to identify wheezing events in real-time regardless of the respiratory cycle.The artificial intelligence-based adventitious breathing event counter distinguished>80%of the events(especially wheezing)in long-term clinical applications in patients with COPD.
基金support of the Natural Science Foundation of Jiangsu Province,China(BK20240977)the China Scholarship Council(201606850024)+1 种基金the National High Technology Research and Development Program of China(2016YFD0701003)the Postgraduate Research&Practice Innovation Program of Jiangsu Province,China(SJCX23_1488)。
文摘Deep learning-based intelligent recognition algorithms are increasingly recognized for their potential to address the labor-intensive challenge of manual pest detection.However,their deployment on mobile devices has been constrained by high computational demands.Here,we developed GBiDC-PEST,a mobile application that incorporates an improved,lightweight detection algorithm based on the You Only Look Once(YOLO)series singlestage architecture,for real-time detection of four tiny pests(wheat mites,sugarcane aphids,wheat aphids,and rice planthoppers).GBiDC-PEST incorporates several innovative modules,including GhostNet for lightweight feature extraction and architecture optimization by reconstructing the backbone,the bi-directional feature pyramid network(BiFPN)for enhanced multiscale feature fusion,depthwise convolution(DWConv)layers to reduce computational load,and the convolutional block attention module(CBAM)to enable precise feature focus.The newly developed GBiDC-PEST was trained and validated using a multitarget agricultural tiny pest dataset(Tpest-3960)that covered various field environments.GBiDC-PEST(2.8 MB)significantly reduced the model size to only 20%of the original model size,offering a smaller size than the YOLO series(v5-v10),higher detection accuracy than YOLOv10n and v10s,and faster detection speed than v8s,v9c,v10m and v10b.In Android deployment experiments,GBiDCPEST demonstrated enhanced performance in detecting pests against complex backgrounds,and the accuracy for wheat mites and rice planthoppers was improved by 4.5-7.5%compared with the original model.The GBiDC-PEST optimization algorithm and its mobile deployment proposed in this study offer a robust technical framework for the rapid,onsite identification and localization of tiny pests.This advancement provides valuable insights for effective pest monitoring,counting,and control in various agricultural settings.
文摘To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved YOLOv5s model with adaptive motion compensation.A UAV-view dynamic feature enhancement strategy is innovatively introduced,and a lightweight detection network combining attention mechanisms and multi-scale fusion is constructed.The robustness of tracking under motion blur scenarios is also optimized.Experimental results demonstrate that the proposed method achieves a mAP@0.5 of 68.2%on the VisDrone dataset and reaches an inference speed of 32 FPS on the NVIDIA Jetson TX2 platform.This significantly improves the balance between accuracy and efficiency in complex scenes,offering reliable technical support for real-time applications such as emergency response.
基金funded by the King Salman Center for Disability Research through Research Group No.KSRG-2024-140.
文摘People with visual impairments face substantial navigation difficulties in residential and unfamiliar indoor spaces.Neither canes nor verbal navigation systems possess adequate features to deliver real-time spatial awareness to users.This research work represents a feasibility study for the wearable IoT-based indoor object detection assistant system architecture that employs a real-time indoor object detection approach to help visually impaired users recognize indoor objects.The system architecture includes four main layers:Wearable Internet of Things(IoT),Network,Cloud,and Indoor Object Detection Layers.The wearable hardware prototype is assembled using a Raspberry Pi 4,while the indoor object detection approach exploits YOLOv11.YOLOv11 represents the cutting edge of deep learning models optimized for both speed and accuracy in recognizing objects and powers the research prototype.In this work,we used a prototype implementation,comparative experiments,and two datasets compiled from Furniture Detection(i.e.,from Roboflow Universe)and Kaggle,which comprises 3000 images evenly distributed across three object categories,including bed,sofa,and table.In the evaluation process,the Raspberry Pi is only used for a feasibility demonstration of real-time inference performance(e.g.,latency and memory consumption)on embedded hardware.We also evaluated YOLOv11 by comparing its performance with other current methodologies,which involved a Convolutional Neural Network(CNN)(MobileNet-Single Shot MultiBox Detector(SSD))model together with the RTDETR Vision Transformer.The experimental results show that YOLOv11 stands out by reaching an average of 99.07%,98.51%,97.96%,and 98.22%for the accuracy,precision,recall,and F1-score,respectively.This feasibility study highlights the effectiveness of Raspberry Pi 4 and YOLOv11 in real-time indoor object detection,paving the way for structured user studies with visually impaired people in the future to evaluate their real-world use and impact.
文摘BACKGROUND Early detection of precancerous lesions is of vital importance for reducing the incidence and mortality of upper gastrointestinal(UGI)tract cancer.However,traditional endoscopy has certain limitations in detecting precancerous lesions.In contrast,real-time computer-aided detection(CAD)systems enhanced by artificial intelligence(AI)systems,although they may increase unnecessary medical procedures,can provide immediate feedback during examination,thereby improving the accuracy of lesion detection.This article aims to conduct a meta-analysis of the diagnostic performance of CAD systems in identifying precancerous lesions of UGI tract cancer during esophagogastroduodenoscopy(EGD),evaluate their potential clinical application value,and determine the direction for further research.AIM To investigate the improvement of the efficiency of EGD examination by the realtime AI-enabled real-time CAD system(AI-CAD)system.METHODS PubMed,EMBASE,Web of Science and Cochrane Library databases were searched by two independent reviewers to retrieve literature with per-patient analysis with a deadline up until April 2025.A meta-analysis was performed with R Studio software(R4.5.0).A random-effects model was used and subgroup analysis was carried out to identify possible sources of heterogeneity.RESULTS The initial search identified 802 articles.According to the inclusion criteria,2113 patients from 10 studies were included in this meta-analysis.The pooled accuracy difference,logarithmic difference of diagnostic odds ratios,sensitivity,specificity and the area under the summary receiver operating characteristic curve(area under the curve)of both AI group and endoscopist group for detecting precancerous lesion were 0.16(95%CI:0.12-0.20),-0.19(95%CI:-0.75-0.37),0.89(95%CI:0.85-0.92,AI group),0.67(95%CI:0.63-0.71,endoscopist group),0.89(95%CI:0.84-0.93,AI group),0.77(95%CI:0.70-0.83,endoscopist group),0.928(95%CI:0.841-0.948,AI group),0.722(95%CI:0.677-0.821,endoscopist group),respectively.CONCLUSION The present studies further provide evidence that the AI-CAD is a reliable endoscopic diagnostic tool that can be used to assist endoscopists in detection of precancerous lesions in the UGI tract.It may be introduced on a large scale for clinical application to enhance the accuracy of detecting precancerous lesions in the UGI tract.
基金Science and Technology Plan of Heilongjiang Provincial Health Commission,Study on the Efficacy of High-Throughput Real-Time Mass Spectrometry Detection of Exhaled Breath for Rapid Diagnosis of Pulmonary Tuberculosis(Project No.:20230303110014)。
文摘Objective:To evaluate the clinical efficacy of high-throughput real-time mass spectrometry detection technology for exhaled breath in the rapid diagnosis of pulmonary tuberculosis(PTB),providing a novel technological support for early screening and diagnosis of PTB.Methods:A total of 120 PTB patients admitted to a hospital from January 2023 to June 2024 were selected as the case group,and 150 healthy individuals and patients with non-tuberculous pulmonary diseases during the same period were selected as the control group.Exhaled breath samples were collected from all study subjects,and the types and concentrations of volatile organic compounds(VOCs)in the samples were detected using a high-throughput real-time mass spectrometer.A diagnostic model was constructed using machine learning algorithms,and core indicators such as diagnostic sensitivity,specificity,and area under the curve(AUC)of this technology were analyzed and compared with the efficacy of traditional sputum smear examination,sputum culture,and GeneXpert MTB/RIF detection.Results:The diagnostic sensitivity of the high-throughput real-time mass spectrometry diagnostic model for exhaled breath in diagnosing PTB was 92.5%,the specificity was 94.0%,and the AUC was 0.978,which were significantly higher than those of sputum smear examination(sensitivity 58.3%,specificity 90.0%,AUC 0.741).Compared with GeneXpert technology,its specificity was comparable(94.0%vs 93.3%),and the detection time was shortened to less than 15 minutes.The model achieved an accuracy of 91.3%in distinguishing PTB from other pulmonary diseases and was not affected by demographic factors such as age and gender.Conclusion:High-throughput real-time mass spectrometry detection technology for exhaled breath has the advantages of being non-invasive,rapid,highly sensitive,and highly specific,and holds significant clinical application value in the rapid diagnosis and large-scale screening of PTB,warranting further promotion.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R435),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Psychological distress detection plays a critical role in modern healthcare,especially in ambient environments where continuous monitoring is essential for timely intervention.Advances in sensor technology and artificial intelligence(AI)have enabled the development of systems capable of mental health monitoring using multimodal data.However,existing models often struggle with contextual adaptation and real-time decision-making in dynamic settings.This paper addresses these challenges by proposing TRANS-HEALTH,a hybrid framework that integrates transformer-based inference with Belief-Desire-Intention(BDI)reasoning for real-time psychological distress detection.The framework utilizes a multimodal dataset containing EEG,GSR,heart rate,and activity data to predict distress while adapting to individual contexts.The methodology combines deep learning for robust pattern recognition and symbolic BDI reasoning to enable adaptive decision-making.The novelty of the approach lies in its seamless integration of transformermodelswith BDI reasoning,providing both high accuracy and contextual relevance in real time.Performance metrics such as accuracy,precision,recall,and F1-score are employed to evaluate the system’s performance.The results show that TRANS-HEALTH outperforms existing models,achieving 96.1% accuracy with 4.78 ms latency and significantly reducing false alerts,with an enhanced ability to engage users,making it suitable for deployment in wearable and remote healthcare environments.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R196)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Cardiovascular diseases(CVDs)continue to present a leading cause ofmortalityworldwide,emphasizing the importance of early and accurate prediction.Electrocardiogram(ECG)signals,central to cardiac monitoring,have increasingly been integratedwithDeep Learning(DL)for real-time prediction of CVDs.However,DL models are prone to performance degradation due to concept drift and to catastrophic forgetting.To address this issue,we propose a realtime CVDs prediction approach,referred to as ADWIN-GFR that combines Convolutional Neural Network(CNN)layers,for spatial feature extraction,with Gated Recurrent Units(GRU),for temporal modeling,alongside adaptive drift detection and mitigation mechanisms.The proposed approach integratesAdaptiveWindowing(ADWIN)for realtime concept drift detection,a fine-tuning strategy based on Generative Features Replay(GFR)to preserve previously acquired knowledge,and a dynamic replay buffer ensuring variance,diversity,and data distribution coverage.Extensive experiments conducted on the MIT-BIH arrhythmia dataset demonstrate that ADWIN-GFR outperforms standard fine-tuning techniques,achieving an average post-drift accuracy of 95.4%,amacro F1-score of 93.9%,and a remarkably low forgetting score of 0.9%.It also exhibits an average drift detection delay of 12 steps and achieves an adaptation gain of 17.2%.These findings underscore the potential of ADWIN-GFR for deployment in real-world cardiac monitoring systems,including wearable ECG devices and hospital-based patient monitoring platforms.
文摘A first and effective method is proposed to detect weld deject adaptively in various Dypes of real-time X-ray images obtained in different conditions. After weld extraction and noise reduction, a proper template of median filter is used to estimate the weld background. After the weld background is subtracted from the original image, an adaptite threshold segmentation algorithm is proposed to obtain the binary image, and then the morphological close and open operation, labeling algorithm and fids'e alarm eliminating algorithm are applied to pracess the binary image to obtain the defect, ct detection result. At last, a fast realization procedure jbr proposed method is developed. The proposed method is tested in real-time X-ray image,s obtairted in different X-ray imaging sutems. Experiment results show that the proposed method is effective to detect low contrast weld dejects with few .false alarms and is adaptive to various types of real-time X-ray imaging systems.
基金supported by the National Natural Science Foundation of China(Grant No.52305623)the Natural Science Foundation of Hubei Province,China(Grant No.2022CFB589)the Natural Science Foundation of Chongqing,China(Grant No.CSTB2023NSCQ-MSX0636).
文摘The quality of the exposed avionics solder joints has a significant impact on the stable operation of the inorbit spacecrafts.Nevertheless,the previously reported inspection methods for multi-scale solder joint defects generally suffer low accuracy and slow detection speed.Herein,a novel real-time detector VMMAO-YOLO is demonstrated based on variable multi-scale concurrency and multi-depth aggregation network(VMMANet)backbone and“one-stop”global information gather-distribute(OS-GD)module.Combined with infrared thermography technology,it can achieve fast and high-precision detection of both internal and external solder joint defects.Specifically,VMMANet is designed for efficient multi-scale feature extraction,which mainly comprises variable multi-scale feature concurrency(VMC)and multi-depth feature aggregation-alignment(MAA)modules.VMC can extract multi-scale features via multiple fix-sized and deformable convolutions,while MAA can aggregate and align multi-depth features on the same order for feature inference.This allows the low-level features with more spatial details to be transmitted in depth-wise,enabling the deeper network to selectively utilize the preceding inference information.The VMMANet replaces inefficient highdensity deep convolution by increasing the width of intermediate feature levels,leading to a salient decline in parameters.The OS-GD is developed for efficacious feature extraction,aggregation and distribution,further enhancing the global information gather and deployment capability of the network.On a self-made solder joint image data set,the VMMAOYOLO achieves a mean average precision mAP@0.5 of 91.6%,surpassing all the mainstream YOLO-series models.Moreover,the VMMAO-YOLO has a body size of merely 19.3 MB and a detection speed up to 119 frame per second,far superior to the prevalent YOLO-series detectors.
基金supported by Postgraduate Research&Practice Innovation Program of Jiangsu Province(Grant No.KYCX24_4084).
文摘To solve the problem of low detection accuracy for complex weld defects,the paper proposes a weld defects detection method based on improved YOLOv5s.To enhance the ability to focus on key information in feature maps,the scSE attention mechanism is intro-duced into the backbone network of YOLOv5s.A Fusion-Block module and additional layers are added to the neck network of YOLOv5s to improve the effect of feature fusion,which is to meet the needs of complex object detection.To reduce the computation-al complexity of the model,the C3Ghost module is used to replace the CSP2_1 module in the neck network of YOLOv5s.The scSE-ASFF module is constructed and inserted between the neck network and the prediction end,which is to realize the fusion of features between the different layers.To address the issue of imbalanced sample quality in the dataset and improve the regression speed and accuracy of the loss function,the CIoU loss function in the YOLOv5s model is replaced with the Focal-EIoU loss function.Finally,ex-periments are conducted based on the collected weld defect dataset to verify the feasibility of the improved YOLOv5s for weld defects detection.The experimental results show that the precision and mAP of the improved YOLOv5s in detecting complex weld defects are as high as 83.4%and 76.1%,respectively,which are 2.5%and 7.6%higher than the traditional YOLOv5s model.The proposed weld defects detection method based on the improved YOLOv5s in this paper can effectively solve the problem of low weld defects detection accuracy.
基金National Natural Science Foundation of China(No.61302159,61227003,61301259)Natual Science Foundation of Shanxi Province(No.2012021011-2)+2 种基金Specialized Research Fund for the Doctoral Program of Higher Education,China(No.20121420110006)Top Science and Technology Innovation Teams of Higher Learning Institutions of Shanxi Province,ChinaProject Sponsored by Scientific Research for the Returned Overseas Chinese Scholars,Shanxi Province(No.2013-083)
文摘Real-time detection for object size has now become a hot topic in the testing field and image processing is the core algorithm. This paper focuses on the processing and display of the collected dynamic images to achieve a real-time image pro- cessing for the moving objects. Firstly, the median filtering, gain calibration, image segmentation, image binarization, cor- ner detection and edge fitting are employed to process the images of the moving objects to make the image close to the real object. Then, the processed images are simultaneously displayed on a real-time basis to make it easier to analyze, understand and identify them, and thus it reduces the computation complexity. Finally, human-computer interaction (HCI)-friendly in- terface based on VC ++ is designed to accomplish the digital logic transform, image processing and real-time display of the objects. The experiment shows that the proposed algorithm and software design have better real-time performance and accu- racy which can meet the industrial needs.
基金supported by the National Natural Science Foundation of China(Nos.62373215,62373219 and 62073193)the Natural Science Foundation of Shandong Province(No.ZR2023MF100)+1 种基金the Key Projects of the Ministry of Industry and Information Technology(No.TC220H057-2022)the Independently Developed Instrument Funds of Shandong University(No.zy20240201)。
文摘Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order to solve this problem,we propose a new method,which combined the lightweight network mobile vision transformer(Mobile Vi T)with the convolutional block attention module(CBAM)mechanism and the new regression loss function.This method needed less computation resources,making it more suitable for embedded edge detection devices.Meanwhile,the new loss function improved the positioning accuracy of the bounding box and enhanced the robustness of the model.In addition,experiments on public datasets demonstrate that the improved model achieves an average accuracy of 87.9%across six typical defect detection tasks,while reducing computational costs by nearly 90%.It significantly reduces the model's computational requirements while maintaining accuracy,ensuring reliable performance for edge deployment.
基金funded by Woosong University Academic Research 2024.
文摘This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel ViT.Leveraging awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and understanding.These techniques mitigated overfitting,stabilized training,and improved generalization capabilities.The LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,respectively.The findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature extraction.The additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial applications.For instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often misclassify.This study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are key.Future research may focus on enhancing LMViT’s computational efficiency for deployment in real-time quality control systems.