Manual inspection of onba earing casting defects is not realistic and unreliable,particularly in the case of some micro-level anomalies which lead to major defects on a large scale.To address these challenges,we propo...Manual inspection of onba earing casting defects is not realistic and unreliable,particularly in the case of some micro-level anomalies which lead to major defects on a large scale.To address these challenges,we propose BearFusionNet,an attention-based deep learning architecture with multi-stream,which merges both DenseNet201 and MobileNetV2 for feature extraction with a classification head inspired by VGG19.This hybrid design,figuratively beaming from one layer to another,extracts the enormity of representations on different scales,backed by a prepreprocessing pipeline that brings defect saliency to the fore through contrast adjustment,denoising,and edge detection.The use of multi-head self-attention enhances feature fusion,enabling the model to capture both large and small spatial features.BearFusionNet achieves an accuracy of 99.66%and Cohen’s kappa score of 0.9929 in Kaggle’s Real-life Industrial Casting Defects dataset.Both McNemar’s and Wilcoxon signed-rank statistical tests,as well as fivefold cross-validation,are employed to assess the robustness of our proposed model.To interpret the model,we adopt Grad-Cam visualizations,which are the state of the art standard.Furthermore,we deploy BearFusionNet as a webbased system for near real-time inference(5-6 s per prediction),which enables the quickest yet accurate detection with visual explanations.Overall,BearFusionNet is an interpretable,accurate,and deployable solution that can automatically detect casting defects,leading to significant advances in the innovative industrial environment.展开更多
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t...Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.展开更多
基金funded by Multimedia University,Cyberjaya,Selangor,Malaysia(Grant Number:PostDoc(MMUI/240029)).
文摘Manual inspection of onba earing casting defects is not realistic and unreliable,particularly in the case of some micro-level anomalies which lead to major defects on a large scale.To address these challenges,we propose BearFusionNet,an attention-based deep learning architecture with multi-stream,which merges both DenseNet201 and MobileNetV2 for feature extraction with a classification head inspired by VGG19.This hybrid design,figuratively beaming from one layer to another,extracts the enormity of representations on different scales,backed by a prepreprocessing pipeline that brings defect saliency to the fore through contrast adjustment,denoising,and edge detection.The use of multi-head self-attention enhances feature fusion,enabling the model to capture both large and small spatial features.BearFusionNet achieves an accuracy of 99.66%and Cohen’s kappa score of 0.9929 in Kaggle’s Real-life Industrial Casting Defects dataset.Both McNemar’s and Wilcoxon signed-rank statistical tests,as well as fivefold cross-validation,are employed to assess the robustness of our proposed model.To interpret the model,we adopt Grad-Cam visualizations,which are the state of the art standard.Furthermore,we deploy BearFusionNet as a webbased system for near real-time inference(5-6 s per prediction),which enables the quickest yet accurate detection with visual explanations.Overall,BearFusionNet is an interpretable,accurate,and deployable solution that can automatically detect casting defects,leading to significant advances in the innovative industrial environment.
文摘Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.