绵羊的姿态与其健康及福利密切相关。随着智能化畜牧业需求的增长,自动、准确地检测绵羊姿态尤为尤为重要。本研究提出基于Mask R-CNN基准网络的新型RDS-Mask R-CNN绵羊姿态检测算法,以Res2Net101作为特征提取网络,同时引入可变形卷积(D...绵羊的姿态与其健康及福利密切相关。随着智能化畜牧业需求的增长,自动、准确地检测绵羊姿态尤为尤为重要。本研究提出基于Mask R-CNN基准网络的新型RDS-Mask R-CNN绵羊姿态检测算法,以Res2Net101作为特征提取网络,同时引入可变形卷积(Deformable convolution network,DCN),以更精准捕捉绵羊在不同位置的姿态特征,并运用软非极大值抑制(Soft non-maximum suppression,Soft NMS)算法实现重叠实例目标的准确分割。结果表明:1)目标检测框架算法对比:与该领域最经典的YOLOv3和Faster R-CNN相比,改进的算法在平均精度均值(Mean average precision,mAP)上分别提升了16.68%和8.64%;2)不同改进策略的算法对比:改进算法相较于基准网络,边界框平均精度均值(Bounding box mean average precision,Bbox mAP)提高6.21%,分割平均精度均值(Segmentation mean average precision,Segm mAP)提高6.61%,分别达到87.34%和81.50%;3)相较于Mask R-CNN,改进模型在识别绵羊站立与躺卧姿态时边界框平均精度(Bounding box average precision,Bbox AP)分别提高了6.84%和5.58%,分割平均精度(Segmentation average precision,Segm AP)分别提高了7.25%和5.17%;4)模型可解释性可视化结果表明RDS-Mask R-CNN能精准捕获绵羊站立和躺卧姿态关键部位深度特征,表明模型自动检测可行且具有可解释性。综上,本研究提出的RDS-Mask R-CNN算法,有效提升了绵羊姿态检测的精准度,为智慧养殖提供了技术支撑。展开更多
Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learni...Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.展开更多
文摘绵羊的姿态与其健康及福利密切相关。随着智能化畜牧业需求的增长,自动、准确地检测绵羊姿态尤为尤为重要。本研究提出基于Mask R-CNN基准网络的新型RDS-Mask R-CNN绵羊姿态检测算法,以Res2Net101作为特征提取网络,同时引入可变形卷积(Deformable convolution network,DCN),以更精准捕捉绵羊在不同位置的姿态特征,并运用软非极大值抑制(Soft non-maximum suppression,Soft NMS)算法实现重叠实例目标的准确分割。结果表明:1)目标检测框架算法对比:与该领域最经典的YOLOv3和Faster R-CNN相比,改进的算法在平均精度均值(Mean average precision,mAP)上分别提升了16.68%和8.64%;2)不同改进策略的算法对比:改进算法相较于基准网络,边界框平均精度均值(Bounding box mean average precision,Bbox mAP)提高6.21%,分割平均精度均值(Segmentation mean average precision,Segm mAP)提高6.61%,分别达到87.34%和81.50%;3)相较于Mask R-CNN,改进模型在识别绵羊站立与躺卧姿态时边界框平均精度(Bounding box average precision,Bbox AP)分别提高了6.84%和5.58%,分割平均精度(Segmentation average precision,Segm AP)分别提高了7.25%和5.17%;4)模型可解释性可视化结果表明RDS-Mask R-CNN能精准捕获绵羊站立和躺卧姿态关键部位深度特征,表明模型自动检测可行且具有可解释性。综上,本研究提出的RDS-Mask R-CNN算法,有效提升了绵羊姿态检测的精准度,为智慧养殖提供了技术支撑。
文摘Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.