Diabetic Retinopathy(DR)is a critical disorder that affects the retina due to the constant rise in diabetics and remains the major cause of blindness across the world.Early detection and timely treatment are essential...Diabetic Retinopathy(DR)is a critical disorder that affects the retina due to the constant rise in diabetics and remains the major cause of blindness across the world.Early detection and timely treatment are essential to mitigate the effects of DR,such as retinal damage and vision impairment.Several conventional approaches have been proposed to detect DR early and accurately,but they are limited by data imbalance,interpretability,overfitting,convergence time,and other issues.To address these drawbacks and improve DR detection accurately,a distributed Explainable Convolutional Neural network-enabled Light Gradient Boosting Machine(DE-ExLNN)is proposed in this research.The model combines an explainable Convolutional Neural Network(CNN)and Light Gradient Boosting Machine(LightGBM),achieving highly accurate outcomes in DR detection.LightGBM serves as the detection model,and the inclusion of an explainable CNN addresses issues that conventional CNN classifiers could not resolve.A custom dataset was created for this research,containing both fundus and OCTA images collected from a realtime environment,providing more accurate results compared to standard conventional DR datasets.The custom dataset demonstrates notable accuracy,sensitivity,specificity,and Matthews Correlation Coefficient(MCC)scores,underscoring the effectiveness of this approach.Evaluations against other standard datasets achieved an accuracy of 93.94%,sensitivity of 93.90%,specificity of 93.99%,and MCC of 93.88%for fundus images.For OCTA images,the results obtained an accuracy of 95.30%,sensitivity of 95.50%,specificity of 95.09%,andMCC of 95%.Results prove that the combination of explainable CNN and LightGBMoutperforms othermethods.The inclusion of distributed learning enhances the model’s efficiency by reducing time consumption and complexity while facilitating feature extraction.展开更多
Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for India...Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.展开更多
Traffic signs play a crucial role in managing traffic on the road,disciplining the drivers,thereby preventing injury,property damage,and fatalities.Traffic sign management with automatic detection and recognition is v...Traffic signs play a crucial role in managing traffic on the road,disciplining the drivers,thereby preventing injury,property damage,and fatalities.Traffic sign management with automatic detection and recognition is very much part of any Intelligent Transportation System(ITS).In this era of self-driving vehicles,calls for automatic detection and recognition of traffic signs cannot be overstated.This paper presents a deep-learning-based autonomous scheme for cognizance of traffic signs in India.The automatic traffic sign detection and recognition was conceived on a Convolutional Neural Network(CNN)-Refined Mask R-CNN(RM R-CNN)-based end-to-end learning.The proffered concept was appraised via an innovative dataset comprised of 6480 images that constituted 7056 instances of Indian traffic signs grouped into 87 categories.We present several refinements to the Mask R-CNN model both in architecture and data augmentation.We have considered highly challenging Indian traffic sign categories which are not yet reported in previous works.The dataset for training and testing of the proposed model is obtained by capturing images in real-time on Indian roads.The evaluation results indicate lower than 3%error.Furthermore,RM R-CNN’s performance was compared with the conventional deep neural network architectures such as Fast R-CNN and Mask R-CNN.Our proposed model achieved precision of 97.08%which is higher than precision obtained by Mask R-CNN and Faster RCNN models.展开更多
基金funded by the Centre for Advanced Modelling and Geospatial Information Systems(CAMGIS),Faculty of Engineering and IT,University of Technology Sydneysupported by the Research Funding Program,King Saud University,Riyadh,Saudi Arabia,under Project Ongoing Research Funding program(ORF-2025-14).
文摘Diabetic Retinopathy(DR)is a critical disorder that affects the retina due to the constant rise in diabetics and remains the major cause of blindness across the world.Early detection and timely treatment are essential to mitigate the effects of DR,such as retinal damage and vision impairment.Several conventional approaches have been proposed to detect DR early and accurately,but they are limited by data imbalance,interpretability,overfitting,convergence time,and other issues.To address these drawbacks and improve DR detection accurately,a distributed Explainable Convolutional Neural network-enabled Light Gradient Boosting Machine(DE-ExLNN)is proposed in this research.The model combines an explainable Convolutional Neural Network(CNN)and Light Gradient Boosting Machine(LightGBM),achieving highly accurate outcomes in DR detection.LightGBM serves as the detection model,and the inclusion of an explainable CNN addresses issues that conventional CNN classifiers could not resolve.A custom dataset was created for this research,containing both fundus and OCTA images collected from a realtime environment,providing more accurate results compared to standard conventional DR datasets.The custom dataset demonstrates notable accuracy,sensitivity,specificity,and Matthews Correlation Coefficient(MCC)scores,underscoring the effectiveness of this approach.Evaluations against other standard datasets achieved an accuracy of 93.94%,sensitivity of 93.90%,specificity of 93.99%,and MCC of 93.88%for fundus images.For OCTA images,the results obtained an accuracy of 95.30%,sensitivity of 95.50%,specificity of 95.09%,andMCC of 95%.Results prove that the combination of explainable CNN and LightGBMoutperforms othermethods.The inclusion of distributed learning enhances the model’s efficiency by reducing time consumption and complexity while facilitating feature extraction.
文摘Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.
文摘Traffic signs play a crucial role in managing traffic on the road,disciplining the drivers,thereby preventing injury,property damage,and fatalities.Traffic sign management with automatic detection and recognition is very much part of any Intelligent Transportation System(ITS).In this era of self-driving vehicles,calls for automatic detection and recognition of traffic signs cannot be overstated.This paper presents a deep-learning-based autonomous scheme for cognizance of traffic signs in India.The automatic traffic sign detection and recognition was conceived on a Convolutional Neural Network(CNN)-Refined Mask R-CNN(RM R-CNN)-based end-to-end learning.The proffered concept was appraised via an innovative dataset comprised of 6480 images that constituted 7056 instances of Indian traffic signs grouped into 87 categories.We present several refinements to the Mask R-CNN model both in architecture and data augmentation.We have considered highly challenging Indian traffic sign categories which are not yet reported in previous works.The dataset for training and testing of the proposed model is obtained by capturing images in real-time on Indian roads.The evaluation results indicate lower than 3%error.Furthermore,RM R-CNN’s performance was compared with the conventional deep neural network architectures such as Fast R-CNN and Mask R-CNN.Our proposed model achieved precision of 97.08%which is higher than precision obtained by Mask R-CNN and Faster RCNN models.