Reliable human action recognition(HAR)in video sequences is critical for a wide range of applications,such as security surveillance,healthcare monitoring,and human-computer interaction.Several automated systems have b...Reliable human action recognition(HAR)in video sequences is critical for a wide range of applications,such as security surveillance,healthcare monitoring,and human-computer interaction.Several automated systems have been designed for this purpose;however,existing methods often struggle to effectively integrate spatial and temporal information from input samples such as 2-stream networks or 3D convolutional neural networks(CNNs),which limits their accuracy in discriminating numerous human actions.Therefore,this study introduces a novel deeplearning framework called theARNet,designed for robustHAR.ARNet consists of two mainmodules,namely,a refined InceptionResNet-V2-based CNN and a Bi-LSTM(Long Short-Term Memory)network.The refined InceptionResNet-V2 employs a parametric rectified linear unit(PReLU)activation strategy within convolutional layers to enhance spatial feature extraction fromindividual video frames.The inclusion of the PReLUmethod improves the spatial informationcapturing ability of the approach as it uses learnable parameters to adaptively control the slope of the negative part of the activation function,allowing richer gradient flow during backpropagation and resulting in robust information capturing and stable model training.These spatial features holding essential pixel characteristics are then processed by the Bi-LSTMmodule for temporal analysis,which assists the ARNet in understanding the dynamic behavior of actions over time.The ARNet integrates three additional dense layers after the Bi-LSTM module to ensure a comprehensive computation of both spatial and temporal patterns and further boost the feature representation.The experimental validation of the model is conducted on 3 benchmark datasets named HMDB51,KTH,and UCF Sports and reports accuracies of 93.82%,99%,and 99.16%,respectively.The Precision results of HMDB51,KTH,and UCF Sports datasets are 97.41%,99.54%,and 99.01%;the Recall values are 98.87%,98.60%,99.08%,and the F1-Score is 98.13%,99.07%,99.04%,respectively.These results highlight the robustness of the ARNet approach and its potential as a versatile tool for accurate HAR across various real-world applications.展开更多
Diabetes is a metabolic disorder that results in a retinal complication called diabetic retinopathy(DR)which is one of the four main reasons for sightlessness all over the globe.DR usually has no clear symptoms before...Diabetes is a metabolic disorder that results in a retinal complication called diabetic retinopathy(DR)which is one of the four main reasons for sightlessness all over the globe.DR usually has no clear symptoms before the onset,thus making disease identication a challenging task.The healthcare industry may face unfavorable consequences if the gap in identifying DR is not lled with effective automation.Thus,our objective is to develop an automatic and cost-effective method for classifying DR samples.In this work,we present a custom Faster-RCNN technique for the recognition and classication of DR lesions from retinal images.After pre-processing,we generate the annotations of the dataset which is required for model training.Then,introduce DenseNet-65 at the feature extraction level of Faster-RCNN to compute the representative set of key points.Finally,the Faster-RCNN localizes and classies the input sample into ve classes.Rigorous experiments performed on a Kaggle dataset comprising of 88,704 images show that the introduced methodology outperforms with an accuracy of 97.2%.We have compared our technique with state-of-the-art approaches to show its robustness in term of DR localization and classication.Additionally,we performed cross-dataset validation on the Kaggle and APTOS datasets and achieved remarkable results on both training and testing phases.展开更多
COVID-19 has become a pandemic,with cases all over the world,with widespread disruption in some countries,such as Italy,US,India,South Korea,and Japan.Early and reliable detection of COVID-19 is mandatory to control t...COVID-19 has become a pandemic,with cases all over the world,with widespread disruption in some countries,such as Italy,US,India,South Korea,and Japan.Early and reliable detection of COVID-19 is mandatory to control the spread of infection.Moreover,prediction of COVID-19 spread in near future is also crucial to better plan for the disease control.For this purpose,we proposed a robust framework for the analysis,prediction,and detection of COVID-19.We make reliable estimates on key pandemic parameters and make predictions on the point of inflection and possible washout time for various countries around the world.The estimates,analysis and predictions are based on the data gathered fromJohns Hopkins Center during the time span of April 21 to June 27,2020.We use the normal distribution for simple and quick predictions of the coronavirus pandemic model and estimate the parameters of Gaussian curves using the least square parameter curve fitting for several countries in different continents.The predictions rely on the possible outcomes of Gaussian time evolution with the central limit theorem of statistics the predictions to be well justified.The parameters of Gaussian distribution,i.e.,maximumtime and width,are determined through a statisticalχ^(2)-fit for the purpose of doubling times after April 21,2020.For COVID-19 detection,we proposed a novel method based on the Histogram of Oriented Gradients(HOG)and CNN in multi-class classification scenario i.e.,Normal,COVID-19,viral pneumonia etc.Experimental results show the effectiveness of our framework for reliable prediction and detection of COVID-19.展开更多
1 Introduction Brain tumor is a lethal disease affecting millions of people around the globe and has a high mortality rate.Early identification and segmentation of brain tumor helps to increase the survival chances of...1 Introduction Brain tumor is a lethal disease affecting millions of people around the globe and has a high mortality rate.Early identification and segmentation of brain tumor helps to increase the survival chances of the patient and also saves them from complex surgical processes.Moreover,the precise segmentation of brain tumors facilitates the surgeon for better clinical development and cure.展开更多
Pre-treatment of the proton exchange membrane water electrolyzers is a crucial procedure performed prior to its regular operation.These procedures help in catalyst activation and membrane saturation,thereby,ensuring i...Pre-treatment of the proton exchange membrane water electrolyzers is a crucial procedure performed prior to its regular operation.These procedures help in catalyst activation and membrane saturation,thereby,ensuring its optimal performance.In this study,we use machine learning to investigate the impact of three distinct activation procedures on the cell performance and stability.The data set necessary to develop the surrogate models was obtained from a lab scale PEM electrolyzer cell.After evaluating the performance of the three tested models and validating them with experimental data,extreme gradient boosting is selected as the to perform parametric analysis.The modeling predictions reveal that the activation procedures mainly impact the ohmic resistance at the beginning of the cell life.These observations were further corroborated using through sensitivity analysis performed through an explainable artificial intelligence technique.Furthermore,data-driven time-series forecasting analysis to predict cell stability for different activation procedures showed a good comparison between experimental data and model predictions.展开更多
Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effe...Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.展开更多
1 Introduction Advertisements detection and replacement with different ads based on the user preferences is employed during sports rebroadcasts that offers more value to both the distributor and viewer.Manual advertis...1 Introduction Advertisements detection and replacement with different ads based on the user preferences is employed during sports rebroadcasts that offers more value to both the distributor and viewer.Manual advertisements detection is a laborious activity and demands an urgent need to develop automated advertisement detection techniques to save the time,storage space,and transmission bandwidth.展开更多
Detection and segmentation of defocus blur is a challenging task in digital imaging applications as the blurry images comprise of blur and sharp regions that wrap significant information and require effective methods ...Detection and segmentation of defocus blur is a challenging task in digital imaging applications as the blurry images comprise of blur and sharp regions that wrap significant information and require effective methods for information extraction.Existing defocus blur detection and segmentation methods have several limitations i.e.,discriminating sharp smooth and blurred smooth regions,low recognition rate in noisy images,and high computational cost without having any prior knowledge of images i.e.,blur degree and camera configuration.Hence,there exists a dire need to develop an effective method for defocus blur detection,and segmentation robust to the above-mentioned limitations.This paper presents a novel features descriptor local directional mean patterns(LDMP)for defocus blur detection and employ KNN matting over the detected LDMP-Trimap for the robust segmentation of sharp and blur regions.We argue/hypothesize that most of the image fields located in blurry regions have significantly less specific local patterns than those in the sharp regions,therefore,proposed LDMP features descriptor should reliably detect the defocus blurred regions.The fusion of LDMP features with KNN matting provides superior performance in terms of obtaining high-quality segmented regions in the image.Additionally,the proposed LDMP features descriptor is robust to noise and successfully detects defocus blur in high-dense noisy images.Experimental results on Shi and Zhao datasets demonstrate the effectiveness of the proposed method in terms of defocus blur detection.Evaluation and comparative analysis signify that our method achieves superior segmentation performance and low computational cost of 15 seconds.展开更多
基金supported and funded by theDeanship of Scientific Research at ImamMohammad Ibn Saud Islamic University(IMSIU)(grant number IMSIU-DDRSP2504).
文摘Reliable human action recognition(HAR)in video sequences is critical for a wide range of applications,such as security surveillance,healthcare monitoring,and human-computer interaction.Several automated systems have been designed for this purpose;however,existing methods often struggle to effectively integrate spatial and temporal information from input samples such as 2-stream networks or 3D convolutional neural networks(CNNs),which limits their accuracy in discriminating numerous human actions.Therefore,this study introduces a novel deeplearning framework called theARNet,designed for robustHAR.ARNet consists of two mainmodules,namely,a refined InceptionResNet-V2-based CNN and a Bi-LSTM(Long Short-Term Memory)network.The refined InceptionResNet-V2 employs a parametric rectified linear unit(PReLU)activation strategy within convolutional layers to enhance spatial feature extraction fromindividual video frames.The inclusion of the PReLUmethod improves the spatial informationcapturing ability of the approach as it uses learnable parameters to adaptively control the slope of the negative part of the activation function,allowing richer gradient flow during backpropagation and resulting in robust information capturing and stable model training.These spatial features holding essential pixel characteristics are then processed by the Bi-LSTMmodule for temporal analysis,which assists the ARNet in understanding the dynamic behavior of actions over time.The ARNet integrates three additional dense layers after the Bi-LSTM module to ensure a comprehensive computation of both spatial and temporal patterns and further boost the feature representation.The experimental validation of the model is conducted on 3 benchmark datasets named HMDB51,KTH,and UCF Sports and reports accuracies of 93.82%,99%,and 99.16%,respectively.The Precision results of HMDB51,KTH,and UCF Sports datasets are 97.41%,99.54%,and 99.01%;the Recall values are 98.87%,98.60%,99.08%,and the F1-Score is 98.13%,99.07%,99.04%,respectively.These results highlight the robustness of the ARNet approach and its potential as a versatile tool for accurate HAR across various real-world applications.
文摘Diabetes is a metabolic disorder that results in a retinal complication called diabetic retinopathy(DR)which is one of the four main reasons for sightlessness all over the globe.DR usually has no clear symptoms before the onset,thus making disease identication a challenging task.The healthcare industry may face unfavorable consequences if the gap in identifying DR is not lled with effective automation.Thus,our objective is to develop an automatic and cost-effective method for classifying DR samples.In this work,we present a custom Faster-RCNN technique for the recognition and classication of DR lesions from retinal images.After pre-processing,we generate the annotations of the dataset which is required for model training.Then,introduce DenseNet-65 at the feature extraction level of Faster-RCNN to compute the representative set of key points.Finally,the Faster-RCNN localizes and classies the input sample into ve classes.Rigorous experiments performed on a Kaggle dataset comprising of 88,704 images show that the introduced methodology outperforms with an accuracy of 97.2%.We have compared our technique with state-of-the-art approaches to show its robustness in term of DR localization and classication.Additionally,we performed cross-dataset validation on the Kaggle and APTOS datasets and achieved remarkable results on both training and testing phases.
文摘COVID-19 has become a pandemic,with cases all over the world,with widespread disruption in some countries,such as Italy,US,India,South Korea,and Japan.Early and reliable detection of COVID-19 is mandatory to control the spread of infection.Moreover,prediction of COVID-19 spread in near future is also crucial to better plan for the disease control.For this purpose,we proposed a robust framework for the analysis,prediction,and detection of COVID-19.We make reliable estimates on key pandemic parameters and make predictions on the point of inflection and possible washout time for various countries around the world.The estimates,analysis and predictions are based on the data gathered fromJohns Hopkins Center during the time span of April 21 to June 27,2020.We use the normal distribution for simple and quick predictions of the coronavirus pandemic model and estimate the parameters of Gaussian curves using the least square parameter curve fitting for several countries in different continents.The predictions rely on the possible outcomes of Gaussian time evolution with the central limit theorem of statistics the predictions to be well justified.The parameters of Gaussian distribution,i.e.,maximumtime and width,are determined through a statisticalχ^(2)-fit for the purpose of doubling times after April 21,2020.For COVID-19 detection,we proposed a novel method based on the Histogram of Oriented Gradients(HOG)and CNN in multi-class classification scenario i.e.,Normal,COVID-19,viral pneumonia etc.Experimental results show the effectiveness of our framework for reliable prediction and detection of COVID-19.
基金This work was supported and funded by the Directorate ASR&TD of UET-Taxila(UET/ASR&TD/RG-1002).
文摘1 Introduction Brain tumor is a lethal disease affecting millions of people around the globe and has a high mortality rate.Early identification and segmentation of brain tumor helps to increase the survival chances of the patient and also saves them from complex surgical processes.Moreover,the precise segmentation of brain tumors facilitates the surgeon for better clinical development and cure.
基金the German Federal Ministry of Education and Research(BMBF)within the H2Giga project DERIEL(grant number 03HY122C).
文摘Pre-treatment of the proton exchange membrane water electrolyzers is a crucial procedure performed prior to its regular operation.These procedures help in catalyst activation and membrane saturation,thereby,ensuring its optimal performance.In this study,we use machine learning to investigate the impact of three distinct activation procedures on the cell performance and stability.The data set necessary to develop the surrogate models was obtained from a lab scale PEM electrolyzer cell.After evaluating the performance of the three tested models and validating them with experimental data,extreme gradient boosting is selected as the to perform parametric analysis.The modeling predictions reveal that the activation procedures mainly impact the ohmic resistance at the beginning of the cell life.These observations were further corroborated using through sensitivity analysis performed through an explainable artificial intelligence technique.Furthermore,data-driven time-series forecasting analysis to predict cell stability for different activation procedures showed a good comparison between experimental data and model predictions.
基金Project supported by the Directorate of Advanced Studies,Research&Technological Development,University of Engineering and Technology Taxila(No.UET/ASRTD/RG-1002-3)。
文摘Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.
基金This work was supported and funded by the Directorate ASR&TD of UET-Taxila(UET/ASR&TD/RG-1002).
文摘1 Introduction Advertisements detection and replacement with different ads based on the user preferences is employed during sports rebroadcasts that offers more value to both the distributor and viewer.Manual advertisements detection is a laborious activity and demands an urgent need to develop automated advertisement detection techniques to save the time,storage space,and transmission bandwidth.
基金This work was supported and funded by the Directorate ASR&TD of UET-Taxila.
文摘Detection and segmentation of defocus blur is a challenging task in digital imaging applications as the blurry images comprise of blur and sharp regions that wrap significant information and require effective methods for information extraction.Existing defocus blur detection and segmentation methods have several limitations i.e.,discriminating sharp smooth and blurred smooth regions,low recognition rate in noisy images,and high computational cost without having any prior knowledge of images i.e.,blur degree and camera configuration.Hence,there exists a dire need to develop an effective method for defocus blur detection,and segmentation robust to the above-mentioned limitations.This paper presents a novel features descriptor local directional mean patterns(LDMP)for defocus blur detection and employ KNN matting over the detected LDMP-Trimap for the robust segmentation of sharp and blur regions.We argue/hypothesize that most of the image fields located in blurry regions have significantly less specific local patterns than those in the sharp regions,therefore,proposed LDMP features descriptor should reliably detect the defocus blurred regions.The fusion of LDMP features with KNN matting provides superior performance in terms of obtaining high-quality segmented regions in the image.Additionally,the proposed LDMP features descriptor is robust to noise and successfully detects defocus blur in high-dense noisy images.Experimental results on Shi and Zhao datasets demonstrate the effectiveness of the proposed method in terms of defocus blur detection.Evaluation and comparative analysis signify that our method achieves superior segmentation performance and low computational cost of 15 seconds.