The concept of classification through deep learning is to build a model that skillfully separates closely-related images dataset into different classes because of diminutive but continuous variations that took place i...The concept of classification through deep learning is to build a model that skillfully separates closely-related images dataset into different classes because of diminutive but continuous variations that took place in physical systems over time and effect substantially.This study has made ozone depletion identification through classification using Faster Region-Based Convolutional Neural Network(F-RCNN).The main advantage of F-RCNN is to accumulate the bounding boxes on images to differentiate the depleted and non-depleted regions.Furthermore,image classification’s primary goal is to accurately predict each minutely varied case’s targeted classes in the dataset based on ozone saturation.The permanent changes in climate are of serious concern.The leading causes beyond these destructive variations are ozone layer depletion,greenhouse gas release,deforestation,pollution,water resources contamination,and UV radiation.This research focuses on the prediction by identifying the ozone layer depletion because it causes many health issues,e.g.,skin cancer,damage to marine life,crops damage,and impacts on living being’s immune systems.We have tried to classify the ozone images dataset into two major classes,depleted and non-depleted regions,to extract the required persuading features through F-RCNN.Furthermore,CNN has been used for feature extraction in the existing literature,and those extricated diverse RoIs are passed on to the CNN for grouping purposes.It is difficult to manage and differentiate those RoIs after grouping that negatively affects the gathered results.The classification outcomes through F-RCNN approach are proficient and demonstrate that general accuracy lies between 91%to 93%in identifying climate variation through ozone concentration classification,whether the region in the image under consideration is depleted or non-depleted.Our proposed model presented 93%accuracy,and it outperforms the prevailing techniques.展开更多
This paper proposes a solution to localization and classification of rice grains in an image.All existing related works rely on conventional based machine learning approaches.However,those techniques do not do well fo...This paper proposes a solution to localization and classification of rice grains in an image.All existing related works rely on conventional based machine learning approaches.However,those techniques do not do well for the problem designed in this paper,due to the high similarities between different types of rice grains.The deep learning based solution is developed in the proposed solution.It contains pre-processing steps of data annotation using the watershed algorithm,auto-alignment using the major axis orientation,and image enhancement using the contrast-limited adaptive histogram equalization(CLAHE)technique.Then,the mask region-based convolutional neural networks(R-CNN)is trained to localize and classify rice grains in an input image.The performance is enhanced by using the transfer learning and the dropout regularization for overfitting prevention.The proposed method is validated using many scenarios of experiments,reported in the forms of mean average precision(mAP)and a confusion matrix.It achieves above 80%mAP for main scenarios in the experiments.It is also shown to perform outstanding,when compared to human experts.展开更多
An otoscope is traditionally used to examine the eardrum and ear canal.A diagnosis of otitis media(OM)relies on the experience of clinicians.If an examiner lacks experience,the examination may be difficult and time-co...An otoscope is traditionally used to examine the eardrum and ear canal.A diagnosis of otitis media(OM)relies on the experience of clinicians.If an examiner lacks experience,the examination may be difficult and time-consuming.This paper presents an ear disease classification method using middle ear images based on a convolutional neural network(CNN).Especially the segmentation and classification networks are used to classify an otoscopic image into six classes:normal,acute otitis media(AOM),otitis media with effusion(OME),chronic otitis media(COM),congenital cholesteatoma(CC)and traumatic perforations(TMPs).The Mask R-CNN is utilized for the segmentation network to extract the region of interest(ROI)from otoscopic images.The extracted ROIs are used as guiding features for the classification.The classification is based on transfer learning with an ensemble of two CNN classifiers:EfficientNetB0 and Inception-V3.The proposed model was trained with a 5-fold cross-validation technique.The proposed method was evaluated and achieved a classification accuracy of 97.29%.展开更多
Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)a...Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)and artificial intelligence(AI)algorithms have been employed to detect AD using single-modality data.However,recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction.In this study,we developed a framework that utilizes multimodal data(tabular data,magnetic resonance imaging(MRI)images,and genetic information)to classify AD.As part of the pre-processing phase,we generated a knowledge graph from the tabular data and MRI images.We employed graph neural networks for knowledge graph creation,and region-based convolutional neural network approach for image-to-knowledge graph generation.Additionally,we integrated various explainable AI(XAI)techniques to interpret and elucidate the prediction outcomes derived from multimodal data.Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images.We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided.Genetic expression values play a crucial role in AD analysis.We used a graphical gene tree to identify genes associated with the disease.Moreover,a dashboard was designed to display XAI outcomes,enabling experts and medical professionals to easily comprehend the predic-tion results.展开更多
Background:Early diagnosis and accurate staging are important to improve the cure rate and prognosis for pancreatic cancer.This study was performed to develop an automatic and accurate imaging processing technique sys...Background:Early diagnosis and accurate staging are important to improve the cure rate and prognosis for pancreatic cancer.This study was performed to develop an automatic and accurate imaging processing technique system,allowing this system to read computed tomography(CT)images correctly and make diagnosis of pancreatic cancer faster.Methods:The establishment of the artificial intelligence(AI)system for pancreatic cancer diagnosis based on sequential contrastenhanced CT images were composed of two processes:training and verification.During training process,our study used all 4385 CT images from 238 pancreatic cancer patients in the database as the training data set.Additionally,we used VGG16,which was pretrained in ImageNet and contained 13 convolutional layers and three fully connected layers,to initialize the feature extraction network.In the verification experiment,we used sequential clinical CT images from 238 pancreatic cancer patients as our experimental data and input these data into the faster region-based convolution network(Faster R-CNN)model that had completed training.Totally,1699 images from 100 pancreatic cancer patients were included for clinical verification.Results:A total of 338 patients with pancreatic cancer were included in the study.The clinical characteristics(sex,age,tumor location,differentiation grade,and tumor-node-metastasis stage)between the two training and verification groups were insignificant.The mean average precision was 0.7664,indicating a good training ejffect of the Faster R-CNN.Sequential contrastenhanced CT images of 100 pancreatic cancer patients were used for clinical verification.The area under the receiver operating characteristic curve calculated according to the trapezoidal rule was 0.9632.It took approximately 0.2 s for the Faster R-CNN AI to automatically process one CT image,which is much faster than the time required for diagnosis by an imaging specialist.Conclusions:Faster R-CNN AI is an effective and objective method with high accuracy for the diagnosis of pancreatic cancer.展开更多
Background:Distinguishing between primary clear cell carcinoma of the liver(PCCCL)and common hepatocellular carcinoma(CHCC)through traditional inspection methods before the operation is difficult.This study aimed to e...Background:Distinguishing between primary clear cell carcinoma of the liver(PCCCL)and common hepatocellular carcinoma(CHCC)through traditional inspection methods before the operation is difficult.This study aimed to establish a Faster region-based convolutional neural network(RCNN)model for the accurate differential diagnosis of PCCCL and CHCC.Methods:In this study,we collected the data of 62 patients with PCCCL and 1079 patients with CHCC in Beijing YouAn Hospital from June 2012 to May 2020.A total of 109 patients with CHCC and 42 patients with PCCCL were randomly divided into the training validation set and the test set in a ratio of 4:1.The Faster RCNN was used for deep learning of patients’data in the training validation set,and established a convolutional neural network model to distinguish PCCCL and CHCC.The accuracy,average precision,and the recall of the model for diagnosing PCCCL and CHCC were used to evaluate the detection performance of the Faster RCNN algorithm.Results:A total of 4392 images of 121 patients(1032 images of 33 patients with PCCCL and 3360 images of 88 patients with CHCC)were uesd in test set for deep learning and establishing the model,and 1072 images of 30 patients(320 images of nine patients with PCCCL and 752 images of 21 patients with CHCC)were used to test the model.The accuracy of the model for accurately diagnosing PCCCL and CHCC was 0.962(95%confidence interval[CI]:0.931-0.992).The average precision of the model for diagnosing PCCCL was 0.908(95%CI:0.823-0.993)and that for diagnosing CHCC was 0.907(95%CI:0.823-0.993).The recall of the model for diagnosing PCCCL was 0.951(95%CI:0.916-0.985)and that for diagnosing CHCC was 0.960(95%CI:0.854-0.962).The time to make a diagnosis using the model took an average of 4 s for each patient.Conclusion:The Faster RCNN model can accurately distinguish PCCCL and CHCC.This model could be important for clinicians to make appropriate treatment plans for patients with PCCCL or CHCC.展开更多
This paper help with leguminous seeds detection and smart farming. There are hundreds of kinds of seeds and itcan be very difficult to distinguish between them. Botanists and those who study plants, however, can ident...This paper help with leguminous seeds detection and smart farming. There are hundreds of kinds of seeds and itcan be very difficult to distinguish between them. Botanists and those who study plants, however, can identifythe type of seed at a glance. As far as we know, this is the first work to consider leguminous seeds images withdifferent backgrounds and different sizes and crowding. Machine learning is used to automatically classify andlocate 11 different seed types. We chose Leguminous seeds from 11 types to be the objects of this study. Thosetypes are of different colors, sizes, and shapes to add variety and complexity to our research. The images datasetof the leguminous seeds was manually collected, annotated, and then split randomly into three sub-datasetstrain, validation, and test (predictions), with a ratio of 80%, 10%, and 10% respectively. The images consideredthe variability between different leguminous seed types. The images were captured on five different backgrounds: white A4 paper, black pad, dark blue pad, dark green pad, and green pad. Different heights and shootingangles were considered. The crowdedness of the seeds also varied randomly between 1 and 50 seeds per image.Different combinations and arrangements between the 11 types were considered. Two different image-capturingdevices were used: a SAMSUNG smartphone camera and a Canon digital camera. A total of 828 images wereobtained, including 9801 seed objects (labels). The dataset contained images of different backgrounds, heights,angles, crowdedness, arrangements, and combinations. The TensorFlow framework was used to construct theFaster Region-based Convolutional Neural Network (R-CNN) model and CSPDarknet53 is used as the backbonefor YOLOv4 based on DenseNet designed to connect layers in convolutional neural. Using the transfer learningmethod, we optimized the seed detection models. The currently dominant object detection methods, Faster RCNN, and YOLOv4 performances were compared experimentally. The mAP (mean average precision) of the FasterR-CNN and YOLOv4 models were 84.56% and 98.52% respectively. YOLOv4 had a significant advantage in detection speed over Faster R-CNN which makes it suitable for real-time identification as well where high accuracy andlow false positives are needed. The results showed that YOLOv4 had better accuracy, and detection ability, as wellas faster detection speed beating Faster R-CNN by a large margin. The model can be effectively applied under avariety of backgrounds, image sizes, seed sizes, shooting angles, and shooting heights, as well as different levelsof seed crowding. It constitutes an effective and efficient method for detecting different leguminous seeds incomplex scenarios. This study provides a reference for further seed testing and enumeration applications.展开更多
Ocean internal waves appear as irregular bright and dark stripes on synthetic aperture radar(SAR)remote sensing images.Ocean internal waves detection in SAR images consequently constituted a difficult and popular rese...Ocean internal waves appear as irregular bright and dark stripes on synthetic aperture radar(SAR)remote sensing images.Ocean internal waves detection in SAR images consequently constituted a difficult and popular research topic.In this paper,ocean internal waves are detected in SAR images by employing the faster regions with convolutional neural network features(Faster R-CNN)framework;for this purpose,888 internal wave samples are utilized to train the convolutional network and identify internal waves.The experimental results demonstrate a 94.78%recognition rate for internal waves,and the average detection speed is 0.22 s/image.In addition,the detection results of internal wave samples under different conditions are analyzed.This paper lays a foundation for detecting ocean internal waves using convolutional neural networks.展开更多
In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convo...In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.展开更多
In order to improve the accuracy of threaded hole object detection,combining a dual camera vision system with the Hough transform circle detection,we propose an object detection method of artifact threaded hole based ...In order to improve the accuracy of threaded hole object detection,combining a dual camera vision system with the Hough transform circle detection,we propose an object detection method of artifact threaded hole based on Faster region-ased convolutional neural network(Faster R-CNN).First,a dual camera image acquisition system is established.One industrial camera placed at a high position is responsible for collecting the whole image of the workpiece,and the suspected screw hole position on the workpiece can be preliminarily selected by Hough transform detection algorithm.Then,the other industrial camera is responsible for collecting the local images of the suspected screw holes that have been detected by Hough transform one by one.After that,ResNet50-based Faster R-CNN object detection model is trained on the self-built screw hole data set.Finally,the local image of the threaded hole is input into the trained Faster R-CNN object detection model for further identification and location.The experimental results show that the proposed method can effectively avoid small object detection of threaded holes,and compared with the method that only uses Hough transform or Faster RCNN object detection alone,it has high recognition and positioning accuracy.展开更多
In order to avoid the problem of poor illumination characteristics and inaccurate positioning accuracy, this paper proposed a pedestrian detection algorithm suitable for low-light environments. The algorithm first app...In order to avoid the problem of poor illumination characteristics and inaccurate positioning accuracy, this paper proposed a pedestrian detection algorithm suitable for low-light environments. The algorithm first applied the multi-scale Retinex image enhancement algorithm to the sample pre-processing of deep learning to improve the image resolution. Then the paper used the faster regional convolutional neural network to train the pedestrian detection model, extracted the pedestrian characteristics, and obtained the bounding boxes through classification and position regression. Finally, the pedestrian detection process was carried out by introducing the Soft-NMS algorithm, and the redundant bounding box was eliminated to obtain the best pedestrian detection position. The experimental results showed that the proposed detection algorithm achieves an average accuracy of 89.74% on the low-light dataset, and the pedestrian detection effect was more significant.展开更多
Rock classification plays a crucial role in various fields such as geology,engineering,and environmental studies.Employing deep learning AI(artificial intelligence)methods has a high potential to significantly improve...Rock classification plays a crucial role in various fields such as geology,engineering,and environmental studies.Employing deep learning AI(artificial intelligence)methods has a high potential to significantly improve the accuracy and efficiency of this task.The paper delves into the exploration of two cuttingedge AI techniques,namely Mask DINO and Mask R-CNN(convolutional neural network),as means to identify rock weathering grades and rock types.The results demonstrate that Mask DINO,which is a Detection Transformer(DETR),outperforms Mask R-CNN for the aforementioned purposes.Mask DINO achieved f-1 scores of 91% and 86% in weathering grade detection and rock type detection,as opposed to the Mask R-CNN's f-1 scores of 84% and 75%,respectively.These findings underscore the substantial potential of employing DETR algorithms like Mask DINO for automatic classification of both rock type and weathering states.Although the study examines only two AI models,the data processing and other techniques developed in this study may serve as a foundation for future advancements in the field.By incorporating these advanced AI techniques,logging personnel can obtain valuable references to aid their work,ultimately contributing to the advancement of geological and related fields.展开更多
Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimat...Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimation,deep learning approaches primarily emphasize on the keypoint features.Conversely,in the case of occluded or incomplete poses,the keypoint feature is insufficiently substantial,especially when there are multiple humans in a single frame.Other features,such as the body border and visibility conditions,can contribute to pose estimation in addition to the keypoint feature.Our model framework integrates multiple features,namely the human body mask features,which can serve as a constraint to keypoint location estimation,the body keypoint features,and the keypoint visibility via mask region-based convolutional neural network(Mask-RCNN).A sequential multi-feature learning setup is formed to share multi-features across the structure,whereas,in the Mask-RCNN,the only feature that could be shared through the system is the region of interest feature.By two-way up-scaling with the shared weight process to produce the mask,we have addressed the problems of improper segmentation,small intrusion,and object loss when Mask-RCNN is used,for instance,segmentation.Accuracy is indicated by the percentage of correct keypoint,and our model can identify 86.1%of the correct keypoints.展开更多
Background:Artificial intelligence-assisted image recognition technology is currently able to detect the target area of an image and fetch information to make classifications according to target features.This study ai...Background:Artificial intelligence-assisted image recognition technology is currently able to detect the target area of an image and fetch information to make classifications according to target features.This study aimed to use deep neural netAVorks for computed tomography(CT)diagnosis of perigastric metastatic lymph nodes(PGMLNs)to simulate the recognition of lymph nodes by radiologists,and to acquire more accurate identification results.Methods:A total of 1371 images of suspected lymph node metastasis from enhanced abdominal CT scans were identified and labeled by radiologists and were used with 18,780 original images for faster region-based convolutional neural networks(FR-CNN)deep learning.The identification results of 6000 random CT images from 100 gastric cancer patients by the FR-CNN were compared with results obtained from radiologists in terms of their identification accuracy.Similarly,1004 CT images with metastatic lymph nodes that had been post-operatively confirmed by pathological examination and 11,340 original images were used in the identification and learning processes described above.The same 6000 gastric cancer CT images were used for the verification,according to which the diagnosis results were analyzed.Results:In the initial group,precision-recall curves were generated based on the precision rates,the recall rates of nodule classes of the training set and the validation set;the mean average precision(mAP)value was 0.5019.To verify the results of the initial learning group,the receiver operating characteristic curves was generated,and the corresponding area under the curve(AUC)value was calculated as 0.8995.After the second phase of precise learning,all the indicators were improved,and the mAP and AUC values were 0.7801 and 0.9541,respectively.Conclusion:Through deep learning,FR-CNN achieved high judgment effectiveness and recognition accuracy for CT diagnosis of PGMLNs.展开更多
Learning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural net...Learning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural network (CNN) based object detector using weakly-supervised and semi-supervised information in the framework of fast region-based CNN (Fast R-CNN). The target is to obtain an object detector as accurate as the fully-supervised Fast R-CNN, but it requires less image annotation effort. To solve this problem, we use weakly-supervised training images (i.e., only the image-level annotation is given) and a few proportions of fully-supervised training images (i.e., the bounding box level annotation is given), that is a weakly-and semi-supervised (WASS) object detection setting. The proposed solution is termed as WASS R-CNN, in which there are two main components. At first, a weakly-supervised R-CNN is firstly trained;after that semi-supervised data are used for finetuning the weakly-supervised detector. We perform object detection experiments on the PASCAL VOC 2007 dataset. The proposed WASS R-CNN achieves more than 85% of a fully-supervised Fast R-CNN's performance (measured using mean average precision) with only 10%of fully-supervised annotations together with weak supervision for all training images. The results show that the proposed learning framework can significantly reduce the labeling efforts for obtaining reliable object detectors.展开更多
The increasing capabilities of Artificial Intelligence(AI),has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans,which m...The increasing capabilities of Artificial Intelligence(AI),has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans,which may not always have a positive impact on the society.AI gone rogue,and Technological Singularity are major concerns in academia as well as the industry.It is necessary to identify the limitations of machines and analyze their incompetence,which could draw a line between human and machine intelligence.Internet memes are an amalgam of pictures,videos,underlying messages,ideas,sentiments,humor,and experiences,hence the way an internet meme is perceived by a human may not be entirely how a machine comprehends it.In this paper,we present experimental evidence on how comprehending Internet Memes is a challenge for AI.We use a combination of Optical Character Recognition techniques like Tesseract,Pixel Link,and East Detector to extract text from the memes,and machine learning algorithms like Convolutional Neural Networks(CNN),Region-based Convolutional Neural Networks(RCNN),and Transfer Learning with pre-trained denseNet for assessing the textual and facial emotions combined.We evaluate the performance using Sensitivity and Specificity.Our results show that comprehending memes is indeed a challenging task,and hence a major limitation of AI.This research would be of utmost interest to researchers working in the areas of Artificial General Intelligence and Technological Singularity.展开更多
To pursue the ideal of a safe high-tech society in a time when traffic accidents are frequent,the traffic signs detection system has become one of the necessary topics in recent years and in the future.The ultimate go...To pursue the ideal of a safe high-tech society in a time when traffic accidents are frequent,the traffic signs detection system has become one of the necessary topics in recent years and in the future.The ultimate goal of this research is to identify and classify the types of traffic signs in a panoramic image.To accomplish this goal,the paper proposes a new model for traffic sign detection based on the Convolutional Neural Network for com-prehensive traffic sign classification and Mask Region-based Convolutional Neural Networks(R-CNN)implementation for identifying and extracting signs in panoramic images.Data augmentation and normalization of the images are also applied to assist in classifying better even if old traffic signs are degraded,and considerably minimize the rates of discovering the extra boxes.The proposed model is tested on both the testing dataset and the actual images and gets 94.5%of the correct signs recognition rate,the classification rate of those signs discovered was 99.41%and the rate of false signs was only around 0.11.展开更多
For traffic object detection in foggy environment based on convolutional neural network(CNN),data sets in fog-free environment are generally used to train the network directly.As a result,the network cannot learn the ...For traffic object detection in foggy environment based on convolutional neural network(CNN),data sets in fog-free environment are generally used to train the network directly.As a result,the network cannot learn the object characteristics in the foggy environment in the training set,and the detection effect is not good.To improve the traffic object detection in foggy environment,we propose a method of generating foggy images on fog-free images from the perspective of data set construction.First,taking the KITTI objection detection data set as an original fog-free image,we generate the depth image of the original image by using improved Monodepth unsupervised depth estimation method.Then,a geometric prior depth template is constructed to fuse the image entropy taken as weight with the depth image.After that,a foggy image is acquired from the depth image based on the atmospheric scattering model.Finally,we take two typical object-detection frameworks,that is,the two-stage object-detection Fster region-based convolutional neural network(Faster-RCNN)and the one-stage object-detection network YOLOv4,to train the original data set,the foggy data set and the mixed data set,respectively.According to the test results on RESIDE-RTTS data set in the outdoor natural foggy environment,the model under the training on the mixed data set shows the best effect.The mean average precision(mAP)values are increased by 5.6%and by 5.0%under the YOLOv4 model and the Faster-RCNN network,respectively.It is proved that the proposed method can effectively improve object identification ability foggy environment.展开更多
As a classic deep learning target detection algorithm,Faster R-CNN(region convolutional neural network)has been widely used in high-resolution synthetic aperture radar(SAR)and inverse SAR(ISAR)image detection.However,...As a classic deep learning target detection algorithm,Faster R-CNN(region convolutional neural network)has been widely used in high-resolution synthetic aperture radar(SAR)and inverse SAR(ISAR)image detection.However,for most common low-resolution radar plane position indicator(PPI)images,it is difficult to achieve good performance.In this paper,taking navigation radar PPI images as an example,a marine target detection method based on the Marine-Faster R-CNN algorithm is proposed in the case of complex background(e.g.,sea clutter)and target characteristics.The method performs feature extraction and target recognition on PPI images generated by radar echoes with the convolutional neural network(CNN).First,to improve the accuracy of detecting marine targets and reduce the false alarm rate,Faster R-CNN was optimized as the Marine-Faster R-CNN in five respects:new backbone network,anchor size,dense target detection,data sample balance,and scale normalization.Then,JRC(Japan Radio Co.,Ltd.)navigation radar was used to collect echo data under different conditions to build a marine target dataset.Finally,comparisons with the classic Faster R-CNN method and the constant false alarm rate(CFAR)algorithm proved that the proposed method is more accurate and robust,has stronger generalization ability,and can be applied to the detection of marine targets for navigation radar.Its performance was tested with datasets from different observation conditions(sea states,radar parameters,and different targets).展开更多
文摘The concept of classification through deep learning is to build a model that skillfully separates closely-related images dataset into different classes because of diminutive but continuous variations that took place in physical systems over time and effect substantially.This study has made ozone depletion identification through classification using Faster Region-Based Convolutional Neural Network(F-RCNN).The main advantage of F-RCNN is to accumulate the bounding boxes on images to differentiate the depleted and non-depleted regions.Furthermore,image classification’s primary goal is to accurately predict each minutely varied case’s targeted classes in the dataset based on ozone saturation.The permanent changes in climate are of serious concern.The leading causes beyond these destructive variations are ozone layer depletion,greenhouse gas release,deforestation,pollution,water resources contamination,and UV radiation.This research focuses on the prediction by identifying the ozone layer depletion because it causes many health issues,e.g.,skin cancer,damage to marine life,crops damage,and impacts on living being’s immune systems.We have tried to classify the ozone images dataset into two major classes,depleted and non-depleted regions,to extract the required persuading features through F-RCNN.Furthermore,CNN has been used for feature extraction in the existing literature,and those extricated diverse RoIs are passed on to the CNN for grouping purposes.It is difficult to manage and differentiate those RoIs after grouping that negatively affects the gathered results.The classification outcomes through F-RCNN approach are proficient and demonstrate that general accuracy lies between 91%to 93%in identifying climate variation through ozone concentration classification,whether the region in the image under consideration is depleted or non-depleted.Our proposed model presented 93%accuracy,and it outperforms the prevailing techniques.
文摘This paper proposes a solution to localization and classification of rice grains in an image.All existing related works rely on conventional based machine learning approaches.However,those techniques do not do well for the problem designed in this paper,due to the high similarities between different types of rice grains.The deep learning based solution is developed in the proposed solution.It contains pre-processing steps of data annotation using the watershed algorithm,auto-alignment using the major axis orientation,and image enhancement using the contrast-limited adaptive histogram equalization(CLAHE)technique.Then,the mask region-based convolutional neural networks(R-CNN)is trained to localize and classify rice grains in an input image.The performance is enhanced by using the transfer learning and the dropout regularization for overfitting prevention.The proposed method is validated using many scenarios of experiments,reported in the forms of mean average precision(mAP)and a confusion matrix.It achieves above 80%mAP for main scenarios in the experiments.It is also shown to perform outstanding,when compared to human experts.
基金This study was supported by a Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science,ICT&Future Planning NRF-2020R1A2C1014829the Soonchunhyang University Research Fund.
文摘An otoscope is traditionally used to examine the eardrum and ear canal.A diagnosis of otitis media(OM)relies on the experience of clinicians.If an examiner lacks experience,the examination may be difficult and time-consuming.This paper presents an ear disease classification method using middle ear images based on a convolutional neural network(CNN).Especially the segmentation and classification networks are used to classify an otoscopic image into six classes:normal,acute otitis media(AOM),otitis media with effusion(OME),chronic otitis media(COM),congenital cholesteatoma(CC)and traumatic perforations(TMPs).The Mask R-CNN is utilized for the segmentation network to extract the region of interest(ROI)from otoscopic images.The extracted ROIs are used as guiding features for the classification.The classification is based on transfer learning with an ensemble of two CNN classifiers:EfficientNetB0 and Inception-V3.The proposed model was trained with a 5-fold cross-validation technique.The proposed method was evaluated and achieved a classification accuracy of 97.29%.
文摘Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)and artificial intelligence(AI)algorithms have been employed to detect AD using single-modality data.However,recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction.In this study,we developed a framework that utilizes multimodal data(tabular data,magnetic resonance imaging(MRI)images,and genetic information)to classify AD.As part of the pre-processing phase,we generated a knowledge graph from the tabular data and MRI images.We employed graph neural networks for knowledge graph creation,and region-based convolutional neural network approach for image-to-knowledge graph generation.Additionally,we integrated various explainable AI(XAI)techniques to interpret and elucidate the prediction outcomes derived from multimodal data.Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images.We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided.Genetic expression values play a crucial role in AD analysis.We used a graphical gene tree to identify genes associated with the disease.Moreover,a dashboard was designed to display XAI outcomes,enabling experts and medical professionals to easily comprehend the predic-tion results.
基金This work was supported by grants from the National Natural Science Foundation of China(No.81802888)the Key Research and Development Project of Shandong Province(No.2018GSF118206 and No.2018GSF118088).
文摘Background:Early diagnosis and accurate staging are important to improve the cure rate and prognosis for pancreatic cancer.This study was performed to develop an automatic and accurate imaging processing technique system,allowing this system to read computed tomography(CT)images correctly and make diagnosis of pancreatic cancer faster.Methods:The establishment of the artificial intelligence(AI)system for pancreatic cancer diagnosis based on sequential contrastenhanced CT images were composed of two processes:training and verification.During training process,our study used all 4385 CT images from 238 pancreatic cancer patients in the database as the training data set.Additionally,we used VGG16,which was pretrained in ImageNet and contained 13 convolutional layers and three fully connected layers,to initialize the feature extraction network.In the verification experiment,we used sequential clinical CT images from 238 pancreatic cancer patients as our experimental data and input these data into the faster region-based convolution network(Faster R-CNN)model that had completed training.Totally,1699 images from 100 pancreatic cancer patients were included for clinical verification.Results:A total of 338 patients with pancreatic cancer were included in the study.The clinical characteristics(sex,age,tumor location,differentiation grade,and tumor-node-metastasis stage)between the two training and verification groups were insignificant.The mean average precision was 0.7664,indicating a good training ejffect of the Faster R-CNN.Sequential contrastenhanced CT images of 100 pancreatic cancer patients were used for clinical verification.The area under the receiver operating characteristic curve calculated according to the trapezoidal rule was 0.9632.It took approximately 0.2 s for the Faster R-CNN AI to automatically process one CT image,which is much faster than the time required for diagnosis by an imaging specialist.Conclusions:Faster R-CNN AI is an effective and objective method with high accuracy for the diagnosis of pancreatic cancer.
文摘Background:Distinguishing between primary clear cell carcinoma of the liver(PCCCL)and common hepatocellular carcinoma(CHCC)through traditional inspection methods before the operation is difficult.This study aimed to establish a Faster region-based convolutional neural network(RCNN)model for the accurate differential diagnosis of PCCCL and CHCC.Methods:In this study,we collected the data of 62 patients with PCCCL and 1079 patients with CHCC in Beijing YouAn Hospital from June 2012 to May 2020.A total of 109 patients with CHCC and 42 patients with PCCCL were randomly divided into the training validation set and the test set in a ratio of 4:1.The Faster RCNN was used for deep learning of patients’data in the training validation set,and established a convolutional neural network model to distinguish PCCCL and CHCC.The accuracy,average precision,and the recall of the model for diagnosing PCCCL and CHCC were used to evaluate the detection performance of the Faster RCNN algorithm.Results:A total of 4392 images of 121 patients(1032 images of 33 patients with PCCCL and 3360 images of 88 patients with CHCC)were uesd in test set for deep learning and establishing the model,and 1072 images of 30 patients(320 images of nine patients with PCCCL and 752 images of 21 patients with CHCC)were used to test the model.The accuracy of the model for accurately diagnosing PCCCL and CHCC was 0.962(95%confidence interval[CI]:0.931-0.992).The average precision of the model for diagnosing PCCCL was 0.908(95%CI:0.823-0.993)and that for diagnosing CHCC was 0.907(95%CI:0.823-0.993).The recall of the model for diagnosing PCCCL was 0.951(95%CI:0.916-0.985)and that for diagnosing CHCC was 0.960(95%CI:0.854-0.962).The time to make a diagnosis using the model took an average of 4 s for each patient.Conclusion:The Faster RCNN model can accurately distinguish PCCCL and CHCC.This model could be important for clinicians to make appropriate treatment plans for patients with PCCCL or CHCC.
文摘This paper help with leguminous seeds detection and smart farming. There are hundreds of kinds of seeds and itcan be very difficult to distinguish between them. Botanists and those who study plants, however, can identifythe type of seed at a glance. As far as we know, this is the first work to consider leguminous seeds images withdifferent backgrounds and different sizes and crowding. Machine learning is used to automatically classify andlocate 11 different seed types. We chose Leguminous seeds from 11 types to be the objects of this study. Thosetypes are of different colors, sizes, and shapes to add variety and complexity to our research. The images datasetof the leguminous seeds was manually collected, annotated, and then split randomly into three sub-datasetstrain, validation, and test (predictions), with a ratio of 80%, 10%, and 10% respectively. The images consideredthe variability between different leguminous seed types. The images were captured on five different backgrounds: white A4 paper, black pad, dark blue pad, dark green pad, and green pad. Different heights and shootingangles were considered. The crowdedness of the seeds also varied randomly between 1 and 50 seeds per image.Different combinations and arrangements between the 11 types were considered. Two different image-capturingdevices were used: a SAMSUNG smartphone camera and a Canon digital camera. A total of 828 images wereobtained, including 9801 seed objects (labels). The dataset contained images of different backgrounds, heights,angles, crowdedness, arrangements, and combinations. The TensorFlow framework was used to construct theFaster Region-based Convolutional Neural Network (R-CNN) model and CSPDarknet53 is used as the backbonefor YOLOv4 based on DenseNet designed to connect layers in convolutional neural. Using the transfer learningmethod, we optimized the seed detection models. The currently dominant object detection methods, Faster RCNN, and YOLOv4 performances were compared experimentally. The mAP (mean average precision) of the FasterR-CNN and YOLOv4 models were 84.56% and 98.52% respectively. YOLOv4 had a significant advantage in detection speed over Faster R-CNN which makes it suitable for real-time identification as well where high accuracy andlow false positives are needed. The results showed that YOLOv4 had better accuracy, and detection ability, as wellas faster detection speed beating Faster R-CNN by a large margin. The model can be effectively applied under avariety of backgrounds, image sizes, seed sizes, shooting angles, and shooting heights, as well as different levelsof seed crowding. It constitutes an effective and efficient method for detecting different leguminous seeds incomplex scenarios. This study provides a reference for further seed testing and enumeration applications.
基金Supported by the National Natural Science Foundation of China(No.61471136)the Special Project for Global Change and Air-sea Interaction of Ministry of Natural Resources(No.GASI-02-SCS-YGST2-04)the Chinese Association of Ocean Mineral Resources R&D(No.DY135-E2-4)
文摘Ocean internal waves appear as irregular bright and dark stripes on synthetic aperture radar(SAR)remote sensing images.Ocean internal waves detection in SAR images consequently constituted a difficult and popular research topic.In this paper,ocean internal waves are detected in SAR images by employing the faster regions with convolutional neural network features(Faster R-CNN)framework;for this purpose,888 internal wave samples are utilized to train the convolutional network and identify internal waves.The experimental results demonstrate a 94.78%recognition rate for internal waves,and the average detection speed is 0.22 s/image.In addition,the detection results of internal wave samples under different conditions are analyzed.This paper lays a foundation for detecting ocean internal waves using convolutional neural networks.
基金National Defense Pre-research Fund Project(No.KMGY318002531)。
文摘In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.
文摘In order to improve the accuracy of threaded hole object detection,combining a dual camera vision system with the Hough transform circle detection,we propose an object detection method of artifact threaded hole based on Faster region-ased convolutional neural network(Faster R-CNN).First,a dual camera image acquisition system is established.One industrial camera placed at a high position is responsible for collecting the whole image of the workpiece,and the suspected screw hole position on the workpiece can be preliminarily selected by Hough transform detection algorithm.Then,the other industrial camera is responsible for collecting the local images of the suspected screw holes that have been detected by Hough transform one by one.After that,ResNet50-based Faster R-CNN object detection model is trained on the self-built screw hole data set.Finally,the local image of the threaded hole is input into the trained Faster R-CNN object detection model for further identification and location.The experimental results show that the proposed method can effectively avoid small object detection of threaded holes,and compared with the method that only uses Hough transform or Faster RCNN object detection alone,it has high recognition and positioning accuracy.
文摘In order to avoid the problem of poor illumination characteristics and inaccurate positioning accuracy, this paper proposed a pedestrian detection algorithm suitable for low-light environments. The algorithm first applied the multi-scale Retinex image enhancement algorithm to the sample pre-processing of deep learning to improve the image resolution. Then the paper used the faster regional convolutional neural network to train the pedestrian detection model, extracted the pedestrian characteristics, and obtained the bounding boxes through classification and position regression. Finally, the pedestrian detection process was carried out by introducing the Soft-NMS algorithm, and the redundant bounding box was eliminated to obtain the best pedestrian detection position. The experimental results showed that the proposed detection algorithm achieves an average accuracy of 89.74% on the low-light dataset, and the pedestrian detection effect was more significant.
基金supported by the Construction Industry Council(Grant No.CICR/01/22)the support from the General Research Fund(Grant No.17206822)of the Research Grants Council(Hong Kong).
文摘Rock classification plays a crucial role in various fields such as geology,engineering,and environmental studies.Employing deep learning AI(artificial intelligence)methods has a high potential to significantly improve the accuracy and efficiency of this task.The paper delves into the exploration of two cuttingedge AI techniques,namely Mask DINO and Mask R-CNN(convolutional neural network),as means to identify rock weathering grades and rock types.The results demonstrate that Mask DINO,which is a Detection Transformer(DETR),outperforms Mask R-CNN for the aforementioned purposes.Mask DINO achieved f-1 scores of 91% and 86% in weathering grade detection and rock type detection,as opposed to the Mask R-CNN's f-1 scores of 84% and 75%,respectively.These findings underscore the substantial potential of employing DETR algorithms like Mask DINO for automatic classification of both rock type and weathering states.Although the study examines only two AI models,the data processing and other techniques developed in this study may serve as a foundation for future advancements in the field.By incorporating these advanced AI techniques,logging personnel can obtain valuable references to aid their work,ultimately contributing to the advancement of geological and related fields.
基金the Industry-University-Research Cooperation Fund Project of the Eighth Research Institute of China Aerospace Science and Technology Corporation(No.USCAST2021-5)。
文摘Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimation,deep learning approaches primarily emphasize on the keypoint features.Conversely,in the case of occluded or incomplete poses,the keypoint feature is insufficiently substantial,especially when there are multiple humans in a single frame.Other features,such as the body border and visibility conditions,can contribute to pose estimation in addition to the keypoint feature.Our model framework integrates multiple features,namely the human body mask features,which can serve as a constraint to keypoint location estimation,the body keypoint features,and the keypoint visibility via mask region-based convolutional neural network(Mask-RCNN).A sequential multi-feature learning setup is formed to share multi-features across the structure,whereas,in the Mask-RCNN,the only feature that could be shared through the system is the region of interest feature.By two-way up-scaling with the shared weight process to produce the mask,we have addressed the problems of improper segmentation,small intrusion,and object loss when Mask-RCNN is used,for instance,segmentation.Accuracy is indicated by the percentage of correct keypoint,and our model can identify 86.1%of the correct keypoints.
文摘Background:Artificial intelligence-assisted image recognition technology is currently able to detect the target area of an image and fetch information to make classifications according to target features.This study aimed to use deep neural netAVorks for computed tomography(CT)diagnosis of perigastric metastatic lymph nodes(PGMLNs)to simulate the recognition of lymph nodes by radiologists,and to acquire more accurate identification results.Methods:A total of 1371 images of suspected lymph node metastasis from enhanced abdominal CT scans were identified and labeled by radiologists and were used with 18,780 original images for faster region-based convolutional neural networks(FR-CNN)deep learning.The identification results of 6000 random CT images from 100 gastric cancer patients by the FR-CNN were compared with results obtained from radiologists in terms of their identification accuracy.Similarly,1004 CT images with metastatic lymph nodes that had been post-operatively confirmed by pathological examination and 11,340 original images were used in the identification and learning processes described above.The same 6000 gastric cancer CT images were used for the verification,according to which the diagnosis results were analyzed.Results:In the initial group,precision-recall curves were generated based on the precision rates,the recall rates of nodule classes of the training set and the validation set;the mean average precision(mAP)value was 0.5019.To verify the results of the initial learning group,the receiver operating characteristic curves was generated,and the corresponding area under the curve(AUC)value was calculated as 0.8995.After the second phase of precise learning,all the indicators were improved,and the mAP and AUC values were 0.7801 and 0.9541,respectively.Conclusion:Through deep learning,FR-CNN achieved high judgment effectiveness and recognition accuracy for CT diagnosis of PGMLNs.
基金This work was supported by the National Natural Science Foundation of China under Grant Nos.61876212,61733007,and 61572207the National Key Research and Development Program of China under Grant No.2018YFB1402604.
文摘Learning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural network (CNN) based object detector using weakly-supervised and semi-supervised information in the framework of fast region-based CNN (Fast R-CNN). The target is to obtain an object detector as accurate as the fully-supervised Fast R-CNN, but it requires less image annotation effort. To solve this problem, we use weakly-supervised training images (i.e., only the image-level annotation is given) and a few proportions of fully-supervised training images (i.e., the bounding box level annotation is given), that is a weakly-and semi-supervised (WASS) object detection setting. The proposed solution is termed as WASS R-CNN, in which there are two main components. At first, a weakly-supervised R-CNN is firstly trained;after that semi-supervised data are used for finetuning the weakly-supervised detector. We perform object detection experiments on the PASCAL VOC 2007 dataset. The proposed WASS R-CNN achieves more than 85% of a fully-supervised Fast R-CNN's performance (measured using mean average precision) with only 10%of fully-supervised annotations together with weak supervision for all training images. The results show that the proposed learning framework can significantly reduce the labeling efforts for obtaining reliable object detectors.
文摘The increasing capabilities of Artificial Intelligence(AI),has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans,which may not always have a positive impact on the society.AI gone rogue,and Technological Singularity are major concerns in academia as well as the industry.It is necessary to identify the limitations of machines and analyze their incompetence,which could draw a line between human and machine intelligence.Internet memes are an amalgam of pictures,videos,underlying messages,ideas,sentiments,humor,and experiences,hence the way an internet meme is perceived by a human may not be entirely how a machine comprehends it.In this paper,we present experimental evidence on how comprehending Internet Memes is a challenge for AI.We use a combination of Optical Character Recognition techniques like Tesseract,Pixel Link,and East Detector to extract text from the memes,and machine learning algorithms like Convolutional Neural Networks(CNN),Region-based Convolutional Neural Networks(RCNN),and Transfer Learning with pre-trained denseNet for assessing the textual and facial emotions combined.We evaluate the performance using Sensitivity and Specificity.Our results show that comprehending memes is indeed a challenging task,and hence a major limitation of AI.This research would be of utmost interest to researchers working in the areas of Artificial General Intelligence and Technological Singularity.
文摘To pursue the ideal of a safe high-tech society in a time when traffic accidents are frequent,the traffic signs detection system has become one of the necessary topics in recent years and in the future.The ultimate goal of this research is to identify and classify the types of traffic signs in a panoramic image.To accomplish this goal,the paper proposes a new model for traffic sign detection based on the Convolutional Neural Network for com-prehensive traffic sign classification and Mask Region-based Convolutional Neural Networks(R-CNN)implementation for identifying and extracting signs in panoramic images.Data augmentation and normalization of the images are also applied to assist in classifying better even if old traffic signs are degraded,and considerably minimize the rates of discovering the extra boxes.The proposed model is tested on both the testing dataset and the actual images and gets 94.5%of the correct signs recognition rate,the classification rate of those signs discovered was 99.41%and the rate of false signs was only around 0.11.
文摘For traffic object detection in foggy environment based on convolutional neural network(CNN),data sets in fog-free environment are generally used to train the network directly.As a result,the network cannot learn the object characteristics in the foggy environment in the training set,and the detection effect is not good.To improve the traffic object detection in foggy environment,we propose a method of generating foggy images on fog-free images from the perspective of data set construction.First,taking the KITTI objection detection data set as an original fog-free image,we generate the depth image of the original image by using improved Monodepth unsupervised depth estimation method.Then,a geometric prior depth template is constructed to fuse the image entropy taken as weight with the depth image.After that,a foggy image is acquired from the depth image based on the atmospheric scattering model.Finally,we take two typical object-detection frameworks,that is,the two-stage object-detection Fster region-based convolutional neural network(Faster-RCNN)and the one-stage object-detection network YOLOv4,to train the original data set,the foggy data set and the mixed data set,respectively.According to the test results on RESIDE-RTTS data set in the outdoor natural foggy environment,the model under the training on the mixed data set shows the best effect.The mean average precision(mAP)values are increased by 5.6%and by 5.0%under the YOLOv4 model and the Faster-RCNN network,respectively.It is proved that the proposed method can effectively improve object identification ability foggy environment.
基金supported by the Shandong Provincial Natural Science Foundation,China(No.ZR2021YQ43)the National Natural Science Foundation of China(Nos.U1933135 and 61931021)the Major Science and Technology Project of Shandong Province,China(No.2019JZZY010415)。
文摘As a classic deep learning target detection algorithm,Faster R-CNN(region convolutional neural network)has been widely used in high-resolution synthetic aperture radar(SAR)and inverse SAR(ISAR)image detection.However,for most common low-resolution radar plane position indicator(PPI)images,it is difficult to achieve good performance.In this paper,taking navigation radar PPI images as an example,a marine target detection method based on the Marine-Faster R-CNN algorithm is proposed in the case of complex background(e.g.,sea clutter)and target characteristics.The method performs feature extraction and target recognition on PPI images generated by radar echoes with the convolutional neural network(CNN).First,to improve the accuracy of detecting marine targets and reduce the false alarm rate,Faster R-CNN was optimized as the Marine-Faster R-CNN in five respects:new backbone network,anchor size,dense target detection,data sample balance,and scale normalization.Then,JRC(Japan Radio Co.,Ltd.)navigation radar was used to collect echo data under different conditions to build a marine target dataset.Finally,comparisons with the classic Faster R-CNN method and the constant false alarm rate(CFAR)algorithm proved that the proposed method is more accurate and robust,has stronger generalization ability,and can be applied to the detection of marine targets for navigation radar.Its performance was tested with datasets from different observation conditions(sea states,radar parameters,and different targets).