To overcome the computational burden of processing three-dimensional(3 D)medical scans and the lack of spatial information in two-dimensional(2 D)medical scans,a novel segmentation method was proposed that integrates ...To overcome the computational burden of processing three-dimensional(3 D)medical scans and the lack of spatial information in two-dimensional(2 D)medical scans,a novel segmentation method was proposed that integrates the segmentation results of three densely connected 2 D convolutional neural networks(2 D-CNNs).In order to combine the lowlevel features and high-level features,we added densely connected blocks in the network structure design so that the low-level features will not be missed as the network layer increases during the learning process.Further,in order to resolve the problems of the blurred boundary of the glioma edema area,we superimposed and fused the T2-weighted fluid-attenuated inversion recovery(FLAIR)modal image and the T2-weighted(T2)modal image to enhance the edema section.For the loss function of network training,we improved the cross-entropy loss function to effectively avoid network over-fitting.On the Multimodal Brain Tumor Image Segmentation Challenge(BraTS)datasets,our method achieves dice similarity coefficient values of 0.84,0.82,and 0.83 on the BraTS2018 training;0.82,0.85,and 0.83 on the BraTS2018 validation;and 0.81,0.78,and 0.83 on the BraTS2013 testing in terms of whole tumors,tumor cores,and enhancing cores,respectively.Experimental results showed that the proposed method achieved promising accuracy and fast processing,demonstrating good potential for clinical medicine.展开更多
Diabetes mellitus is a long-term condition characterized by hyperglycemia.It could lead to plenty of difficulties.According to rising morbidity in recent years,the world’s diabetic patients will exceed 642 million by...Diabetes mellitus is a long-term condition characterized by hyperglycemia.It could lead to plenty of difficulties.According to rising morbidity in recent years,the world’s diabetic patients will exceed 642 million by 2040,implying that one out of every ten persons will be diabetic.There is no doubt that this startling figure requires immediate attention from industry and academia to promote innovation and growth in diabetes risk prediction to save individuals’lives.Due to its rapid development,deep learning(DL)was used to predict numerous diseases.However,DLmethods still suffer from their limited prediction performance due to the hyperparameters selection and parameters optimization.Therefore,the selection of hyper-parameters is critical in improving classification performance.This study presents Convolutional Neural Network(CNN)that has achieved remarkable results in many medical domains where the Bayesian optimization algorithm(BOA)has been employed for hyperparameters selection and parameters optimization.Two issues have been investigated and solved during the experiment to enhance the results.The first is the dataset class imbalance,which is solved using Synthetic Minority Oversampling Technique(SMOTE)technique.The second issue is the model’s poor performance,which has been solved using the Bayesian optimization algorithm.The findings indicate that the Bayesian based-CNN model superbases all the state-of-the-art models in the literature with an accuracy of 89.36%,F1-score of 0.88.6,andMatthews Correlation Coefficient(MCC)of 0.88.6.展开更多
The analysis of Android malware shows that this threat is constantly increasing and is a real threat to mobile devices since traditional approaches,such as signature-based detection,are no longer effective due to the ...The analysis of Android malware shows that this threat is constantly increasing and is a real threat to mobile devices since traditional approaches,such as signature-based detection,are no longer effective due to the continuously advancing level of sophistication.To resolve this problem,efficient and flexible malware detection tools are needed.This work examines the possibility of employing deep CNNs to detect Android malware by transforming network traffic into image data representations.Moreover,the dataset used in this study is the CIC-AndMal2017,which contains 20,000 instances of network traffic across five distinct malware categories:a.Trojan,b.Adware,c.Ransomware,d.Spyware,e.Worm.These network traffic features are then converted to image formats for deep learning,which is applied in a CNN framework,including the VGG16 pre-trained model.In addition,our approach yielded high performance,yielding an accuracy of 0.92,accuracy of 99.1%,precision of 98.2%,recall of 99.5%,and F1 score of 98.7%.Subsequent improvements to the classification model through changes within the VGG19 framework improved the classification rate to 99.25%.Through the results obtained,it is clear that CNNs are a very effective way to classify Android malware,providing greater accuracy than conventional techniques.The success of this approach also shows the applicability of deep learning in mobile security along with the direction for the future advancement of the real-time detection system and other deeper learning techniques to counter the increasing number of threats emerging in the future.展开更多
Image based individual dairy cattle recognition has gained much attention recently. In order to further improve the accuracy of individual dairy cattle recognition, an algorithm based on deep convolutional neural netw...Image based individual dairy cattle recognition has gained much attention recently. In order to further improve the accuracy of individual dairy cattle recognition, an algorithm based on deep convolutional neural network( DCNN) is proposed in this paper,which enables automatic feature extraction and classification that outperforms traditional hand craft features. Through making multigroup comparison experiments including different network layers,different sizes of convolution kernel and different feature dimensions in full connection layer,we demonstrate that the proposed method is suitable for dairy cattle classification. The experimental results show that the accuracy is significantly higher compared to two traditional image processing algorithms: scale invariant feature transform( SIFT) algorithm and bag of feature( BOF) model.展开更多
In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-con...In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.展开更多
In this study,we examined the efficacy of a deep convolutional neural network(DCNN)in recognizing concrete surface images and predicting the compressive strength of concrete.A digital single-lens reflex(DSLR)camera an...In this study,we examined the efficacy of a deep convolutional neural network(DCNN)in recognizing concrete surface images and predicting the compressive strength of concrete.A digital single-lens reflex(DSLR)camera and microscope were simultaneously used to obtain concrete surface images used as the input data for the DCNN.Thereafter,training,validation,and testing of the DCNNs were performed based on the DSLR camera and microscope image data.Results of the analysis indicated that the DCNN employing DSLR image data achieved a relatively higher accuracy.The accuracy of the DSLR-derived image data was attributed to the relatively wider range of the DSLR camera,which was beneficial for extracting a larger number of features.Moreover,the DSLR camera procured more realistic images than the microscope.Thus,when the compressive strength of concrete was evaluated using the DCNN employing a DSLR camera,time and cost were reduced,whereas the usefulness increased.Furthermore,an indirect comparison of the accuracy of the DCNN with that of existing non-destructive methods for evaluating the strength of concrete proved the reliability of DCNN-derived concrete strength predictions.In addition,it was determined that the DCNN used for concrete strength evaluations in this study can be further expanded to detect and evaluate various deteriorative factors that affect the durability of structures,such as salt damage,carbonation,sulfation,corrosion,and freezing-thawing.展开更多
Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the ...Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the AMR method of radiation source signals based on two-dimensional data matrix and improved residual neural network is proposed in this paper.First,the time series of the radiation source signals are reconstructed into two-dimensional data matrix,which greatly simplifies the signal preprocessing process.Second,the depthwise convolution and large-size convolutional kernels based residual neural network(DLRNet)is proposed to improve the feature extraction capability of the AMR model.Finally,the model performs feature extraction and classification on the two-dimensional data matrix to obtain the recognition vector that represents the signal modulation type.Theoretical analysis and simulation results show that the AMR method based on two-dimensional data matrix and improved residual network can significantly improve the accuracy of the AMR method.The recognition accuracy of the proposed method maintains a high level greater than 90% even at -14 dB SNR.展开更多
Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for India...Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.展开更多
In this paper, an improved two-dimensional convolution neural network(2DCNN) is proposed to monitor and analyze elevator health, based on the distribution characteristics of elevator time series data in two-dimensiona...In this paper, an improved two-dimensional convolution neural network(2DCNN) is proposed to monitor and analyze elevator health, based on the distribution characteristics of elevator time series data in two-dimensional images. The current and effective power signals from an elevator traction machine are collected to generate gray-scale binary images. The improved two-dimensional convolution neural network is used to extract deep features from the images for classification, so as to recognize the elevator working conditions. Furthermore, the oscillation criterion is proposed to describe and analyze the active power oscillations. The current and active power are used to synchronously describe the working condition of the elevator, which can explain the co-occurrence state and potential relationship of elevator data. Based on the improved integration of local features of the time series, the recognition accuracy of the proposed 2DCNN is 97.78%, which is better than that of a one-dimensional convolution neural network. This research can improve the real-time monitoring and visual analysis performance of the elevator maintenance personnel, as well as improve their work efficiency.展开更多
This paper describes a 2D/3D vision chip with integrated sensing and processing capabilities.The 2D/3D vision chip architecture includes a 2D/3D image sensor and a programmable visual processor.In this architecture,we...This paper describes a 2D/3D vision chip with integrated sensing and processing capabilities.The 2D/3D vision chip architecture includes a 2D/3D image sensor and a programmable visual processor.In this architecture,we design a novel on-chip processing flow with die-to-die image transmission and low-latency fixed-point image processing.The vision chip achieves real-time end-to-end processing of convolutional neural networks(CNNs)and conventional image processing algo-rithms.Furthermore,an end-to-end 2D/3D vision system is built to exhibit the capacity of the vision chip.The vision system achieves real-timing applications under 2D and 3D scenes,such as human face detection(processing delay 10.2 ms)and depth map reconstruction(processing delay 4.1 ms).The frame rate of image acquisition,image process,and result display is larger than 30 fps.展开更多
The airborne two-dimensional stereo(2D-S) optical array probe has been operating for more than 10 yr, accumulating a large amount of cloud particle image data. However, due to the lack of reliable and unbiased classif...The airborne two-dimensional stereo(2D-S) optical array probe has been operating for more than 10 yr, accumulating a large amount of cloud particle image data. However, due to the lack of reliable and unbiased classification tools,our ability to extract meaningful morphological information related to cloud microphysical processes is limited. To solve this issue, we propose a novel classification algorithm for 2D-S cloud particle images based on a convolutional neural network(CNN), named CNN-2DS. A 2D-S cloud particle shape dataset was established by using the 2D-S cloud particle images observed from 13 aircraft detection flights in 6 regions of China(Northeast, Northwest, North,East, Central, and South China). This dataset contains 33,300 cloud particle images with 8 types of cloud particle shape(linear, sphere, dendrite, aggregate, graupel, plate, donut, and irregular). The CNN-2DS model was trained and tested based on the established 2D-S dataset. Experimental results show that the CNN-2DS model can accurately identify cloud particles with an average classification accuracy of 97%. Compared with other common classification models [e.g., Vision Transformer(ViT) and Residual Neural Network(ResNet)], the CNN-2DS model is lightweight(few parameters) and fast in calculations, and has the highest classification accuracy. In a word, the proposed CNN-2DS model is effective and reliable for the classification of cloud particles detected by the 2D-S probe.展开更多
As the basis of machine vision,the biomimetic image sensing devices are the eyes of artificial intelligence.In recent years,with the development of two-dimensional(2D)materials,many new optoelectronic devices are deve...As the basis of machine vision,the biomimetic image sensing devices are the eyes of artificial intelligence.In recent years,with the development of two-dimensional(2D)materials,many new optoelectronic devices are developed for their outstanding performance.However,there are still little sensing arrays based on 2D materials with high imaging quality,due to the poor uniformity of pixels caused by material defects and fabrication technique.Here,we propose a 2D MoS_(2)sensing array based on artificial neural network(ANN)learning.By equipping the MoS_(2)sensing array with a“brain”(ANN),the imaging quality can be effectively improved.In the test,the relative standard deviation(RSD)between pixels decreased from about 34.3%to 6.2%and 5.49%after adjustment by the back propagation(BP)and Elman neural networks,respectively.The peak signal to noise ratio(PSNR)and structural similarity(SSIM)of the image are improved by about 2.5 times,which realizes the re-recognition of the distorted image.This provides a feasible approach for the application of 2D sensing array by integrating ANN to achieve high quality imaging.展开更多
基金the National Natural Science Foundation of China(No.81830052)the Shanghai Natural Science Foundation of China(No.20ZR1438300)the Shanghai Science and Technology Support Project(No.18441900500),China。
文摘To overcome the computational burden of processing three-dimensional(3 D)medical scans and the lack of spatial information in two-dimensional(2 D)medical scans,a novel segmentation method was proposed that integrates the segmentation results of three densely connected 2 D convolutional neural networks(2 D-CNNs).In order to combine the lowlevel features and high-level features,we added densely connected blocks in the network structure design so that the low-level features will not be missed as the network layer increases during the learning process.Further,in order to resolve the problems of the blurred boundary of the glioma edema area,we superimposed and fused the T2-weighted fluid-attenuated inversion recovery(FLAIR)modal image and the T2-weighted(T2)modal image to enhance the edema section.For the loss function of network training,we improved the cross-entropy loss function to effectively avoid network over-fitting.On the Multimodal Brain Tumor Image Segmentation Challenge(BraTS)datasets,our method achieves dice similarity coefficient values of 0.84,0.82,and 0.83 on the BraTS2018 training;0.82,0.85,and 0.83 on the BraTS2018 validation;and 0.81,0.78,and 0.83 on the BraTS2013 testing in terms of whole tumors,tumor cores,and enhancing cores,respectively.Experimental results showed that the proposed method achieved promising accuracy and fast processing,demonstrating good potential for clinical medicine.
基金This research/paper was fully supported by Universiti Teknologi PETRONAS,under the Yayasan Universiti Teknologi PETRONAS(YUTP)Fundamental Research Grant Scheme(015LC0-311).
文摘Diabetes mellitus is a long-term condition characterized by hyperglycemia.It could lead to plenty of difficulties.According to rising morbidity in recent years,the world’s diabetic patients will exceed 642 million by 2040,implying that one out of every ten persons will be diabetic.There is no doubt that this startling figure requires immediate attention from industry and academia to promote innovation and growth in diabetes risk prediction to save individuals’lives.Due to its rapid development,deep learning(DL)was used to predict numerous diseases.However,DLmethods still suffer from their limited prediction performance due to the hyperparameters selection and parameters optimization.Therefore,the selection of hyper-parameters is critical in improving classification performance.This study presents Convolutional Neural Network(CNN)that has achieved remarkable results in many medical domains where the Bayesian optimization algorithm(BOA)has been employed for hyperparameters selection and parameters optimization.Two issues have been investigated and solved during the experiment to enhance the results.The first is the dataset class imbalance,which is solved using Synthetic Minority Oversampling Technique(SMOTE)technique.The second issue is the model’s poor performance,which has been solved using the Bayesian optimization algorithm.The findings indicate that the Bayesian based-CNN model superbases all the state-of-the-art models in the literature with an accuracy of 89.36%,F1-score of 0.88.6,andMatthews Correlation Coefficient(MCC)of 0.88.6.
基金funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University,through the Research Funding Program,Grant No.(FRP-1443-15).
文摘The analysis of Android malware shows that this threat is constantly increasing and is a real threat to mobile devices since traditional approaches,such as signature-based detection,are no longer effective due to the continuously advancing level of sophistication.To resolve this problem,efficient and flexible malware detection tools are needed.This work examines the possibility of employing deep CNNs to detect Android malware by transforming network traffic into image data representations.Moreover,the dataset used in this study is the CIC-AndMal2017,which contains 20,000 instances of network traffic across five distinct malware categories:a.Trojan,b.Adware,c.Ransomware,d.Spyware,e.Worm.These network traffic features are then converted to image formats for deep learning,which is applied in a CNN framework,including the VGG16 pre-trained model.In addition,our approach yielded high performance,yielding an accuracy of 0.92,accuracy of 99.1%,precision of 98.2%,recall of 99.5%,and F1 score of 98.7%.Subsequent improvements to the classification model through changes within the VGG19 framework improved the classification rate to 99.25%.Through the results obtained,it is clear that CNNs are a very effective way to classify Android malware,providing greater accuracy than conventional techniques.The success of this approach also shows the applicability of deep learning in mobile security along with the direction for the future advancement of the real-time detection system and other deeper learning techniques to counter the increasing number of threats emerging in the future.
基金Science and Technology Support Plan Project of Tianjin Municipal Science and Technology Commission(No.15ZCZDNC00130)
文摘Image based individual dairy cattle recognition has gained much attention recently. In order to further improve the accuracy of individual dairy cattle recognition, an algorithm based on deep convolutional neural network( DCNN) is proposed in this paper,which enables automatic feature extraction and classification that outperforms traditional hand craft features. Through making multigroup comparison experiments including different network layers,different sizes of convolution kernel and different feature dimensions in full connection layer,we demonstrate that the proposed method is suitable for dairy cattle classification. The experimental results show that the accuracy is significantly higher compared to two traditional image processing algorithms: scale invariant feature transform( SIFT) algorithm and bag of feature( BOF) model.
基金supported by the West Light Foundation of the Chinese Academy of Sciences(2019-XBQNXZ-A-007)the National Natural Science Foundation of China(12071458,71731009).
文摘In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(NRF-2018R1A2B6007333)This study was supported by 2018 Research Grant from Kangwon National University.
文摘In this study,we examined the efficacy of a deep convolutional neural network(DCNN)in recognizing concrete surface images and predicting the compressive strength of concrete.A digital single-lens reflex(DSLR)camera and microscope were simultaneously used to obtain concrete surface images used as the input data for the DCNN.Thereafter,training,validation,and testing of the DCNNs were performed based on the DSLR camera and microscope image data.Results of the analysis indicated that the DCNN employing DSLR image data achieved a relatively higher accuracy.The accuracy of the DSLR-derived image data was attributed to the relatively wider range of the DSLR camera,which was beneficial for extracting a larger number of features.Moreover,the DSLR camera procured more realistic images than the microscope.Thus,when the compressive strength of concrete was evaluated using the DCNN employing a DSLR camera,time and cost were reduced,whereas the usefulness increased.Furthermore,an indirect comparison of the accuracy of the DCNN with that of existing non-destructive methods for evaluating the strength of concrete proved the reliability of DCNN-derived concrete strength predictions.In addition,it was determined that the DCNN used for concrete strength evaluations in this study can be further expanded to detect and evaluate various deteriorative factors that affect the durability of structures,such as salt damage,carbonation,sulfation,corrosion,and freezing-thawing.
基金National Natural Science Foundation of China under Grant No.61973037China Postdoctoral Science Foundation under Grant No.2022M720419。
文摘Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the AMR method of radiation source signals based on two-dimensional data matrix and improved residual neural network is proposed in this paper.First,the time series of the radiation source signals are reconstructed into two-dimensional data matrix,which greatly simplifies the signal preprocessing process.Second,the depthwise convolution and large-size convolutional kernels based residual neural network(DLRNet)is proposed to improve the feature extraction capability of the AMR model.Finally,the model performs feature extraction and classification on the two-dimensional data matrix to obtain the recognition vector that represents the signal modulation type.Theoretical analysis and simulation results show that the AMR method based on two-dimensional data matrix and improved residual network can significantly improve the accuracy of the AMR method.The recognition accuracy of the proposed method maintains a high level greater than 90% even at -14 dB SNR.
文摘Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.
基金Sponsored by the National Natural Science Foundation of China (Grant No.61771223)the Key Research and Development Program of Jiangsu Province(Grant No.SBE2018334)。
文摘In this paper, an improved two-dimensional convolution neural network(2DCNN) is proposed to monitor and analyze elevator health, based on the distribution characteristics of elevator time series data in two-dimensional images. The current and effective power signals from an elevator traction machine are collected to generate gray-scale binary images. The improved two-dimensional convolution neural network is used to extract deep features from the images for classification, so as to recognize the elevator working conditions. Furthermore, the oscillation criterion is proposed to describe and analyze the active power oscillations. The current and active power are used to synchronously describe the working condition of the elevator, which can explain the co-occurrence state and potential relationship of elevator data. Based on the improved integration of local features of the time series, the recognition accuracy of the proposed 2DCNN is 97.78%, which is better than that of a one-dimensional convolution neural network. This research can improve the real-time monitoring and visual analysis performance of the elevator maintenance personnel, as well as improve their work efficiency.
基金supported in part by the National Key Research and Development Program of China(Grant No.2019YFB2204300)in part by the National Natural Science Foundation of China(Grant Nos.62334008 and 62274154)in part by the Key Program of National Natural Science Foundation of China(Grant No.62134004).
文摘This paper describes a 2D/3D vision chip with integrated sensing and processing capabilities.The 2D/3D vision chip architecture includes a 2D/3D image sensor and a programmable visual processor.In this architecture,we design a novel on-chip processing flow with die-to-die image transmission and low-latency fixed-point image processing.The vision chip achieves real-time end-to-end processing of convolutional neural networks(CNNs)and conventional image processing algo-rithms.Furthermore,an end-to-end 2D/3D vision system is built to exhibit the capacity of the vision chip.The vision system achieves real-timing applications under 2D and 3D scenes,such as human face detection(processing delay 10.2 ms)and depth map reconstruction(processing delay 4.1 ms).The frame rate of image acquisition,image process,and result display is larger than 30 fps.
基金Supported by the National Key Research and Development Program of China (2019YFC1510301)Key Innovation Team Fund of the China Meteorological Administration (CMA2022ZD10)Basic Research Fund of the Chinese Academy of Meteorological Sciences(2021Y010)。
文摘The airborne two-dimensional stereo(2D-S) optical array probe has been operating for more than 10 yr, accumulating a large amount of cloud particle image data. However, due to the lack of reliable and unbiased classification tools,our ability to extract meaningful morphological information related to cloud microphysical processes is limited. To solve this issue, we propose a novel classification algorithm for 2D-S cloud particle images based on a convolutional neural network(CNN), named CNN-2DS. A 2D-S cloud particle shape dataset was established by using the 2D-S cloud particle images observed from 13 aircraft detection flights in 6 regions of China(Northeast, Northwest, North,East, Central, and South China). This dataset contains 33,300 cloud particle images with 8 types of cloud particle shape(linear, sphere, dendrite, aggregate, graupel, plate, donut, and irregular). The CNN-2DS model was trained and tested based on the established 2D-S dataset. Experimental results show that the CNN-2DS model can accurately identify cloud particles with an average classification accuracy of 97%. Compared with other common classification models [e.g., Vision Transformer(ViT) and Residual Neural Network(ResNet)], the CNN-2DS model is lightweight(few parameters) and fast in calculations, and has the highest classification accuracy. In a word, the proposed CNN-2DS model is effective and reliable for the classification of cloud particles detected by the 2D-S probe.
基金This project was financially supported by the Dalian Science and Technology Innovation Fund of China(No.2019J11CY011)the Science Fund for Creative Research Groups of NSFC(No.51621064).
文摘As the basis of machine vision,the biomimetic image sensing devices are the eyes of artificial intelligence.In recent years,with the development of two-dimensional(2D)materials,many new optoelectronic devices are developed for their outstanding performance.However,there are still little sensing arrays based on 2D materials with high imaging quality,due to the poor uniformity of pixels caused by material defects and fabrication technique.Here,we propose a 2D MoS_(2)sensing array based on artificial neural network(ANN)learning.By equipping the MoS_(2)sensing array with a“brain”(ANN),the imaging quality can be effectively improved.In the test,the relative standard deviation(RSD)between pixels decreased from about 34.3%to 6.2%and 5.49%after adjustment by the back propagation(BP)and Elman neural networks,respectively.The peak signal to noise ratio(PSNR)and structural similarity(SSIM)of the image are improved by about 2.5 times,which realizes the re-recognition of the distorted image.This provides a feasible approach for the application of 2D sensing array by integrating ANN to achieve high quality imaging.