In recent years,deep learning has been introduced into the field of Single-pixel imaging(SPI),garnering significant attention.However,conventional networks still exhibit limitations in preserving image details.To addr...In recent years,deep learning has been introduced into the field of Single-pixel imaging(SPI),garnering significant attention.However,conventional networks still exhibit limitations in preserving image details.To address this issue,we integrate Large Kernel Convolution(LKconv)into the U-Net framework,proposing an enhanced network structure named U-LKconv network,which significantly enhances the capability to recover image details even under low sampling conditions.展开更多
The plenoptic imaging technique provides a promising approach to the non-invasive three-dimensional measurement, especially for the high-temperature combustion diagnosis. We establish a light-field convolution imaging...The plenoptic imaging technique provides a promising approach to the non-invasive three-dimensional measurement, especially for the high-temperature combustion diagnosis. We establish a light-field convolution imaging model for diffusion flame in this work, considering the radiation transfer process inside the diffusion flame and the light transfer process inside the focused plenoptic camera together. The radiation transfer process is described by the radiation transfer equation and solved by the generalized source multi-flux method. Wave optics theory is adopted to describe the light transfer process, combining Fresnel diffraction and the phase conversion of the lens. The flame light-field image is obtained by the light-field convolution imaging model and adopted as the measurement signal to reconstruct three-dimensional temperature field. The inverse problem of temperature reconstruction is solved by the least square QR decomposition method. The simulative temperature reconstruction work is conducted, including the inverse analysis, the uncertainty analysis, and the measurement noise influence. All the results show that the proposed measurement method is available to reconstruct three-dimensional temperature with satisfactory accuracy and acceptable uncertainty. Both symmetric and asymmetric distributed temperature fields are investigated, and the reconstructed results prove the validity and universality of the measurement method.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global featu...Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.展开更多
In the PSP(Pressure-Sensitive Paint),image deblurring is essential due to factors such as prolonged camera exposure times and highmodel velocities,which can lead to significant image blurring.Conventional deblurring m...In the PSP(Pressure-Sensitive Paint),image deblurring is essential due to factors such as prolonged camera exposure times and highmodel velocities,which can lead to significant image blurring.Conventional deblurring methods applied to PSP images often suffer from limited accuracy and require extensive computational resources.To address these issues,this study proposes a deep learning-based approach tailored for PSP image deblurring.Considering that PSP applications primarily involve the accurate pressure measurements of complex geometries,the images captured under such conditions exhibit distinctive non-uniform motion blur,presenting challenges for standard deep learning models utilizing convolutional or attention-based techniques.In this paper,we introduce a novel deblurring architecture featuring multiple DAAM(Deformable Ack Attention Module).These modules provide enhanced flexibility for end-to-end deblurring,leveraging irregular convolution operations for efficient feature extraction while employing attention mechanisms interpreted as multiple 1×1 convolutions,subsequently reassembled to enhance performance.Furthermore,we incorporate a RSC(Residual Shortcut Convolution)module for initial feature processing,aimed at reducing redundant computations and improving the learning capacity for representative shallow features.To preserve critical spatial information during upsampling and downsampling,we replace conventional convolutions with wt(Haar wavelet downsampling)and dysample(Upsampling by Dynamic Sampling).This modification significantly enhances high-precision image reconstruction.By integrating these advanced modules within an encoder-decoder framework,we present the DFDNet(Deformable Fusion Deblurring Network)for image blur removal,providing robust technical support for subsequent PSP data analysis.Experimental evaluations on the FY dataset demonstrate the superior performance of our model,achieving competitive results on the GOPRO and HIDE datasets.展开更多
In existing image manipulation localization methods,the receptive field of standard convolution is limited,and during feature transfer,it is easy to lose high-frequency information about traces of manipulation.In addi...In existing image manipulation localization methods,the receptive field of standard convolution is limited,and during feature transfer,it is easy to lose high-frequency information about traces of manipulation.In addition,during feature fusion,the use of fixed sampling kernels makes it difficult to focus on local changes in features,leading to limited localization accuracy.This paper proposes an image manipulation localization method based on dual-branch hybrid convolution.First,a dual-branch hybrid convolution module is designed to expand the receptive field of the model to enhance the feature extraction ability of contextual semantic information,while also enabling the model to focus more on the high-frequency detail features of manipulation traces while localizing the manipulated area.Second,a multiscale content-aware feature fusion module is used to dynamically generate adaptive sampling kernels for each position in the feature map,enabling the model to focus more on the details of local features while locating the manipulated area.Experimental results on multiple datasets show that this method not only effectively improves the accuracy of image manipulation localization but also enhances the robustness of the model.展开更多
In the management of land resources and the protection of cultivated land,the law enforcement of land satellite images is often used as one of the main means.In recent years,the policies and regulations of the law enf...In the management of land resources and the protection of cultivated land,the law enforcement of land satellite images is often used as one of the main means.In recent years,the policies and regulations of the law enforcement of land satellite images have become more and more strict and been adjusted increasingly frequently,playing a decisive role in preventing excessive non-agricultural and non-food urbanization.In the process of the law enforcement,the extraction of suspected illegal buildings is the most important and time-consuming content.Compared with the traditional deep learning model,fully convolutional networks(FCN)has a great advantage in remote sensing image processing because its input images are not limited by size,and both convolution and deconvolution are independent of the overall size of images.In this paper,an intelligent extraction model of suspected illegal buildings from land satellite images based on deep learning FCN was built.Kaiyuan City,Yunnan Province was taken as an example.The verification results show that the global accuracy of this model was 86.6%in the process of building extraction,and mean intersection over union(mIoU)was 73.6%.This study can provide reference for the extraction of suspected illegal buildings in the law enforcement work of land satellite images,and reduce the tedious manual operation to a certain extent.展开更多
The analysis of Android malware shows that this threat is constantly increasing and is a real threat to mobile devices since traditional approaches,such as signature-based detection,are no longer effective due to the ...The analysis of Android malware shows that this threat is constantly increasing and is a real threat to mobile devices since traditional approaches,such as signature-based detection,are no longer effective due to the continuously advancing level of sophistication.To resolve this problem,efficient and flexible malware detection tools are needed.This work examines the possibility of employing deep CNNs to detect Android malware by transforming network traffic into image data representations.Moreover,the dataset used in this study is the CIC-AndMal2017,which contains 20,000 instances of network traffic across five distinct malware categories:a.Trojan,b.Adware,c.Ransomware,d.Spyware,e.Worm.These network traffic features are then converted to image formats for deep learning,which is applied in a CNN framework,including the VGG16 pre-trained model.In addition,our approach yielded high performance,yielding an accuracy of 0.92,accuracy of 99.1%,precision of 98.2%,recall of 99.5%,and F1 score of 98.7%.Subsequent improvements to the classification model through changes within the VGG19 framework improved the classification rate to 99.25%.Through the results obtained,it is clear that CNNs are a very effective way to classify Android malware,providing greater accuracy than conventional techniques.The success of this approach also shows the applicability of deep learning in mobile security along with the direction for the future advancement of the real-time detection system and other deeper learning techniques to counter the increasing number of threats emerging in the future.展开更多
Neural network methods have recently emerged as a hot topic in computed tomography(CT) imaging owing to their powerful fitting ability;however, their potential applications still need to be carefully studied because t...Neural network methods have recently emerged as a hot topic in computed tomography(CT) imaging owing to their powerful fitting ability;however, their potential applications still need to be carefully studied because their results are often difficult to interpret and are ambiguous in generalizability. Thus, quality assessments of the results obtained from a neural network are necessary to evaluate the neural network. Assessing the image quality of neural networks using traditional objective measurements is not appropriate because neural networks are nonstationary and nonlinear. In contrast, subjective assessments are trustworthy, although they are time-and energy-consuming for radiologists. Model observers that mimic subjective assessment require the mean and covariance of images, which are calculated from numerous image samples;however, this has not yet been applied to the evaluation of neural networks. In this study, we propose an analytical method for noise propagation from a single projection to efficiently evaluate convolutional neural networks(CNNs) in the CT imaging field. We propagate noise through nonlinear layers in a CNN using the Taylor expansion. Nesting of the linear and nonlinear layer noise propagation constitutes the covariance estimation of the CNN. A commonly used U-net structure is adopted for validation. The results reveal that the covariance estimation obtained from the proposed analytical method agrees well with that obtained from the image samples for different phantoms, noise levels, and activation functions, demonstrating that propagating noise from only a single projection is feasible for CNN methods in CT reconstruction. In addition, we use covariance estimation to provide three measurements for the qualitative and quantitative performance evaluation of U-net. The results indicate that the network cannot be applied to projections with high noise levels and possesses limitations in terms of efficiency for processing low-noise projections. U-net is more effective in improving the image quality of smooth regions compared with that of the edge. LeakyReLU outperforms Swish in terms of noise reduction.展开更多
Background:The main cause of breast cancer is the deterioration of malignant tumor cells in breast tissue.Early diagnosis of tumors has become the most effective way to prevent breast cancer.Method:For distinguishing ...Background:The main cause of breast cancer is the deterioration of malignant tumor cells in breast tissue.Early diagnosis of tumors has become the most effective way to prevent breast cancer.Method:For distinguishing between tumor and non-tumor in MRI,a new type of computer-aided detection CAD system for breast tumors is designed in this paper.The CAD system was constructed using three networks,namely,the VGG16,Inception V3,and ResNet50.Then,the influence of the convolutional neural network second migration on the experimental results was further explored in the VGG16 system.Result:CAD system built based on VGG16,Inception V3,and ResNet50 has higher performance than mainstream CAD systems.Among them,the system built based on VGG16 and ResNet50 has outstanding performance.We further explore the impact of the secondary migration on the experimental results in the VGG16 system,and these results show that the migration can improve system performance of the proposed framework.Conclusion:The accuracy of CNN represented by VGG16 is as high as 91.25%,which is more accurate than traditional machine learningmodels.The F1 score of the three basic networks that join the secondary migration is close to 1.0,and the performance of the VGG16-based breast tumor CAD system is higher than Inception V3,and ResNet50.展开更多
Imaging plates are widely used to detect alpha particles to track information,and the number of alpha particle tracks is affected by the overlapping and fading effects of the track information.In this study,an experim...Imaging plates are widely used to detect alpha particles to track information,and the number of alpha particle tracks is affected by the overlapping and fading effects of the track information.In this study,an experiment and a simulation were used to calibrate the efficiency parameter of an imaging plate,which was used to calculate the grayscale.Images were created by using grayscale,which trained the convolutional neural network to count the alpha tracks.The results demonstrated that the trained convolutional neural network can evaluate the alpha track counts based on the source and background images with a wider linear range,which was unaffected by the overlapping effect.The alpha track counts were unaffected by the fading effect within 60 min,where the calibrated formula for the fading effect was analyzed for 132.7 min.The detection efficiency of the trained convolutional neural network for inhomogeneous ^(241)Am sources(2π emission)was 0.6050±0.0399,whereas the efficiency curve of the photo-stimulated luminescence method was lower than that of the trained convolutional neural network.展开更多
Brain tumor significantly impacts the quality of life and changes everything for a patient and their loved ones.Diagnosing a brain tumor usually begins with magnetic resonance imaging(MRI).The manual brain tumor diagn...Brain tumor significantly impacts the quality of life and changes everything for a patient and their loved ones.Diagnosing a brain tumor usually begins with magnetic resonance imaging(MRI).The manual brain tumor diagnosis from the MRO images always requires an expert radiologist.However,this process is time-consuming and costly.Therefore,a computerized technique is required for brain tumor detection in MRI images.Using the MRI,a novel mechanism of the three-dimensional(3D)Kronecker convolution feature pyramid(KCFP)is used to segment brain tumors,resolving the pixel loss and weak processing of multi-scale lesions.A single dilation rate was replaced with the 3D Kronecker convolution,while local feature learning was performed using the 3D Feature Selection(3DFSC).A 3D KCFP was added at the end of 3DFSC to resolve weak processing of multi-scale lesions,yielding efficient segmentation of brain tumors of different sizes.A 3D connected component analysis with a global threshold was used as a post-processing technique.The standard Multimodal Brain Tumor Segmentation 2020 dataset was used for model validation.Our 3D KCFP model performed exceptionally well compared to other benchmark schemes with a dice similarity coefficient of 0.90,0.80,and 0.84 for the whole tumor,enhancing tumor,and tumor core,respectively.Overall,the proposed model was efficient in brain tumor segmentation,which may facilitate medical practitioners for an appropriate diagnosis for future treatment planning.展开更多
Gait is an essential biomedical feature that distinguishes individuals through walking.This feature automatically stimulates the need for remote human recognition in security-sensitive visual monitoring applications.H...Gait is an essential biomedical feature that distinguishes individuals through walking.This feature automatically stimulates the need for remote human recognition in security-sensitive visual monitoring applications.However,there is still a lack of sufficient accuracy of gait recognition at night,in addition to taking some critical factors that affect the performances of the recognition algorithm.Therefore,a novel approach is proposed to automatically identify individuals from thermal infrared(TIR)images according to their gaits captured at night.This approach uses a new night gait network(NGaitNet)based on similarity deep convolutional neural networks(CNNs)method to enhance gait recognition at night.First,the TIR image is represented via personal movements and enhanced body skeleton segments.Then,the state-space method with a Hough transform is used to extract gait features to obtain skeletal joints′angles.These features are trained to identify the most discriminating gait patterns that indicate a change in human identity.To verify the proposed method,the experimental results are performed by using learning and validation curves via being connected by the Visdom website.The proposed thermal infrared imaging night gait recognition(TIRNGaitNet)approach has achieved the highest gait recognition accuracy rates(99.5%,97.0%),especially under normal walking conditions on the Chinese Academy of Sciences Institute of Automation infrared night gait dataset(CASIA C)and Donghua University thermal infrared night gait database(DHU night gait dataset).On the same dataset,the results of the TIRNGaitNet approach provide the record scores of(98.0%,87.0%)under the slow walking condition and(94.0%,86.0%)for the quick walking condition.展开更多
Indoor Wi-Fi localization of mobile devices plays a more and more important role along with the rapid growth of location-based services and Wi-Fi mobile devices.In this paper,a new method of constructing the channel s...Indoor Wi-Fi localization of mobile devices plays a more and more important role along with the rapid growth of location-based services and Wi-Fi mobile devices.In this paper,a new method of constructing the channel state information(CSI)image is proposed to improve the localization accuracy.Compared with previous methods of constructing the CSI image,the new kind of CSI image proposed is able to contain more channel information such as the angle of arrival(AoA),the time of arrival(TOA)and the amplitude.We construct three gray images by using phase differences of different antennas and amplitudes of different subcarriers of one antenna,and then merge them to form one RGB image.The localization method has off-line stage and on-line stage.In the off-line stage,the composed three-channel RGB images at training locations are used to train a convolutional neural network(CNN)which has been proved to be efficient in image recognition.In the on-line stage,images at test locations are fed to the well-trained CNN model and the localization result is the weighted mean value with highest output values.The performance of the proposed method is verified with extensive experiments in the representative indoor environment.展开更多
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati...With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.展开更多
Liver tumors segmentation from computed tomography (CT) images is an essential task for diagnosis and treatments of liver cancer. However, it is difficult owing to the variability of appearances, fuzzy boundaries, het...Liver tumors segmentation from computed tomography (CT) images is an essential task for diagnosis and treatments of liver cancer. However, it is difficult owing to the variability of appearances, fuzzy boundaries, heterogeneous densities, shapes and sizes of lesions. In this paper, an automatic method based on convolutional neural networks (CNNs) is presented to segment lesions from CT images. The CNNs is one of deep learning models with some convolutional filters which can learn hierarchical features from data. We compared the CNNs model to popular machine learning algorithms: AdaBoost, Random Forests (RF), and support vector machine (SVM). These classifiers were trained by handcrafted features containing mean, variance, and contextual features. Experimental evaluation was performed on 30 portal phase enhanced CT images using leave-one-out cross validation. The average Dice Similarity Coefficient (DSC), precision, and recall achieved of 80.06% ± 1.63%, 82.67% ± 1.43%, and 84.34% ± 1.61%, respectively. The results show that the CNNs method has better performance than other methods and is promising in liver tumor segmentation.展开更多
AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment...AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.展开更多
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos...Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.展开更多
How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classif...How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.展开更多
A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were construc...A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were constructed based on images captured under four single light sources.Reconstruction image 1 was constructed by fusing greyscale versions of the original images into one image,and Reconstruction image2 was constructed based on the differences between the images captured under the different light sources.Subsequently,the four original images and two reconstructed images were input into the convolutional neural network AlexNet to recognize the density range in three cases:-1.5(clean coal) and+1.5 g/cm^(3)(non-clean coal);-1.8(non-gangue) and+1.8 g/cm^(3)(gangue);-1.5(clean coal),1.5-1.8(middlings),and+1.8 g/cm^(3)(gangue).The results show the following:(1) The reconstructed images,especially Reconstruction image 2,can effectively improve the recognition accuracy for the coal density range compared with images captured under single light source.(2) The recognition accuracies for gangue and non-gangue,clean coal and non-clean coal,and clean coal,middlings,and gangue reached88.44%,86.72% and 77.08%,respectively.(3) The recognition accuracy increases as the density moves further away from the boundary density.展开更多
文摘In recent years,deep learning has been introduced into the field of Single-pixel imaging(SPI),garnering significant attention.However,conventional networks still exhibit limitations in preserving image details.To address this issue,we integrate Large Kernel Convolution(LKconv)into the U-Net framework,proposing an enhanced network structure named U-LKconv network,which significantly enhances the capability to recover image details even under low sampling conditions.
基金supported by the National Natural Science Foundation of China(Grant No.51976044)the National Science and Technology Major Project(Grant No.2017-V-0016-0069)the Foundation for Heilongjiang Touyan Innovation Team Program。
文摘The plenoptic imaging technique provides a promising approach to the non-invasive three-dimensional measurement, especially for the high-temperature combustion diagnosis. We establish a light-field convolution imaging model for diffusion flame in this work, considering the radiation transfer process inside the diffusion flame and the light transfer process inside the focused plenoptic camera together. The radiation transfer process is described by the radiation transfer equation and solved by the generalized source multi-flux method. Wave optics theory is adopted to describe the light transfer process, combining Fresnel diffraction and the phase conversion of the lens. The flame light-field image is obtained by the light-field convolution imaging model and adopted as the measurement signal to reconstruct three-dimensional temperature field. The inverse problem of temperature reconstruction is solved by the least square QR decomposition method. The simulative temperature reconstruction work is conducted, including the inverse analysis, the uncertainty analysis, and the measurement noise influence. All the results show that the proposed measurement method is available to reconstruct three-dimensional temperature with satisfactory accuracy and acceptable uncertainty. Both symmetric and asymmetric distributed temperature fields are investigated, and the reconstructed results prove the validity and universality of the measurement method.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金National Key Research and Development Program of China,Grant/Award Number:2018YFE0206900China Postdoctoral Science Foundation,Grant/Award Number:2023M731204+2 种基金The Open Project of Key Laboratory for Quality Evaluation of Ultrasound Surgical Equipment of National Medical Products Administration,Grant/Award Number:SMDTKL-2023-1-01The Hubei Province Key Research and Development Project,Grant/Award Number:2023BCB007CAAI-Huawei MindSpore Open Fund。
文摘Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.
基金supported by the National Natural Science Foundation of China(No.12202476).
文摘In the PSP(Pressure-Sensitive Paint),image deblurring is essential due to factors such as prolonged camera exposure times and highmodel velocities,which can lead to significant image blurring.Conventional deblurring methods applied to PSP images often suffer from limited accuracy and require extensive computational resources.To address these issues,this study proposes a deep learning-based approach tailored for PSP image deblurring.Considering that PSP applications primarily involve the accurate pressure measurements of complex geometries,the images captured under such conditions exhibit distinctive non-uniform motion blur,presenting challenges for standard deep learning models utilizing convolutional or attention-based techniques.In this paper,we introduce a novel deblurring architecture featuring multiple DAAM(Deformable Ack Attention Module).These modules provide enhanced flexibility for end-to-end deblurring,leveraging irregular convolution operations for efficient feature extraction while employing attention mechanisms interpreted as multiple 1×1 convolutions,subsequently reassembled to enhance performance.Furthermore,we incorporate a RSC(Residual Shortcut Convolution)module for initial feature processing,aimed at reducing redundant computations and improving the learning capacity for representative shallow features.To preserve critical spatial information during upsampling and downsampling,we replace conventional convolutions with wt(Haar wavelet downsampling)and dysample(Upsampling by Dynamic Sampling).This modification significantly enhances high-precision image reconstruction.By integrating these advanced modules within an encoder-decoder framework,we present the DFDNet(Deformable Fusion Deblurring Network)for image blur removal,providing robust technical support for subsequent PSP data analysis.Experimental evaluations on the FY dataset demonstrate the superior performance of our model,achieving competitive results on the GOPRO and HIDE datasets.
基金National Natural Science Foundation of China(61703363)Shanxi Provincial Basic Research Program(202403021221206)+2 种基金Key Project of Shanxi Provincial Strategic Research on Science and Technology(202304031401011)Funding Project for Scientific Research Innovation Team on Data Mining and Industrial Intelligence Applications(YCXYTD-202402)Yuncheng University Research Project(YQ-2020021)。
文摘In existing image manipulation localization methods,the receptive field of standard convolution is limited,and during feature transfer,it is easy to lose high-frequency information about traces of manipulation.In addition,during feature fusion,the use of fixed sampling kernels makes it difficult to focus on local changes in features,leading to limited localization accuracy.This paper proposes an image manipulation localization method based on dual-branch hybrid convolution.First,a dual-branch hybrid convolution module is designed to expand the receptive field of the model to enhance the feature extraction ability of contextual semantic information,while also enabling the model to focus more on the high-frequency detail features of manipulation traces while localizing the manipulated area.Second,a multiscale content-aware feature fusion module is used to dynamically generate adaptive sampling kernels for each position in the feature map,enabling the model to focus more on the details of local features while locating the manipulated area.Experimental results on multiple datasets show that this method not only effectively improves the accuracy of image manipulation localization but also enhances the robustness of the model.
文摘In the management of land resources and the protection of cultivated land,the law enforcement of land satellite images is often used as one of the main means.In recent years,the policies and regulations of the law enforcement of land satellite images have become more and more strict and been adjusted increasingly frequently,playing a decisive role in preventing excessive non-agricultural and non-food urbanization.In the process of the law enforcement,the extraction of suspected illegal buildings is the most important and time-consuming content.Compared with the traditional deep learning model,fully convolutional networks(FCN)has a great advantage in remote sensing image processing because its input images are not limited by size,and both convolution and deconvolution are independent of the overall size of images.In this paper,an intelligent extraction model of suspected illegal buildings from land satellite images based on deep learning FCN was built.Kaiyuan City,Yunnan Province was taken as an example.The verification results show that the global accuracy of this model was 86.6%in the process of building extraction,and mean intersection over union(mIoU)was 73.6%.This study can provide reference for the extraction of suspected illegal buildings in the law enforcement work of land satellite images,and reduce the tedious manual operation to a certain extent.
基金funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University,through the Research Funding Program,Grant No.(FRP-1443-15).
文摘The analysis of Android malware shows that this threat is constantly increasing and is a real threat to mobile devices since traditional approaches,such as signature-based detection,are no longer effective due to the continuously advancing level of sophistication.To resolve this problem,efficient and flexible malware detection tools are needed.This work examines the possibility of employing deep CNNs to detect Android malware by transforming network traffic into image data representations.Moreover,the dataset used in this study is the CIC-AndMal2017,which contains 20,000 instances of network traffic across five distinct malware categories:a.Trojan,b.Adware,c.Ransomware,d.Spyware,e.Worm.These network traffic features are then converted to image formats for deep learning,which is applied in a CNN framework,including the VGG16 pre-trained model.In addition,our approach yielded high performance,yielding an accuracy of 0.92,accuracy of 99.1%,precision of 98.2%,recall of 99.5%,and F1 score of 98.7%.Subsequent improvements to the classification model through changes within the VGG19 framework improved the classification rate to 99.25%.Through the results obtained,it is clear that CNNs are a very effective way to classify Android malware,providing greater accuracy than conventional techniques.The success of this approach also shows the applicability of deep learning in mobile security along with the direction for the future advancement of the real-time detection system and other deeper learning techniques to counter the increasing number of threats emerging in the future.
基金supported by the National Natural Science Foundation of China(Nos.62031020 and 61771279)。
文摘Neural network methods have recently emerged as a hot topic in computed tomography(CT) imaging owing to their powerful fitting ability;however, their potential applications still need to be carefully studied because their results are often difficult to interpret and are ambiguous in generalizability. Thus, quality assessments of the results obtained from a neural network are necessary to evaluate the neural network. Assessing the image quality of neural networks using traditional objective measurements is not appropriate because neural networks are nonstationary and nonlinear. In contrast, subjective assessments are trustworthy, although they are time-and energy-consuming for radiologists. Model observers that mimic subjective assessment require the mean and covariance of images, which are calculated from numerous image samples;however, this has not yet been applied to the evaluation of neural networks. In this study, we propose an analytical method for noise propagation from a single projection to efficiently evaluate convolutional neural networks(CNNs) in the CT imaging field. We propagate noise through nonlinear layers in a CNN using the Taylor expansion. Nesting of the linear and nonlinear layer noise propagation constitutes the covariance estimation of the CNN. A commonly used U-net structure is adopted for validation. The results reveal that the covariance estimation obtained from the proposed analytical method agrees well with that obtained from the image samples for different phantoms, noise levels, and activation functions, demonstrating that propagating noise from only a single projection is feasible for CNN methods in CT reconstruction. In addition, we use covariance estimation to provide three measurements for the qualitative and quantitative performance evaluation of U-net. The results indicate that the network cannot be applied to projections with high noise levels and possesses limitations in terms of efficiency for processing low-noise projections. U-net is more effective in improving the image quality of smooth regions compared with that of the edge. LeakyReLU outperforms Swish in terms of noise reduction.
文摘Background:The main cause of breast cancer is the deterioration of malignant tumor cells in breast tissue.Early diagnosis of tumors has become the most effective way to prevent breast cancer.Method:For distinguishing between tumor and non-tumor in MRI,a new type of computer-aided detection CAD system for breast tumors is designed in this paper.The CAD system was constructed using three networks,namely,the VGG16,Inception V3,and ResNet50.Then,the influence of the convolutional neural network second migration on the experimental results was further explored in the VGG16 system.Result:CAD system built based on VGG16,Inception V3,and ResNet50 has higher performance than mainstream CAD systems.Among them,the system built based on VGG16 and ResNet50 has outstanding performance.We further explore the impact of the secondary migration on the experimental results in the VGG16 system,and these results show that the migration can improve system performance of the proposed framework.Conclusion:The accuracy of CNN represented by VGG16 is as high as 91.25%,which is more accurate than traditional machine learningmodels.The F1 score of the three basic networks that join the secondary migration is close to 1.0,and the performance of the VGG16-based breast tumor CAD system is higher than Inception V3,and ResNet50.
基金supported by the Hunan Provincial Innovation Foundation for Postgraduates (No.QL20210228)the National Natural Science Foundation of China (No.12075112)the National Natural Science Foundation of China (No.12175102).
文摘Imaging plates are widely used to detect alpha particles to track information,and the number of alpha particle tracks is affected by the overlapping and fading effects of the track information.In this study,an experiment and a simulation were used to calibrate the efficiency parameter of an imaging plate,which was used to calculate the grayscale.Images were created by using grayscale,which trained the convolutional neural network to count the alpha tracks.The results demonstrated that the trained convolutional neural network can evaluate the alpha track counts based on the source and background images with a wider linear range,which was unaffected by the overlapping effect.The alpha track counts were unaffected by the fading effect within 60 min,where the calibrated formula for the fading effect was analyzed for 132.7 min.The detection efficiency of the trained convolutional neural network for inhomogeneous ^(241)Am sources(2π emission)was 0.6050±0.0399,whereas the efficiency curve of the photo-stimulated luminescence method was lower than that of the trained convolutional neural network.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP),granted financial resources from theMinistry of Trade,Industry&Energy,Republic ofKorea(No.20204010600090).In addition,it was funded from the National Center of Artificial Intelligence(NCAI),Higher Education Commission,Pakistan,Grant/Award Number:Grant 2(1064).
文摘Brain tumor significantly impacts the quality of life and changes everything for a patient and their loved ones.Diagnosing a brain tumor usually begins with magnetic resonance imaging(MRI).The manual brain tumor diagnosis from the MRO images always requires an expert radiologist.However,this process is time-consuming and costly.Therefore,a computerized technique is required for brain tumor detection in MRI images.Using the MRI,a novel mechanism of the three-dimensional(3D)Kronecker convolution feature pyramid(KCFP)is used to segment brain tumors,resolving the pixel loss and weak processing of multi-scale lesions.A single dilation rate was replaced with the 3D Kronecker convolution,while local feature learning was performed using the 3D Feature Selection(3DFSC).A 3D KCFP was added at the end of 3DFSC to resolve weak processing of multi-scale lesions,yielding efficient segmentation of brain tumors of different sizes.A 3D connected component analysis with a global threshold was used as a post-processing technique.The standard Multimodal Brain Tumor Segmentation 2020 dataset was used for model validation.Our 3D KCFP model performed exceptionally well compared to other benchmark schemes with a dice similarity coefficient of 0.90,0.80,and 0.84 for the whole tumor,enhancing tumor,and tumor core,respectively.Overall,the proposed model was efficient in brain tumor segmentation,which may facilitate medical practitioners for an appropriate diagnosis for future treatment planning.
文摘Gait is an essential biomedical feature that distinguishes individuals through walking.This feature automatically stimulates the need for remote human recognition in security-sensitive visual monitoring applications.However,there is still a lack of sufficient accuracy of gait recognition at night,in addition to taking some critical factors that affect the performances of the recognition algorithm.Therefore,a novel approach is proposed to automatically identify individuals from thermal infrared(TIR)images according to their gaits captured at night.This approach uses a new night gait network(NGaitNet)based on similarity deep convolutional neural networks(CNNs)method to enhance gait recognition at night.First,the TIR image is represented via personal movements and enhanced body skeleton segments.Then,the state-space method with a Hough transform is used to extract gait features to obtain skeletal joints′angles.These features are trained to identify the most discriminating gait patterns that indicate a change in human identity.To verify the proposed method,the experimental results are performed by using learning and validation curves via being connected by the Visdom website.The proposed thermal infrared imaging night gait recognition(TIRNGaitNet)approach has achieved the highest gait recognition accuracy rates(99.5%,97.0%),especially under normal walking conditions on the Chinese Academy of Sciences Institute of Automation infrared night gait dataset(CASIA C)and Donghua University thermal infrared night gait database(DHU night gait dataset).On the same dataset,the results of the TIRNGaitNet approach provide the record scores of(98.0%,87.0%)under the slow walking condition and(94.0%,86.0%)for the quick walking condition.
基金supported by the National Natural Science Foundation of China (No.61631013)National Key Basic Research Program of China (973 Program) (No. 2013CB329002)National Major Project (NO. 2018ZX03001006003)
文摘Indoor Wi-Fi localization of mobile devices plays a more and more important role along with the rapid growth of location-based services and Wi-Fi mobile devices.In this paper,a new method of constructing the channel state information(CSI)image is proposed to improve the localization accuracy.Compared with previous methods of constructing the CSI image,the new kind of CSI image proposed is able to contain more channel information such as the angle of arrival(AoA),the time of arrival(TOA)and the amplitude.We construct three gray images by using phase differences of different antennas and amplitudes of different subcarriers of one antenna,and then merge them to form one RGB image.The localization method has off-line stage and on-line stage.In the off-line stage,the composed three-channel RGB images at training locations are used to train a convolutional neural network(CNN)which has been proved to be efficient in image recognition.In the on-line stage,images at test locations are fed to the well-trained CNN model and the localization result is the weighted mean value with highest output values.The performance of the proposed method is verified with extensive experiments in the representative indoor environment.
文摘With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.
文摘Liver tumors segmentation from computed tomography (CT) images is an essential task for diagnosis and treatments of liver cancer. However, it is difficult owing to the variability of appearances, fuzzy boundaries, heterogeneous densities, shapes and sizes of lesions. In this paper, an automatic method based on convolutional neural networks (CNNs) is presented to segment lesions from CT images. The CNNs is one of deep learning models with some convolutional filters which can learn hierarchical features from data. We compared the CNNs model to popular machine learning algorithms: AdaBoost, Random Forests (RF), and support vector machine (SVM). These classifiers were trained by handcrafted features containing mean, variance, and contextual features. Experimental evaluation was performed on 30 portal phase enhanced CT images using leave-one-out cross validation. The average Dice Similarity Coefficient (DSC), precision, and recall achieved of 80.06% ± 1.63%, 82.67% ± 1.43%, and 84.34% ± 1.61%, respectively. The results show that the CNNs method has better performance than other methods and is promising in liver tumor segmentation.
基金Supported by National Science Foundation of China(No.81800878)Interdisciplinary Program of Shanghai Jiao Tong University(No.YG2017QN24)+1 种基金Key Technological Research Projects of Songjiang District(No.18sjkjgg24)Bethune Langmu Ophthalmological Research Fund for Young and Middle-aged People(No.BJ-LM2018002J)
文摘AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.
基金supported by the National Natural Science Foundation of China[grant number 41671452].
文摘Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.
基金supported by the National Natural Science Foundation of China(U1435220)
文摘How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.
文摘A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were constructed based on images captured under four single light sources.Reconstruction image 1 was constructed by fusing greyscale versions of the original images into one image,and Reconstruction image2 was constructed based on the differences between the images captured under the different light sources.Subsequently,the four original images and two reconstructed images were input into the convolutional neural network AlexNet to recognize the density range in three cases:-1.5(clean coal) and+1.5 g/cm^(3)(non-clean coal);-1.8(non-gangue) and+1.8 g/cm^(3)(gangue);-1.5(clean coal),1.5-1.8(middlings),and+1.8 g/cm^(3)(gangue).The results show the following:(1) The reconstructed images,especially Reconstruction image 2,can effectively improve the recognition accuracy for the coal density range compared with images captured under single light source.(2) The recognition accuracies for gangue and non-gangue,clean coal and non-clean coal,and clean coal,middlings,and gangue reached88.44%,86.72% and 77.08%,respectively.(3) The recognition accuracy increases as the density moves further away from the boundary density.