A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were construc...A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were constructed based on images captured under four single light sources.Reconstruction image 1 was constructed by fusing greyscale versions of the original images into one image,and Reconstruction image2 was constructed based on the differences between the images captured under the different light sources.Subsequently,the four original images and two reconstructed images were input into the convolutional neural network AlexNet to recognize the density range in three cases:-1.5(clean coal) and+1.5 g/cm^(3)(non-clean coal);-1.8(non-gangue) and+1.8 g/cm^(3)(gangue);-1.5(clean coal),1.5-1.8(middlings),and+1.8 g/cm^(3)(gangue).The results show the following:(1) The reconstructed images,especially Reconstruction image 2,can effectively improve the recognition accuracy for the coal density range compared with images captured under single light source.(2) The recognition accuracies for gangue and non-gangue,clean coal and non-clean coal,and clean coal,middlings,and gangue reached88.44%,86.72% and 77.08%,respectively.(3) The recognition accuracy increases as the density moves further away from the boundary density.展开更多
Multi-source information can be obtained through the fusion of infrared images and visible light images,which have the characteristics of complementary information.However,the existing acquisition methods of fusion im...Multi-source information can be obtained through the fusion of infrared images and visible light images,which have the characteristics of complementary information.However,the existing acquisition methods of fusion images have disadvantages such as blurred edges,low contrast,and loss of details.Based on convolution sparse representation and improved pulse-coupled neural network this paper proposes an image fusion algorithm that decompose the source images into high-frequency and low-frequency subbands by non-subsampled Shearlet Transform(NSST).Furthermore,the low-frequency subbands were fused by convolutional sparse representation(CSR),and the high-frequency subbands were fused by an improved pulse coupled neural network(IPCNN)algorithm,which can effectively solve the problem of difficulty in setting parameters of the traditional PCNN algorithm,improving the performance of sparse representation with details injection.The result reveals that the proposed method in this paper has more advantages than the existing mainstream fusion algorithms in terms of visual effects and objective indicators.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
Diabetic Retinopathy(DR)is a critical disorder that affects the retina due to the constant rise in diabetics and remains the major cause of blindness across the world.Early detection and timely treatment are essential...Diabetic Retinopathy(DR)is a critical disorder that affects the retina due to the constant rise in diabetics and remains the major cause of blindness across the world.Early detection and timely treatment are essential to mitigate the effects of DR,such as retinal damage and vision impairment.Several conventional approaches have been proposed to detect DR early and accurately,but they are limited by data imbalance,interpretability,overfitting,convergence time,and other issues.To address these drawbacks and improve DR detection accurately,a distributed Explainable Convolutional Neural network-enabled Light Gradient Boosting Machine(DE-ExLNN)is proposed in this research.The model combines an explainable Convolutional Neural Network(CNN)and Light Gradient Boosting Machine(LightGBM),achieving highly accurate outcomes in DR detection.LightGBM serves as the detection model,and the inclusion of an explainable CNN addresses issues that conventional CNN classifiers could not resolve.A custom dataset was created for this research,containing both fundus and OCTA images collected from a realtime environment,providing more accurate results compared to standard conventional DR datasets.The custom dataset demonstrates notable accuracy,sensitivity,specificity,and Matthews Correlation Coefficient(MCC)scores,underscoring the effectiveness of this approach.Evaluations against other standard datasets achieved an accuracy of 93.94%,sensitivity of 93.90%,specificity of 93.99%,and MCC of 93.88%for fundus images.For OCTA images,the results obtained an accuracy of 95.30%,sensitivity of 95.50%,specificity of 95.09%,andMCC of 95%.Results prove that the combination of explainable CNN and LightGBMoutperforms othermethods.The inclusion of distributed learning enhances the model’s efficiency by reducing time consumption and complexity while facilitating feature extraction.展开更多
在低光环境下,人脸识别面临图像质量低、特征模糊等诸多挑战,导致现有方法难以提取鲁棒且辨识度高的特征,从而严重影响识别性能。为应对这一问题,提出了一种新颖的非成对低光人脸识别模型LFSepNet(low-light face separation network)...在低光环境下,人脸识别面临图像质量低、特征模糊等诸多挑战,导致现有方法难以提取鲁棒且辨识度高的特征,从而严重影响识别性能。为应对这一问题,提出了一种新颖的非成对低光人脸识别模型LFSepNet(low-light face separation network)。与传统基于卷积神经网络(convolutional neural network,CNN)架构的训练方法不同,LFSepNet采用Transformer架构,更有效地捕捉长距离依赖关系,从而克服卷积神经网络在局部感受野上的限制。由于低光环境下的人脸图像往往整体偏暗,仅有少数区域可能包含较丰富的照明信息,传统CNN在特征提取时容易受限于局部区域,难以充分利用这些关键信息。相比之下,Transformer通过自注意力机制实现全局信息建模,使网络能够更全面地整合亮度不均的人脸图像信息,从而提升特征解耦的效果和低光人脸识别的准确性。LFSepNet模型包含自适应亮度分离模块和自适应照明间隙损失,通过动态分离人脸与光照特征,减少光照干扰,同时进一步优化特征分离效果,使模型能够提取更加精确和鲁棒的特征。实验结果表明,LFSepNet在多个低光人脸数据集上的性能均优于现有方法,特别是在极端低光条件下,其识别精度显著提升。该研究为低光人脸识别提供了基于非成对设置的有效解决方案,并在实际应用中展现了良好的潜力。展开更多
Monitring pest populations in paddy fields is important to effectively implement integrated pest management.Light traps are widely used to monitor field pests all over the world.Most conventional light traps still inv...Monitring pest populations in paddy fields is important to effectively implement integrated pest management.Light traps are widely used to monitor field pests all over the world.Most conventional light traps still involve manual identification of target pests from lots of trapped insects,which is time-consuming,labor-intensive and error-prone,especially in pest peak periods.In this paper,we developed an automatic monitoring system for rice light-trap pests based on machine vision.This system is composed of an itelligent light trap,a computer or mobile phone client platform and a cloud server.The light trap firstly traps,kills and disperses insects,then collects images of trapped insects and sends each image to the cloud server.Five target pests in images are automatically identifed and counted by pest identification models loaded in the server.To avoid light-trap insects piling up,a vibration plate and a moving rotation conveyor belt are adopted to disperse these trapped insects.There was a close correlation(r=0.92)between our automatic and manual identification methods based on the daily pest number of one-year images from one light trap.Field experiments demonstrated the effectiveness and accuracy of our automatic light trap monitoring system.展开更多
An efficient convolution neural network(CNN) plays a crucial role in various visual tasks like object classification or detection, etc. The most common way to construct a CNN is stacking the same convolution block or ...An efficient convolution neural network(CNN) plays a crucial role in various visual tasks like object classification or detection, etc. The most common way to construct a CNN is stacking the same convolution block or complex connection. These approaches may be efficient but the parameter size and computation(Comp) have explosive growth. So we present a novel architecture called"DLA+", which could obtain the feature from the different stages, and by the newly designed convolution block, could achieve better accuracy, while also dropping the computation six times compared to the baseline. We design some experiments about classification and object detection. On the CIFAR10 and VOC data-sets, we get better precision and faster speed than other architecture. The lightweight network even allows us to deploy to some low-performance device like drone, laptop, etc.展开更多
Real-time,contact-free temperature monitoring of low to medium range(30℃-150℃)has been extensively used in industry and agriculture,which is usually realized by costly infrared temperature detection methods.This pap...Real-time,contact-free temperature monitoring of low to medium range(30℃-150℃)has been extensively used in industry and agriculture,which is usually realized by costly infrared temperature detection methods.This paper proposes an alternative approach of extracting temperature information in real time from the visible light images of the monitoring target using a convolutional neural network(CNN).A mean-square error of<1.119℃was reached in the temperature measurements of low to medium range using the CNN and the visible light images.Imaging angle and imaging distance do not affect the temperature detection using visible optical images by the CNN.Moreover,the CNN has a certain illuminance generalization ability capable of detection temperature information from the images which were collected under different illuminance and were not used for training.Compared to the conventional machine learning algorithms mentioned in the recent literatures,this real-time,contact-free temperature measurement approach that does not require any further image processing operations facilitates temperature monitoring applications in the industrial and civil fields.展开更多
In this study,we proposed a multi-source approach for mapping local-scale population density of England.Specifically,we mapped both the working and daytime population densities by integrating the multi-source data suc...In this study,we proposed a multi-source approach for mapping local-scale population density of England.Specifically,we mapped both the working and daytime population densities by integrating the multi-source data such as residential population density,point-of-interest density,point-of-interest category mix,and nighttime light intensity.It is demonstrated that combining remote sensing and social sensing data provides a plausible way to map annual working or daytime population densities.In this paper,we trained models with England-wide data and subsequently tested these models with Wales-wide data.In addition,we further tested the models with England-wide data at a higher level of spatial granularity.Particularly,the random forest and convolutional neural network models were adopted to map population density.The estimated results and validation suggest that the three built models have high prediction accuracies at the local authority district level.It is shown that the convolutional neural network models have the greatest prediction accuracies at the local authority district level though they are most time-consuming.The models trained with the data at the local authority district level are less appropriately applicable to test data at a higher level of spatial granularity.The proposed multi-source approach performs well in mapping local-scale population density.It indicates that combining remote sensing and social sensing data is advantageous to mapping socioeconomic variables.展开更多
文摘A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were constructed based on images captured under four single light sources.Reconstruction image 1 was constructed by fusing greyscale versions of the original images into one image,and Reconstruction image2 was constructed based on the differences between the images captured under the different light sources.Subsequently,the four original images and two reconstructed images were input into the convolutional neural network AlexNet to recognize the density range in three cases:-1.5(clean coal) and+1.5 g/cm^(3)(non-clean coal);-1.8(non-gangue) and+1.8 g/cm^(3)(gangue);-1.5(clean coal),1.5-1.8(middlings),and+1.8 g/cm^(3)(gangue).The results show the following:(1) The reconstructed images,especially Reconstruction image 2,can effectively improve the recognition accuracy for the coal density range compared with images captured under single light source.(2) The recognition accuracies for gangue and non-gangue,clean coal and non-clean coal,and clean coal,middlings,and gangue reached88.44%,86.72% and 77.08%,respectively.(3) The recognition accuracy increases as the density moves further away from the boundary density.
基金supported in part by the National Natural Science Foundation of China under Grant 41505017.
文摘Multi-source information can be obtained through the fusion of infrared images and visible light images,which have the characteristics of complementary information.However,the existing acquisition methods of fusion images have disadvantages such as blurred edges,low contrast,and loss of details.Based on convolution sparse representation and improved pulse-coupled neural network this paper proposes an image fusion algorithm that decompose the source images into high-frequency and low-frequency subbands by non-subsampled Shearlet Transform(NSST).Furthermore,the low-frequency subbands were fused by convolutional sparse representation(CSR),and the high-frequency subbands were fused by an improved pulse coupled neural network(IPCNN)algorithm,which can effectively solve the problem of difficulty in setting parameters of the traditional PCNN algorithm,improving the performance of sparse representation with details injection.The result reveals that the proposed method in this paper has more advantages than the existing mainstream fusion algorithms in terms of visual effects and objective indicators.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金funded by the Centre for Advanced Modelling and Geospatial Information Systems(CAMGIS),Faculty of Engineering and IT,University of Technology Sydneysupported by the Research Funding Program,King Saud University,Riyadh,Saudi Arabia,under Project Ongoing Research Funding program(ORF-2025-14).
文摘Diabetic Retinopathy(DR)is a critical disorder that affects the retina due to the constant rise in diabetics and remains the major cause of blindness across the world.Early detection and timely treatment are essential to mitigate the effects of DR,such as retinal damage and vision impairment.Several conventional approaches have been proposed to detect DR early and accurately,but they are limited by data imbalance,interpretability,overfitting,convergence time,and other issues.To address these drawbacks and improve DR detection accurately,a distributed Explainable Convolutional Neural network-enabled Light Gradient Boosting Machine(DE-ExLNN)is proposed in this research.The model combines an explainable Convolutional Neural Network(CNN)and Light Gradient Boosting Machine(LightGBM),achieving highly accurate outcomes in DR detection.LightGBM serves as the detection model,and the inclusion of an explainable CNN addresses issues that conventional CNN classifiers could not resolve.A custom dataset was created for this research,containing both fundus and OCTA images collected from a realtime environment,providing more accurate results compared to standard conventional DR datasets.The custom dataset demonstrates notable accuracy,sensitivity,specificity,and Matthews Correlation Coefficient(MCC)scores,underscoring the effectiveness of this approach.Evaluations against other standard datasets achieved an accuracy of 93.94%,sensitivity of 93.90%,specificity of 93.99%,and MCC of 93.88%for fundus images.For OCTA images,the results obtained an accuracy of 95.30%,sensitivity of 95.50%,specificity of 95.09%,andMCC of 95%.Results prove that the combination of explainable CNN and LightGBMoutperforms othermethods.The inclusion of distributed learning enhances the model’s efficiency by reducing time consumption and complexity while facilitating feature extraction.
基金Supported by the Fundamental Public Welfare Research Program of Zhejiang Provincial Natural Science Foundation,China(LGN18C140007 and Y20C140024)the National High Technology Research and Development Program of China(863 Program,2013AA102402)the Agricultural Science and Technology Innovation Program of Chinese Academy of Agricultural Sciences.
文摘Monitring pest populations in paddy fields is important to effectively implement integrated pest management.Light traps are widely used to monitor field pests all over the world.Most conventional light traps still involve manual identification of target pests from lots of trapped insects,which is time-consuming,labor-intensive and error-prone,especially in pest peak periods.In this paper,we developed an automatic monitoring system for rice light-trap pests based on machine vision.This system is composed of an itelligent light trap,a computer or mobile phone client platform and a cloud server.The light trap firstly traps,kills and disperses insects,then collects images of trapped insects and sends each image to the cloud server.Five target pests in images are automatically identifed and counted by pest identification models loaded in the server.To avoid light-trap insects piling up,a vibration plate and a moving rotation conveyor belt are adopted to disperse these trapped insects.There was a close correlation(r=0.92)between our automatic and manual identification methods based on the daily pest number of one-year images from one light trap.Field experiments demonstrated the effectiveness and accuracy of our automatic light trap monitoring system.
基金supported by University Synergy Innovation Program of Anhui Province (No. GXXT-2019-007)Corporative Information Processing and Deep Mining for Intelligent Robot (No. JCYJ20170817155854115)+1 种基金Major Project for New Generation of AI (No.2018AAA0100400)Anhui Provincial Natural Science Foundation (No. 1908085MF206)。
文摘An efficient convolution neural network(CNN) plays a crucial role in various visual tasks like object classification or detection, etc. The most common way to construct a CNN is stacking the same convolution block or complex connection. These approaches may be efficient but the parameter size and computation(Comp) have explosive growth. So we present a novel architecture called"DLA+", which could obtain the feature from the different stages, and by the newly designed convolution block, could achieve better accuracy, while also dropping the computation six times compared to the baseline. We design some experiments about classification and object detection. On the CIFAR10 and VOC data-sets, we get better precision and faster speed than other architecture. The lightweight network even allows us to deploy to some low-performance device like drone, laptop, etc.
基金Project supported by the National Natural Science Foundation of China (Grant Nos.61975072 and 12174173)the Natural Science Foundation of Fujian Province,China (Grant Nos.2022H0023,2022J02047,ZZ2023J20,and 2022G02006)。
文摘Real-time,contact-free temperature monitoring of low to medium range(30℃-150℃)has been extensively used in industry and agriculture,which is usually realized by costly infrared temperature detection methods.This paper proposes an alternative approach of extracting temperature information in real time from the visible light images of the monitoring target using a convolutional neural network(CNN).A mean-square error of<1.119℃was reached in the temperature measurements of low to medium range using the CNN and the visible light images.Imaging angle and imaging distance do not affect the temperature detection using visible optical images by the CNN.Moreover,the CNN has a certain illuminance generalization ability capable of detection temperature information from the images which were collected under different illuminance and were not used for training.Compared to the conventional machine learning algorithms mentioned in the recent literatures,this real-time,contact-free temperature measurement approach that does not require any further image processing operations facilitates temperature monitoring applications in the industrial and civil fields.
文摘In this study,we proposed a multi-source approach for mapping local-scale population density of England.Specifically,we mapped both the working and daytime population densities by integrating the multi-source data such as residential population density,point-of-interest density,point-of-interest category mix,and nighttime light intensity.It is demonstrated that combining remote sensing and social sensing data provides a plausible way to map annual working or daytime population densities.In this paper,we trained models with England-wide data and subsequently tested these models with Wales-wide data.In addition,we further tested the models with England-wide data at a higher level of spatial granularity.Particularly,the random forest and convolutional neural network models were adopted to map population density.The estimated results and validation suggest that the three built models have high prediction accuracies at the local authority district level.It is shown that the convolutional neural network models have the greatest prediction accuracies at the local authority district level though they are most time-consuming.The models trained with the data at the local authority district level are less appropriately applicable to test data at a higher level of spatial granularity.The proposed multi-source approach performs well in mapping local-scale population density.It indicates that combining remote sensing and social sensing data is advantageous to mapping socioeconomic variables.