The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an...Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.展开更多
To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network mo...To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.展开更多
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of...With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.展开更多
Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE...Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.展开更多
Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-...Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-scale monitoring of Spartina alterniflora,but they require large datasets and have poor interpretability.A new method is proposed to detect Spartina alterniflora from Sentinel-2 imagery.Firstly,to get the high canopy cover and dense community characteristics of Spartina alterniflora,multi-dimensional shallow features are extracted from the imagery.Secondly,to detect different objects from satellite imagery,index features are extracted,and the statistical features of the Gray-Level Co-occurrence Matrix(GLCM)are derived using principal component analysis.Then,ensemble learning methods,including random forest,extreme gradient boosting,and light gradient boosting machine models,are employed for image classification.Meanwhile,Recursive Feature Elimination with Cross-Validation(RFECV)is used to select the best feature subset.Finally,to enhance the interpretability of the models,the best features are utilized to classify multi-temporal images and SHapley Additive exPlanations(SHAP)is combined with these classifications to explain the model prediction process.The method is validated by using Sentinel-2 imageries and previous observations of Spartina alterniflora in Chongming Island,it is found that the model combining image texture features such as GLCM covariance can significantly improve the detection accuracy of Spartina alterniflora by about 8%compared with the model without image texture features.Through multiple model comparisons and feature selection via RFECV,the selected model and eight features demonstrated good classification accuracy when applied to data from different time periods,proving that feature reduction can effectively enhance model generalization.Additionally,visualizing model decisions using SHAP revealed that the image texture feature component_1_GLCMVariance is particularly important for identifying each land cover type.展开更多
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio...Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.展开更多
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ...With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.展开更多
Based on the stability and inequality of texture features between coal and rock,this study used the digital image analysis technique to propose a coal–rock interface detection method.By using gray level co-occurrence...Based on the stability and inequality of texture features between coal and rock,this study used the digital image analysis technique to propose a coal–rock interface detection method.By using gray level co-occurrence matrix,twenty-two texture features were extracted from the images of coal and rock.Data dimension of the feature space reduced to four by feature selection,which was according to a separability criterion based on inter-class mean difference and within-class scatter.The experimental results show that the optimized features were effective in improving the separability of the samples and reducing the time complexity of the algorithm.In the optimized low-dimensional feature space,the coal–rock classifer was set up using the fsher discriminant method.Using the 10-fold cross-validation technique,the performance of the classifer was evaluated,and an average recognition rate of 94.12%was obtained.The results of comparative experiments show that the identifcation performance of the proposed method was superior to the texture description method based on gray histogram and gradient histogram.展开更多
The strength of cement-based materials,such as mortar,concrete and cement paste backfill(CPB),depends on its microstructures(e.g.pore structure and arrangement of particles and skeleton).Numerous studies on the relati...The strength of cement-based materials,such as mortar,concrete and cement paste backfill(CPB),depends on its microstructures(e.g.pore structure and arrangement of particles and skeleton).Numerous studies on the relationship between strength and pore structure(e.g.,pore size and its distribution)were performed,but the micro-morphology characteristics have been rarely concerned.Texture describing the surface properties of the sample is a global feature,which is an effective way to quantify the micro-morphological properties.In statistical analysis,GLCM features and Tamura texture are the most representative methods for characterizing the texture features.The mechanical strength and section image of the backfill sample prepared from three different solid concentrations of paste were obtained by uniaxial compressive strength test and scanning electron microscope,respectively.The texture features of different SEM images were calculated based on image analysis technology,and then the correlation between these parameters and the strength was analyzed.It was proved that the method is effective in the quantitative analysis on the micro-morphology characteristics of CPB.There is a significant correlation between the texture features and the unconfined compressive strength,and the prediction of strength is feasible using texture parameters of the CPB microstructure.展开更多
Objective: To explore the role of the texture features of images in the diagnosis of solitary pulmonary nodules (SPNs) in different sizes. Materials and methods: A total of 379 patients with pathologically confirm...Objective: To explore the role of the texture features of images in the diagnosis of solitary pulmonary nodules (SPNs) in different sizes. Materials and methods: A total of 379 patients with pathologically confirmed SPNs were enrolled in this study. They were divided into three groups based on the SPN sizes: ≤10, 11-20, and 〉20 mm. Their texture features were segmented and extracted. The differences in the image features between benign and malignant SPNs were compared. The SPNs in these three groups were determined and analyzed with the texture features of images. Results: These 379 SPNs were successfully segmented using the 2D Otsu threshold method and the self-adaptive threshold segmentation method. The texture features of these SPNs were obtained using the method of grey level co-occurrence matrix (GLCM). Of these 379 patients, 120 had benign SPNs and 259 had malignant SPNs. The entropy, contrast, energy, homogeneity, and correlation were 3.5597±0.6470, 0.5384±0.2561, 0.1921±0.1256, 0.8281±0.0604, and 0.8748±0.0740 in the benign SPNs and 3.8007±0.6235, 0.6088±0.2961, 0.1673±0.1070, 0.7980±0.0555, and 0.8550±0.0869 in the malignant SPNs (all P〈0.05). The sensitivity, specificity, and accuracy of the texture features of images were 83.3%, 90.0%, and 86.8%, respectively, for SPNs sized 〈10 mm, and were 86.6%, 88.2%, and 87.1%, respectively, for SPNs sized 11-20 mm and 94.7%, 91.8%, and 93.9%, respectively, for SPNs sized 〉20 mm. Conclusions: The entropy and contrast of malignant pulmonary nodules have been demonstrated to be higher in comparison to those of benign pulmonary nodules, while the energy, homogeneity correlation of malignant pulmonary nodules are lower than those of benign pulmonary nodules. The texture features of images can reflect the tissue features and have high sensitivity, specificity, and accuracy in differentiating SPNs. The sensitivity and accuracy increase for larger SPNs.展开更多
According to the pulverized coal combustion flame image texture features of the rotary-kiln oxide pellets sintering process,a combustion working condition recognition method based on the generalized learning vector(GL...According to the pulverized coal combustion flame image texture features of the rotary-kiln oxide pellets sintering process,a combustion working condition recognition method based on the generalized learning vector(GLVQ) neural network is proposed.Firstly,the numerical flame image is analyzed to extract texture features,such as energy,entropy and inertia,based on grey-level co-occurrence matrix(GLCM) to provide qualitative information on the changes in the visual appearance of the flame.Then the kernel principal component analysis(KPCA) method is adopted to deduct the input vector with high dimensionality so as to reduce the GLVQ target dimension and network scale greatly.Finally,the GLVQ neural network is trained by using the normalized texture feature data.The test results show that the proposed KPCA-GLVQ classifer has an excellent performance on training speed and correct recognition rate,and it meets the requirement for real-time combustion working condition recognition for the rotary kiln process.展开更多
In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, t...In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, the block is considered informative. A total of 12 features including statistics of brightness, color components and texture measures are used to form intermediate vectors. Principal component analysis is then performed to reduce the dimension to 6 to give the final feature vectors. Relevance of the constructed feature vectors is demonstrated by experiments in which k-means clustering is used to group the vectors hence the blocks. Blocks falling into the same group show similar visual appearances.展开更多
This paper presents a novel approach to feature subset selection using genetic algorithms.This approach has the ability to accommodate multiple criteria such as the accuracy and cost of classification into the process...This paper presents a novel approach to feature subset selection using genetic algorithms.This approach has the ability to accommodate multiple criteria such as the accuracy and cost of classification into the process of feature selection and finds the effective feature subset for texture classification.On the basis of the effective feature subset selected,a method is described to extract the objects which are higher than their surroundings,such as trees or forest,in the color aerial images.The methodology presented in this paper is illustrated by its application to the problem of trees extraction from aerial images.展开更多
To accurately describe damage within coal, digital image processing technology was used to determine texture parameters and obtain quantitative information related to coal meso-cracks. The relationship between damage ...To accurately describe damage within coal, digital image processing technology was used to determine texture parameters and obtain quantitative information related to coal meso-cracks. The relationship between damage and mesoscopic information for coal under compression was then analysed. The shape and distribution of damage were comprehensively considered in a defined damage variable, which was based on the texture characteristic. An elastic-brittle damage model based on the mesostructure information of coal was established. As a result, the damage model can appropriately and reliably replicate the processes of initiation, expansion, cut-through and eventual destruction of microscopic damage to coal under compression. After comparison, it was proved that the predicted overall stress-strain response of the model was comparable to the experimental result.展开更多
Surgical excision is an effective treatment for oral squamous cell carcinoma(OSCC),but exact intraoperative differentiation OSCC from the normal tissue is the first premise.As a noninvasive imaging technique,optical c...Surgical excision is an effective treatment for oral squamous cell carcinoma(OSCC),but exact intraoperative differentiation OSCC from the normal tissue is the first premise.As a noninvasive imaging technique,optical coherence tomography(OCT)has the nearly same resolution as the histopathological examination,whose images contain rich information to assist surgeons to make clinical decisions.We extracted kinds of texture features from OCT images obtained by a home-made swept-source OCT system in this paper,and studied the identification of OSCC based on different combinations of texture features and machine learning classifiers.It was demonstrated that different combinations had different accuracies,among which the combination of texture features,gray level co-occurrence matrix(GLCM),Laws'texture measnres(LM),and center symmetric auto-correlation(CSAC),and SVM as the classifier,had the optimal comprehensive identification effect,whose accuracy was 94.1%.It was proven that it is feasible to distinguish OSCC based on texture features in OCT images,and it has great potential in helping surgeons make rapid and accurate decisions in oral clinical practice.展开更多
In modern electromagnetic environment, radar emitter signal recognition is an important research topic. On the basis of multi-resolution wavelet analysis, an adaptive radar emitter signal recognition method based on m...In modern electromagnetic environment, radar emitter signal recognition is an important research topic. On the basis of multi-resolution wavelet analysis, an adaptive radar emitter signal recognition method based on multi-scale wavelet entropy feature extraction and feature weighting was proposed. With the only priori knowledge of signal to noise ratio(SNR), the method of extracting multi-scale wavelet entropy features of wavelet coefficients from different received signals were combined with calculating uneven weight factor and stability weight factor of the extracted multi-dimensional characteristics. Radar emitter signals of different modulation types and different parameters modulated were recognized through feature weighting and feature fusion. Theoretical analysis and simulation results show that the presented algorithm has a high recognition rate. Additionally, when the SNR is greater than-4 d B, the correct recognition rate is higher than 93%. Hence, the proposed algorithm has great application value.展开更多
Designing detection algorithms with high efficiency for Synthetic Aperture Radar(SAR) imagery is essential for the operator SAR Automatic Target Recognition(ATR) system.This work abandons the detection strategy of vis...Designing detection algorithms with high efficiency for Synthetic Aperture Radar(SAR) imagery is essential for the operator SAR Automatic Target Recognition(ATR) system.This work abandons the detection strategy of visiting every pixel in SAR imagery as done in many traditional detection algorithms,and introduces the gridding and fusion idea of different texture fea-tures to realize fast target detection.It first grids the original SAR imagery,yielding a set of grids to be classified into clutter grids and target grids,and then calculates the texture features in each grid.By fusing the calculation results,the target grids containing potential maneuvering targets are determined.The dual threshold segmentation technique is imposed on target grids to obtain the regions of interest.The fused texture features,including local statistics features and Gray-Level Co-occurrence Matrix(GLCM),are investigated.The efficiency and superiority of our proposed algorithm were tested and verified by comparing with existing fast de-tection algorithms using real SAR data.The results obtained from the experiments indicate the promising practical application val-ue of our study.展开更多
Objective To investigate effect of MR field strength on texture features of cerebral T2 fluid attenuated inversion recovery(T2-FLAIR)images.Methods We acquired cerebral 3 D T2-FLAIR images of thirty patients who were ...Objective To investigate effect of MR field strength on texture features of cerebral T2 fluid attenuated inversion recovery(T2-FLAIR)images.Methods We acquired cerebral 3 D T2-FLAIR images of thirty patients who were diagnosed with ischemic white matter lesion(WML)with MR-1.5 T and MR-3.0 T scanners.Histogram texture features which included mean signal intensity(Mean),Skewness and Kurtosis,and gray level co-occurrence matrix(GLCM)texture features which included angular second moment(ASM),Contrast,Correlation,Inverse difference moment(IDM)and Entropy,of regions of interest located in the area of WML and normal white matter(NWM)were measured by ImageJ software.The texture parameters acquired with MR-1.5 T scanning were compared with MR-3.0 T scanning.Results The Mean of both WML and NWM obtained with MR-1.5 T scanning was significantly lower than that acquired with MR-3.0 T(P<0.001),while Skewness and Kurtosis between MR-1.5 T and MR-3.0 T scanning showed no significant difference(P>0.05).ASM,Correlation and IDM of both WML and NWM acquired with MR-1.5 T revealed significantly lower values than those with MR-3.0 T(P<0.001),while Contrast and Entropy acquired with MR-1.5 T showed significantly higher values than those with MR-3.0 T(P<0.001).Conclusion MR field strength showed no significant effect on histogram textures,while had significant effect on GLCM texture features of cerebral T2-FLAIR images,which indicated that it should be cautious to explain the texture results acquired based on the different MR field strength.展开更多
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers 2024TIAD-CYKJCXX0121,2024NSCQ-LZX0135Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology graduate education high-quality development project,grant number gzlsz202401the Chongqing University of Technology-Chongqing LINGLUE Technology Co.,Ltd.,Electronic Information(Artificial Intelligence)graduate joint training basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.,Computer Technology graduate joint training base.
文摘Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.
基金supported by National Natural Science Foundation of China(No.61862037)Lanzhou Jiaotong University Tianyou Innovation Team Project(No.TY202002)。
文摘To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.
基金supported by Communication University of China(HG23035)partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013).
文摘With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.
基金supported by the National Key Research and Development Program of China(2023YFB3307800)National Natural Science Foundation of China(62394343,62373155)+2 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024A26)Fundamental Research Funds for the Central Universities.
文摘Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.
基金The National Key Research and Development Program of China under contract No.2023YFC3008204the National Natural Science Foundation of China under contract Nos 41977302 and 42476217.
文摘Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-scale monitoring of Spartina alterniflora,but they require large datasets and have poor interpretability.A new method is proposed to detect Spartina alterniflora from Sentinel-2 imagery.Firstly,to get the high canopy cover and dense community characteristics of Spartina alterniflora,multi-dimensional shallow features are extracted from the imagery.Secondly,to detect different objects from satellite imagery,index features are extracted,and the statistical features of the Gray-Level Co-occurrence Matrix(GLCM)are derived using principal component analysis.Then,ensemble learning methods,including random forest,extreme gradient boosting,and light gradient boosting machine models,are employed for image classification.Meanwhile,Recursive Feature Elimination with Cross-Validation(RFECV)is used to select the best feature subset.Finally,to enhance the interpretability of the models,the best features are utilized to classify multi-temporal images and SHapley Additive exPlanations(SHAP)is combined with these classifications to explain the model prediction process.The method is validated by using Sentinel-2 imageries and previous observations of Spartina alterniflora in Chongming Island,it is found that the model combining image texture features such as GLCM covariance can significantly improve the detection accuracy of Spartina alterniflora by about 8%compared with the model without image texture features.Through multiple model comparisons and feature selection via RFECV,the selected model and eight features demonstrated good classification accuracy when applied to data from different time periods,proving that feature reduction can effectively enhance model generalization.Additionally,visualizing model decisions using SHAP revealed that the image texture feature component_1_GLCMVariance is particularly important for identifying each land cover type.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia through research group No.(RG-NBU-2022-1234).
文摘Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.
文摘With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.
基金the National Natural Science Foundation of China(No.51134024/E0422)for the financial support
文摘Based on the stability and inequality of texture features between coal and rock,this study used the digital image analysis technique to propose a coal–rock interface detection method.By using gray level co-occurrence matrix,twenty-two texture features were extracted from the images of coal and rock.Data dimension of the feature space reduced to four by feature selection,which was according to a separability criterion based on inter-class mean difference and within-class scatter.The experimental results show that the optimized features were effective in improving the separability of the samples and reducing the time complexity of the algorithm.In the optimized low-dimensional feature space,the coal–rock classifer was set up using the fsher discriminant method.Using the 10-fold cross-validation technique,the performance of the classifer was evaluated,and an average recognition rate of 94.12%was obtained.The results of comparative experiments show that the identifcation performance of the proposed method was superior to the texture description method based on gray histogram and gradient histogram.
基金Project(51722401)supported by the National Natural Science Foundation for Excellent Young Scholars of ChinaProject(FRF-TP-18-003C1)supported by the Fundamental Research Funds for the Central Universities,ChinaProject(51734001)supported by the Key Program of National Natural Science Foundation of China
文摘The strength of cement-based materials,such as mortar,concrete and cement paste backfill(CPB),depends on its microstructures(e.g.pore structure and arrangement of particles and skeleton).Numerous studies on the relationship between strength and pore structure(e.g.,pore size and its distribution)were performed,but the micro-morphology characteristics have been rarely concerned.Texture describing the surface properties of the sample is a global feature,which is an effective way to quantify the micro-morphological properties.In statistical analysis,GLCM features and Tamura texture are the most representative methods for characterizing the texture features.The mechanical strength and section image of the backfill sample prepared from three different solid concentrations of paste were obtained by uniaxial compressive strength test and scanning electron microscope,respectively.The texture features of different SEM images were calculated based on image analysis technology,and then the correlation between these parameters and the strength was analyzed.It was proved that the method is effective in the quantitative analysis on the micro-morphology characteristics of CPB.There is a significant correlation between the texture features and the unconfined compressive strength,and the prediction of strength is feasible using texture parameters of the CPB microstructure.
基金supported by National Natural Science Fund project [81202284]Guangdong Provincial Natural Science Fund project [S2011040004735]+2 种基金Project for Outstanding Young Innovative Talents in Colleges and Universities of Guangdong Province [LYM11106]Special Research Fund for Basic Scientific Research Projects in Central Universities [21612305, 21612101]Guangzhou Municipal Science and Technology Fund project [2014J4100119]
文摘Objective: To explore the role of the texture features of images in the diagnosis of solitary pulmonary nodules (SPNs) in different sizes. Materials and methods: A total of 379 patients with pathologically confirmed SPNs were enrolled in this study. They were divided into three groups based on the SPN sizes: ≤10, 11-20, and 〉20 mm. Their texture features were segmented and extracted. The differences in the image features between benign and malignant SPNs were compared. The SPNs in these three groups were determined and analyzed with the texture features of images. Results: These 379 SPNs were successfully segmented using the 2D Otsu threshold method and the self-adaptive threshold segmentation method. The texture features of these SPNs were obtained using the method of grey level co-occurrence matrix (GLCM). Of these 379 patients, 120 had benign SPNs and 259 had malignant SPNs. The entropy, contrast, energy, homogeneity, and correlation were 3.5597±0.6470, 0.5384±0.2561, 0.1921±0.1256, 0.8281±0.0604, and 0.8748±0.0740 in the benign SPNs and 3.8007±0.6235, 0.6088±0.2961, 0.1673±0.1070, 0.7980±0.0555, and 0.8550±0.0869 in the malignant SPNs (all P〈0.05). The sensitivity, specificity, and accuracy of the texture features of images were 83.3%, 90.0%, and 86.8%, respectively, for SPNs sized 〈10 mm, and were 86.6%, 88.2%, and 87.1%, respectively, for SPNs sized 11-20 mm and 94.7%, 91.8%, and 93.9%, respectively, for SPNs sized 〉20 mm. Conclusions: The entropy and contrast of malignant pulmonary nodules have been demonstrated to be higher in comparison to those of benign pulmonary nodules, while the energy, homogeneity correlation of malignant pulmonary nodules are lower than those of benign pulmonary nodules. The texture features of images can reflect the tissue features and have high sensitivity, specificity, and accuracy in differentiating SPNs. The sensitivity and accuracy increase for larger SPNs.
基金supported by China Postdoctoral Science Foundation(No.20110491510)Program for Liaoning Excellent Talents in University(No.LJQ2011027)+1 种基金Anshan Science and Technology Project(No.2011MS11)Special Research Foundation of University of Science and Technology of Liaoning(No.2011zx10)
文摘According to the pulverized coal combustion flame image texture features of the rotary-kiln oxide pellets sintering process,a combustion working condition recognition method based on the generalized learning vector(GLVQ) neural network is proposed.Firstly,the numerical flame image is analyzed to extract texture features,such as energy,entropy and inertia,based on grey-level co-occurrence matrix(GLCM) to provide qualitative information on the changes in the visual appearance of the flame.Then the kernel principal component analysis(KPCA) method is adopted to deduct the input vector with high dimensionality so as to reduce the GLVQ target dimension and network scale greatly.Finally,the GLVQ neural network is trained by using the normalized texture feature data.The test results show that the proposed KPCA-GLVQ classifer has an excellent performance on training speed and correct recognition rate,and it meets the requirement for real-time combustion working condition recognition for the rotary kiln process.
基金Project supported by the National Natural Science Foundation of China (Grant No.60502039), the Shanghai Rising-Star Program (Grant No.06QA14022), and the Key Project of Shanghai Municipality for Basic Research (Grant No.04JC14037)
文摘In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, the block is considered informative. A total of 12 features including statistics of brightness, color components and texture measures are used to form intermediate vectors. Principal component analysis is then performed to reduce the dimension to 6 to give the final feature vectors. Relevance of the constructed feature vectors is demonstrated by experiments in which k-means clustering is used to group the vectors hence the blocks. Blocks falling into the same group show similar visual appearances.
文摘This paper presents a novel approach to feature subset selection using genetic algorithms.This approach has the ability to accommodate multiple criteria such as the accuracy and cost of classification into the process of feature selection and finds the effective feature subset for texture classification.On the basis of the effective feature subset selected,a method is described to extract the objects which are higher than their surroundings,such as trees or forest,in the color aerial images.The methodology presented in this paper is illustrated by its application to the problem of trees extraction from aerial images.
基金funding by the National Natural Science Foundation of China(Nos.51474039 and 51404046)the Project of Shanxi Provincial Federation of Coalbed Methane Research(No.2013012010)the Science Foundation of North University of China(No.XJJ2016033)
文摘To accurately describe damage within coal, digital image processing technology was used to determine texture parameters and obtain quantitative information related to coal meso-cracks. The relationship between damage and mesoscopic information for coal under compression was then analysed. The shape and distribution of damage were comprehensively considered in a defined damage variable, which was based on the texture characteristic. An elastic-brittle damage model based on the mesostructure information of coal was established. As a result, the damage model can appropriately and reliably replicate the processes of initiation, expansion, cut-through and eventual destruction of microscopic damage to coal under compression. After comparison, it was proved that the predicted overall stress-strain response of the model was comparable to the experimental result.
基金This study was supported by the National Natural Science Foundation of China(No.61875092)Science and Technology Support Program of Tianjin(17YFZCSY00740)+1 种基金the Beijing-Tianjin-Hebei Basic Research Cooperation Special Program(19JCZDJC65300)the Fundamental Research Funds for the Central Universities,Nankai University(63201178).
文摘Surgical excision is an effective treatment for oral squamous cell carcinoma(OSCC),but exact intraoperative differentiation OSCC from the normal tissue is the first premise.As a noninvasive imaging technique,optical coherence tomography(OCT)has the nearly same resolution as the histopathological examination,whose images contain rich information to assist surgeons to make clinical decisions.We extracted kinds of texture features from OCT images obtained by a home-made swept-source OCT system in this paper,and studied the identification of OSCC based on different combinations of texture features and machine learning classifiers.It was demonstrated that different combinations had different accuracies,among which the combination of texture features,gray level co-occurrence matrix(GLCM),Laws'texture measnres(LM),and center symmetric auto-correlation(CSAC),and SVM as the classifier,had the optimal comprehensive identification effect,whose accuracy was 94.1%.It was proven that it is feasible to distinguish OSCC based on texture features in OCT images,and it has great potential in helping surgeons make rapid and accurate decisions in oral clinical practice.
基金Project(61301095)supported by the National Natural Science Foundation of ChinaProject(QC2012C070)supported by Heilongjiang Provincial Natural Science Foundation for the Youth,ChinaProjects(HEUCF130807,HEUCFZ1129)supported by the Fundamental Research Funds for the Central Universities of China
文摘In modern electromagnetic environment, radar emitter signal recognition is an important research topic. On the basis of multi-resolution wavelet analysis, an adaptive radar emitter signal recognition method based on multi-scale wavelet entropy feature extraction and feature weighting was proposed. With the only priori knowledge of signal to noise ratio(SNR), the method of extracting multi-scale wavelet entropy features of wavelet coefficients from different received signals were combined with calculating uneven weight factor and stability weight factor of the extracted multi-dimensional characteristics. Radar emitter signals of different modulation types and different parameters modulated were recognized through feature weighting and feature fusion. Theoretical analysis and simulation results show that the presented algorithm has a high recognition rate. Additionally, when the SNR is greater than-4 d B, the correct recognition rate is higher than 93%. Hence, the proposed algorithm has great application value.
基金Supported by the National Natural Science Foundation of China (No. 61032001, No.61002045)
文摘Designing detection algorithms with high efficiency for Synthetic Aperture Radar(SAR) imagery is essential for the operator SAR Automatic Target Recognition(ATR) system.This work abandons the detection strategy of visiting every pixel in SAR imagery as done in many traditional detection algorithms,and introduces the gridding and fusion idea of different texture fea-tures to realize fast target detection.It first grids the original SAR imagery,yielding a set of grids to be classified into clutter grids and target grids,and then calculates the texture features in each grid.By fusing the calculation results,the target grids containing potential maneuvering targets are determined.The dual threshold segmentation technique is imposed on target grids to obtain the regions of interest.The fused texture features,including local statistics features and Gray-Level Co-occurrence Matrix(GLCM),are investigated.The efficiency and superiority of our proposed algorithm were tested and verified by comparing with existing fast de-tection algorithms using real SAR data.The results obtained from the experiments indicate the promising practical application val-ue of our study.
文摘Objective To investigate effect of MR field strength on texture features of cerebral T2 fluid attenuated inversion recovery(T2-FLAIR)images.Methods We acquired cerebral 3 D T2-FLAIR images of thirty patients who were diagnosed with ischemic white matter lesion(WML)with MR-1.5 T and MR-3.0 T scanners.Histogram texture features which included mean signal intensity(Mean),Skewness and Kurtosis,and gray level co-occurrence matrix(GLCM)texture features which included angular second moment(ASM),Contrast,Correlation,Inverse difference moment(IDM)and Entropy,of regions of interest located in the area of WML and normal white matter(NWM)were measured by ImageJ software.The texture parameters acquired with MR-1.5 T scanning were compared with MR-3.0 T scanning.Results The Mean of both WML and NWM obtained with MR-1.5 T scanning was significantly lower than that acquired with MR-3.0 T(P<0.001),while Skewness and Kurtosis between MR-1.5 T and MR-3.0 T scanning showed no significant difference(P>0.05).ASM,Correlation and IDM of both WML and NWM acquired with MR-1.5 T revealed significantly lower values than those with MR-3.0 T(P<0.001),while Contrast and Entropy acquired with MR-1.5 T showed significantly higher values than those with MR-3.0 T(P<0.001).Conclusion MR field strength showed no significant effect on histogram textures,while had significant effect on GLCM texture features of cerebral T2-FLAIR images,which indicated that it should be cautious to explain the texture results acquired based on the different MR field strength.