The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE...Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an...Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.展开更多
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of...With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.展开更多
To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network mo...To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.展开更多
Three-dimensional(3D)single molecule localization microscopy(SMLM)plays an important role in biomedical applications,but its data processing is very complicated.Deep learning is a potential tool to solve this problem....Three-dimensional(3D)single molecule localization microscopy(SMLM)plays an important role in biomedical applications,but its data processing is very complicated.Deep learning is a potential tool to solve this problem.As the state of art 3D super-resolution localization algorithm based on deep learning,FD-DeepLoc algorithm reported recently still has a gap with the expected goal of online image processing,even though it has greatly improved the data processing throughput.In this paper,a new algorithm Lite-FD-DeepLoc is developed on the basis of FD-DeepLoc algorithm to meet the online image processing requirements of 3D SMLM.This new algorithm uses the feature compression method to reduce the parameters of the model,and combines it with pipeline programming to accelerate the inference process of the deep learning model.The simulated data processing results show that the image processing speed of Lite-FD-DeepLoc is about twice as fast as that of FD-DeepLoc with a slight decrease in localization accuracy,which can realize real-time processing of 256×256 pixels size images.The results of biological experimental data processing imply that Lite-FD-DeepLoc can successfully analyze the data based on astigmatism and saddle point engineering,and the global resolution of the reconstructed image is equivalent to or even better than FD-DeepLoc algorithm.展开更多
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio...Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.展开更多
Data-driven process monitoring is an effective approach to assure safe operation of modern manufacturing and energy systems,such as thermal power plants being studied in this work.Industrial processes are inherently d...Data-driven process monitoring is an effective approach to assure safe operation of modern manufacturing and energy systems,such as thermal power plants being studied in this work.Industrial processes are inherently dynamic and need to be monitored using dynamic algorithms.Mainstream dynamic algorithms rely on concatenating current measurement with past data.This work proposes a new,alternative dynamic process monitoring algorithm,using dot product feature analysis(DPFA).DPFA computes the dot product of consecutive samples,thus naturally capturing the process dynamics through temporal correlation.At the same time,DPFA's online computational complexity is lower than not just existing dynamic algorithms,but also classical static algorithms(e.g.,principal component analysis and slow feature analysis).The detectability of the new algorithm is analyzed for three types of faults typically seen in process systems:sensor bias,process fault and gain change fault.Through experiments with a numerical example and real data from a thermal power plant,the DPFA algorithm is shown to be superior to the state-of-the-art methods,in terms of better monitoring performance(fault detection rate and false alarm rate)and lower computational complexity.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the f...Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the features in lung X-ray images.A pneumonia classification model based on multi-scale directional feature enhancement MSD-Net is proposed in this paper.The main innovations are as follows:Firstly,the Multi-scale Residual Feature Extraction Module(MRFEM)is designed to effectively extract multi-scale features.The MRFEM uses dilated convolutions with different expansion rates to increase the receptive field and extract multi-scale features effectively.Secondly,the Multi-scale Directional Feature Perception Module(MDFPM)is designed,which uses a three-branch structure of different sizes convolution to transmit direction feature layer by layer,and focuses on the target region to enhance the feature information.Thirdly,the Axial Compression Former Module(ACFM)is designed to perform global calculations to enhance the perception ability of global features in different directions.To verify the effectiveness of the MSD-Net,comparative experiments and ablation experiments are carried out.In the COVID-19 RADIOGRAPHY DATABASE,the Accuracy,Recall,Precision,F1 Score,and Specificity of MSD-Net are 97.76%,95.57%,95.52%,95.52%,and 98.51%,respectively.In the chest X-ray dataset,the Accuracy,Recall,Precision,F1 Score and Specificity of MSD-Net are 97.78%,95.22%,96.49%,95.58%,and 98.11%,respectively.This model improves the accuracy of lung image recognition effectively and provides an important clinical reference to pneumonia Computer-Aided Diagnosis.展开更多
Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulate...Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulated to obtain the contour accurately.Therefore,an algorithm forαphase contour detection of TB6 titanium alloy fused with multi-scale fretting features is proposed.Firstly,through the response of the classical receptive field model based on fretting and the suppression of new non-classical receptive field model based on fretting,the information maps of theαphase contour of the TB6 titanium alloy at different scales are obtained;then the information map of the smallest scale contour is used as a benchmark,the neighborhood is constructed to judge the deviation of other scale contour information,and the corresponding weight value is calculated;finally,Gaussian function is used to weight and fuse the deviation information,and the contour detection result of TB6 titanium alloyαphase is obtained.In the Visual Studio 2013 environment,484 metallographic images with different temperatures,strain rates,and magnifications were tested.The results show that the performance evaluation F value of the proposed algorithm is 0.915,which can effectively improve the accuracy ofαphase contour detection of TB6 titanium alloy.展开更多
In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract i...In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract image features and project them into a feature space,thus evaluating the similarity between samples based on their relative distances within the metric space.To sufficiently extract feature information from limited sample data and mitigate the impact of constrained data vol-ume,a multi-scale feature extraction network is presented to capture data features at various scales during the process of image feature extraction.Additionally,the position of the prototype is fine-tuned by assigning weights to data points to mitigate the influence of outliers on the experiment.The loss function integrates contrastive loss and label-smoothing to bring similar data points closer and separate dissimilar data points within the metric space.Experimental evaluations are conducted on small-sample datasets mini-ImageNet and CUB200-2011.The method in this paper can achieve higher classification accuracy.Specifically,in the 5-way 1-shot experiment,classification accuracy reaches 50.13%and 66.79%respectively on these two datasets.Moreover,in the 5-way 5-shot ex-periment,accuracy of 66.79%and 85.91%are observed,respectively.展开更多
Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it cha...Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.展开更多
Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional a...Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional approaches often fail in the face of challenges such as low contrast, morphological variability, and densely packed structures. Recent advancements in deep learning have transformed segmentation capabilities through the integration of fine-scale detail preservation, coarse-scale contextual modeling, and multi-scale feature fusion. This work provides a comprehensive analysis of state-of-the-art deep learning models, including U-Net variants, attention-based frameworks, and Transformer-integrated networks, highlighting innovations that improve accuracy, generalizability, and computational efficiency. Key architectural components such as convolution operations, shallow and deep blocks, skip connections, and hybrid encoders are examined for their roles in enhancing spatial representation and semantic consistency. We further discuss the importance of hierarchical and instance-aware segmentation and annotation in interpreting complex biological scenes and multiplexed medical images. By bridging methodological developments with diverse application domains, this paper outlines current trends and future directions for semantic segmentation, emphasizing its critical role in facilitating annotation, diagnosis, and discovery in biomedical research.展开更多
Accurate and robust navigation in complex surgical environments is crucial for bronchoscopic surgeries.This study purposes a bronchoscopic lumen feature matching network(BLFM-Net)based on deep learning to address the ...Accurate and robust navigation in complex surgical environments is crucial for bronchoscopic surgeries.This study purposes a bronchoscopic lumen feature matching network(BLFM-Net)based on deep learning to address the challenges of image noise,anatomical complexity,and the stringent real-time requirements.The BLFM-Net enhances bronchoscopic image processing by integrating several functional modules.The FFA-Net preprocessing module mitigates image fogging and improves visual clarity for subsequent processing.The feature extraction module derives multi-dimensional features,such as centroids,area,and shape descriptors,from dehazed images.The Faster RCNN Object detection module detects bronchial regions of interest and generates bounding boxes to localize key areas.The feature matching module accelerates the process by combining detection boxes,extracted features,and a KD-Tree(K-Dimensional Tree)-based algorithm,ensuring efficient and accurate regional feature associations.The BLFM-Net was evaluated on 5212 bronchoscopic images,demonstrating superior performance compared to traditional and other deep learning-based image matching methods.It achieved real-time matching with an average frame time of 6 ms,with a matching accuracy of over 96%.The method remained robust under challenging conditions including frame dropping(0,5,10,20),shadowed regions,and variable lighting,maintaining accuracy of above 94%even with the frame dropping of 20.This study presents BLFM-Net,a deep learning-based matching network designed to enhance and match bronchial features in bronchoscopic images.The BLFM-Net shows improved accuracy,real-time performance,and reliability,making a valuable tool for bronchoscopic surgeries.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
The capacity to diagnose faults in rolling bearings is of significant practical importance to ensure the normal operation of the equipment.Frequency-domain features can effectively enhance the identification of fault ...The capacity to diagnose faults in rolling bearings is of significant practical importance to ensure the normal operation of the equipment.Frequency-domain features can effectively enhance the identification of fault modes.However,existing methods often suffer from insufficient frequency-domain representation in practical applications,which greatly affects diagnostic performance.Therefore,this paper proposes a rolling bearing fault diagnosismethod based on aMulti-Scale FusionNetwork(MSFN)using the Time-Division Fourier Transform(TDFT).The method constructs multi-scale channels to extract time-domain and frequency-domain features of the signal in parallel.A multi-level,multi-scale filter-based approach is designed to extract frequency-domain features in a segmented manner.A cross-attention mechanism is introduced to facilitate the fusion of the extracted time-frequency domain features.The performance of the proposed method is validated using the CWRU and Ottawa datasets.The results show that the average accuracy of MSFN under complex noisy signals is 97.75%and 94.41%.The average accuracy under variable load conditions is 98.68%.This demonstrates its significant application potential compared to existing methods.展开更多
Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurat...Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.展开更多
Advanced Persistent Threats(APTs)represent one of the most complex and dangerous categories of cyber-attacks characterised by their stealthy behaviour,long-term persistence,and ability to bypass traditional detection ...Advanced Persistent Threats(APTs)represent one of the most complex and dangerous categories of cyber-attacks characterised by their stealthy behaviour,long-term persistence,and ability to bypass traditional detection systems.The complexity of real-world network data poses significant challenges in detection.Machine learning models have shown promise in detecting APTs;however,their performance often suffers when trained on large datasets with redundant or irrelevant features.This study presents a novel,hybrid feature selection method designed to improve APT detection by reducing dimensionality while preserving the informative characteristics of the data.It combines Mutual Information(MI),Symmetric Uncertainty(SU)and Minimum Redundancy Maximum Relevance(mRMR)to enhance feature selection.MI and SU assess feature relevance,while mRMR maximises relevance and minimises redundancy,ensuring that the most impactful features are prioritised.This method addresses redundancy among selected features,improving the overall efficiency and effectiveness of the detection model.Experiments on a real-world APT datasets were conducted to evaluate the proposed method.Multiple classifiers including,Random Forest,Support Vector Machine(SVM),Gradient Boosting,and Neural Networks were used to assess classification performance.The results demonstrate that the proposed feature selection method significantly enhances detection accuracy compared to baseline models trained on the full feature set.The Random Forest algorithm achieved the highest performance,with near-perfect accuracy,precision,recall,and F1 scores(99.97%).The proposed adaptive thresholding algorithm within the selection method allows each classifier to benefit from a reduced and optimised feature space,resulting in improved training and predictive performance.This research offers a scalable and classifier-agnostic solution for dimensionality reduction in cybersecurity applications.展开更多
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the National Key Research and Development Program of China(2023YFB3307800)National Natural Science Foundation of China(62394343,62373155)+2 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024A26)Fundamental Research Funds for the Central Universities.
文摘Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers 2024TIAD-CYKJCXX0121,2024NSCQ-LZX0135Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology graduate education high-quality development project,grant number gzlsz202401the Chongqing University of Technology-Chongqing LINGLUE Technology Co.,Ltd.,Electronic Information(Artificial Intelligence)graduate joint training basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.,Computer Technology graduate joint training base.
文摘Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.
基金supported by Communication University of China(HG23035)partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013).
文摘With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.
基金supported by National Natural Science Foundation of China(No.61862037)Lanzhou Jiaotong University Tianyou Innovation Team Project(No.TY202002)。
文摘To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.
基金supported by the Start-up Fund from Hainan University(No.KYQD(ZR)-20077)。
文摘Three-dimensional(3D)single molecule localization microscopy(SMLM)plays an important role in biomedical applications,but its data processing is very complicated.Deep learning is a potential tool to solve this problem.As the state of art 3D super-resolution localization algorithm based on deep learning,FD-DeepLoc algorithm reported recently still has a gap with the expected goal of online image processing,even though it has greatly improved the data processing throughput.In this paper,a new algorithm Lite-FD-DeepLoc is developed on the basis of FD-DeepLoc algorithm to meet the online image processing requirements of 3D SMLM.This new algorithm uses the feature compression method to reduce the parameters of the model,and combines it with pipeline programming to accelerate the inference process of the deep learning model.The simulated data processing results show that the image processing speed of Lite-FD-DeepLoc is about twice as fast as that of FD-DeepLoc with a slight decrease in localization accuracy,which can realize real-time processing of 256×256 pixels size images.The results of biological experimental data processing imply that Lite-FD-DeepLoc can successfully analyze the data based on astigmatism and saddle point engineering,and the global resolution of the reconstructed image is equivalent to or even better than FD-DeepLoc algorithm.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia through research group No.(RG-NBU-2022-1234).
文摘Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.
基金supported in part by the National Science Fund for Distinguished Young Scholars of China(62225303)the National Natural Science Fundation of China(62303039,62433004)+2 种基金the China Postdoctoral Science Foundation(BX20230034,2023M730190)the Fundamental Research Funds for the Central Universities(buctrc202201,QNTD2023-01)the High Performance Computing Platform,College of Information Science and Technology,Beijing University of Chemical Technology
文摘Data-driven process monitoring is an effective approach to assure safe operation of modern manufacturing and energy systems,such as thermal power plants being studied in this work.Industrial processes are inherently dynamic and need to be monitored using dynamic algorithms.Mainstream dynamic algorithms rely on concatenating current measurement with past data.This work proposes a new,alternative dynamic process monitoring algorithm,using dot product feature analysis(DPFA).DPFA computes the dot product of consecutive samples,thus naturally capturing the process dynamics through temporal correlation.At the same time,DPFA's online computational complexity is lower than not just existing dynamic algorithms,but also classical static algorithms(e.g.,principal component analysis and slow feature analysis).The detectability of the new algorithm is analyzed for three types of faults typically seen in process systems:sensor bias,process fault and gain change fault.Through experiments with a numerical example and real data from a thermal power plant,the DPFA algorithm is shown to be superior to the state-of-the-art methods,in terms of better monitoring performance(fault detection rate and false alarm rate)and lower computational complexity.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金supported in part by the National Natural Science Foundation of China(Grant No.62062003)Natural Science Foundation of Ningxia(Grant No.2023AAC03293).
文摘Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the features in lung X-ray images.A pneumonia classification model based on multi-scale directional feature enhancement MSD-Net is proposed in this paper.The main innovations are as follows:Firstly,the Multi-scale Residual Feature Extraction Module(MRFEM)is designed to effectively extract multi-scale features.The MRFEM uses dilated convolutions with different expansion rates to increase the receptive field and extract multi-scale features effectively.Secondly,the Multi-scale Directional Feature Perception Module(MDFPM)is designed,which uses a three-branch structure of different sizes convolution to transmit direction feature layer by layer,and focuses on the target region to enhance the feature information.Thirdly,the Axial Compression Former Module(ACFM)is designed to perform global calculations to enhance the perception ability of global features in different directions.To verify the effectiveness of the MSD-Net,comparative experiments and ablation experiments are carried out.In the COVID-19 RADIOGRAPHY DATABASE,the Accuracy,Recall,Precision,F1 Score,and Specificity of MSD-Net are 97.76%,95.57%,95.52%,95.52%,and 98.51%,respectively.In the chest X-ray dataset,the Accuracy,Recall,Precision,F1 Score and Specificity of MSD-Net are 97.78%,95.22%,96.49%,95.58%,and 98.11%,respectively.This model improves the accuracy of lung image recognition effectively and provides an important clinical reference to pneumonia Computer-Aided Diagnosis.
基金Supported by Hebei Provincial Key Laboratory for Software Engineering(Grant No.22567637H)the"Rail Vehicle Application Engineering"National International Science and Technology Cooperation Base Open Project Fund(Grant No.BMRV21KF09).
文摘Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulated to obtain the contour accurately.Therefore,an algorithm forαphase contour detection of TB6 titanium alloy fused with multi-scale fretting features is proposed.Firstly,through the response of the classical receptive field model based on fretting and the suppression of new non-classical receptive field model based on fretting,the information maps of theαphase contour of the TB6 titanium alloy at different scales are obtained;then the information map of the smallest scale contour is used as a benchmark,the neighborhood is constructed to judge the deviation of other scale contour information,and the corresponding weight value is calculated;finally,Gaussian function is used to weight and fuse the deviation information,and the contour detection result of TB6 titanium alloyαphase is obtained.In the Visual Studio 2013 environment,484 metallographic images with different temperatures,strain rates,and magnifications were tested.The results show that the performance evaluation F value of the proposed algorithm is 0.915,which can effectively improve the accuracy ofαphase contour detection of TB6 titanium alloy.
基金the Scientific Research Foundation of Liaoning Provincial Department of Education(No.LJKZ0139)the Program for Liaoning Excellent Talents in University(No.LR15045).
文摘In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract image features and project them into a feature space,thus evaluating the similarity between samples based on their relative distances within the metric space.To sufficiently extract feature information from limited sample data and mitigate the impact of constrained data vol-ume,a multi-scale feature extraction network is presented to capture data features at various scales during the process of image feature extraction.Additionally,the position of the prototype is fine-tuned by assigning weights to data points to mitigate the influence of outliers on the experiment.The loss function integrates contrastive loss and label-smoothing to bring similar data points closer and separate dissimilar data points within the metric space.Experimental evaluations are conducted on small-sample datasets mini-ImageNet and CUB200-2011.The method in this paper can achieve higher classification accuracy.Specifically,in the 5-way 1-shot experiment,classification accuracy reaches 50.13%and 66.79%respectively on these two datasets.Moreover,in the 5-way 5-shot ex-periment,accuracy of 66.79%and 85.91%are observed,respectively.
基金supported in part by the National Natural Science Foundation of China under Grants 62463002,62062021 and 62473033in part by the Guiyang Scientific Plan Project[2023]48–11,in part by QKHZYD[2023]010 Guizhou Province Science and Technology Innovation Base Construction Project“Key Laboratory Construction of Intelligent Mountain Agricultural Equipment”.
文摘Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.
基金Open Access funding provided by the National Institutes of Health(NIH)The funding for this project was provided by NCATS Intramural Fund.
文摘Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional approaches often fail in the face of challenges such as low contrast, morphological variability, and densely packed structures. Recent advancements in deep learning have transformed segmentation capabilities through the integration of fine-scale detail preservation, coarse-scale contextual modeling, and multi-scale feature fusion. This work provides a comprehensive analysis of state-of-the-art deep learning models, including U-Net variants, attention-based frameworks, and Transformer-integrated networks, highlighting innovations that improve accuracy, generalizability, and computational efficiency. Key architectural components such as convolution operations, shallow and deep blocks, skip connections, and hybrid encoders are examined for their roles in enhancing spatial representation and semantic consistency. We further discuss the importance of hierarchical and instance-aware segmentation and annotation in interpreting complex biological scenes and multiplexed medical images. By bridging methodological developments with diverse application domains, this paper outlines current trends and future directions for semantic segmentation, emphasizing its critical role in facilitating annotation, diagnosis, and discovery in biomedical research.
基金funded by the National Natural Science Foundation of China(Grant No.52175028).
文摘Accurate and robust navigation in complex surgical environments is crucial for bronchoscopic surgeries.This study purposes a bronchoscopic lumen feature matching network(BLFM-Net)based on deep learning to address the challenges of image noise,anatomical complexity,and the stringent real-time requirements.The BLFM-Net enhances bronchoscopic image processing by integrating several functional modules.The FFA-Net preprocessing module mitigates image fogging and improves visual clarity for subsequent processing.The feature extraction module derives multi-dimensional features,such as centroids,area,and shape descriptors,from dehazed images.The Faster RCNN Object detection module detects bronchial regions of interest and generates bounding boxes to localize key areas.The feature matching module accelerates the process by combining detection boxes,extracted features,and a KD-Tree(K-Dimensional Tree)-based algorithm,ensuring efficient and accurate regional feature associations.The BLFM-Net was evaluated on 5212 bronchoscopic images,demonstrating superior performance compared to traditional and other deep learning-based image matching methods.It achieved real-time matching with an average frame time of 6 ms,with a matching accuracy of over 96%.The method remained robust under challenging conditions including frame dropping(0,5,10,20),shadowed regions,and variable lighting,maintaining accuracy of above 94%even with the frame dropping of 20.This study presents BLFM-Net,a deep learning-based matching network designed to enhance and match bronchial features in bronchoscopic images.The BLFM-Net shows improved accuracy,real-time performance,and reliability,making a valuable tool for bronchoscopic surgeries.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金fully supported by the Frontier Exploration Projects of Longmen Laboratory(No.LMQYTSKT034)Key Research and Development and Promotion of Special(Science and Technology)Project of Henan Province,China(No.252102210158)。
文摘The capacity to diagnose faults in rolling bearings is of significant practical importance to ensure the normal operation of the equipment.Frequency-domain features can effectively enhance the identification of fault modes.However,existing methods often suffer from insufficient frequency-domain representation in practical applications,which greatly affects diagnostic performance.Therefore,this paper proposes a rolling bearing fault diagnosismethod based on aMulti-Scale FusionNetwork(MSFN)using the Time-Division Fourier Transform(TDFT).The method constructs multi-scale channels to extract time-domain and frequency-domain features of the signal in parallel.A multi-level,multi-scale filter-based approach is designed to extract frequency-domain features in a segmented manner.A cross-attention mechanism is introduced to facilitate the fusion of the extracted time-frequency domain features.The performance of the proposed method is validated using the CWRU and Ottawa datasets.The results show that the average accuracy of MSFN under complex noisy signals is 97.75%and 94.41%.The average accuracy under variable load conditions is 98.68%.This demonstrates its significant application potential compared to existing methods.
基金supported by grants fromthe North China University of Technology Research Start-Up Fund(11005136024XN147-14)and(110051360024XN151-97)Guangzhou Development Zone Science and Technology Project(2023GH02)+4 种基金the National Key R&D Program of China(2021YFE0201100 and 2022YFA1103401 to Juntao Gao)National Natural Science Foundation of China(981890991 to Juntao Gao)Beijing Municipal Natural Science Foundation(Z200021 to Juntao Gao)CAS Interdisciplinary Innovation Team(JCTD-2020-04 to Juntao Gao)0032/2022/A,by Macao FDCT,and MYRG2022-00271-FST.
文摘Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.
基金funded by Universiti Teknologi Malaysia under the UTM RA ICONIC Grant(Q.J130000.4351.09G61).
文摘Advanced Persistent Threats(APTs)represent one of the most complex and dangerous categories of cyber-attacks characterised by their stealthy behaviour,long-term persistence,and ability to bypass traditional detection systems.The complexity of real-world network data poses significant challenges in detection.Machine learning models have shown promise in detecting APTs;however,their performance often suffers when trained on large datasets with redundant or irrelevant features.This study presents a novel,hybrid feature selection method designed to improve APT detection by reducing dimensionality while preserving the informative characteristics of the data.It combines Mutual Information(MI),Symmetric Uncertainty(SU)and Minimum Redundancy Maximum Relevance(mRMR)to enhance feature selection.MI and SU assess feature relevance,while mRMR maximises relevance and minimises redundancy,ensuring that the most impactful features are prioritised.This method addresses redundancy among selected features,improving the overall efficiency and effectiveness of the detection model.Experiments on a real-world APT datasets were conducted to evaluate the proposed method.Multiple classifiers including,Random Forest,Support Vector Machine(SVM),Gradient Boosting,and Neural Networks were used to assess classification performance.The results demonstrate that the proposed feature selection method significantly enhances detection accuracy compared to baseline models trained on the full feature set.The Random Forest algorithm achieved the highest performance,with near-perfect accuracy,precision,recall,and F1 scores(99.97%).The proposed adaptive thresholding algorithm within the selection method allows each classifier to benefit from a reduced and optimised feature space,resulting in improved training and predictive performance.This research offers a scalable and classifier-agnostic solution for dimensionality reduction in cybersecurity applications.