False Data Injection Attacks(FDIAs)pose a critical security threat to modern power grids,corrupting state estimation and enabling malicious control actions that can lead to severe consequences,including cascading fail...False Data Injection Attacks(FDIAs)pose a critical security threat to modern power grids,corrupting state estimation and enabling malicious control actions that can lead to severe consequences,including cascading failures,large-scale blackouts,and significant economic losses.While detecting attacks is important,accurately localizing compromised nodes or measurements is even more critical,as it enables timely mitigation,targeted response,and enhanced system resilience beyond what detection alone can offer.Existing research typically models topological features using fixed structures,which can introduce irrelevant information and affect the effectiveness of feature extraction.To address this limitation,this paper proposes an FDIA localization model with adaptive neighborhood selection,which dynamically captures spatial dependencies of the power grid by adjusting node relationships based on data-driven similarities.The improved Transformer is employed to pre-fuse global spatial features of the graph,enriching the feature representation.To improve spatio-temporal correlation extraction for FDIA localization,the proposed model employs dilated causal convolution with a gating mechanism combined with graph convolution to capture and fuse long-range temporal features and adaptive topological features.This fully exploits the temporal dynamics and spatial dependencies inherent in the power grid.Finally,multi-source information is integrated to generate highly robust node embeddings,enhancing FDIA detection and localization.Experiments are conducted on IEEE 14,57,and 118-bus systems,and the results demonstrate that the proposed model substantially improves the accuracy of FDIA localization.Additional experiments are conducted to verify the effectiveness and robustness of the proposed model.展开更多
The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection m...The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection methods for assessing CWS fluidity.This paper proposed a method for analyzing the fluidity using videos of CWS dripping processes.By integrating the temporal and spatial features of each frame in the video,a multi-cascade classifier for CWS fluidity is established.The classifier distinguishes between four levels(A,B,C,and D)based on the quality of fluidity.The preliminary classification of A and D is achieved through feature engineering and the XGBoost algorithm.Subsequently,convolutional neural networks(CNN)and long short-term memory(LSTM)are utilized to further differentiate between the B and C categories which are prone to confusion.Finally,through detailed comparative experiments,the paper demonstrates the step-by-step design process of the proposed method and the superiority of the final solution.The proposed method achieves an accuracy rate of over 90%in determining the fluidity of CWS,serving as a technical reference for future industrial applications.展开更多
Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combin...Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.展开更多
The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which...The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.展开更多
Due to water conflicts and allocation in the Lancang-Mekong River Basin(LMRB),the spatio-temporal differentiation of total water resources and the natural-human influence need to be clarified.This work investigated LM...Due to water conflicts and allocation in the Lancang-Mekong River Basin(LMRB),the spatio-temporal differentiation of total water resources and the natural-human influence need to be clarified.This work investigated LMRB's terrestrial water storage anomaly(TWSA)and its spatio-temporal dynamics during 2002–2020.Considering the effects of natural factors and human activities,the respective contributions of climate variability and human activities to terrestrial water storage change(TWSC)were separated.Results showed that:(1)LMRB's TWSA decreased by 0.3158 cm/a.(2)TWSA showed a gradual increase in distribution from southwest of MRB to middle LMRB and from northeast of LRB to middle LMRB.TWSA positively changed in Myanmar while slightly changed in Laos and China.It negatively changed in Vietnam,Thailand and Cambodia.(3)TWSA components decreased in a descending order of soil moisture,groundwater and precipitation.(4)Natural factors had a substantial and spatial differentiated influence on TWSA over the LMRB.(5)Climate variability contributed 79%of TWSC in the LMRB while human activities contributed 21%with an increasing impact after 2008.The TWSC of upstream basin countries was found to be controlled by climate variability while Vietnam and Cambodia's TWSC has been controlled by human activities since 2012.展开更多
Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which ...Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Ela...In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Elatty E.Abd Elgawad Computers,Materials&Continua,2022,Vol.70,No.1,pp.1617–1630.DOI:10.32604/cmc.2022.018621,URL:https://www.techscience.com/cmc/v70n1/44361,there was an error regarding the affiliation for the author Hafiz Tayyab Rauf.Instead of“Centre for Smart Systems,AI and Cybersecurity,Staffordshire University,Stoke-on-Trent,UK”,the affiliation should be“Independent Researcher,Bradford,BD80HS,UK”.展开更多
The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more e...The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more efficient and reliable intrusion detection systems(IDSs).However,the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs.Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues.While most of these researchers reported the success of these preprocessing techniques on a shallow level,very few studies have been performed on their effects on a wider scale.Furthermore,the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used,which most of the existing studies give little emphasis on.Thus,this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets:NSL-KDD,UNSW-NB15,and CSE–CIC–IDS2018,and various AI algorithms.A wrapper-based approach,which tends to give superior performance,and min-max normalization methods were used for feature selection and normalization,respectively.Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization.The models were evaluated using popular evaluation metrics in IDS modeling,intra-and inter-model comparisons were performed between models and with state-of-the-art works.Random forest(RF)models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86%and 96.01%,respectively,whereas artificial neural network(ANN)achieved the best accuracy of 95.43%on the CSE–CIC–IDS2018 dataset.The RF models also achieved an excellent performance compared to recent works.The results show that normalization and feature selection positively affect IDS modeling.Furthermore,while feature selection benefits simpler algorithms(such as RF),normalization is more useful for complex algorithms like ANNs and deep neural networks(DNNs),and algorithms such as Naive Bayes are unsuitable for IDS modeling.The study also found that the UNSW-NB15 and CSE–CIC–IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset.Our findings suggest that prioritizing robust algorithms like RF,alongside complex models such as ANN and DNN,can significantly enhance IDS performance.These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates.展开更多
Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it cha...Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.展开更多
BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features ...BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.展开更多
During Donald Trump’s first term,the“Trump Shock”brought world politics into an era of uncertainties and pulled the transatlantic alliance down to its lowest point in history.The Trump 2.0 tsunami brewed by the 202...During Donald Trump’s first term,the“Trump Shock”brought world politics into an era of uncertainties and pulled the transatlantic alliance down to its lowest point in history.The Trump 2.0 tsunami brewed by the 2024 presidential election of the United States has plunged the U.S.-Europe relations into more gloomy waters,ushering in a more complex and turbulent period of adjustment.展开更多
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions...With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions and their triggers within a text,facilitating a deeper understanding of expressed sentiments and their underlying reasons.This comprehension is crucial for making informed strategic decisions in various business and societal contexts.However,recent research approaches employing multi-task learning frameworks for modeling often face challenges such as the inability to simultaneouslymodel extracted features and their interactions,or inconsistencies in label prediction between emotion-cause pair extraction and independent assistant tasks like emotion and cause extraction.To address these issues,this study proposes an emotion-cause pair extraction methodology that incorporates joint feature encoding and task alignment mechanisms.The model consists of two primary components:First,joint feature encoding simultaneously generates features for emotion-cause pairs and clauses,enhancing feature interactions between emotion clauses,cause clauses,and emotion-cause pairs.Second,the task alignment technique is applied to reduce the labeling distance between emotion-cause pair extraction and the two assistant tasks,capturing deep semantic information interactions among tasks.The proposed method is evaluated on a Chinese benchmark corpus using 10-fold cross-validation,assessing key performance metrics such as precision,recall,and F1 score.Experimental results demonstrate that the model achieves an F1 score of 76.05%,surpassing the state-of-the-art by 1.03%.The proposed model exhibits significant improvements in emotion-cause pair extraction(ECPE)and cause extraction(CE)compared to existing methods,validating its effectiveness.This research introduces a novel approach based on joint feature encoding and task alignment mechanisms,contributing to advancements in emotion-cause pair extraction.However,the study’s limitation lies in the data sources,potentially restricting the generalizability of the findings.展开更多
Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimo...Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimodal Aspect-oriented Sentiment Classification(MASC).Currently,most existing models for JMASA only perform text and image feature encoding from a basic level,but often neglect the in-depth analysis of unimodal intrinsic features,which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features.Given this problem,we propose a Text-Image Feature Fine-grained Learning(TIFFL)model for JMASA.First,we construct an enhanced adjacency matrix of word dependencies and adopt graph convolutional network to learn the syntactic structure features for text,which addresses the context interference problem of identifying different aspect terms.Then,the adjective-noun pairs extracted from image are introduced to enable the semantic representation of visual features more intuitive,which addresses the ambiguous semantic extraction problem during image feature learning.Thereby,the model performance of aspect term extraction and sentiment polarity prediction can be further optimized and enhanced.Experiments on two Twitter benchmark datasets demonstrate that TIFFL achieves competitive results for JMASA,MATE and MASC,thus validating the effectiveness of our proposed methods.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Objective To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis(TB)in the Guangdong Province between 2010 and 2019.Method Spatial and temporal variations in TB ...Objective To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis(TB)in the Guangdong Province between 2010 and 2019.Method Spatial and temporal variations in TB incidence were mapped using heat maps and hierarchical clustering.Socioenvironmental influencing factors were evaluated using a Bayesian spatiotemporal conditional autoregressive(ST-CAR)model.Results Annual incidence of TB in Guangdong decreased from 91.85/100,000 in 2010 to 53.06/100,000in 2019.Spatial hotspots were found in northeastern Guangdong,particularly in Heyuan,Shanwei,and Shantou,while Shenzhen,Dongguan,and Foshan had the lowest rates in the Pearl River Delta.The STCAR model showed that the TB risk was lower with higher per capita Gross Domestic Product(GDP)[Relative Risk(RR),0.91;95%Confidence Interval(CI):0.86–0.98],more the ratio of licensed physicians and physician(RR,0.94;95%CI:0.90-0.98),and higher per capita public expenditure(RR,0.94;95%CI:0.90–0.97),with a marginal effect of population density(RR,0.86;95%CI:0.86–1.00).Conclusion The incidence of TB in Guangdong varies spatially and temporally.Areas with poor economic conditions and insufficient healthcare resources are at an increased risk of TB infection.Strategies focusing on equitable health resource distribution and economic development are the key to TB control.展开更多
The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships ...The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships between pixels requires further improvement.Previous methods face challenges in efficiently managing multi-scale fea-tures of different granularities from the encoder backbone,leaving room for improvement in their global representation and feature extraction capabilities.To address these challenges,we propose a novel Decoder with Multi-Head Feature Receptors(DMHFR),which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities:coarse,fine-grained,and full set.These groups are subsequently processed by Multi-Head Feature Receptors(MHFRs)after feature capture and modeling operations.MHFRs include two Three-Head Feature Receptors(THFRs)and one Four-Head Feature Receptor(FHFR).Each group of features is passed through these MHFRs and then fed into axial transformers,which help the model capture long-range dependencies within the features.The three MHFRs produce three distinct feature outputs.The output from the FHFR serves as auxiliary auxiliary features in the prediction head,and the prediction output and their losses will eventually be aggregated.Experimental results show that the Transformer using DMHFR outperforms 15 state of the arts(SOTA)methods on five public datasets.Specifically,it achieved significant improvements in mean DICE scores over the classic Parallel Reverse Attention Network(PraNet)method,with gains of 4.1%,2.2%,1.4%,8.9%,and 16.3%on the CVC-ClinicDB,Kvasir-SEG,CVC-T,CVC-ColonDB,and ETIS-LaribPolypDB datasets,respectively.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
基金supported by National Key Research and Development Plan of China(No.2022YFB3103304).
文摘False Data Injection Attacks(FDIAs)pose a critical security threat to modern power grids,corrupting state estimation and enabling malicious control actions that can lead to severe consequences,including cascading failures,large-scale blackouts,and significant economic losses.While detecting attacks is important,accurately localizing compromised nodes or measurements is even more critical,as it enables timely mitigation,targeted response,and enhanced system resilience beyond what detection alone can offer.Existing research typically models topological features using fixed structures,which can introduce irrelevant information and affect the effectiveness of feature extraction.To address this limitation,this paper proposes an FDIA localization model with adaptive neighborhood selection,which dynamically captures spatial dependencies of the power grid by adjusting node relationships based on data-driven similarities.The improved Transformer is employed to pre-fuse global spatial features of the graph,enriching the feature representation.To improve spatio-temporal correlation extraction for FDIA localization,the proposed model employs dilated causal convolution with a gating mechanism combined with graph convolution to capture and fuse long-range temporal features and adaptive topological features.This fully exploits the temporal dynamics and spatial dependencies inherent in the power grid.Finally,multi-source information is integrated to generate highly robust node embeddings,enhancing FDIA detection and localization.Experiments are conducted on IEEE 14,57,and 118-bus systems,and the results demonstrate that the proposed model substantially improves the accuracy of FDIA localization.Additional experiments are conducted to verify the effectiveness and robustness of the proposed model.
基金supported by the Youth Fund of the National Natural Science Foundation of China(No.52304311)the National Natural Science Foundation of China(No.52274282)the Postdoctoral Fellowship Program of CPSF(No.GZC20233016)。
文摘The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection methods for assessing CWS fluidity.This paper proposed a method for analyzing the fluidity using videos of CWS dripping processes.By integrating the temporal and spatial features of each frame in the video,a multi-cascade classifier for CWS fluidity is established.The classifier distinguishes between four levels(A,B,C,and D)based on the quality of fluidity.The preliminary classification of A and D is achieved through feature engineering and the XGBoost algorithm.Subsequently,convolutional neural networks(CNN)and long short-term memory(LSTM)are utilized to further differentiate between the B and C categories which are prone to confusion.Finally,through detailed comparative experiments,the paper demonstrates the step-by-step design process of the proposed method and the superiority of the final solution.The proposed method achieves an accuracy rate of over 90%in determining the fluidity of CWS,serving as a technical reference for future industrial applications.
基金supported by National Natural Science Foundation of China(No.61103123)Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry
文摘Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.
基金supported in part by National Key Research and Development Program of China(2019YFB2103200)NSFC(61672108),Open Subject Funds of Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory(SKX182010049)+1 种基金the Fundamental Research Funds for the Central Universities(5004193192019PTB-019)the Industrial Internet Innovation and Development Project 2018 of China.
文摘The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.
基金National Natural Science Foundation of China,No.42161006Yunnan Fundamental Research Projects No.202201AT070094,No.202301BF070001-004+1 种基金Special Project for High-level Talents of Yunnan Province for Young Top Talents,No.C6213001159European Research Council(ERC)Starting-Grant STORIES,No.101040939。
文摘Due to water conflicts and allocation in the Lancang-Mekong River Basin(LMRB),the spatio-temporal differentiation of total water resources and the natural-human influence need to be clarified.This work investigated LMRB's terrestrial water storage anomaly(TWSA)and its spatio-temporal dynamics during 2002–2020.Considering the effects of natural factors and human activities,the respective contributions of climate variability and human activities to terrestrial water storage change(TWSC)were separated.Results showed that:(1)LMRB's TWSA decreased by 0.3158 cm/a.(2)TWSA showed a gradual increase in distribution from southwest of MRB to middle LMRB and from northeast of LRB to middle LMRB.TWSA positively changed in Myanmar while slightly changed in Laos and China.It negatively changed in Vietnam,Thailand and Cambodia.(3)TWSA components decreased in a descending order of soil moisture,groundwater and precipitation.(4)Natural factors had a substantial and spatial differentiated influence on TWSA over the LMRB.(5)Climate variability contributed 79%of TWSC in the LMRB while human activities contributed 21%with an increasing impact after 2008.The TWSC of upstream basin countries was found to be controlled by climate variability while Vietnam and Cambodia's TWSC has been controlled by human activities since 2012.
基金supported by the National Natural Science Foundation of China(No.52188102).
文摘Anomaly Detection (AD) has been extensively adopted in industrial settings to facilitate quality control of products. It is critical to industrial production, especially to areas such as aircraft manufacturing, which require strict part qualification rates. Although being more efficient and practical, few-shot AD has not been well explored. The existing AD methods only extract features in a single frequency while defects exist in multiple frequency domains. Moreover, current methods have not fully leveraged the few-shot support samples to extract input-related normal patterns. To address these issues, we propose an industrial few-shot AD method, Feature Extender for Anomaly Detection (FEAD), which extracts normal patterns in multiple frequency domains from few-shot samples under the guidance of the input sample. Firstly, to achieve better coverage of normal patterns in the input sample, we introduce a Sample-Conditioned Transformation Module (SCTM), which transforms support features under the guidance of the input sample to obtain extra normal patterns. Secondly, to effectively distinguish and localize anomaly patterns in multiple frequency domains, we devise an Adaptive Descriptor Construction Module (ADCM) to build and select pattern descriptors in a series of frequencies adaptively. Finally, an auxiliary task for SCTM is designed to ensure the diversity of transformations and include more normal patterns into support features. Extensive experiments on two widely used industrial AD datasets (MVTec-AD and VisA) demonstrate the effectiveness of the proposed FEAD.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
文摘In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Elatty E.Abd Elgawad Computers,Materials&Continua,2022,Vol.70,No.1,pp.1617–1630.DOI:10.32604/cmc.2022.018621,URL:https://www.techscience.com/cmc/v70n1/44361,there was an error regarding the affiliation for the author Hafiz Tayyab Rauf.Instead of“Centre for Smart Systems,AI and Cybersecurity,Staffordshire University,Stoke-on-Trent,UK”,the affiliation should be“Independent Researcher,Bradford,BD80HS,UK”.
文摘The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence(AI)techniques(such as machine learning(ML)and deep learning(DL))to build more efficient and reliable intrusion detection systems(IDSs).However,the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs.Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues.While most of these researchers reported the success of these preprocessing techniques on a shallow level,very few studies have been performed on their effects on a wider scale.Furthermore,the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used,which most of the existing studies give little emphasis on.Thus,this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets:NSL-KDD,UNSW-NB15,and CSE–CIC–IDS2018,and various AI algorithms.A wrapper-based approach,which tends to give superior performance,and min-max normalization methods were used for feature selection and normalization,respectively.Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization.The models were evaluated using popular evaluation metrics in IDS modeling,intra-and inter-model comparisons were performed between models and with state-of-the-art works.Random forest(RF)models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86%and 96.01%,respectively,whereas artificial neural network(ANN)achieved the best accuracy of 95.43%on the CSE–CIC–IDS2018 dataset.The RF models also achieved an excellent performance compared to recent works.The results show that normalization and feature selection positively affect IDS modeling.Furthermore,while feature selection benefits simpler algorithms(such as RF),normalization is more useful for complex algorithms like ANNs and deep neural networks(DNNs),and algorithms such as Naive Bayes are unsuitable for IDS modeling.The study also found that the UNSW-NB15 and CSE–CIC–IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset.Our findings suggest that prioritizing robust algorithms like RF,alongside complex models such as ANN and DNN,can significantly enhance IDS performance.These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates.
基金supported in part by the National Natural Science Foundation of China under Grants 62463002,62062021 and 62473033in part by the Guiyang Scientific Plan Project[2023]48–11,in part by QKHZYD[2023]010 Guizhou Province Science and Technology Innovation Base Construction Project“Key Laboratory Construction of Intelligent Mountain Agricultural Equipment”.
文摘Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.
文摘BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.
文摘During Donald Trump’s first term,the“Trump Shock”brought world politics into an era of uncertainties and pulled the transatlantic alliance down to its lowest point in history.The Trump 2.0 tsunami brewed by the 2024 presidential election of the United States has plunged the U.S.-Europe relations into more gloomy waters,ushering in a more complex and turbulent period of adjustment.
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
文摘With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions and their triggers within a text,facilitating a deeper understanding of expressed sentiments and their underlying reasons.This comprehension is crucial for making informed strategic decisions in various business and societal contexts.However,recent research approaches employing multi-task learning frameworks for modeling often face challenges such as the inability to simultaneouslymodel extracted features and their interactions,or inconsistencies in label prediction between emotion-cause pair extraction and independent assistant tasks like emotion and cause extraction.To address these issues,this study proposes an emotion-cause pair extraction methodology that incorporates joint feature encoding and task alignment mechanisms.The model consists of two primary components:First,joint feature encoding simultaneously generates features for emotion-cause pairs and clauses,enhancing feature interactions between emotion clauses,cause clauses,and emotion-cause pairs.Second,the task alignment technique is applied to reduce the labeling distance between emotion-cause pair extraction and the two assistant tasks,capturing deep semantic information interactions among tasks.The proposed method is evaluated on a Chinese benchmark corpus using 10-fold cross-validation,assessing key performance metrics such as precision,recall,and F1 score.Experimental results demonstrate that the model achieves an F1 score of 76.05%,surpassing the state-of-the-art by 1.03%.The proposed model exhibits significant improvements in emotion-cause pair extraction(ECPE)and cause extraction(CE)compared to existing methods,validating its effectiveness.This research introduces a novel approach based on joint feature encoding and task alignment mechanisms,contributing to advancements in emotion-cause pair extraction.However,the study’s limitation lies in the data sources,potentially restricting the generalizability of the findings.
基金supported by the Science and Technology Project of Henan Province(No.222102210081).
文摘Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimodal Aspect-oriented Sentiment Classification(MASC).Currently,most existing models for JMASA only perform text and image feature encoding from a basic level,but often neglect the in-depth analysis of unimodal intrinsic features,which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features.Given this problem,we propose a Text-Image Feature Fine-grained Learning(TIFFL)model for JMASA.First,we construct an enhanced adjacency matrix of word dependencies and adopt graph convolutional network to learn the syntactic structure features for text,which addresses the context interference problem of identifying different aspect terms.Then,the adjective-noun pairs extracted from image are introduced to enable the semantic representation of visual features more intuitive,which addresses the ambiguous semantic extraction problem during image feature learning.Thereby,the model performance of aspect term extraction and sentiment polarity prediction can be further optimized and enhanced.Experiments on two Twitter benchmark datasets demonstrate that TIFFL achieves competitive results for JMASA,MATE and MASC,thus validating the effectiveness of our proposed methods.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the Guangdong Provincial Clinical Research Center for Tuberculosis(No.2020B1111170014)。
文摘Objective To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis(TB)in the Guangdong Province between 2010 and 2019.Method Spatial and temporal variations in TB incidence were mapped using heat maps and hierarchical clustering.Socioenvironmental influencing factors were evaluated using a Bayesian spatiotemporal conditional autoregressive(ST-CAR)model.Results Annual incidence of TB in Guangdong decreased from 91.85/100,000 in 2010 to 53.06/100,000in 2019.Spatial hotspots were found in northeastern Guangdong,particularly in Heyuan,Shanwei,and Shantou,while Shenzhen,Dongguan,and Foshan had the lowest rates in the Pearl River Delta.The STCAR model showed that the TB risk was lower with higher per capita Gross Domestic Product(GDP)[Relative Risk(RR),0.91;95%Confidence Interval(CI):0.86–0.98],more the ratio of licensed physicians and physician(RR,0.94;95%CI:0.90-0.98),and higher per capita public expenditure(RR,0.94;95%CI:0.90–0.97),with a marginal effect of population density(RR,0.86;95%CI:0.86–1.00).Conclusion The incidence of TB in Guangdong varies spatially and temporally.Areas with poor economic conditions and insufficient healthcare resources are at an increased risk of TB infection.Strategies focusing on equitable health resource distribution and economic development are the key to TB control.
基金supported by Xiamen Medical and Health Guidance Project in 2021(No.3502Z20214ZD1070)supported by a grant from Guangxi Key Laboratory of Machine Vision and Intelligent Control,China(No.2023B02).
文摘The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships between pixels requires further improvement.Previous methods face challenges in efficiently managing multi-scale fea-tures of different granularities from the encoder backbone,leaving room for improvement in their global representation and feature extraction capabilities.To address these challenges,we propose a novel Decoder with Multi-Head Feature Receptors(DMHFR),which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities:coarse,fine-grained,and full set.These groups are subsequently processed by Multi-Head Feature Receptors(MHFRs)after feature capture and modeling operations.MHFRs include two Three-Head Feature Receptors(THFRs)and one Four-Head Feature Receptor(FHFR).Each group of features is passed through these MHFRs and then fed into axial transformers,which help the model capture long-range dependencies within the features.The three MHFRs produce three distinct feature outputs.The output from the FHFR serves as auxiliary auxiliary features in the prediction head,and the prediction output and their losses will eventually be aggregated.Experimental results show that the Transformer using DMHFR outperforms 15 state of the arts(SOTA)methods on five public datasets.Specifically,it achieved significant improvements in mean DICE scores over the classic Parallel Reverse Attention Network(PraNet)method,with gains of 4.1%,2.2%,1.4%,8.9%,and 16.3%on the CVC-ClinicDB,Kvasir-SEG,CVC-T,CVC-ColonDB,and ETIS-LaribPolypDB datasets,respectively.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.