Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b...Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.展开更多
Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating In...Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating Interactive Dynamic Graph Convolution Network(IDGCN)with Temporal Multi-Head Trend-Aware Attention.Its core innovation lies in IDGCN,which uniquely splits sequences into symmetric intervals for interactive feature sharing via dynamic graphs,and a novel attention mechanism incorporating convolutional operations to capture essential local traffic trends—addressing a critical gap in standard attention for continuous data.For 15-and 60-min forecasting on METR-LA,AIDGCN achieves MAEs of 0.75%and 0.39%,and RMSEs of 1.32%and 0.14%,respectively.In the 60-min long-term forecasting of the PEMS-BAY dataset,the AIDGCN out-performs the MRA-BGCN method by 6.28%,4.93%,and 7.17%in terms of MAE,RMSE,and MAPE,respectively.Experimental results demonstrate the superiority of our pro-posed model over state-of-the-art methods.展开更多
随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法...随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法。首先通过皮尔逊相关分析筛选关键特征,并利用孤立森林算法检测异常值,结合线性插值法和标准化完成数据预处理。随后,通过时间卷积网络(Temporal Convolutional Network,TCN)提取时序特征,再利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)网络捕获前后向时间依赖关系,并在输出端引入注意力机制聚焦关键时间步特征。最后,在Desert Knowledge Australia Solar Centre(DKASC)数据集上的对比实验表明,与传统LSTM、BiLSTM模型相比,提出的TCN-BiLSTM-Attention模型在预测精度、稳定性等方面均表现出一定优势。展开更多
Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,exi...Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,existing GFL approaches often lack the capability for comprehensive feature extraction and adaptive optimization,particularly in non-independent and identically distributed(NON-IID)scenarios where balancing global structural understanding and local node-level detail remains a challenge.To this end,this paper proposes a novel framework called GFL-SAR(Graph Federated Collaborative Learning Framework Based on Structural Amplification and Attention Refinement),which enhances the representation learning capability of graph data through a dual-branch collaborative design.Specifically,we propose the Structural Insight Amplifier(SIA),which utilizes an improved Graph Convolutional Network(GCN)to strengthen structural awareness and improve modeling of topological patterns.In parallel,we propose the Attentive Relational Refiner(ARR),which employs an enhanced Graph Attention Network(GAT)to perform fine-grained modeling of node relationships and neighborhood features,thereby improving the expressiveness of local interactions and preserving critical contextual information.GFL-SAR effectively integrates multi-scale features from every branch via feature fusion and federated optimization,thereby addressing existing GFL limitations in structural modeling and feature representation.Experiments on standard benchmark datasets including Cora,Citeseer,Polblogs,and Cora_ML demonstrate that GFL-SAR achieves superior performance in classification accuracy,convergence speed,and robustness compared to existing methods,confirming its effectiveness and generalizability in GFL tasks.展开更多
Clock synchronization has important applications in multi-agent collaboration(such as drone light shows,intelligent transportation systems,and game AI),group decision-making,and emergency rescue operations.Synchroniza...Clock synchronization has important applications in multi-agent collaboration(such as drone light shows,intelligent transportation systems,and game AI),group decision-making,and emergency rescue operations.Synchronization method based on pulse-coupled oscillators(PCOs)provides an effective solution for clock synchronization in wireless networks.However,the existing clock synchronization algorithms in multi-agent ad hoc networks are difficult to meet the requirements of high precision and high stability of synchronization clock in group cooperation.Hence,this paper constructs a network model,named DAUNet(unsupervised neural network based on dual attention),to enhance clock synchronization accuracy in multi-agent wireless ad hoc networks.Specifically,we design an unsupervised distributed neural network framework as the backbone,building upon classical PCO-based synchronization methods.This framework resolves issues such as prolonged time synchronization message exchange between nodes,difficulties in centralized node coordination,and challenges in distributed training.Furthermore,we introduce a dual-attention mechanism as the core module of DAUNet.By integrating a Multi-Head Attention module and a Gated Attention module,the model significantly improves information extraction capabilities while reducing computational complexity,effectively mitigating synchronization inaccuracies and instability in multi-agent ad hoc networks.To evaluate the effectiveness of the proposed model,comparative experiments and ablation studies were conducted against classical methods and existing deep learning models.The research results show that,compared with the deep learning networks based on DASA and LSTM,DAUNet can reduce the mean normalized phase difference(NPD)by 1 to 2 orders of magnitude.Compared with the attention models based on additive attention and self-attention mechanisms,the performance of DAUNet has improved by more than ten times.This study demonstrates DAUNet’s potential in advancing multi-agent ad hoc networking technologies.展开更多
In the context of modern software development characterized by increasing complexity and compressed development cycles,traditional static vulnerability detection methods face prominent challenges including high false ...In the context of modern software development characterized by increasing complexity and compressed development cycles,traditional static vulnerability detection methods face prominent challenges including high false positive rates and missed detections of complex logic due to their over-reliance on rule templates.This paper proposes a Syntax-Aware Hierarchical Attention Network(SAHAN)model,which achieves high-precision vulnerability detection through grammar-rule-driven multi-granularity code slicing and hierarchical semantic fusion mechanisms.The SAHAN model first generates Syntax Independent Units(SIUs),which slices the code based on Abstract Syntax Tree(AST)and predefined grammar rules,retaining vulnerability-sensitive contexts.Following this,through a hierarchical attention mechanism,the local syntax-aware layer encodes fine-grained patterns within SIUs,while the global semantic correlation layer captures vulnerability chains across SIUs,achieving synergistic modeling of syntax and semantics.Experiments show that on benchmark datasets like QEMU,SAHAN significantly improves detection performance by 4.8%to 13.1%on average compared to baseline models such as Devign and VulDeePecker.展开更多
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text...Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals.展开更多
Dear Editor,This letter proposes the graph tensor alliance attention network(GT-A^(2)T)to represent a dynamic graph(DG)precisely.Its main idea includes 1)Establishing a unified spatio-temporal message propagation fram...Dear Editor,This letter proposes the graph tensor alliance attention network(GT-A^(2)T)to represent a dynamic graph(DG)precisely.Its main idea includes 1)Establishing a unified spatio-temporal message propagation framework on a DG via the tensor product for capturing the complex cohesive spatio-temporal interdependencies precisely and 2)Acquiring the alliance attention scores by node features and favorable high-order structural correlations.展开更多
Objective To report the development,validation,and findings of the Multi-dimensional Attention Rating Scale(MARS),a self-report tool crafted to evaluate six-dimension attention levels.Methods The MARS was developed ba...Objective To report the development,validation,and findings of the Multi-dimensional Attention Rating Scale(MARS),a self-report tool crafted to evaluate six-dimension attention levels.Methods The MARS was developed based on Classical Test Theory(CTT).Totally 202 highly educated healthy adult participants were recruited for reliability and validity tests.Reliability was measured using Cronbach's alpha and test-retest reliability.Structural validity was explored using principal component analysis.Criterion validity was analyzed by correlating MARS scores with the Toronto Hospital Alertness Test(THAT),the Attentional Control Scale(ACS),and the Attention Network Test(ANT).Results The MARS comprises 12 items spanning six distinct dimensions of attention:focused attention,sustained attention,shifting attention,selective attention,divided attention,and response inhibition.As assessed by six experts,the content validation index(CVI)was 0.95,the Cronbach's alpha for the MARS was 0.78,and the test-retest reliability was 0.81.Four factors were identified(cumulative variance contribution rate 68.79%).The total score of MARS was correlated positively with THAT(r=0.60,P<0.01)and ACS(r=0.78,P<0.01)and negatively with ANT's reaction time for alerting(r=−0.31,P=0.049).Conclusion The MARS can reliably and validly assess six-dimension attention levels in real-world settings and is expected to be a new tool for assessing multi-dimensional attention impairments in different mental disorders.展开更多
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f...Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.展开更多
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie...We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.展开更多
Precise traffic flow forecasting is essential for mitigating urban traffic congestion.However,it is difficult for existing methods to adequately capture the dynamic spatio-temporal characteristics and multiscale tempo...Precise traffic flow forecasting is essential for mitigating urban traffic congestion.However,it is difficult for existing methods to adequately capture the dynamic spatio-temporal characteristics and multiscale temporal dependencies of traffic flow.A traffic flow prediction model with multiscale temporal awareness and graph diffusion attention networks(MT-GDAN)is proposed to address these issues.Specifically,a graph diffusion attention module is constructed,which dynamically adjusts and calculates the weights of neighboring nodes in the graph structure using a random graph attention network(GAT)and captures the spatial characteristics of hidden nodes through an adaptive adjacency matrix,thus better exploiting the dynamic spatio-temporal properties of traffic flow.Secondly,a multiscale isometric convolutional network and bi-level routing attention are used to construct a multiscale temporal awareness module.The former extracts local information of traffic flow segments by convolution with different sizes of convolution kernels and then introduces isometric convolution to obtain the global temporal relationship between local features of traffic flow segments;the latter filters irrelevant spatio-temporal features at a coarse regional level and focuses locally on key points to more accurately capture the multiscale temporal dependencies of traffic flows.Experimental results reveal that the MT-GDAN model surpasses the mainstream baseline model in terms of forecasting accuracy and exhibits good prediction performance.展开更多
Background Raising a child with attention deficit hyperactivity disorder(ADHD)is a key challenge for the primary caregiver.This systematic review aims to identify major burdens facing the primary caregiver of a child ...Background Raising a child with attention deficit hyperactivity disorder(ADHD)is a key challenge for the primary caregiver.This systematic review aims to identify major burdens facing the primary caregiver of a child with ADHD.Methods The electronic databases CINAHL,PubMed,and Google Scholar were searched for studies published in English from 2017 to 2022 assessing the challenges facing caregivers of a child with ADHD.The Johns Hopkins Nursing Evidence-Based Practice Model was used to assess quality and risk of bias of studies identified for inclusion.Articles were synthesized by evaluating principal themes of burden to caregivers,stress of caregivers,and effectiveness of intervention programs.Results Eleven articles were included in this review and included a total of 2426 participants.Findings revealed that caregivers of children with ADHD have a poor quality of life and high stress levels.Supportive parenting programs can be effective for improved coping and adaptation mechanisms with children with ADHD.However,few interventional studies were identified,increasing potential for bias.No meta-analysis was conducted.Conclusion Caregivers of children with ADHD can benefit from strategies to improve their quality of life and reduce their stress levels.Targeted parenting programs can make a positive difference in the well-being of caregivers and children with ADHD.Additional research is needed to address the evidence-based effectiveness of parenting support programs.展开更多
In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively a...In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively address the randomness of cloud distribution and the non-uniformity of cloud thickness,we propose a coarse-to-fine thin cloud removal architecture based on the observations of the random distribution and uneven thickness of cloud.In the coarse-level declouding network,we innovatively introduce a multi-scale attention mechanism,i.e.,pyramid nonlocal attention(PNA).By integrating global context with local detail information,it specifically addresses image quality degradation caused by the uncertainty in cloud distribution.During the fine-level declouding stage,we focus on the impact of cloud thickness on declouding results(primarily manifested as insufficient detail information).Through a carefully designed residual dense module,we significantly enhance the extraction and utilization of feature details.Thus,our approach precisely restores lost local texture features on top of coarse-level results,achieving a substantial leap in declouding quality.To evaluate the effectiveness of our cloud removal technology and attention mechanism,we conducted comprehensive analyses on publicly available datasets.Results demonstrate that our method achieves state-of-the-art performance across a wide range of techniques.展开更多
The Informer model leverages its innovative ProbSparse self-attention mechanism to demonstrate significant performance advantages in long-sequence time-series forecasting tasks.However,when confronted with time-series...The Informer model leverages its innovative ProbSparse self-attention mechanism to demonstrate significant performance advantages in long-sequence time-series forecasting tasks.However,when confronted with time-series data exhibiting multi-scale characteristics and substantial noise,the model’s attention mechanism reveals inherent limitations.Specifically,the model is susceptible to interference from local noise or irrelevant patterns,leading to diminished focus on globally critical information and consequently impairing forecasting accuracy.To address this challenge,this study proposes an enhanced architecture that integrates a Gated Attention mechanism into the original Informer framework.This mechanism employs learnable gating functions to dynamically and selectively impose differentiated weighting on crucial temporal segments and discriminative feature dimensions within the input sequence.This adaptive weighting strategy is designed to effectively suppress noise interference while amplifying the capture of core dynamic patterns.Consequently,it substantially strengthens the model’s capability to represent complex temporal dynamics and ultimately elevates its predictive performance.展开更多
To overcome the obstacles of poor feature extraction and little prior information on the appearance of infrared dim small targets,we propose a multi-domain attention-guided pyramid network(MAGPNet).Specifically,we des...To overcome the obstacles of poor feature extraction and little prior information on the appearance of infrared dim small targets,we propose a multi-domain attention-guided pyramid network(MAGPNet).Specifically,we design three modules to ensure that salient features of small targets can be acquired and retained in the multi-scale feature maps.To improve the adaptability of the network for targets of different sizes,we design a kernel aggregation attention block with a receptive field attention branch and weight the feature maps under different perceptual fields with attention mechanism.Based on the research on human vision system,we further propose an adaptive local contrast measure module to enhance the local features of infrared small targets.With this parameterized component,we can implement the information aggregation of multi-scale contrast saliency maps.Finally,to fully utilize the information within spatial and channel domains in feature maps of different scales,we propose the mixed spatial-channel attention-guided fusion module to achieve high-quality fusion effects while ensuring that the small target features can be preserved at deep layers.Experiments on public datasets demonstrate that our MAGPNet can achieve a better performance over other state-of-the-art methods in terms of the intersection of union,Precision,Recall,and F-measure.In addition,we conduct detailed ablation studies to verify the effectiveness of each component in our network.展开更多
Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dyn...Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dynamic sign language requires identifying keyframes that best represent the signs, and missing these keyframes reduces accuracy. Secondly, some methods do not focus enough on hand regions, which are small within the overall frame, leading to information loss. To address these challenges, we propose a novel Video Transformer Attention-based Network (VTAN) for dynamic sign language recognition. Our approach prioritizes informative frames and hand regions effectively. To tackle the first issue, we designed a keyframe extraction module enhanced by a convolutional autoencoder, which focuses on selecting information-rich frames and eliminating redundant ones from the video sequences. For the second issue, we developed a soft attention-based transformer module that emphasizes extracting features from hand regions, ensuring that the network pays more attention to hand information within sequences. This dual-focus approach improves effective dynamic sign language recognition by addressing the key challenges of identifying critical frames and emphasizing hand regions. Experimental results on two public benchmark datasets demonstrate the effectiveness of our network, outperforming most of the typical methods in sign language recognition tasks.展开更多
Hyperspectral image(HSI)classification is crucial for numerous remote sensing applications.Traditional deep learning methods may miss pixel relationships and context,leading to inefficiencies.This paper introduces the...Hyperspectral image(HSI)classification is crucial for numerous remote sensing applications.Traditional deep learning methods may miss pixel relationships and context,leading to inefficiencies.This paper introduces the spectral band graph convolutional and attention-enhanced CNN joint network(SGCCN),a novel approach that harnesses the power of spectral band graph convolutions for capturing long-range relationships,utilizes local perception of attention-enhanced multi-level convolutions for local spatial feature and employs a dynamic attention mechanism to enhance feature extraction.The SGCCN integrates spectral and spatial features through a self-attention fusion network,significantly improving classification accuracy and efficiency.The proposed method outperforms existing techniques,demonstrating its effectiveness in handling the challenges associated with HSI data.展开更多
Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existin...Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.展开更多
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Metaverse Support Program to Nurture the Best Talents(IITP-2024-RS-2023-00254529)grant funded by the Korea government(MSIT).
文摘Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.
文摘Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating Interactive Dynamic Graph Convolution Network(IDGCN)with Temporal Multi-Head Trend-Aware Attention.Its core innovation lies in IDGCN,which uniquely splits sequences into symmetric intervals for interactive feature sharing via dynamic graphs,and a novel attention mechanism incorporating convolutional operations to capture essential local traffic trends—addressing a critical gap in standard attention for continuous data.For 15-and 60-min forecasting on METR-LA,AIDGCN achieves MAEs of 0.75%and 0.39%,and RMSEs of 1.32%and 0.14%,respectively.In the 60-min long-term forecasting of the PEMS-BAY dataset,the AIDGCN out-performs the MRA-BGCN method by 6.28%,4.93%,and 7.17%in terms of MAE,RMSE,and MAPE,respectively.Experimental results demonstrate the superiority of our pro-posed model over state-of-the-art methods.
文摘随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法。首先通过皮尔逊相关分析筛选关键特征,并利用孤立森林算法检测异常值,结合线性插值法和标准化完成数据预处理。随后,通过时间卷积网络(Temporal Convolutional Network,TCN)提取时序特征,再利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)网络捕获前后向时间依赖关系,并在输出端引入注意力机制聚焦关键时间步特征。最后,在Desert Knowledge Australia Solar Centre(DKASC)数据集上的对比实验表明,与传统LSTM、BiLSTM模型相比,提出的TCN-BiLSTM-Attention模型在预测精度、稳定性等方面均表现出一定优势。
基金supported by National Natural Science Foundation of China(62466045)Inner Mongolia Natural Science Foundation Project(2021LHMS06003)Inner Mongolia University Basic Research Business Fee Project(114).
文摘Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,existing GFL approaches often lack the capability for comprehensive feature extraction and adaptive optimization,particularly in non-independent and identically distributed(NON-IID)scenarios where balancing global structural understanding and local node-level detail remains a challenge.To this end,this paper proposes a novel framework called GFL-SAR(Graph Federated Collaborative Learning Framework Based on Structural Amplification and Attention Refinement),which enhances the representation learning capability of graph data through a dual-branch collaborative design.Specifically,we propose the Structural Insight Amplifier(SIA),which utilizes an improved Graph Convolutional Network(GCN)to strengthen structural awareness and improve modeling of topological patterns.In parallel,we propose the Attentive Relational Refiner(ARR),which employs an enhanced Graph Attention Network(GAT)to perform fine-grained modeling of node relationships and neighborhood features,thereby improving the expressiveness of local interactions and preserving critical contextual information.GFL-SAR effectively integrates multi-scale features from every branch via feature fusion and federated optimization,thereby addressing existing GFL limitations in structural modeling and feature representation.Experiments on standard benchmark datasets including Cora,Citeseer,Polblogs,and Cora_ML demonstrate that GFL-SAR achieves superior performance in classification accuracy,convergence speed,and robustness compared to existing methods,confirming its effectiveness and generalizability in GFL tasks.
文摘Clock synchronization has important applications in multi-agent collaboration(such as drone light shows,intelligent transportation systems,and game AI),group decision-making,and emergency rescue operations.Synchronization method based on pulse-coupled oscillators(PCOs)provides an effective solution for clock synchronization in wireless networks.However,the existing clock synchronization algorithms in multi-agent ad hoc networks are difficult to meet the requirements of high precision and high stability of synchronization clock in group cooperation.Hence,this paper constructs a network model,named DAUNet(unsupervised neural network based on dual attention),to enhance clock synchronization accuracy in multi-agent wireless ad hoc networks.Specifically,we design an unsupervised distributed neural network framework as the backbone,building upon classical PCO-based synchronization methods.This framework resolves issues such as prolonged time synchronization message exchange between nodes,difficulties in centralized node coordination,and challenges in distributed training.Furthermore,we introduce a dual-attention mechanism as the core module of DAUNet.By integrating a Multi-Head Attention module and a Gated Attention module,the model significantly improves information extraction capabilities while reducing computational complexity,effectively mitigating synchronization inaccuracies and instability in multi-agent ad hoc networks.To evaluate the effectiveness of the proposed model,comparative experiments and ablation studies were conducted against classical methods and existing deep learning models.The research results show that,compared with the deep learning networks based on DASA and LSTM,DAUNet can reduce the mean normalized phase difference(NPD)by 1 to 2 orders of magnitude.Compared with the attention models based on additive attention and self-attention mechanisms,the performance of DAUNet has improved by more than ten times.This study demonstrates DAUNet’s potential in advancing multi-agent ad hoc networking technologies.
基金supported by the research start-up funds for invited doctor of Lanzhou University of Technology under Grant 14/062402。
文摘In the context of modern software development characterized by increasing complexity and compressed development cycles,traditional static vulnerability detection methods face prominent challenges including high false positive rates and missed detections of complex logic due to their over-reliance on rule templates.This paper proposes a Syntax-Aware Hierarchical Attention Network(SAHAN)model,which achieves high-precision vulnerability detection through grammar-rule-driven multi-granularity code slicing and hierarchical semantic fusion mechanisms.The SAHAN model first generates Syntax Independent Units(SIUs),which slices the code based on Abstract Syntax Tree(AST)and predefined grammar rules,retaining vulnerability-sensitive contexts.Following this,through a hierarchical attention mechanism,the local syntax-aware layer encodes fine-grained patterns within SIUs,while the global semantic correlation layer captures vulnerability chains across SIUs,achieving synergistic modeling of syntax and semantics.Experiments show that on benchmark datasets like QEMU,SAHAN significantly improves detection performance by 4.8%to 13.1%on average compared to baseline models such as Devign and VulDeePecker.
基金Shenzhen Institute of Artificial Intelligence and Robotics for Society,Grant/Award Number:AC01202201003-02GuangDong Basic and Applied Basic Research Foundation,Grant/Award Number:2024A1515010252Longgang District Shenzhen's“Ten Action Plan”for Supporting Innovation Projects,Grant/Award Number:LGKCSDPT2024002。
文摘Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals.
基金supported in part by the National Natural Science Foundation of China(62372385).
文摘Dear Editor,This letter proposes the graph tensor alliance attention network(GT-A^(2)T)to represent a dynamic graph(DG)precisely.Its main idea includes 1)Establishing a unified spatio-temporal message propagation framework on a DG via the tensor product for capturing the complex cohesive spatio-temporal interdependencies precisely and 2)Acquiring the alliance attention scores by node features and favorable high-order structural correlations.
文摘Objective To report the development,validation,and findings of the Multi-dimensional Attention Rating Scale(MARS),a self-report tool crafted to evaluate six-dimension attention levels.Methods The MARS was developed based on Classical Test Theory(CTT).Totally 202 highly educated healthy adult participants were recruited for reliability and validity tests.Reliability was measured using Cronbach's alpha and test-retest reliability.Structural validity was explored using principal component analysis.Criterion validity was analyzed by correlating MARS scores with the Toronto Hospital Alertness Test(THAT),the Attentional Control Scale(ACS),and the Attention Network Test(ANT).Results The MARS comprises 12 items spanning six distinct dimensions of attention:focused attention,sustained attention,shifting attention,selective attention,divided attention,and response inhibition.As assessed by six experts,the content validation index(CVI)was 0.95,the Cronbach's alpha for the MARS was 0.78,and the test-retest reliability was 0.81.Four factors were identified(cumulative variance contribution rate 68.79%).The total score of MARS was correlated positively with THAT(r=0.60,P<0.01)and ACS(r=0.78,P<0.01)and negatively with ANT's reaction time for alerting(r=−0.31,P=0.049).Conclusion The MARS can reliably and validly assess six-dimension attention levels in real-world settings and is expected to be a new tool for assessing multi-dimensional attention impairments in different mental disorders.
基金supported by the National Natural Science Foundation of China(No.62103298)。
文摘Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.
基金supported by the National Natural Science Foundation of China (Nos.61806107 and 61702135)。
文摘We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.
基金Supported by the by Key R&D Program of Gansu Province(No.23YFGA0063)the Key Talent Project of Gansu Province(No.2024RCXM57,2024RCXM22)the Major Science and Technology Special Program of Gansu Province(No.25ZYJA037).
文摘Precise traffic flow forecasting is essential for mitigating urban traffic congestion.However,it is difficult for existing methods to adequately capture the dynamic spatio-temporal characteristics and multiscale temporal dependencies of traffic flow.A traffic flow prediction model with multiscale temporal awareness and graph diffusion attention networks(MT-GDAN)is proposed to address these issues.Specifically,a graph diffusion attention module is constructed,which dynamically adjusts and calculates the weights of neighboring nodes in the graph structure using a random graph attention network(GAT)and captures the spatial characteristics of hidden nodes through an adaptive adjacency matrix,thus better exploiting the dynamic spatio-temporal properties of traffic flow.Secondly,a multiscale isometric convolutional network and bi-level routing attention are used to construct a multiscale temporal awareness module.The former extracts local information of traffic flow segments by convolution with different sizes of convolution kernels and then introduces isometric convolution to obtain the global temporal relationship between local features of traffic flow segments;the latter filters irrelevant spatio-temporal features at a coarse regional level and focuses locally on key points to more accurately capture the multiscale temporal dependencies of traffic flows.Experimental results reveal that the MT-GDAN model surpasses the mainstream baseline model in terms of forecasting accuracy and exhibits good prediction performance.
文摘Background Raising a child with attention deficit hyperactivity disorder(ADHD)is a key challenge for the primary caregiver.This systematic review aims to identify major burdens facing the primary caregiver of a child with ADHD.Methods The electronic databases CINAHL,PubMed,and Google Scholar were searched for studies published in English from 2017 to 2022 assessing the challenges facing caregivers of a child with ADHD.The Johns Hopkins Nursing Evidence-Based Practice Model was used to assess quality and risk of bias of studies identified for inclusion.Articles were synthesized by evaluating principal themes of burden to caregivers,stress of caregivers,and effectiveness of intervention programs.Results Eleven articles were included in this review and included a total of 2426 participants.Findings revealed that caregivers of children with ADHD have a poor quality of life and high stress levels.Supportive parenting programs can be effective for improved coping and adaptation mechanisms with children with ADHD.However,few interventional studies were identified,increasing potential for bias.No meta-analysis was conducted.Conclusion Caregivers of children with ADHD can benefit from strategies to improve their quality of life and reduce their stress levels.Targeted parenting programs can make a positive difference in the well-being of caregivers and children with ADHD.Additional research is needed to address the evidence-based effectiveness of parenting support programs.
基金supported by the Fundamental Research Funds for the Central Universities(No.2572025BR14)the China Energy Digital Intelligence Technology Development(Beijing)Co.,Ltd.Science and Technology Innovation Project(No.YA2024001500).
文摘In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively address the randomness of cloud distribution and the non-uniformity of cloud thickness,we propose a coarse-to-fine thin cloud removal architecture based on the observations of the random distribution and uneven thickness of cloud.In the coarse-level declouding network,we innovatively introduce a multi-scale attention mechanism,i.e.,pyramid nonlocal attention(PNA).By integrating global context with local detail information,it specifically addresses image quality degradation caused by the uncertainty in cloud distribution.During the fine-level declouding stage,we focus on the impact of cloud thickness on declouding results(primarily manifested as insufficient detail information).Through a carefully designed residual dense module,we significantly enhance the extraction and utilization of feature details.Thus,our approach precisely restores lost local texture features on top of coarse-level results,achieving a substantial leap in declouding quality.To evaluate the effectiveness of our cloud removal technology and attention mechanism,we conducted comprehensive analyses on publicly available datasets.Results demonstrate that our method achieves state-of-the-art performance across a wide range of techniques.
文摘The Informer model leverages its innovative ProbSparse self-attention mechanism to demonstrate significant performance advantages in long-sequence time-series forecasting tasks.However,when confronted with time-series data exhibiting multi-scale characteristics and substantial noise,the model’s attention mechanism reveals inherent limitations.Specifically,the model is susceptible to interference from local noise or irrelevant patterns,leading to diminished focus on globally critical information and consequently impairing forecasting accuracy.To address this challenge,this study proposes an enhanced architecture that integrates a Gated Attention mechanism into the original Informer framework.This mechanism employs learnable gating functions to dynamically and selectively impose differentiated weighting on crucial temporal segments and discriminative feature dimensions within the input sequence.This adaptive weighting strategy is designed to effectively suppress noise interference while amplifying the capture of core dynamic patterns.Consequently,it substantially strengthens the model’s capability to represent complex temporal dynamics and ultimately elevates its predictive performance.
基金the Industry-University-Research Cooperation Fund Project of the Eighth Research Institute of China Aerospace Science and Technology Corporation(No.USCAST2021-5)。
文摘To overcome the obstacles of poor feature extraction and little prior information on the appearance of infrared dim small targets,we propose a multi-domain attention-guided pyramid network(MAGPNet).Specifically,we design three modules to ensure that salient features of small targets can be acquired and retained in the multi-scale feature maps.To improve the adaptability of the network for targets of different sizes,we design a kernel aggregation attention block with a receptive field attention branch and weight the feature maps under different perceptual fields with attention mechanism.Based on the research on human vision system,we further propose an adaptive local contrast measure module to enhance the local features of infrared small targets.With this parameterized component,we can implement the information aggregation of multi-scale contrast saliency maps.Finally,to fully utilize the information within spatial and channel domains in feature maps of different scales,we propose the mixed spatial-channel attention-guided fusion module to achieve high-quality fusion effects while ensuring that the small target features can be preserved at deep layers.Experiments on public datasets demonstrate that our MAGPNet can achieve a better performance over other state-of-the-art methods in terms of the intersection of union,Precision,Recall,and F-measure.In addition,we conduct detailed ablation studies to verify the effectiveness of each component in our network.
基金supported by the National Natural Science Foundation of China under Grant Nos.62076117 and 62166026the Jiangxi Provincial Key Laboratory of Virtual Reality under Grant No.2024SSY03151.
文摘Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dynamic sign language requires identifying keyframes that best represent the signs, and missing these keyframes reduces accuracy. Secondly, some methods do not focus enough on hand regions, which are small within the overall frame, leading to information loss. To address these challenges, we propose a novel Video Transformer Attention-based Network (VTAN) for dynamic sign language recognition. Our approach prioritizes informative frames and hand regions effectively. To tackle the first issue, we designed a keyframe extraction module enhanced by a convolutional autoencoder, which focuses on selecting information-rich frames and eliminating redundant ones from the video sequences. For the second issue, we developed a soft attention-based transformer module that emphasizes extracting features from hand regions, ensuring that the network pays more attention to hand information within sequences. This dual-focus approach improves effective dynamic sign language recognition by addressing the key challenges of identifying critical frames and emphasizing hand regions. Experimental results on two public benchmark datasets demonstrate the effectiveness of our network, outperforming most of the typical methods in sign language recognition tasks.
基金supported in part by the National Natural Science Foundations of China(No.61801214)the Postgraduate Research Practice Innovation Program of NUAA(No.xcxjh20231504)。
文摘Hyperspectral image(HSI)classification is crucial for numerous remote sensing applications.Traditional deep learning methods may miss pixel relationships and context,leading to inefficiencies.This paper introduces the spectral band graph convolutional and attention-enhanced CNN joint network(SGCCN),a novel approach that harnesses the power of spectral band graph convolutions for capturing long-range relationships,utilizes local perception of attention-enhanced multi-level convolutions for local spatial feature and employs a dynamic attention mechanism to enhance feature extraction.The SGCCN integrates spectral and spatial features through a self-attention fusion network,significantly improving classification accuracy and efficiency.The proposed method outperforms existing techniques,demonstrating its effectiveness in handling the challenges associated with HSI data.
基金supported by the National Natural Science Foundation of China,China (Grants No.62171232)the Priority Academic Program Development of Jiangsu Higher Education Institutions,China。
文摘Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.