Avian incubation is characterised by the contact between the eggs and the bird's skin to transfer heat to increase egg temperature above ambient conditions. Birds can be attentive to the clutch all of the time or,...Avian incubation is characterised by the contact between the eggs and the bird's skin to transfer heat to increase egg temperature above ambient conditions. Birds can be attentive to the clutch all of the time or, particularly if only one parent incubates, attentiveness may be quite low. Attentiveness is related to egg size with large eggs having high attentiveness, whereas small eggs (<10 g) can have attentiveness ranging from 50% to 100%. Previous studies have suggested that incubation duration is a function of attentiveness albeit for small birds. This study tested the hypothesis that, after controlling for egg size and phylogeny, incubation duration would be a function of attentiveness. Data for 444 bird species representing 24 orders were analysed. Whilst egg mass had a significant relationship with incubation duration, there was no relationship with attentiveness for all of the species or a subset of the passerines. Despite egg temperature drops during an incubation recess, average day-time and night-time temperatures are similar in a range of species. Re-examination of previously reported temperature profiles recorded by dummy eggs over a 24-h period shows that after an incubation recess there seems to be an additional heat flux that raises egg temperature above that seen during night-time periods of constant incubation. The reasons why eggs under intermittent incubation are not considerably cooler than eggs during constant incubation are discussed.展开更多
Background:While nest attentiveness plays a critical role in the reproductive success of avian species,nest attentiveness data with high temporal resolution is not available for many species.However,improvements in bo...Background:While nest attentiveness plays a critical role in the reproductive success of avian species,nest attentiveness data with high temporal resolution is not available for many species.However,improvements in both video monitoring and temperature logging devices present an opportunity to increase our understanding of this aspect of avian behavior.Methods:To investigate nest attentiveness behaviors and evaluate these technologies,we monitored 13 nests across two Common Tern(Sterna hirundo)breeding colonies with a paired video camera-temperature logger approach,while monitoring 63 additional nests with temperature loggers alone.Observations occurred from May to August of 2017 on Poplar(Chesapeake Bay,Maryland,USA)and Skimmer Islands(Isle of Wight Bay,Maryland,USA).We examined data respective to four times of day:Morning(civil dawn‒11:59),Peak(12:00‒16:00),Cooling(16:01‒civil dusk),and Night(civil dusk‒civil dawn).Results:While successful nests had mostly short duration off-bouts and maintained consistent nest attentiveness throughout the day,failed nests had dramatic reductions in nest attentiveness during the Cooling and Night periods(p<0.05)with one colony experiencing repeated nocturnal abandonment due to predation pressure from a Great Horned Owl(Bubo virginianus).Incubation appeared to ameliorate ambient temperatures during Night,as nests were significantly warmer during Night when birds were on versus off the nest(p<0.05).Meanwhile,off-bouts during the Peak period occurred during higher ambient temperatures,perhaps due to adults leaving the nest during the hottest periods to perform belly soaking.Unfortunately,temperature logger data alone had limited ability to predict nest attentiveness status during shorter bouts,with results highly dependent on time of day and bout duration.While our methods did not affect hatching success(p>0.05),video-monitored nests did have significantly lower clutch sizes(p<0.05).Conclusions:The paired use of iButtons and video cameras enabled a detailed description of the incubation behavior of COTE.However,while promising for future research,the logistical and potential biological complications involved in the use of these methods suggest that careful planning is needed before these devices are utilized to ensure data is collected in a safe and successful manner.展开更多
随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法...随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法。首先通过皮尔逊相关分析筛选关键特征,并利用孤立森林算法检测异常值,结合线性插值法和标准化完成数据预处理。随后,通过时间卷积网络(Temporal Convolutional Network,TCN)提取时序特征,再利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)网络捕获前后向时间依赖关系,并在输出端引入注意力机制聚焦关键时间步特征。最后,在Desert Knowledge Australia Solar Centre(DKASC)数据集上的对比实验表明,与传统LSTM、BiLSTM模型相比,提出的TCN-BiLSTM-Attention模型在预测精度、稳定性等方面均表现出一定优势。展开更多
Successful drought planning is dependent on the generation of timely and accurate early warning information.Yet there is little evidence to explain the extent to which crop farmers pay attention to and assimilate earl...Successful drought planning is dependent on the generation of timely and accurate early warning information.Yet there is little evidence to explain the extent to which crop farmers pay attention to and assimilate early warning drought information that aids in the policy formulation in support of drought risk reduction.A socioecological survey,using a structured questionnaire administered to 426 crop farming households,was carried out in the Talensi District of the Upper East Region,Ghana.The data analytic techniques used were frequency tables,relative importance index,and multinomial logistics embedded in SPSS v.20 software.The results show that crop farmers predominantly rely on agricultural extension officers for early warning drought information,with an estimated 78% of them paying little to very much attention to the information.The likelihood ratio Chi-square test showed that there is a significant improvement in fit as X^(2)(20)=96.792,p<0.000.Household status,average monthly income,and age were the significant predictors for crop farmers paying no attention at all to early warning drought information,while household status was the only significant factor among those paying a little attention.The drive to build a climate-resilient society with effective early warning centers across Ghana will receive 60% lower support from crop farmers paying no to a little attention as compared to farmers paying very much attention to early warning drought information.Broader stakeholder engagements should be carried out to harness inclusive support from crop farmers to build a climate-resilient society in Ghana.展开更多
Behavior recognition of Hu sheep contributes to their intensive and intelligent farming.Due to the generally high density of Hu sheep farming,severe occlusion occurs among different behaviors and even among sheep perf...Behavior recognition of Hu sheep contributes to their intensive and intelligent farming.Due to the generally high density of Hu sheep farming,severe occlusion occurs among different behaviors and even among sheep performing the same behavior,leading to missing and false detection issues in existing behavior recognition methods.A high-low frequency aggregated attention and negative sample comprehensive score loss and comprehensive score soft non-maximum suppression-YOLO(HLNC-YOLO)was proposed for identifying the behavior of Hu sheep,addressing the issues of missed and erroneous detections caused by occlusion between Hu sheep in intensive farming.Firstly,images of four typical behaviors-standing,lying,eating,and drinking-were collected from the sheep farm to construct the Hu sheep behavior dataset(HSBD).Next,to solve the occlusion issues,during the training phase,the C2F-HLAtt module was integrated,which combined high-low frequency aggregation attention,into the YOLO v8 Backbone to perceive occluded objects and introduce an auxiliary reversible branch to retain more effective features.Using comprehensive score regression loss(CSLoss)to reduce the scores of suboptimal boxes and enhance the comprehensive scores of occluded object boxes.Finally,the soft comprehensive score non-maximal suppression(Soft-CS-NMS)algorithm filtered prediction boxes during the inferencing.Testing on the HSBD,HLNC-YOLO achieved a mean average precision(mAP@50)of 87.8%,with a memory footprint of 17.4 MB.This represented an improvement of 7.1,2.2,4.6,and 11 percentage points over YOLO v8,YOLO v9,YOLO v10,and Faster R-CNN,respectively.Research indicated that the HLNC-YOLO accurately identified the behavior of Hu sheep in intensive farming and possessed generalization capabilities,providing technical support for smart farming.展开更多
Lip synchronization serves as a core technology for enabling natural interactions in digital virtual humans.However,it faces challenges such as insufficient dynamic correspondence between speech and lip movements and ...Lip synchronization serves as a core technology for enabling natural interactions in digital virtual humans.However,it faces challenges such as insufficient dynamic correspondence between speech and lip movements and inadequate modeling of image details.To address these limitations,a comprehensively optimized lip synchronization framework extending the Wav2Lip architecture was proposed in this study.Firstly,based on the Wav2Lip model,a facial region extraction strategy using facial keypoints was designed,which effectively enhances the robustness of facial alignment during lip synchronization for digital virtual humans.Then,a cross-modal attention fusion module between visual and speech features was introduced to improve cross-modal information fusion,and a dynamic receptive field convolution module was developed in the generation branch to enhance the modeling performance of the lip region.Finally,experiments were conducted on the VFHQ dataset.The proposed method was compared with Wav2Lip,VideoRetalking,and DI-Net models,and its performance was evaluated using three metrics:LSE-C,CSIM,and FID.Experimental results showed that the proposed method achieves significant improvements in synchronization accuracy and image fidelity,providing an efficient and feasible solution for lip-synthesis tasks of digital virtual humans.展开更多
To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM...To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM method optimizes data augmentation by combining a sample selection strategy and dynamic interpolation coefficients,thus enabling information fusion of speech data with different emotions at the acoustic level.The ICASA method enhances feature extraction capability through dynamic fusion of the improved coordinate attention(ICA)and shuffle attention(SA)techniques.The ICA technique reduces computational overhead by employing depth-separable convolution and an h-swish activation function and captures long-range dependencies of multi-scale time-frequency features using the attention weights.The SA technique promotes feature interaction through channel shuffling,which helps the model learn richer and more discriminative emotional features.Experimental results demonstrate that,compared to the baseline model,the proposed model improves the weighted accuracy by 5.42%and 4.54%,and the unweighted accuracy by 3.37%and 3.85%on the IEMOCAP and RAVDESS datasets,respectively.These improvements were confirmed to be statistically significant by independent samples t-tests,further supporting the practical reliability and applicability of the proposed model in real-world emotion-aware speech systems.展开更多
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b...Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.展开更多
Modern business information systems face significant challenges in managing heterogeneous data sources,integrating disparate systems,and providing real-time decision support in complex enterprise environments.Contempo...Modern business information systems face significant challenges in managing heterogeneous data sources,integrating disparate systems,and providing real-time decision support in complex enterprise environments.Contemporary enterprises typically operate 200+interconnected systems,with research indicating that 52% of organizations manage three or more enterprise content management systems,creating information silos that reduce operational efficiency by up to 35%.While attention mechanisms have demonstrated remarkable success in natural language processing and computer vision,their systematic application to business information systems remains largely unexplored.This paper presents the theoretical foundation for a Hierarchical Attention-Based Business Information System(HABIS)framework that applies multi-level attention mechanisms to enterprise environments.We provide a comprehensive mathematical formulation of the framework,analyze its computational complexity,and present a proof-of-concept implementation with simulation-based validation that demonstrates a 42% reduction in crosssystem query latency compared to legacy ERP modules and 70% improvement in prediction accuracy over baseline methods.The theoretical framework introduces four hierarchical attention levels:system-level attention for dynamic weighting of business systems,process-level attention for business process prioritization,data-level attention for critical information selection,and temporal attention for time-sensitive pattern recognition.Our complexity analysis demonstrates that the framework achieves O(n log n)computational complexity for attention computation,making it scalable to large enterprise environments including retail supply chains with 200+system-scale deployments.The proof-of-concept implementation validates the theoretical framework’s feasibility withMSE loss of 0.439 and response times of 0.000120 s per query,demonstrating its potential for addressing key challenges in business information systems.This work establishes a foundation for future empirical research and practical implementation of attention-driven enterprise systems.展开更多
Neuromodulation techniques effectively intervene in cognitive function,holding considerable scientific and practical value in fields such as aerospace,medicine,life sciences,and brain research.These techniques utilize...Neuromodulation techniques effectively intervene in cognitive function,holding considerable scientific and practical value in fields such as aerospace,medicine,life sciences,and brain research.These techniques utilize electrical stimulation to directly or indirectly target specific brain regions,modulating neural activity and influencing broader brain networks,thereby regulating cognitive function.Regulating cognitive function involves an understanding of aspects such as perception,learning and memory,attention,spatial cognition,and physical function.To enhance the application of cognitive regulation in the general population,this paper reviews recent publications from the Web of Science to assess the advancements and challenges of invasive and non-invasive stimulation methods in modulating cognitive functions.This review covers various neuromodulation techniques for cognitive intervention,including deep brain stimulation,vagus nerve stimulation,and invasive methods using microelectrode arrays.The non-invasive techniques discussed include transcranial magnetic stimulation,transcranial direct current stimulation,transcranial alternating current stimulation,transcutaneous electrical acupoint stimulation,and time interference stimulation for activating deep targets.Invasive stimulation methods,which are ideal for studying the pathogenesis of neurological diseases,tend to cause greater trauma and have been less researched in the context of cognitive function regulation.Non-invasive methods,particularly newer transcranial stimulation techniques,are gentler and more appropriate for regulating cognitive functions in the general population.These include transcutaneous acupoint electrical stimulation using acupoints and time interference methods for activating deep targets.This paper also discusses current technical challenges and potential future breakthroughs in neuromodulation technology.It is recommended that neuromodulation techniques be combined with neural detection methods to better assess their effects and improve the accuracy of non-invasive neuromodulation.Additionally,researching closed-loop feedback neuromodulation methods is identified as a promising direction for future development.展开更多
Microseismic(MS)monitoring is an effective technique to detect mining-induced rock fractures.However,recognizing grouting-induced signals is challenging due to complex geological conditions in deep rock plates.Therefo...Microseismic(MS)monitoring is an effective technique to detect mining-induced rock fractures.However,recognizing grouting-induced signals is challenging due to complex geological conditions in deep rock plates.Therefore,a hybrid model(WM-ResNet50)integrating data enhancement,a deep convolutional neural network(CNN),and convolutional block attention modules(CBAM)was proposed.Firstly,an MS system was established at the Xieqiao coal mine in Anhui Province,China.MS waveforms and injection parameters were acquired during grouting.Secondly,signals were categorized based on time-frequency characteristics to build a dataset,which was divided into training,validation,and test sets at a ratio of 4:1:1.Subsequently,the performance of WM-ResNet50 was evaluated based on indices such as individual precision,total accuracy,recall,and loss function.The results indicated that WMResNet50 achieved an average recognition accuracy of 94.38%,surpassing that of a simple CNN(90.04%),ResNet18(91.72%),and ResNet50(92.48%).Finally,WM-ResNet50 was applied to monitor the whole process at laboratory tests and field cases.Both results affirmed the feasibility and effectiveness of MS inversion in predicting actual slurry diffusion ranges within deep rock layers.By comparison,it was revealed that the MS sources classified by WM-ResNet50 matched grouting records well.A solution to address insufficient diffusion under long-borehole grouting has been proposed.WM-ResNet50′s accuracy was validated through in-situ coring and XRD analysis for cement-based hydration products.This study provides a beneficial reference for similar rock signal processing and in-field grouting practices.展开更多
Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest....Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.展开更多
Detecting fake news in multimodal and multilingual social media environments is challenging due to inherent noise,inter-modal imbalance,computational bottlenecks,and semantic ambiguity.To address these issues,we propo...Detecting fake news in multimodal and multilingual social media environments is challenging due to inherent noise,inter-modal imbalance,computational bottlenecks,and semantic ambiguity.To address these issues,we propose SparseMoE-MFN,a novel unified framework that integrates sparse attention with a sparse-activated Mixture of-Experts(MoE)architecture.This framework aims to enhance the efficiency,inferential depth,and interpretability of multimodal fake news detection.Sparse MoE-MFN leverages LLaVA-v1.6-Mistral-7B-HF for efficient visual encoding and Qwen/Qwen2-7B for text processing.The sparse attention module adaptively filters irrelevant tokens and focuses on key regions,reducing computational costs and noise.The sparse MoE module dynamically routes inputs to specialized experts(visual,language,cross-modal alignment)based on content heterogeneity.This expert specialization design boosts computational efficiency and semantic adaptability,enabling precise processing of complex content and improving performance on ambiguous categories.Evaluated on the large-scale,multilingualMR2 dataset,SparseMoEMFN achieves state-of-the-art performance.It obtains an accuracy of 86.7%and a macro-averaged F1 score of 0.859,outperforming strong baselines like MiniGPT-4 by 3.4%and 3.2%,respectively.Notably,it shows significant advantages in the“unverified”category.Furthermore,SparseMoE-MFN demonstrates superior computational efficiency,with an average inference latency of 89.1 ms and 95.4 GFLOPs,substantially lower than existing models.Ablation studies and visualization analyses confirm the effectiveness of both sparse attention and sparse MoE components in improving accuracy,generalization,and efficiency.展开更多
Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,exi...Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,existing GFL approaches often lack the capability for comprehensive feature extraction and adaptive optimization,particularly in non-independent and identically distributed(NON-IID)scenarios where balancing global structural understanding and local node-level detail remains a challenge.To this end,this paper proposes a novel framework called GFL-SAR(Graph Federated Collaborative Learning Framework Based on Structural Amplification and Attention Refinement),which enhances the representation learning capability of graph data through a dual-branch collaborative design.Specifically,we propose the Structural Insight Amplifier(SIA),which utilizes an improved Graph Convolutional Network(GCN)to strengthen structural awareness and improve modeling of topological patterns.In parallel,we propose the Attentive Relational Refiner(ARR),which employs an enhanced Graph Attention Network(GAT)to perform fine-grained modeling of node relationships and neighborhood features,thereby improving the expressiveness of local interactions and preserving critical contextual information.GFL-SAR effectively integrates multi-scale features from every branch via feature fusion and federated optimization,thereby addressing existing GFL limitations in structural modeling and feature representation.Experiments on standard benchmark datasets including Cora,Citeseer,Polblogs,and Cora_ML demonstrate that GFL-SAR achieves superior performance in classification accuracy,convergence speed,and robustness compared to existing methods,confirming its effectiveness and generalizability in GFL tasks.展开更多
文摘Avian incubation is characterised by the contact between the eggs and the bird's skin to transfer heat to increase egg temperature above ambient conditions. Birds can be attentive to the clutch all of the time or, particularly if only one parent incubates, attentiveness may be quite low. Attentiveness is related to egg size with large eggs having high attentiveness, whereas small eggs (<10 g) can have attentiveness ranging from 50% to 100%. Previous studies have suggested that incubation duration is a function of attentiveness albeit for small birds. This study tested the hypothesis that, after controlling for egg size and phylogeny, incubation duration would be a function of attentiveness. Data for 444 bird species representing 24 orders were analysed. Whilst egg mass had a significant relationship with incubation duration, there was no relationship with attentiveness for all of the species or a subset of the passerines. Despite egg temperature drops during an incubation recess, average day-time and night-time temperatures are similar in a range of species. Re-examination of previously reported temperature profiles recorded by dummy eggs over a 24-h period shows that after an incubation recess there seems to be an additional heat flux that raises egg temperature above that seen during night-time periods of constant incubation. The reasons why eggs under intermittent incubation are not considerably cooler than eggs during constant incubation are discussed.
基金This work was supported by the U.S.Army Corps of Engineers(Baltimore District),U.S.Geological Survey(Patuxent Wildlife Research Center)the University of Maryland,the Maryland Department of Natural Resources(Wildlife and Heritage Program)the Maryland Environmental Service,and the Maryland Coastal Bays Program.
文摘Background:While nest attentiveness plays a critical role in the reproductive success of avian species,nest attentiveness data with high temporal resolution is not available for many species.However,improvements in both video monitoring and temperature logging devices present an opportunity to increase our understanding of this aspect of avian behavior.Methods:To investigate nest attentiveness behaviors and evaluate these technologies,we monitored 13 nests across two Common Tern(Sterna hirundo)breeding colonies with a paired video camera-temperature logger approach,while monitoring 63 additional nests with temperature loggers alone.Observations occurred from May to August of 2017 on Poplar(Chesapeake Bay,Maryland,USA)and Skimmer Islands(Isle of Wight Bay,Maryland,USA).We examined data respective to four times of day:Morning(civil dawn‒11:59),Peak(12:00‒16:00),Cooling(16:01‒civil dusk),and Night(civil dusk‒civil dawn).Results:While successful nests had mostly short duration off-bouts and maintained consistent nest attentiveness throughout the day,failed nests had dramatic reductions in nest attentiveness during the Cooling and Night periods(p<0.05)with one colony experiencing repeated nocturnal abandonment due to predation pressure from a Great Horned Owl(Bubo virginianus).Incubation appeared to ameliorate ambient temperatures during Night,as nests were significantly warmer during Night when birds were on versus off the nest(p<0.05).Meanwhile,off-bouts during the Peak period occurred during higher ambient temperatures,perhaps due to adults leaving the nest during the hottest periods to perform belly soaking.Unfortunately,temperature logger data alone had limited ability to predict nest attentiveness status during shorter bouts,with results highly dependent on time of day and bout duration.While our methods did not affect hatching success(p>0.05),video-monitored nests did have significantly lower clutch sizes(p<0.05).Conclusions:The paired use of iButtons and video cameras enabled a detailed description of the incubation behavior of COTE.However,while promising for future research,the logistical and potential biological complications involved in the use of these methods suggest that careful planning is needed before these devices are utilized to ensure data is collected in a safe and successful manner.
文摘随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法。首先通过皮尔逊相关分析筛选关键特征,并利用孤立森林算法检测异常值,结合线性插值法和标准化完成数据预处理。随后,通过时间卷积网络(Temporal Convolutional Network,TCN)提取时序特征,再利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)网络捕获前后向时间依赖关系,并在输出端引入注意力机制聚焦关键时间步特征。最后,在Desert Knowledge Australia Solar Centre(DKASC)数据集上的对比实验表明,与传统LSTM、BiLSTM模型相比,提出的TCN-BiLSTM-Attention模型在预测精度、稳定性等方面均表现出一定优势。
文摘Successful drought planning is dependent on the generation of timely and accurate early warning information.Yet there is little evidence to explain the extent to which crop farmers pay attention to and assimilate early warning drought information that aids in the policy formulation in support of drought risk reduction.A socioecological survey,using a structured questionnaire administered to 426 crop farming households,was carried out in the Talensi District of the Upper East Region,Ghana.The data analytic techniques used were frequency tables,relative importance index,and multinomial logistics embedded in SPSS v.20 software.The results show that crop farmers predominantly rely on agricultural extension officers for early warning drought information,with an estimated 78% of them paying little to very much attention to the information.The likelihood ratio Chi-square test showed that there is a significant improvement in fit as X^(2)(20)=96.792,p<0.000.Household status,average monthly income,and age were the significant predictors for crop farmers paying no attention at all to early warning drought information,while household status was the only significant factor among those paying a little attention.The drive to build a climate-resilient society with effective early warning centers across Ghana will receive 60% lower support from crop farmers paying no to a little attention as compared to farmers paying very much attention to early warning drought information.Broader stakeholder engagements should be carried out to harness inclusive support from crop farmers to build a climate-resilient society in Ghana.
文摘Behavior recognition of Hu sheep contributes to their intensive and intelligent farming.Due to the generally high density of Hu sheep farming,severe occlusion occurs among different behaviors and even among sheep performing the same behavior,leading to missing and false detection issues in existing behavior recognition methods.A high-low frequency aggregated attention and negative sample comprehensive score loss and comprehensive score soft non-maximum suppression-YOLO(HLNC-YOLO)was proposed for identifying the behavior of Hu sheep,addressing the issues of missed and erroneous detections caused by occlusion between Hu sheep in intensive farming.Firstly,images of four typical behaviors-standing,lying,eating,and drinking-were collected from the sheep farm to construct the Hu sheep behavior dataset(HSBD).Next,to solve the occlusion issues,during the training phase,the C2F-HLAtt module was integrated,which combined high-low frequency aggregation attention,into the YOLO v8 Backbone to perceive occluded objects and introduce an auxiliary reversible branch to retain more effective features.Using comprehensive score regression loss(CSLoss)to reduce the scores of suboptimal boxes and enhance the comprehensive scores of occluded object boxes.Finally,the soft comprehensive score non-maximal suppression(Soft-CS-NMS)algorithm filtered prediction boxes during the inferencing.Testing on the HSBD,HLNC-YOLO achieved a mean average precision(mAP@50)of 87.8%,with a memory footprint of 17.4 MB.This represented an improvement of 7.1,2.2,4.6,and 11 percentage points over YOLO v8,YOLO v9,YOLO v10,and Faster R-CNN,respectively.Research indicated that the HLNC-YOLO accurately identified the behavior of Hu sheep in intensive farming and possessed generalization capabilities,providing technical support for smart farming.
文摘Lip synchronization serves as a core technology for enabling natural interactions in digital virtual humans.However,it faces challenges such as insufficient dynamic correspondence between speech and lip movements and inadequate modeling of image details.To address these limitations,a comprehensively optimized lip synchronization framework extending the Wav2Lip architecture was proposed in this study.Firstly,based on the Wav2Lip model,a facial region extraction strategy using facial keypoints was designed,which effectively enhances the robustness of facial alignment during lip synchronization for digital virtual humans.Then,a cross-modal attention fusion module between visual and speech features was introduced to improve cross-modal information fusion,and a dynamic receptive field convolution module was developed in the generation branch to enhance the modeling performance of the lip region.Finally,experiments were conducted on the VFHQ dataset.The proposed method was compared with Wav2Lip,VideoRetalking,and DI-Net models,and its performance was evaluated using three metrics:LSE-C,CSIM,and FID.Experimental results showed that the proposed method achieves significant improvements in synchronization accuracy and image fidelity,providing an efficient and feasible solution for lip-synthesis tasks of digital virtual humans.
基金supported by the National Natural Science Foundation of China under Grant No.12204062the Natural Science Foundation of Shandong Province under Grant No.ZR2022MF330。
文摘To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM method optimizes data augmentation by combining a sample selection strategy and dynamic interpolation coefficients,thus enabling information fusion of speech data with different emotions at the acoustic level.The ICASA method enhances feature extraction capability through dynamic fusion of the improved coordinate attention(ICA)and shuffle attention(SA)techniques.The ICA technique reduces computational overhead by employing depth-separable convolution and an h-swish activation function and captures long-range dependencies of multi-scale time-frequency features using the attention weights.The SA technique promotes feature interaction through channel shuffling,which helps the model learn richer and more discriminative emotional features.Experimental results demonstrate that,compared to the baseline model,the proposed model improves the weighted accuracy by 5.42%and 4.54%,and the unweighted accuracy by 3.37%and 3.85%on the IEMOCAP and RAVDESS datasets,respectively.These improvements were confirmed to be statistically significant by independent samples t-tests,further supporting the practical reliability and applicability of the proposed model in real-world emotion-aware speech systems.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Metaverse Support Program to Nurture the Best Talents(IITP-2024-RS-2023-00254529)grant funded by the Korea government(MSIT).
文摘Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.
文摘Modern business information systems face significant challenges in managing heterogeneous data sources,integrating disparate systems,and providing real-time decision support in complex enterprise environments.Contemporary enterprises typically operate 200+interconnected systems,with research indicating that 52% of organizations manage three or more enterprise content management systems,creating information silos that reduce operational efficiency by up to 35%.While attention mechanisms have demonstrated remarkable success in natural language processing and computer vision,their systematic application to business information systems remains largely unexplored.This paper presents the theoretical foundation for a Hierarchical Attention-Based Business Information System(HABIS)framework that applies multi-level attention mechanisms to enterprise environments.We provide a comprehensive mathematical formulation of the framework,analyze its computational complexity,and present a proof-of-concept implementation with simulation-based validation that demonstrates a 42% reduction in crosssystem query latency compared to legacy ERP modules and 70% improvement in prediction accuracy over baseline methods.The theoretical framework introduces four hierarchical attention levels:system-level attention for dynamic weighting of business systems,process-level attention for business process prioritization,data-level attention for critical information selection,and temporal attention for time-sensitive pattern recognition.Our complexity analysis demonstrates that the framework achieves O(n log n)computational complexity for attention computation,making it scalable to large enterprise environments including retail supply chains with 200+system-scale deployments.The proof-of-concept implementation validates the theoretical framework’s feasibility withMSE loss of 0.439 and response times of 0.000120 s per query,demonstrating its potential for addressing key challenges in business information systems.This work establishes a foundation for future empirical research and practical implementation of attention-driven enterprise systems.
基金supported by STI 2030-Major Projects,No.2021ZD0201603(to JL)the Joint Foundation Program of the Chinese Academy of Sciences,No.8091A170201(to JL)+1 种基金the National Natural Science Foundation of China,Nos.T2293730(to XC),T2293731(to XC),T2293734(to XC),62471291(to YW),62121003(to XC),61960206012(to XC),62333020(to XC),and 62171434(to XC)the National Key Research and Development Program of China,Nos.2022YFC2402501(to XC),2022YFB3205602(to XC).
文摘Neuromodulation techniques effectively intervene in cognitive function,holding considerable scientific and practical value in fields such as aerospace,medicine,life sciences,and brain research.These techniques utilize electrical stimulation to directly or indirectly target specific brain regions,modulating neural activity and influencing broader brain networks,thereby regulating cognitive function.Regulating cognitive function involves an understanding of aspects such as perception,learning and memory,attention,spatial cognition,and physical function.To enhance the application of cognitive regulation in the general population,this paper reviews recent publications from the Web of Science to assess the advancements and challenges of invasive and non-invasive stimulation methods in modulating cognitive functions.This review covers various neuromodulation techniques for cognitive intervention,including deep brain stimulation,vagus nerve stimulation,and invasive methods using microelectrode arrays.The non-invasive techniques discussed include transcranial magnetic stimulation,transcranial direct current stimulation,transcranial alternating current stimulation,transcutaneous electrical acupoint stimulation,and time interference stimulation for activating deep targets.Invasive stimulation methods,which are ideal for studying the pathogenesis of neurological diseases,tend to cause greater trauma and have been less researched in the context of cognitive function regulation.Non-invasive methods,particularly newer transcranial stimulation techniques,are gentler and more appropriate for regulating cognitive functions in the general population.These include transcutaneous acupoint electrical stimulation using acupoints and time interference methods for activating deep targets.This paper also discusses current technical challenges and potential future breakthroughs in neuromodulation technology.It is recommended that neuromodulation techniques be combined with neural detection methods to better assess their effects and improve the accuracy of non-invasive neuromodulation.Additionally,researching closed-loop feedback neuromodulation methods is identified as a promising direction for future development.
基金financial support from the National Natural Science Foundation of China(Nos.52204089,52374082)the Young Elite Scientists Sponsorship Program(No.2023QNRC001)by China Association for Science and Technology(CAST).
文摘Microseismic(MS)monitoring is an effective technique to detect mining-induced rock fractures.However,recognizing grouting-induced signals is challenging due to complex geological conditions in deep rock plates.Therefore,a hybrid model(WM-ResNet50)integrating data enhancement,a deep convolutional neural network(CNN),and convolutional block attention modules(CBAM)was proposed.Firstly,an MS system was established at the Xieqiao coal mine in Anhui Province,China.MS waveforms and injection parameters were acquired during grouting.Secondly,signals were categorized based on time-frequency characteristics to build a dataset,which was divided into training,validation,and test sets at a ratio of 4:1:1.Subsequently,the performance of WM-ResNet50 was evaluated based on indices such as individual precision,total accuracy,recall,and loss function.The results indicated that WMResNet50 achieved an average recognition accuracy of 94.38%,surpassing that of a simple CNN(90.04%),ResNet18(91.72%),and ResNet50(92.48%).Finally,WM-ResNet50 was applied to monitor the whole process at laboratory tests and field cases.Both results affirmed the feasibility and effectiveness of MS inversion in predicting actual slurry diffusion ranges within deep rock layers.By comparison,it was revealed that the MS sources classified by WM-ResNet50 matched grouting records well.A solution to address insufficient diffusion under long-borehole grouting has been proposed.WM-ResNet50′s accuracy was validated through in-situ coring and XRD analysis for cement-based hydration products.This study provides a beneficial reference for similar rock signal processing and in-field grouting practices.
基金supported in part by the National Natural Science Foundation of China:61773330.
文摘Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.
基金supported by the National Social Science Fund of China(20BXW101).
文摘Detecting fake news in multimodal and multilingual social media environments is challenging due to inherent noise,inter-modal imbalance,computational bottlenecks,and semantic ambiguity.To address these issues,we propose SparseMoE-MFN,a novel unified framework that integrates sparse attention with a sparse-activated Mixture of-Experts(MoE)architecture.This framework aims to enhance the efficiency,inferential depth,and interpretability of multimodal fake news detection.Sparse MoE-MFN leverages LLaVA-v1.6-Mistral-7B-HF for efficient visual encoding and Qwen/Qwen2-7B for text processing.The sparse attention module adaptively filters irrelevant tokens and focuses on key regions,reducing computational costs and noise.The sparse MoE module dynamically routes inputs to specialized experts(visual,language,cross-modal alignment)based on content heterogeneity.This expert specialization design boosts computational efficiency and semantic adaptability,enabling precise processing of complex content and improving performance on ambiguous categories.Evaluated on the large-scale,multilingualMR2 dataset,SparseMoEMFN achieves state-of-the-art performance.It obtains an accuracy of 86.7%and a macro-averaged F1 score of 0.859,outperforming strong baselines like MiniGPT-4 by 3.4%and 3.2%,respectively.Notably,it shows significant advantages in the“unverified”category.Furthermore,SparseMoE-MFN demonstrates superior computational efficiency,with an average inference latency of 89.1 ms and 95.4 GFLOPs,substantially lower than existing models.Ablation studies and visualization analyses confirm the effectiveness of both sparse attention and sparse MoE components in improving accuracy,generalization,and efficiency.
基金supported by National Natural Science Foundation of China(62466045)Inner Mongolia Natural Science Foundation Project(2021LHMS06003)Inner Mongolia University Basic Research Business Fee Project(114).
文摘Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,existing GFL approaches often lack the capability for comprehensive feature extraction and adaptive optimization,particularly in non-independent and identically distributed(NON-IID)scenarios where balancing global structural understanding and local node-level detail remains a challenge.To this end,this paper proposes a novel framework called GFL-SAR(Graph Federated Collaborative Learning Framework Based on Structural Amplification and Attention Refinement),which enhances the representation learning capability of graph data through a dual-branch collaborative design.Specifically,we propose the Structural Insight Amplifier(SIA),which utilizes an improved Graph Convolutional Network(GCN)to strengthen structural awareness and improve modeling of topological patterns.In parallel,we propose the Attentive Relational Refiner(ARR),which employs an enhanced Graph Attention Network(GAT)to perform fine-grained modeling of node relationships and neighborhood features,thereby improving the expressiveness of local interactions and preserving critical contextual information.GFL-SAR effectively integrates multi-scale features from every branch via feature fusion and federated optimization,thereby addressing existing GFL limitations in structural modeling and feature representation.Experiments on standard benchmark datasets including Cora,Citeseer,Polblogs,and Cora_ML demonstrate that GFL-SAR achieves superior performance in classification accuracy,convergence speed,and robustness compared to existing methods,confirming its effectiveness and generalizability in GFL tasks.