The accurate establishment of a ferrite transformation start temperature model is crucial to design a reasonable controlled rolling process and ensure uniform microstructure in aluminum bearing dual-phase steel.The me...The accurate establishment of a ferrite transformation start temperature model is crucial to design a reasonable controlled rolling process and ensure uniform microstructure in aluminum bearing dual-phase steel.The measurements of the expansion-temperature curves of aluminum bearing dual-phase steel under continuous cooling and isothermal conditions are presented,utilizing a dynamic transformation dilatometer experiment.Based on these expansion-temperature curves,the start temperature and incubation time of ferrite transformation were determined,elucidating the influence of process parameters on both the incubation time and the start temperature of ferrite transformation.By integrating metallurgical principles with measured incubation time of ferrite transformation,and considering the effects of temperature and strain,a fitting model for the variation in volume free energy during ferrite nucleation was derived.Building upon this foundation,a high-precision incubation time of ferrite transformation mathematical model for the experimental steel was established.To more accurately calculate the start temperature of ferrite transformation under continuous cooling conditions,the Scheil’s additivity rule was modified to account for the effects of deformation and cooling rate.The results indicate that the modification coefficient decreases with increasing the cooling rate and strain,thereby significantly improving the accuracy of calculating the starting temperature of ferrite transformation using the modified additivity rule.展开更多
Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.Howev...Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.However,the increasing demand for higher resolution and real-time imaging results in significant data volume,limiting data storage,transmission and processing efficiency of system.Therefore,there is an urgent need for an effective method to compress the raw data without compromising image quality.This paper presents a photoacoustic-computed tomography 3D data compression method and system based on Wavelet-Transformer.This method is based on the cooperative compression framework that integrates wavelet hard coding with deep learning-based soft decoding.It combines the multiscale analysis capability of wavelet transforms with the global feature modeling advantage of Transformers,achieving high-quality data compression and reconstruction.Experimental results using k-wave simulation suggest that the proposed compression system has advantages under extreme compression conditions,achieving a raw data compression ratio of up to 1:40.Furthermore,three-dimensional data compression experiment using in vivo mouse demonstrated that the maximum peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)values of reconstructed images reached 38.60 and 0.9583,effectively overcoming detail loss and artifacts introduced by raw data compression.All the results suggest that the proposed system can significantly reduce storage requirements and hardware cost,enhancing computational efficiency and image quality.These advantages support the development of photoacoustic-computed tomography toward higher efficiency,real-time performance and intelligent functionality.展开更多
It is difficult to recover chrysocolla from sulfidation flotation which is closely related to the mineral surface composition.In this study,the effects of fluoride roasting on the surface composition of chrysocolla we...It is difficult to recover chrysocolla from sulfidation flotation which is closely related to the mineral surface composition.In this study,the effects of fluoride roasting on the surface composition of chrysocolla were investigated,its impact on sulfidation flotation was explored,and the mechanisms involved in both fluoride roasting and sulfidation flotation were discussed.With CaF_(2)as the roasting reagent,Na_(2)S·9H_(2)O as the sulfidation reagent,and sodium butyl xanthate(NaBX)as the collector,the results of the flotation experiments showed that fluoride roasting improved the floatability of chrysocolla,and the recovery rate increased from 16.87%to 82.74%.X-ray diffraction analysis revealed that after fluoride roasting,approximately all the Cu on the chrysocolla surface was exposed in the form of CuO,which could provide a basis for subsequent sulfidation flotation.The microscopy and elemental analyses revealed that large quantities of"pagoda-like"grains were observed on the sulfidation surface of the fluoride-roasted chrysocolla,indicating high crystallinity particles of copper sulfide.This suggests that the effect of sulfide formation on the chrysocolla surface was more pronounced.X-ray photoelectron spectroscopy revealed that fluoride roasting increased the relative contents of sulfur and copper on the surface and that both the Cu~+and polysulfide fractions on the surface of the minerals increased.This enhances the effect of sulfidation,which is conducive to flotation recovery.Therefore,fluoride roasting improved the effect of copper species transformation and sulfidation on the surface of chysocolla,promoted the adsorption of collectors,and improved the recovery of chrysocolla from sulfidation flotation.展开更多
Traffic flow prediction constitutes a fundamental component of Intelligent Transportation Systems(ITS),playing a pivotal role in mitigating congestion,enhancing route optimization,and improving the utilization efficie...Traffic flow prediction constitutes a fundamental component of Intelligent Transportation Systems(ITS),playing a pivotal role in mitigating congestion,enhancing route optimization,and improving the utilization efficiency of roadway infrastructure.However,existingmethods struggle in complex traffic scenarios due to static spatio-temporal embedding,restricted multi-scale temporal modeling,and weak representation of local spatial interactions.This study proposes Bi-STAT+,an enhanced bidirectional spatio-temporal attention framework to address existing limitations through three principal contributions:(1)an adaptive spatio-temporal embedding module that dynamically adjusts embeddings to capture complex traffic variations;(2)frequency-domain analysis in the temporal dimension for simultaneous high-frequency details and low-frequency trend extraction;and(3)an agent attention mechanism in the spatial dimension that enhances local feature extraction through dynamic weight allocation.Extensive experiments were performed on four distinct datasets,including two publicly benchmark datasets(PEMS04 and PEMS08)and two private datasets collected from Baotou and Chengdu,China.The results demonstrate that Bi-STAT+consistently outperforms existing methods in terms of MAE,RMSE,and MAPE,while maintaining strong robustness against missing data and noise.Furthermore,the results highlight that prediction accuracy improves significantly with higher sampling rates,providing crucial insights for optimizing real-world deployment scenarios.展开更多
Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task...Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.展开更多
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b...Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.展开更多
In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we devel...In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics.展开更多
Multimodal spatiotemporal data from smart city consumer electronics present critical challenges including cross-modal temporal misalignment,unreliable data quality,limited joint modeling of spatial and temporal depend...Multimodal spatiotemporal data from smart city consumer electronics present critical challenges including cross-modal temporal misalignment,unreliable data quality,limited joint modeling of spatial and temporal dependencies,and weak resilience to adversarial updates.To address these limitations,EdgeST-Fusion is introduced as a cross-modal federated graph transformer framework for context-aware smart city analytics.The architecture integrates cross-modal embedding networks for modality alignment,graph transformer encoders for spatial dependency modeling,temporal self-attention for dynamic pattern learning,and adaptive anomaly detection to ensure data quality and security during aggregation.A privacy-preserving federated learning protocol with differential privacy guarantees enables collaborative model training without centralizing sensitive data.The framework employs data-quality-aware weighted aggregation to enhance robustness against noisy and malicious client updates.Experimental evaluation on the GeoLife,PeMS-Bay,and SmartHome+datasets demonstrates that EdgeST-Fusion achieves 21.8%improvement in prediction accuracy,35.7%reduction in communication overhead,and 29.4%enhancement in security resilience compared to recent baselines.Real-world deployment across three smart city testbeds validates practical viability with 90.0%average accuracy and sub-250 ms inference latency.The proposed framework remains feasible for deployment on heterogeneous and resource-constrained consumer electronics devices whilemaintaining strong privacy guarantees and scalability for large-scale urban environments.展开更多
The increasing integration of cyber-physical components in Industry 4.0 water infrastructures has heightened the risk of false data injection(FDI)attacks,posing critical threats to operational integrity,resource manag...The increasing integration of cyber-physical components in Industry 4.0 water infrastructures has heightened the risk of false data injection(FDI)attacks,posing critical threats to operational integrity,resource management,and public safety.Traditional detection mechanisms often struggle to generalize across heterogeneous environments or adapt to sophisticated,stealthy threats.To address these challenges,we propose a novel evolutionary optimized transformer-based deep reinforcement learning framework(Evo-Transformer-DRL)designed for robust and adaptive FDI detection in smart water infrastructures.The proposed architecture integrates three powerful paradigms:a transformer encoder for modeling complex temporal dependencies in multivariate time series,a DRL agent for learning optimal decision policies in dynamic environments,and an evolutionary optimizer to fine-tune model hyper-parameters.This synergy enhances detection performance while maintaining adaptability across varying data distributions.Specifically,hyper-parameters of both the transformer and DRL modules are optimized using an improved grey wolf optimizer(IGWO),ensuring a balanced trade-off between detection accuracy and computational efficiency.The model is trained and evaluated on three realistic Industry 4.0 water datasets:secure water treatment(SWaT),water distribution(WADI),and battle of the attack detection algorithms(BATADAL),which capture diverse attack scenarios in smart treatment and distribution systems.Comparative analysis against state-of-the-art baselines including Transformer,DRL,bidirectional encoder representations from transformers(BERT),convolutional neural network(CNN),long short-term memory(LSTM),and support vector machines(SVM)demonstrates that our proposed Evo-Transformer-DRL framework consistently outperforms others in key metrics such as accuracy,recall,area under the curve(AUC),and execution time.Notably,it achieves a maximum detection accuracy of 99.19%,highlighting its strong generalization capability across different testbeds.These results confirm the suitability of our hybrid framework for real-world Industry 4.0 deployment,where rapid adaptation,scalability,and reliability are paramount for securing critical infrastructure systems.展开更多
针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为...针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为网格序列,从而获取建筑多边形化简前后的Token序列,构建出建筑多边形化简样本对数据;随后采用Transformer架构建立模型,基于样本数据利用模型的掩码自注意力机制学习点序列之间的依赖关系,最终逐点生成新的简化多边形,从而实现建筑多边形的化简。在训练过程中,模型使用结构化的样本数据,设计了忽略特定索引的交叉熵损失函数以提升化简质量。试验设计包括主试验与泛化验证两部分。主试验基于洛杉矶1∶2000建筑数据集,分别采用0.2、0.3和0.5 mm 3种网格尺寸对多边形进行编码,实现了目标比例尺为1∶5000与1∶10000的化简。试验结果表明,在0.3 mm的网格尺寸下模型性能最优,验证集上的化简结果与人工标注的一致率超过92.0%,且针对北京部分区域的建筑多边形数据的泛化试验验证了模型的迁移能力;与LSTM模型的对比分析显示,在参数规模相近的条件下,LSTM模型无法形成有效收敛,并生成可用结果。本文证实了Transformer在处理空间几何序列任务中的潜力,且能够有效复用已有化简样本,为智能建筑多边形化简提供了具有工程实用价值的途径。展开更多
基金supported by the National Science and Technology Major Project-Intelligent Manufacturing Systems And Robots(2025ZD1602200)the National Key Research and Development Program of China(Grant No.2022YFB3304800).
文摘The accurate establishment of a ferrite transformation start temperature model is crucial to design a reasonable controlled rolling process and ensure uniform microstructure in aluminum bearing dual-phase steel.The measurements of the expansion-temperature curves of aluminum bearing dual-phase steel under continuous cooling and isothermal conditions are presented,utilizing a dynamic transformation dilatometer experiment.Based on these expansion-temperature curves,the start temperature and incubation time of ferrite transformation were determined,elucidating the influence of process parameters on both the incubation time and the start temperature of ferrite transformation.By integrating metallurgical principles with measured incubation time of ferrite transformation,and considering the effects of temperature and strain,a fitting model for the variation in volume free energy during ferrite nucleation was derived.Building upon this foundation,a high-precision incubation time of ferrite transformation mathematical model for the experimental steel was established.To more accurately calculate the start temperature of ferrite transformation under continuous cooling conditions,the Scheil’s additivity rule was modified to account for the effects of deformation and cooling rate.The results indicate that the modification coefficient decreases with increasing the cooling rate and strain,thereby significantly improving the accuracy of calculating the starting temperature of ferrite transformation using the modified additivity rule.
基金supported by the National Key R&D Program of China[Grant No.2023YFF0713600]the National Natural Science Foundation of China[Grant No.62275062]+3 种基金Project of Shandong Innovation and Startup Community of High-end Medical Apparatus and Instruments[Grant No.2023-SGTTXM-002 and 2024-SGTTXM-005]the Shandong Province Technology Innovation Guidance Plan(Central Leading Local Science and Technology Development Fund)[Grant No.YDZX2023115]the Taishan Scholar Special Funding Project of Shandong Provincethe Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai[Grant No.ZL202402].
文摘Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.However,the increasing demand for higher resolution and real-time imaging results in significant data volume,limiting data storage,transmission and processing efficiency of system.Therefore,there is an urgent need for an effective method to compress the raw data without compromising image quality.This paper presents a photoacoustic-computed tomography 3D data compression method and system based on Wavelet-Transformer.This method is based on the cooperative compression framework that integrates wavelet hard coding with deep learning-based soft decoding.It combines the multiscale analysis capability of wavelet transforms with the global feature modeling advantage of Transformers,achieving high-quality data compression and reconstruction.Experimental results using k-wave simulation suggest that the proposed compression system has advantages under extreme compression conditions,achieving a raw data compression ratio of up to 1:40.Furthermore,three-dimensional data compression experiment using in vivo mouse demonstrated that the maximum peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)values of reconstructed images reached 38.60 and 0.9583,effectively overcoming detail loss and artifacts introduced by raw data compression.All the results suggest that the proposed system can significantly reduce storage requirements and hardware cost,enhancing computational efficiency and image quality.These advantages support the development of photoacoustic-computed tomography toward higher efficiency,real-time performance and intelligent functionality.
基金financially supported by the National Natural Science Foundation of China(No.52374259)the Open Fund of the State Key Laboratory of Mineral Processing Science and Technology,China(No.BGRIMM-KJSKL-2023-11)the Major Science and Technology Projects in Yunnan Province,China(No.202302 AF080004)。
文摘It is difficult to recover chrysocolla from sulfidation flotation which is closely related to the mineral surface composition.In this study,the effects of fluoride roasting on the surface composition of chrysocolla were investigated,its impact on sulfidation flotation was explored,and the mechanisms involved in both fluoride roasting and sulfidation flotation were discussed.With CaF_(2)as the roasting reagent,Na_(2)S·9H_(2)O as the sulfidation reagent,and sodium butyl xanthate(NaBX)as the collector,the results of the flotation experiments showed that fluoride roasting improved the floatability of chrysocolla,and the recovery rate increased from 16.87%to 82.74%.X-ray diffraction analysis revealed that after fluoride roasting,approximately all the Cu on the chrysocolla surface was exposed in the form of CuO,which could provide a basis for subsequent sulfidation flotation.The microscopy and elemental analyses revealed that large quantities of"pagoda-like"grains were observed on the sulfidation surface of the fluoride-roasted chrysocolla,indicating high crystallinity particles of copper sulfide.This suggests that the effect of sulfide formation on the chrysocolla surface was more pronounced.X-ray photoelectron spectroscopy revealed that fluoride roasting increased the relative contents of sulfur and copper on the surface and that both the Cu~+and polysulfide fractions on the surface of the minerals increased.This enhances the effect of sulfidation,which is conducive to flotation recovery.Therefore,fluoride roasting improved the effect of copper species transformation and sulfidation on the surface of chysocolla,promoted the adsorption of collectors,and improved the recovery of chrysocolla from sulfidation flotation.
基金partly supported by the Youth Foundation of the Inner Mongolia Natural Science Foundation[grant number 2024QN06017 and 2025MS06022]the Basic Scientific Research Business Fee Project for Universities in Inner Mongolia[grant numbers 2023XKJX019 and 2023XKJX024]the Central Guidance on Local Science and Technology Development Fund through[grant number 2024ZY0084].
文摘Traffic flow prediction constitutes a fundamental component of Intelligent Transportation Systems(ITS),playing a pivotal role in mitigating congestion,enhancing route optimization,and improving the utilization efficiency of roadway infrastructure.However,existingmethods struggle in complex traffic scenarios due to static spatio-temporal embedding,restricted multi-scale temporal modeling,and weak representation of local spatial interactions.This study proposes Bi-STAT+,an enhanced bidirectional spatio-temporal attention framework to address existing limitations through three principal contributions:(1)an adaptive spatio-temporal embedding module that dynamically adjusts embeddings to capture complex traffic variations;(2)frequency-domain analysis in the temporal dimension for simultaneous high-frequency details and low-frequency trend extraction;and(3)an agent attention mechanism in the spatial dimension that enhances local feature extraction through dynamic weight allocation.Extensive experiments were performed on four distinct datasets,including two publicly benchmark datasets(PEMS04 and PEMS08)and two private datasets collected from Baotou and Chengdu,China.The results demonstrate that Bi-STAT+consistently outperforms existing methods in terms of MAE,RMSE,and MAPE,while maintaining strong robustness against missing data and noise.Furthermore,the results highlight that prediction accuracy improves significantly with higher sampling rates,providing crucial insights for optimizing real-world deployment scenarios.
文摘Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Metaverse Support Program to Nurture the Best Talents(IITP-2024-RS-2023-00254529)grant funded by the Korea government(MSIT).
文摘Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.
基金The authors extend their appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number(PSAU/2024/01/32082).
文摘In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics.
基金supported by the University of Tabuk,Saudi Arabia。
文摘Multimodal spatiotemporal data from smart city consumer electronics present critical challenges including cross-modal temporal misalignment,unreliable data quality,limited joint modeling of spatial and temporal dependencies,and weak resilience to adversarial updates.To address these limitations,EdgeST-Fusion is introduced as a cross-modal federated graph transformer framework for context-aware smart city analytics.The architecture integrates cross-modal embedding networks for modality alignment,graph transformer encoders for spatial dependency modeling,temporal self-attention for dynamic pattern learning,and adaptive anomaly detection to ensure data quality and security during aggregation.A privacy-preserving federated learning protocol with differential privacy guarantees enables collaborative model training without centralizing sensitive data.The framework employs data-quality-aware weighted aggregation to enhance robustness against noisy and malicious client updates.Experimental evaluation on the GeoLife,PeMS-Bay,and SmartHome+datasets demonstrates that EdgeST-Fusion achieves 21.8%improvement in prediction accuracy,35.7%reduction in communication overhead,and 29.4%enhancement in security resilience compared to recent baselines.Real-world deployment across three smart city testbeds validates practical viability with 90.0%average accuracy and sub-250 ms inference latency.The proposed framework remains feasible for deployment on heterogeneous and resource-constrained consumer electronics devices whilemaintaining strong privacy guarantees and scalability for large-scale urban environments.
文摘The increasing integration of cyber-physical components in Industry 4.0 water infrastructures has heightened the risk of false data injection(FDI)attacks,posing critical threats to operational integrity,resource management,and public safety.Traditional detection mechanisms often struggle to generalize across heterogeneous environments or adapt to sophisticated,stealthy threats.To address these challenges,we propose a novel evolutionary optimized transformer-based deep reinforcement learning framework(Evo-Transformer-DRL)designed for robust and adaptive FDI detection in smart water infrastructures.The proposed architecture integrates three powerful paradigms:a transformer encoder for modeling complex temporal dependencies in multivariate time series,a DRL agent for learning optimal decision policies in dynamic environments,and an evolutionary optimizer to fine-tune model hyper-parameters.This synergy enhances detection performance while maintaining adaptability across varying data distributions.Specifically,hyper-parameters of both the transformer and DRL modules are optimized using an improved grey wolf optimizer(IGWO),ensuring a balanced trade-off between detection accuracy and computational efficiency.The model is trained and evaluated on three realistic Industry 4.0 water datasets:secure water treatment(SWaT),water distribution(WADI),and battle of the attack detection algorithms(BATADAL),which capture diverse attack scenarios in smart treatment and distribution systems.Comparative analysis against state-of-the-art baselines including Transformer,DRL,bidirectional encoder representations from transformers(BERT),convolutional neural network(CNN),long short-term memory(LSTM),and support vector machines(SVM)demonstrates that our proposed Evo-Transformer-DRL framework consistently outperforms others in key metrics such as accuracy,recall,area under the curve(AUC),and execution time.Notably,it achieves a maximum detection accuracy of 99.19%,highlighting its strong generalization capability across different testbeds.These results confirm the suitability of our hybrid framework for real-world Industry 4.0 deployment,where rapid adaptation,scalability,and reliability are paramount for securing critical infrastructure systems.
文摘针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为网格序列,从而获取建筑多边形化简前后的Token序列,构建出建筑多边形化简样本对数据;随后采用Transformer架构建立模型,基于样本数据利用模型的掩码自注意力机制学习点序列之间的依赖关系,最终逐点生成新的简化多边形,从而实现建筑多边形的化简。在训练过程中,模型使用结构化的样本数据,设计了忽略特定索引的交叉熵损失函数以提升化简质量。试验设计包括主试验与泛化验证两部分。主试验基于洛杉矶1∶2000建筑数据集,分别采用0.2、0.3和0.5 mm 3种网格尺寸对多边形进行编码,实现了目标比例尺为1∶5000与1∶10000的化简。试验结果表明,在0.3 mm的网格尺寸下模型性能最优,验证集上的化简结果与人工标注的一致率超过92.0%,且针对北京部分区域的建筑多边形数据的泛化试验验证了模型的迁移能力;与LSTM模型的对比分析显示,在参数规模相近的条件下,LSTM模型无法形成有效收敛,并生成可用结果。本文证实了Transformer在处理空间几何序列任务中的潜力,且能够有效复用已有化简样本,为智能建筑多边形化简提供了具有工程实用价值的途径。