The accurate establishment of a ferrite transformation start temperature model is crucial to design a reasonable controlled rolling process and ensure uniform microstructure in aluminum bearing dual-phase steel.The me...The accurate establishment of a ferrite transformation start temperature model is crucial to design a reasonable controlled rolling process and ensure uniform microstructure in aluminum bearing dual-phase steel.The measurements of the expansion-temperature curves of aluminum bearing dual-phase steel under continuous cooling and isothermal conditions are presented,utilizing a dynamic transformation dilatometer experiment.Based on these expansion-temperature curves,the start temperature and incubation time of ferrite transformation were determined,elucidating the influence of process parameters on both the incubation time and the start temperature of ferrite transformation.By integrating metallurgical principles with measured incubation time of ferrite transformation,and considering the effects of temperature and strain,a fitting model for the variation in volume free energy during ferrite nucleation was derived.Building upon this foundation,a high-precision incubation time of ferrite transformation mathematical model for the experimental steel was established.To more accurately calculate the start temperature of ferrite transformation under continuous cooling conditions,the Scheil’s additivity rule was modified to account for the effects of deformation and cooling rate.The results indicate that the modification coefficient decreases with increasing the cooling rate and strain,thereby significantly improving the accuracy of calculating the starting temperature of ferrite transformation using the modified additivity rule.展开更多
Vehicle-induced response separation is a crucial issue in structural health monitoring(SHM).This paper proposes a block-wise sliding recursive wavelet transform algorithm to meet the real-time processing requirements ...Vehicle-induced response separation is a crucial issue in structural health monitoring(SHM).This paper proposes a block-wise sliding recursive wavelet transform algorithm to meet the real-time processing requirements of monitoring data.To extend the separation target from a fixed dataset to a continuously updating data stream,a block-wise sliding framework is first developed.This framework is further optimized considering the characteristics of real-time data streams,and its advantage in computational efficiency is theoretically demonstrated.During the decomposition and reconstruction processes,information from neighboring data blocks is fully utilized to reduce algorithmic complexity.In addition,a delay-setting strategy is introduced for each processing window to mitigate boundary effects,thereby balancing accuracy and efficiency.Simulated signal experiments are conducted to determine the optimal delay configuration and to verify the algorithm’s superior performance,achieving a lower Root Mean Square Error(RMSE)and only 0.0249 times the average computational time compared with the original algorithm.Furthermore,strain signals from the Lieshi River Bridge are employed to validate the method.The proposed algorithm successfully separates the static trend from vehicle-induced responses in real time across different sampling frequencies,demonstrating its effectiveness and applicability in real-time bridge monitoring.展开更多
Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task...Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.展开更多
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b...Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.展开更多
With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal ...With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal performance in fine-grained detection tasks,primarily due to imbalanced datasets and insufficient model representation capabilities.To address this challenge,this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection.During the data processing stage,Large Language Models(LLMs)are employed for data augmentation,mitigating dataset imbalance.In the data input stage,both word-level and character-level features are utilized as model inputs,enhancing the richness of features and preventing information loss.A dual-stream Transformer serves as the backbone network in the learning representation stage,complemented by a graph-based feature fusion mechanism.At the output stage,both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives,further enhancing the model’s feature representation.Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.展开更多
As a core component of power systems, the operational status of transformers directly affects grid stability. To address the problem of “domain shift” in cross-domain fault diagnosis, this paper proposes a memory-en...As a core component of power systems, the operational status of transformers directly affects grid stability. To address the problem of “domain shift” in cross-domain fault diagnosis, this paper proposes a memory-enhanced dual-stream network (MemFuse-DSN). The method reconstructs the feature space by selecting and enhancing multi-source domain samples based on similarity metrics. An adaptive weighted dual-stream architecture is designed, integrating gradient reversal and orthogonality constraints to achieve efficient feature alignment. In addition, a novel dual dynamic memory module is introduced: the task memory bank is used to store high-confidence class prototype information, and adopts an exponential moving average (EMA) strategy to ensure the smooth evolution of prototypes over time;the domain memory bank is periodically updated and clusters potential noisy features, dynamically tracking domain shift trends, thereby optimizing the decoupled feature learning process. Experimental validation was conducted on a ±110 kV transformer vibration testing platform using typical fault types including winding looseness, core looseness, and compound faults. The results show that the proposed method achieves a fault diagnosis accuracy of 99.2%, providing a highly generalizable solution for the intelligent operation and maintenance of power equipment.展开更多
Nonlinear transforms have significantly advanced learned image compression(LIC),particularly using residual blocks.This transform enhances the nonlinear expression ability and obtain compact feature representation by ...Nonlinear transforms have significantly advanced learned image compression(LIC),particularly using residual blocks.This transform enhances the nonlinear expression ability and obtain compact feature representation by enlarging the receptive field,which indicates how the convolution process extracts features in a high dimensional feature space.However,its functionality is restricted to the spatial dimension and network depth,limiting further improvements in network performance due to insufficient information interaction and representation.Crucially,the potential of high dimensional feature space in the channel dimension and the exploration of network width/resolution remain largely untapped.In this paper,we consider nonlinear transforms from the perspective of feature space,defining high-dimensional feature spaces in different dimensions and investigating the specific effects.Firstly,we introduce the dimension increasing and decreasing transforms in both channel and spatial dimensions to obtain high dimensional feature space and achieve better feature extraction.Secondly,we design a channel-spatial fusion residual transform(CSR),which incorporates multi-dimensional transforms for a more effective representation.Furthermore,we simplify the proposed fusion transform to obtain a slim architecture(CSR-sm),balancing network complexity and compression performance.Finally,we build the overall network with stacked CSR transforms to achieve better compression and reconstruction.Experimental results demonstrate that the proposed method can achieve superior ratedistortion performance compared to the existing LIC methods and traditional codecs.Specifically,our proposed method achieves 9.38%BD-rate reduction over VVC on Kodak dataset.展开更多
针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为...针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为网格序列,从而获取建筑多边形化简前后的Token序列,构建出建筑多边形化简样本对数据;随后采用Transformer架构建立模型,基于样本数据利用模型的掩码自注意力机制学习点序列之间的依赖关系,最终逐点生成新的简化多边形,从而实现建筑多边形的化简。在训练过程中,模型使用结构化的样本数据,设计了忽略特定索引的交叉熵损失函数以提升化简质量。试验设计包括主试验与泛化验证两部分。主试验基于洛杉矶1∶2000建筑数据集,分别采用0.2、0.3和0.5 mm 3种网格尺寸对多边形进行编码,实现了目标比例尺为1∶5000与1∶10000的化简。试验结果表明,在0.3 mm的网格尺寸下模型性能最优,验证集上的化简结果与人工标注的一致率超过92.0%,且针对北京部分区域的建筑多边形数据的泛化试验验证了模型的迁移能力;与LSTM模型的对比分析显示,在参数规模相近的条件下,LSTM模型无法形成有效收敛,并生成可用结果。本文证实了Transformer在处理空间几何序列任务中的潜力,且能够有效复用已有化简样本,为智能建筑多边形化简提供了具有工程实用价值的途径。展开更多
基金supported by the National Science and Technology Major Project-Intelligent Manufacturing Systems And Robots(2025ZD1602200)the National Key Research and Development Program of China(Grant No.2022YFB3304800).
文摘The accurate establishment of a ferrite transformation start temperature model is crucial to design a reasonable controlled rolling process and ensure uniform microstructure in aluminum bearing dual-phase steel.The measurements of the expansion-temperature curves of aluminum bearing dual-phase steel under continuous cooling and isothermal conditions are presented,utilizing a dynamic transformation dilatometer experiment.Based on these expansion-temperature curves,the start temperature and incubation time of ferrite transformation were determined,elucidating the influence of process parameters on both the incubation time and the start temperature of ferrite transformation.By integrating metallurgical principles with measured incubation time of ferrite transformation,and considering the effects of temperature and strain,a fitting model for the variation in volume free energy during ferrite nucleation was derived.Building upon this foundation,a high-precision incubation time of ferrite transformation mathematical model for the experimental steel was established.To more accurately calculate the start temperature of ferrite transformation under continuous cooling conditions,the Scheil’s additivity rule was modified to account for the effects of deformation and cooling rate.The results indicate that the modification coefficient decreases with increasing the cooling rate and strain,thereby significantly improving the accuracy of calculating the starting temperature of ferrite transformation using the modified additivity rule.
基金the support of the Major Science and Technology Project of Yunnan Province,China(Grant No.202502AD080007)the National Natural Science Foundation of China(Grant No.52378288)。
文摘Vehicle-induced response separation is a crucial issue in structural health monitoring(SHM).This paper proposes a block-wise sliding recursive wavelet transform algorithm to meet the real-time processing requirements of monitoring data.To extend the separation target from a fixed dataset to a continuously updating data stream,a block-wise sliding framework is first developed.This framework is further optimized considering the characteristics of real-time data streams,and its advantage in computational efficiency is theoretically demonstrated.During the decomposition and reconstruction processes,information from neighboring data blocks is fully utilized to reduce algorithmic complexity.In addition,a delay-setting strategy is introduced for each processing window to mitigate boundary effects,thereby balancing accuracy and efficiency.Simulated signal experiments are conducted to determine the optimal delay configuration and to verify the algorithm’s superior performance,achieving a lower Root Mean Square Error(RMSE)and only 0.0249 times the average computational time compared with the original algorithm.Furthermore,strain signals from the Lieshi River Bridge are employed to validate the method.The proposed algorithm successfully separates the static trend from vehicle-induced responses in real time across different sampling frequencies,demonstrating its effectiveness and applicability in real-time bridge monitoring.
文摘Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Metaverse Support Program to Nurture the Best Talents(IITP-2024-RS-2023-00254529)grant funded by the Korea government(MSIT).
文摘Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.
基金supported by the Fundamental Research Funds for the Central Universities(2024JKF13)the Beijing Municipal Education Commission General Program of Science and Technology(No.KM202414019003).
文摘With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal performance in fine-grained detection tasks,primarily due to imbalanced datasets and insufficient model representation capabilities.To address this challenge,this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection.During the data processing stage,Large Language Models(LLMs)are employed for data augmentation,mitigating dataset imbalance.In the data input stage,both word-level and character-level features are utilized as model inputs,enhancing the richness of features and preventing information loss.A dual-stream Transformer serves as the backbone network in the learning representation stage,complemented by a graph-based feature fusion mechanism.At the output stage,both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives,further enhancing the model’s feature representation.Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.
基金supported by the State Grid Shandong Electric Power Company Project(Grant Number SGSDJX00BDJS2400388).
文摘As a core component of power systems, the operational status of transformers directly affects grid stability. To address the problem of “domain shift” in cross-domain fault diagnosis, this paper proposes a memory-enhanced dual-stream network (MemFuse-DSN). The method reconstructs the feature space by selecting and enhancing multi-source domain samples based on similarity metrics. An adaptive weighted dual-stream architecture is designed, integrating gradient reversal and orthogonality constraints to achieve efficient feature alignment. In addition, a novel dual dynamic memory module is introduced: the task memory bank is used to store high-confidence class prototype information, and adopts an exponential moving average (EMA) strategy to ensure the smooth evolution of prototypes over time;the domain memory bank is periodically updated and clusters potential noisy features, dynamically tracking domain shift trends, thereby optimizing the decoupled feature learning process. Experimental validation was conducted on a ±110 kV transformer vibration testing platform using typical fault types including winding looseness, core looseness, and compound faults. The results show that the proposed method achieves a fault diagnosis accuracy of 99.2%, providing a highly generalizable solution for the intelligent operation and maintenance of power equipment.
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.62031013)Guangdong Province Key Construction Discipline Scientific Research Capacity Improvement Project(Grant No.2022ZDJS117).
文摘Nonlinear transforms have significantly advanced learned image compression(LIC),particularly using residual blocks.This transform enhances the nonlinear expression ability and obtain compact feature representation by enlarging the receptive field,which indicates how the convolution process extracts features in a high dimensional feature space.However,its functionality is restricted to the spatial dimension and network depth,limiting further improvements in network performance due to insufficient information interaction and representation.Crucially,the potential of high dimensional feature space in the channel dimension and the exploration of network width/resolution remain largely untapped.In this paper,we consider nonlinear transforms from the perspective of feature space,defining high-dimensional feature spaces in different dimensions and investigating the specific effects.Firstly,we introduce the dimension increasing and decreasing transforms in both channel and spatial dimensions to obtain high dimensional feature space and achieve better feature extraction.Secondly,we design a channel-spatial fusion residual transform(CSR),which incorporates multi-dimensional transforms for a more effective representation.Furthermore,we simplify the proposed fusion transform to obtain a slim architecture(CSR-sm),balancing network complexity and compression performance.Finally,we build the overall network with stacked CSR transforms to achieve better compression and reconstruction.Experimental results demonstrate that the proposed method can achieve superior ratedistortion performance compared to the existing LIC methods and traditional codecs.Specifically,our proposed method achieves 9.38%BD-rate reduction over VVC on Kodak dataset.
文摘针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为网格序列,从而获取建筑多边形化简前后的Token序列,构建出建筑多边形化简样本对数据;随后采用Transformer架构建立模型,基于样本数据利用模型的掩码自注意力机制学习点序列之间的依赖关系,最终逐点生成新的简化多边形,从而实现建筑多边形的化简。在训练过程中,模型使用结构化的样本数据,设计了忽略特定索引的交叉熵损失函数以提升化简质量。试验设计包括主试验与泛化验证两部分。主试验基于洛杉矶1∶2000建筑数据集,分别采用0.2、0.3和0.5 mm 3种网格尺寸对多边形进行编码,实现了目标比例尺为1∶5000与1∶10000的化简。试验结果表明,在0.3 mm的网格尺寸下模型性能最优,验证集上的化简结果与人工标注的一致率超过92.0%,且针对北京部分区域的建筑多边形数据的泛化试验验证了模型的迁移能力;与LSTM模型的对比分析显示,在参数规模相近的条件下,LSTM模型无法形成有效收敛,并生成可用结果。本文证实了Transformer在处理空间几何序列任务中的潜力,且能够有效复用已有化简样本,为智能建筑多边形化简提供了具有工程实用价值的途径。