critical for guiding treatment and improving patient outcomes.Traditional molecular subtyping via immuno-histochemistry(IHC)test is invasive,time-consuming,and may not fully represent tumor heterogeneity.This study pr...critical for guiding treatment and improving patient outcomes.Traditional molecular subtyping via immuno-histochemistry(IHC)test is invasive,time-consuming,and may not fully represent tumor heterogeneity.This study proposes a non-invasive approach using digital mammography images and deep learning algorithm for classifying breast cancer molecular subtypes.Four pretrained models,including two Convolutional Neural Networks(MobileNet_V3_Large and VGG-16)and two Vision Transformers(ViT_B_16 and ViT_Base_Patch16_Clip_224)were fine-tuned to classify images into HER2-enriched,Luminal,Normal-like,and Triple Negative subtypes.Hyperparameter tuning,including learning rate adjustment and layer freezing strategies,was applied to optimize performance.Among the evaluated models,ViT_Base_Patch16_Clip_224 achieved the highest test accuracy(94.44%),with equally high precision,recall,and F1-score of 0.94,demonstrating excellent generalization.MobileNet_V3_Large achieved the same accuracy but showed less training stability.In contrast,VGG-16 recorded the lowest performance,indicating a limitation in its generalizability for this classification task.The study also highlighted the superior performance of the Vision Transformer models over CNNs,particularly due to their ability to capture global contextual features and the benefit of CLIP-based pretraining in ViT_Base_Patch16_Clip_224.To enhance clinical applicability,a graphical user interface(GUI)named“BCMS Dx”was developed for streamlined subtype prediction.Deep learning applied to mammography has proven effective for accurate and non-invasive molecular subtyping.The proposed Vision Transformer-based model and supporting GUI offer a promising direction for augmenting diagnostic workflows,minimizing the need for invasive procedures,and advancing personalized breast cancer management.展开更多
在低光环境下,人脸识别面临图像质量低、特征模糊等诸多挑战,导致现有方法难以提取鲁棒且辨识度高的特征,从而严重影响识别性能。为应对这一问题,提出了一种新颖的非成对低光人脸识别模型LFSepNet(low-light face separation network)...在低光环境下,人脸识别面临图像质量低、特征模糊等诸多挑战,导致现有方法难以提取鲁棒且辨识度高的特征,从而严重影响识别性能。为应对这一问题,提出了一种新颖的非成对低光人脸识别模型LFSepNet(low-light face separation network)。与传统基于卷积神经网络(convolutional neural network,CNN)架构的训练方法不同,LFSepNet采用Transformer架构,更有效地捕捉长距离依赖关系,从而克服卷积神经网络在局部感受野上的限制。由于低光环境下的人脸图像往往整体偏暗,仅有少数区域可能包含较丰富的照明信息,传统CNN在特征提取时容易受限于局部区域,难以充分利用这些关键信息。相比之下,Transformer通过自注意力机制实现全局信息建模,使网络能够更全面地整合亮度不均的人脸图像信息,从而提升特征解耦的效果和低光人脸识别的准确性。LFSepNet模型包含自适应亮度分离模块和自适应照明间隙损失,通过动态分离人脸与光照特征,减少光照干扰,同时进一步优化特征分离效果,使模型能够提取更加精确和鲁棒的特征。实验结果表明,LFSepNet在多个低光人脸数据集上的性能均优于现有方法,特别是在极端低光条件下,其识别精度显著提升。该研究为低光人脸识别提供了基于非成对设置的有效解决方案,并在实际应用中展现了良好的潜力。展开更多
In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we devel...In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics.展开更多
Weakly supervised semantic segmentation(WSSS)is a tricky task,which only provides category information for segmentation prediction.Thus,the key stage of WSSS is to generate the pseudo labels.For convolutional neural n...Weakly supervised semantic segmentation(WSSS)is a tricky task,which only provides category information for segmentation prediction.Thus,the key stage of WSSS is to generate the pseudo labels.For convolutional neural network(CNN)based methods,in which class activation mapping(CAM)is proposed to obtain the pseudo labels,and only concentrates on the most discriminative parts.Recently,transformer-based methods utilize attention map from the multi-headed self-attention(MHSA)module to predict pseudo labels,which usually contain obvious background noise and incoherent object area.To solve the above problems,we use the Conformer as our backbone,which is a parallel network based on convolutional neural network(CNN)and Transformer.The two branches generate pseudo labels and refine them independently,and can effectively combine the advantages of CNN and Transformer.However,the parallel structure is not close enough in the information communication.Thus,parallel structure can result in poor details about pseudo labels,and the background noise still exists.To alleviate this problem,we propose enhancing convolution CAM(ECCAM)model,which have three improved modules based on enhancing convolution,including deeper stem(DStem),convolutional feed-forward network(CFFN)and feature coupling unit with convolution(FCUConv).The ECCAM could make Conformer have tighter interaction between CNN and Transformer branches.After experimental verification,the improved modules we propose can help the network perceive more local information from images,making the final segmentation results more refined.Compared with similar architecture,our modules greatly improve the semantic segmentation performance and achieve70.2%mean intersection over union(mIoU)on the PASCAL VOC 2012 dataset.展开更多
It is well known that aluminum and copper exhibit structural phase transformations in quasi-static and dynamic measurements,including shock wave loading.However,the dependence of phase transformations in a wide range ...It is well known that aluminum and copper exhibit structural phase transformations in quasi-static and dynamic measurements,including shock wave loading.However,the dependence of phase transformations in a wide range of crystallographic directions of shock loading has not been revealed.In this work,we calculated the shock Hugoniot for aluminum and copper in different crystallographic directions([100],[110],[111],[112],[102],[114],[123],[134],[221]and[401])of shock compression using molecular dynamics(MD)simulations.The results showed a high pressure(>160 GPa for Cu and>40 GPa for Al)of the FCC-to-BCC transition.In copper,different characteristics of the phase transition are observed depending on the loading direction with the[100]compression direction being the weakest.The FCC-to-BCC transition for copper is in the range of 150–220 GPa,which is consistent with the existing experimental data.Due to the high transition pressure,the BCC phase transition in copper competes with melting.In aluminum,the FCC-to-BCC transition is observed for all studied directions at pressures between 40 and 50 GPa far beyond the melting.In all considered cases we observe the coexistence of HCP and BCC phases during the FCC-to-BCC transition,which is consistent with the experimental data and atomistic calculations;this HCP phase forms in the course of accompanying plastic deformation with dislocation activity in the parent FCC phase.The plasticity incipience is also anisotropic in bothmetals,which is due to the difference in the projections of stress on the slip plane for different orientations of the FCC crystal.MD modeling results demonstrate a strong dependence of the FCC-to-BCC transition on the crystallographic direction,in which the material is loaded in the copper crystals.However,MD simulations data can only be obtained for specific points in the stereographic direction space;therefore,for more comprehensive understanding of the phase transition process,a feed-forward neural network was trained using MD modeling data.The trained machine learning model allowed us to construct continuous stereographic maps of phase transitions as a function of stress in the shock-compressed state of metal.Due to appearance and growth of multiple centers of new phase,the FCC-to-BCC transition leads to formation of a polycrystalline structure from the parent single crystal.展开更多
This study proposes a novel forecasting framework that simultaneously captures the strong periodicity and irregular meteorological fluctuations inherent in solar radiation time series.Existing approaches typically def...This study proposes a novel forecasting framework that simultaneously captures the strong periodicity and irregular meteorological fluctuations inherent in solar radiation time series.Existing approaches typically define inter-regional correlations using either simple correlation coefficients or distance-based measures when applying spatio-temporal graph neural networks(STGNNs).However,such definitions are prone to generating spurious correlations due to the dominance of periodic structures.To address this limitation,we adopt the Elastic-Band Transform(EBT)to decompose solar radiation into periodic and amplitude-modulated components,which are then modeled independently with separate graph neural networks.The periodic component,characterized by strong nationwide correlations,is learned with a relatively simple architecture,whereas the amplitude-modulated component is modeled with more complex STGNNs that capture climatological similarities between regions.The predictions from the two components are subsequently recombined to yield final forecasts that integrate both periodic patterns and aperiodic variability.The proposed framework is validated with multiple STGNN architectures,and experimental results demonstrate improved predictive accuracy and interpretability compared to conventional methods.展开更多
基金funded by the Ministry of Higher Education(MoHE)Malaysia through the Fundamental Research Grant Scheme—Early Career Researcher(FRGS-EC),grant number FRGSEC/1/2024/ICT02/UNIMAP/02/8.
文摘critical for guiding treatment and improving patient outcomes.Traditional molecular subtyping via immuno-histochemistry(IHC)test is invasive,time-consuming,and may not fully represent tumor heterogeneity.This study proposes a non-invasive approach using digital mammography images and deep learning algorithm for classifying breast cancer molecular subtypes.Four pretrained models,including two Convolutional Neural Networks(MobileNet_V3_Large and VGG-16)and two Vision Transformers(ViT_B_16 and ViT_Base_Patch16_Clip_224)were fine-tuned to classify images into HER2-enriched,Luminal,Normal-like,and Triple Negative subtypes.Hyperparameter tuning,including learning rate adjustment and layer freezing strategies,was applied to optimize performance.Among the evaluated models,ViT_Base_Patch16_Clip_224 achieved the highest test accuracy(94.44%),with equally high precision,recall,and F1-score of 0.94,demonstrating excellent generalization.MobileNet_V3_Large achieved the same accuracy but showed less training stability.In contrast,VGG-16 recorded the lowest performance,indicating a limitation in its generalizability for this classification task.The study also highlighted the superior performance of the Vision Transformer models over CNNs,particularly due to their ability to capture global contextual features and the benefit of CLIP-based pretraining in ViT_Base_Patch16_Clip_224.To enhance clinical applicability,a graphical user interface(GUI)named“BCMS Dx”was developed for streamlined subtype prediction.Deep learning applied to mammography has proven effective for accurate and non-invasive molecular subtyping.The proposed Vision Transformer-based model and supporting GUI offer a promising direction for augmenting diagnostic workflows,minimizing the need for invasive procedures,and advancing personalized breast cancer management.
基金The authors extend their appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number(PSAU/2024/01/32082).
文摘In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics.
文摘Weakly supervised semantic segmentation(WSSS)is a tricky task,which only provides category information for segmentation prediction.Thus,the key stage of WSSS is to generate the pseudo labels.For convolutional neural network(CNN)based methods,in which class activation mapping(CAM)is proposed to obtain the pseudo labels,and only concentrates on the most discriminative parts.Recently,transformer-based methods utilize attention map from the multi-headed self-attention(MHSA)module to predict pseudo labels,which usually contain obvious background noise and incoherent object area.To solve the above problems,we use the Conformer as our backbone,which is a parallel network based on convolutional neural network(CNN)and Transformer.The two branches generate pseudo labels and refine them independently,and can effectively combine the advantages of CNN and Transformer.However,the parallel structure is not close enough in the information communication.Thus,parallel structure can result in poor details about pseudo labels,and the background noise still exists.To alleviate this problem,we propose enhancing convolution CAM(ECCAM)model,which have three improved modules based on enhancing convolution,including deeper stem(DStem),convolutional feed-forward network(CFFN)and feature coupling unit with convolution(FCUConv).The ECCAM could make Conformer have tighter interaction between CNN and Transformer branches.After experimental verification,the improved modules we propose can help the network perceive more local information from images,making the final segmentation results more refined.Compared with similar architecture,our modules greatly improve the semantic segmentation performance and achieve70.2%mean intersection over union(mIoU)on the PASCAL VOC 2012 dataset.
基金founded by the Ministry of Science and Higher Education of the Russian Federation,State assignments for research,registration No.1024032600084-8-1.3.2Study of the grain growth and the formation of polycrystalline structure as a result of phase transition(Section 6)was founded by the Russian Science Foundation,Project No.24-71-00078+3 种基金https://rscf.ru/en/project/24-71-00078/(accessed on 01 December 2025).Study of the orientation dependence of the phase transition of aluminum in Section 3 was founded by the Russian Science Foundation,Project No.24-19-00684https://rscf.ru/en/project/24-19-00684/(accessed on 01 December 2025).
文摘It is well known that aluminum and copper exhibit structural phase transformations in quasi-static and dynamic measurements,including shock wave loading.However,the dependence of phase transformations in a wide range of crystallographic directions of shock loading has not been revealed.In this work,we calculated the shock Hugoniot for aluminum and copper in different crystallographic directions([100],[110],[111],[112],[102],[114],[123],[134],[221]and[401])of shock compression using molecular dynamics(MD)simulations.The results showed a high pressure(>160 GPa for Cu and>40 GPa for Al)of the FCC-to-BCC transition.In copper,different characteristics of the phase transition are observed depending on the loading direction with the[100]compression direction being the weakest.The FCC-to-BCC transition for copper is in the range of 150–220 GPa,which is consistent with the existing experimental data.Due to the high transition pressure,the BCC phase transition in copper competes with melting.In aluminum,the FCC-to-BCC transition is observed for all studied directions at pressures between 40 and 50 GPa far beyond the melting.In all considered cases we observe the coexistence of HCP and BCC phases during the FCC-to-BCC transition,which is consistent with the experimental data and atomistic calculations;this HCP phase forms in the course of accompanying plastic deformation with dislocation activity in the parent FCC phase.The plasticity incipience is also anisotropic in bothmetals,which is due to the difference in the projections of stress on the slip plane for different orientations of the FCC crystal.MD modeling results demonstrate a strong dependence of the FCC-to-BCC transition on the crystallographic direction,in which the material is loaded in the copper crystals.However,MD simulations data can only be obtained for specific points in the stereographic direction space;therefore,for more comprehensive understanding of the phase transition process,a feed-forward neural network was trained using MD modeling data.The trained machine learning model allowed us to construct continuous stereographic maps of phase transitions as a function of stress in the shock-compressed state of metal.Due to appearance and growth of multiple centers of new phase,the FCC-to-BCC transition leads to formation of a polycrystalline structure from the parent single crystal.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2023-00249743).
文摘This study proposes a novel forecasting framework that simultaneously captures the strong periodicity and irregular meteorological fluctuations inherent in solar radiation time series.Existing approaches typically define inter-regional correlations using either simple correlation coefficients or distance-based measures when applying spatio-temporal graph neural networks(STGNNs).However,such definitions are prone to generating spurious correlations due to the dominance of periodic structures.To address this limitation,we adopt the Elastic-Band Transform(EBT)to decompose solar radiation into periodic and amplitude-modulated components,which are then modeled independently with separate graph neural networks.The periodic component,characterized by strong nationwide correlations,is learned with a relatively simple architecture,whereas the amplitude-modulated component is modeled with more complex STGNNs that capture climatological similarities between regions.The predictions from the two components are subsequently recombined to yield final forecasts that integrate both periodic patterns and aperiodic variability.The proposed framework is validated with multiple STGNN architectures,and experimental results demonstrate improved predictive accuracy and interpretability compared to conventional methods.