Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
It is difficult to recover chrysocolla from sulfidation flotation which is closely related to the mineral surface composition.In this study,the effects of fluoride roasting on the surface composition of chrysocolla we...It is difficult to recover chrysocolla from sulfidation flotation which is closely related to the mineral surface composition.In this study,the effects of fluoride roasting on the surface composition of chrysocolla were investigated,its impact on sulfidation flotation was explored,and the mechanisms involved in both fluoride roasting and sulfidation flotation were discussed.With CaF_(2)as the roasting reagent,Na_(2)S·9H_(2)O as the sulfidation reagent,and sodium butyl xanthate(NaBX)as the collector,the results of the flotation experiments showed that fluoride roasting improved the floatability of chrysocolla,and the recovery rate increased from 16.87%to 82.74%.X-ray diffraction analysis revealed that after fluoride roasting,approximately all the Cu on the chrysocolla surface was exposed in the form of CuO,which could provide a basis for subsequent sulfidation flotation.The microscopy and elemental analyses revealed that large quantities of"pagoda-like"grains were observed on the sulfidation surface of the fluoride-roasted chrysocolla,indicating high crystallinity particles of copper sulfide.This suggests that the effect of sulfide formation on the chrysocolla surface was more pronounced.X-ray photoelectron spectroscopy revealed that fluoride roasting increased the relative contents of sulfur and copper on the surface and that both the Cu~+and polysulfide fractions on the surface of the minerals increased.This enhances the effect of sulfidation,which is conducive to flotation recovery.Therefore,fluoride roasting improved the effect of copper species transformation and sulfidation on the surface of chysocolla,promoted the adsorption of collectors,and improved the recovery of chrysocolla from sulfidation flotation.展开更多
Low-light image enhancement aims to improve the visibility of severely degraded images captured under insufficient illumination,alleviating the adverse effects of illumination degradation on image quality.Traditional ...Low-light image enhancement aims to improve the visibility of severely degraded images captured under insufficient illumination,alleviating the adverse effects of illumination degradation on image quality.Traditional Retinex-based approaches,inspired by human visual perception of brightness and color,decompose an image into illumination and reflectance components to restore fine details.However,their limited capacity for handling noise and complex lighting conditions often leads to distortions and artifacts in the enhanced results,particularly under extreme low-light scenarios.Although deep learning methods built upon Retinex theory have recently advanced the field,most still suffer frominsufficient interpretability and sub-optimal enhancement performance.This paper presents RetinexWT,a novel framework that tightly integrates classical Retinex theory with modern deep learning.Following Retinex principles,RetinexWT employs wavelet transforms to estimate illumination maps for brightness adjustment.A detail-recovery module that synergistically combines Vision Transformer(ViT)and wavelet transforms is then introduced to guide the restoration of lost details,thereby improving overall image quality.Within the framework,wavelet decomposition splits input features into high-frequency and low-frequency components,enabling scale-specific processing of global illumination/color cues and fine textures.Furthermore,a gating mechanism selectively fuses down-sampled and up-sampled features,while an attention-based fusion strategy enhances model interpretability.Extensive experiments on the LOL dataset demonstrate that RetinexWT surpasses existing Retinex-oriented deeplearning methods,achieving an average Peak Signal-to-Noise Ratio(PSNR)improvement of 0.22 dB over the current StateOfTheArt(SOTA),thereby confirming its superiority in low-light image enhancement.Code is available at https://github.com/CHEN-hJ516/RetinexWT(accessed on 14 October 2025).展开更多
Bayan Obo rare earth mine is the largest light rare earth resource worldwide,primarily extracts rare earth elements(REEs)from mixed RE concentrates with bastnaesite and monazite.Nevertheless,the adoption of the concen...Bayan Obo rare earth mine is the largest light rare earth resource worldwide,primarily extracts rare earth elements(REEs)from mixed RE concentrates with bastnaesite and monazite.Nevertheless,the adoption of the concentrated sulfuric acid roasting metallurgical process has resulted in damage to the environment.Therefore,this paper adopted the method of selective mineral phase transformation(MPT)followed by enhanced micro-flotation.By determining the optimal MPT co nditions,the flotation recovery of bastnaesite-roasted products by the collector(phthalic acid,PA)is improved,and the enhanced separation of bastnaesite with monazite is realized.The results show that with the increase of roasting temperature and time,the bastnaesite decomposition product is CeOF and monazite does not change significantly.Subsequent micro-flotation exhibits a gradual decline in the PA consumption of bastnaesiteroasted products,while the flotation recovery of monazite-roasted products remains poor.The artificial mixed ore experiments result in a CeOF foam product with a content of 94.14%and a recovery of 85.80%,and a monazite tank product with a content of 73.53%and a recovery of 87.87%.Compared with the preroasting ore,the surface and interior of bastnaesite-roasted products develop numerous cracks and porosities,and no obvious structural damage is observed in monazite-roasted particles.As the roasting temperature increases,the mineral particles undergo recrystallization or closure,reducing the specific surface area of bastnaesite-roasted products and enhancing hydrophobicity,leading to diminished PA consumption.Fourier transform infrared and other flotation-relation tests show that PA is chemisorbed on the surface of CeOF.The MPT conditions are optimized in this study,which provides a reference for further advancing the efficient separation of bastnaesite and monazite.展开更多
Amphibious vehicles are more prone to attitude instability compared to ships,making it crucial to develop effective methods for monitoring instability risks.However,large inclination events,which can lead to instabili...Amphibious vehicles are more prone to attitude instability compared to ships,making it crucial to develop effective methods for monitoring instability risks.However,large inclination events,which can lead to instability,occur frequently in both experimental and operational data.This infrequency causes events to be overlooked by existing prediction models,which lack the precision to accurately predict inclination attitudes in amphibious vehicles.To address this gap in predicting attitudes near extreme inclination points,this study introduces a novel loss function,termed generalized extreme value loss.Subsequently,a deep learning model for improved waterborne attitude prediction,termed iInformer,was developed using a Transformer-based approach.During the embedding phase,a text prototype is created based on the vehicle’s operation log data is constructed to help the model better understand the vehicle’s operating environment.Data segmentation techniques are used to highlight local data variation features.Furthermore,to mitigate issues related to poor convergence and slow training speeds caused by the extreme value loss function,a teacher forcing mechanism is integrated into the model,enhancing its convergence capabilities.Experimental results validate the effectiveness of the proposed method,demonstrating its ability to handle data imbalance challenges.Specifically,the model achieves over a 60%improvement in root mean square error under extreme value conditions,with significant improvements observed across additional metrics.展开更多
Due to the complex structural hierarchy,with deeply nested associative relations between entities such as equipment,specifications,and business processes,intelligent power grid engineering is challenging.Meanwhile,lim...Due to the complex structural hierarchy,with deeply nested associative relations between entities such as equipment,specifications,and business processes,intelligent power grid engineering is challenging.Meanwhile,limited by the fragmented data and loss of contextual information,the generated reports are prone to the problems such as content redundancy and omission of critical information,failing to meet the demands of efficient decision-making and accurate management in modern power systems.To address these issues,this paper proposes a knowledge graph(KG)-enhanced framework to automatically generate electric power engineering reports.In the KG construction phase,a feature-fused entity recognition model named BERT-BiLSTM-CRF is adopted to improve the accuracy of entity recognition in scenarios involving power engineering professional terminology,thereby solving the problem of ambiguous entity boundaries in traditional models;then a BERT-attention relation extraction model is proposed to enhance the completeness of extracting complex hierarchical and implicit relations in power grid data.In the report generation phase,an improved Transformer architecture is adopted to accurately transform structured knowledge into natural language reports that comply with engineering specifications,addressing the issue of semantic inconsistency caused by the loss of structural information in existing models.By validating with real-world projects,the results show that the proposed framework significantly outperforms existing baseline models in entity recognition,confirming its superiority and applicability in practical engineering.展开更多
Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenari...Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenarios,including fluctuating noise levels and unpredictable environmental elements,these techniques do not fully resolve these challenges.We introduce an innovative strategy that builds upon Retinex theory and integrates a novel deep network architecture,merging the Convolutional Block Attention Module(CBAM)with the Transformer.Our model is capable of detecting more prominent features across both channel and spatial domains.We have conducted extensive experiments across several datasets,namely LOLv1,LOLv2-real,and LOLv2-sync.The results show that our approach surpasses other methods when evaluated against critical metrics such as Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM).Moreover,we have visually assessed images enhanced by various techniques and utilized visual metrics like LPIPS for comparison,and the experimental data clearly demonstrate that our approach excels visually over other methods as well.展开更多
Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning...Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.展开更多
针对现有深度学习算法在壁画修复时,存在全局语义一致性约束不足及局部特征提取不充分,导致修复后的壁画易出现边界效应和细节模糊等问题,提出一种双向自回归Transformer与快速傅里叶卷积增强的壁画修复方法.首先,设计基于Transformer...针对现有深度学习算法在壁画修复时,存在全局语义一致性约束不足及局部特征提取不充分,导致修复后的壁画易出现边界效应和细节模糊等问题,提出一种双向自回归Transformer与快速傅里叶卷积增强的壁画修复方法.首先,设计基于Transformer结构的全局语义特征修复模块,利用双向自回归机制与掩码语言模型(masked language modeling,MLM),提出改进的多头注意力全局语义壁画修复模块,提高对全局语义特征的修复能力.然后,构建了由门控卷积和残差模块组成的全局语义增强模块,增强全局语义特征一致性约束.最后,设计局部细节修复模块,采用大核注意力机制(large kernel attention,LKA)与快速傅里叶卷积提高细节特征的捕获能力,同时减少局部细节信息的丢失,提升修复壁画局部和整体特征的一致性.通过对敦煌壁画数字化修复实验,结果表明,所提算法修复性能更优,客观评价指标均优于比较算法.展开更多
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金financially supported by the National Natural Science Foundation of China(No.52374259)the Open Fund of the State Key Laboratory of Mineral Processing Science and Technology,China(No.BGRIMM-KJSKL-2023-11)the Major Science and Technology Projects in Yunnan Province,China(No.202302 AF080004)。
文摘It is difficult to recover chrysocolla from sulfidation flotation which is closely related to the mineral surface composition.In this study,the effects of fluoride roasting on the surface composition of chrysocolla were investigated,its impact on sulfidation flotation was explored,and the mechanisms involved in both fluoride roasting and sulfidation flotation were discussed.With CaF_(2)as the roasting reagent,Na_(2)S·9H_(2)O as the sulfidation reagent,and sodium butyl xanthate(NaBX)as the collector,the results of the flotation experiments showed that fluoride roasting improved the floatability of chrysocolla,and the recovery rate increased from 16.87%to 82.74%.X-ray diffraction analysis revealed that after fluoride roasting,approximately all the Cu on the chrysocolla surface was exposed in the form of CuO,which could provide a basis for subsequent sulfidation flotation.The microscopy and elemental analyses revealed that large quantities of"pagoda-like"grains were observed on the sulfidation surface of the fluoride-roasted chrysocolla,indicating high crystallinity particles of copper sulfide.This suggests that the effect of sulfide formation on the chrysocolla surface was more pronounced.X-ray photoelectron spectroscopy revealed that fluoride roasting increased the relative contents of sulfur and copper on the surface and that both the Cu~+and polysulfide fractions on the surface of the minerals increased.This enhances the effect of sulfidation,which is conducive to flotation recovery.Therefore,fluoride roasting improved the effect of copper species transformation and sulfidation on the surface of chysocolla,promoted the adsorption of collectors,and improved the recovery of chrysocolla from sulfidation flotation.
基金supported in part by the National Natural Science Foundation of China[Grant number 62471075]the Major Science and Technology Project Grant of the Chongqing Municipal Education Commission[Grant number KJZD-M202301901].
文摘Low-light image enhancement aims to improve the visibility of severely degraded images captured under insufficient illumination,alleviating the adverse effects of illumination degradation on image quality.Traditional Retinex-based approaches,inspired by human visual perception of brightness and color,decompose an image into illumination and reflectance components to restore fine details.However,their limited capacity for handling noise and complex lighting conditions often leads to distortions and artifacts in the enhanced results,particularly under extreme low-light scenarios.Although deep learning methods built upon Retinex theory have recently advanced the field,most still suffer frominsufficient interpretability and sub-optimal enhancement performance.This paper presents RetinexWT,a novel framework that tightly integrates classical Retinex theory with modern deep learning.Following Retinex principles,RetinexWT employs wavelet transforms to estimate illumination maps for brightness adjustment.A detail-recovery module that synergistically combines Vision Transformer(ViT)and wavelet transforms is then introduced to guide the restoration of lost details,thereby improving overall image quality.Within the framework,wavelet decomposition splits input features into high-frequency and low-frequency components,enabling scale-specific processing of global illumination/color cues and fine textures.Furthermore,a gating mechanism selectively fuses down-sampled and up-sampled features,while an attention-based fusion strategy enhances model interpretability.Extensive experiments on the LOL dataset demonstrate that RetinexWT surpasses existing Retinex-oriented deeplearning methods,achieving an average Peak Signal-to-Noise Ratio(PSNR)improvement of 0.22 dB over the current StateOfTheArt(SOTA),thereby confirming its superiority in low-light image enhancement.Code is available at https://github.com/CHEN-hJ516/RetinexWT(accessed on 14 October 2025).
基金Project supported by the National Key R&D Program of China(2022YFC2905800)the National Natural Science Foundation of China(52174242)the National Youth Talent Support Program(QNBJ-2023-03)。
文摘Bayan Obo rare earth mine is the largest light rare earth resource worldwide,primarily extracts rare earth elements(REEs)from mixed RE concentrates with bastnaesite and monazite.Nevertheless,the adoption of the concentrated sulfuric acid roasting metallurgical process has resulted in damage to the environment.Therefore,this paper adopted the method of selective mineral phase transformation(MPT)followed by enhanced micro-flotation.By determining the optimal MPT co nditions,the flotation recovery of bastnaesite-roasted products by the collector(phthalic acid,PA)is improved,and the enhanced separation of bastnaesite with monazite is realized.The results show that with the increase of roasting temperature and time,the bastnaesite decomposition product is CeOF and monazite does not change significantly.Subsequent micro-flotation exhibits a gradual decline in the PA consumption of bastnaesiteroasted products,while the flotation recovery of monazite-roasted products remains poor.The artificial mixed ore experiments result in a CeOF foam product with a content of 94.14%and a recovery of 85.80%,and a monazite tank product with a content of 73.53%and a recovery of 87.87%.Compared with the preroasting ore,the surface and interior of bastnaesite-roasted products develop numerous cracks and porosities,and no obvious structural damage is observed in monazite-roasted particles.As the roasting temperature increases,the mineral particles undergo recrystallization or closure,reducing the specific surface area of bastnaesite-roasted products and enhancing hydrophobicity,leading to diminished PA consumption.Fourier transform infrared and other flotation-relation tests show that PA is chemisorbed on the surface of CeOF.The MPT conditions are optimized in this study,which provides a reference for further advancing the efficient separation of bastnaesite and monazite.
基金Supported by the National Defense Basic Scientific Research Program of China.
文摘Amphibious vehicles are more prone to attitude instability compared to ships,making it crucial to develop effective methods for monitoring instability risks.However,large inclination events,which can lead to instability,occur frequently in both experimental and operational data.This infrequency causes events to be overlooked by existing prediction models,which lack the precision to accurately predict inclination attitudes in amphibious vehicles.To address this gap in predicting attitudes near extreme inclination points,this study introduces a novel loss function,termed generalized extreme value loss.Subsequently,a deep learning model for improved waterborne attitude prediction,termed iInformer,was developed using a Transformer-based approach.During the embedding phase,a text prototype is created based on the vehicle’s operation log data is constructed to help the model better understand the vehicle’s operating environment.Data segmentation techniques are used to highlight local data variation features.Furthermore,to mitigate issues related to poor convergence and slow training speeds caused by the extreme value loss function,a teacher forcing mechanism is integrated into the model,enhancing its convergence capabilities.Experimental results validate the effectiveness of the proposed method,demonstrating its ability to handle data imbalance challenges.Specifically,the model achieves over a 60%improvement in root mean square error under extreme value conditions,with significant improvements observed across additional metrics.
基金supported by State Grid Shanghai Economic Research Institute under Grant No.SGTYHT/23-JS-004.
文摘Due to the complex structural hierarchy,with deeply nested associative relations between entities such as equipment,specifications,and business processes,intelligent power grid engineering is challenging.Meanwhile,limited by the fragmented data and loss of contextual information,the generated reports are prone to the problems such as content redundancy and omission of critical information,failing to meet the demands of efficient decision-making and accurate management in modern power systems.To address these issues,this paper proposes a knowledge graph(KG)-enhanced framework to automatically generate electric power engineering reports.In the KG construction phase,a feature-fused entity recognition model named BERT-BiLSTM-CRF is adopted to improve the accuracy of entity recognition in scenarios involving power engineering professional terminology,thereby solving the problem of ambiguous entity boundaries in traditional models;then a BERT-attention relation extraction model is proposed to enhance the completeness of extracting complex hierarchical and implicit relations in power grid data.In the report generation phase,an improved Transformer architecture is adopted to accurately transform structured knowledge into natural language reports that comply with engineering specifications,addressing the issue of semantic inconsistency caused by the loss of structural information in existing models.By validating with real-world projects,the results show that the proposed framework significantly outperforms existing baseline models in entity recognition,confirming its superiority and applicability in practical engineering.
文摘Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenarios,including fluctuating noise levels and unpredictable environmental elements,these techniques do not fully resolve these challenges.We introduce an innovative strategy that builds upon Retinex theory and integrates a novel deep network architecture,merging the Convolutional Block Attention Module(CBAM)with the Transformer.Our model is capable of detecting more prominent features across both channel and spatial domains.We have conducted extensive experiments across several datasets,namely LOLv1,LOLv2-real,and LOLv2-sync.The results show that our approach surpasses other methods when evaluated against critical metrics such as Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM).Moreover,we have visually assessed images enhanced by various techniques and utilized visual metrics like LPIPS for comparison,and the experimental data clearly demonstrate that our approach excels visually over other methods as well.
基金supported by the Key Laboratory of Forensic Science and Technology at College of Sichuan Province(2023YB04).
文摘Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.
文摘针对现有深度学习算法在壁画修复时,存在全局语义一致性约束不足及局部特征提取不充分,导致修复后的壁画易出现边界效应和细节模糊等问题,提出一种双向自回归Transformer与快速傅里叶卷积增强的壁画修复方法.首先,设计基于Transformer结构的全局语义特征修复模块,利用双向自回归机制与掩码语言模型(masked language modeling,MLM),提出改进的多头注意力全局语义壁画修复模块,提高对全局语义特征的修复能力.然后,构建了由门控卷积和残差模块组成的全局语义增强模块,增强全局语义特征一致性约束.最后,设计局部细节修复模块,采用大核注意力机制(large kernel attention,LKA)与快速傅里叶卷积提高细节特征的捕获能力,同时减少局部细节信息的丢失,提升修复壁画局部和整体特征的一致性.通过对敦煌壁画数字化修复实验,结果表明,所提算法修复性能更优,客观评价指标均优于比较算法.