Crack detection accuracy in computer vision is often constrained by limited annotated datasets.Although Generative Adversarial Networks(GANs)have been applied for data augmentation,they frequently introduce blurs and ...Crack detection accuracy in computer vision is often constrained by limited annotated datasets.Although Generative Adversarial Networks(GANs)have been applied for data augmentation,they frequently introduce blurs and artifacts.To address this challenge,this study leverages Denoising Diffusion Probabilistic Models(DDPMs)to generate high-quality synthetic crack images,enriching the training set with diverse and structurally consistent samples that enhance the crack segmentation.The proposed framework involves a two-stage pipeline:first,DDPMs are used to synthesize high-fidelity crack images that capture fine structural details.Second,these generated samples are combined with real data to train segmentation networks,thereby improving accuracy and robustness in crack detection.Compared with GAN-based approaches,DDPM achieved the best fidelity,with the highest Structural Similarity Index(SSIM)(0.302)and lowest Learned Perceptual Image Patch Similarity(LPIPS)(0.461),producing artifact-free images that preserve fine crack details.To validate its effectiveness,six segmentation models were tested,among which LinkNet consistently achieved the best performance,excelling in both region-level accuracy and structural continuity.Incorporating DDPM-augmented data further enhanced segmentation outcomes,increasing F1 scores by up to 1.1%and IoU by 1.7%,while also improving boundary alignment and skeleton continuity compared with models trained on real images alone.Experiments with varying augmentation ratios showed consistent improvements,with F1 rising from 0.946(no augmentation)to 0.957 and IoU from 0.897 to 0.913 at the highest ratio.These findings demonstrate the effectiveness of diffusion-based augmentation for complex crack detection in structural health monitoring.展开更多
Air target intent recognition holds significant importance in aiding commanders to assess battlefield situations and secure a competitive edge in decision-making.Progress in this domain has been hindered by challenges...Air target intent recognition holds significant importance in aiding commanders to assess battlefield situations and secure a competitive edge in decision-making.Progress in this domain has been hindered by challenges posed by imbalanced battlefield data and the limited robustness of traditional recognition models.Inspired by the success of diffusion models in addressing visual domain sample imbalances,this paper introduces a new approach that utilizes the Markov Transfer Field(MTF)method for time series data visualization.This visualization,when combined with the Denoising Diffusion Probabilistic Model(DDPM),effectively enhances sample data and mitigates noise within the original dataset.Additionally,a transformer-based model tailored for time series visualization and air target intent recognition is developed.Comprehensive experimental results,encompassing comparative,ablation,and denoising validations,reveal that the proposed method achieves a notable 98.86%accuracy in air target intent recognition while demonstrating exceptional robustness and generalization capabilities.This approach represents a promising avenue for advancing air target intent recognition.展开更多
High-Resolution(HR)data on flow fields are critical for accurately evaluating the aerodynamic performance of aircraft.However,acquiring such data through large-scale numerical simulations or wind tunnel experiments is...High-Resolution(HR)data on flow fields are critical for accurately evaluating the aerodynamic performance of aircraft.However,acquiring such data through large-scale numerical simulations or wind tunnel experiments is highly resource intensive.This paper proposes a FlowViT-Diff framework that integrates a Vision Transformer(ViT)with an enhanced denoising diffusion probabilistic model for the Super-Resolution(SR)reconstruction of HR flow fields based on low-resolution inputs.It provides a quick initial prediction of the HR flow field by optimizing the ViT architecture,and incorporates this preliminary output as guidance within an enhanced diffusion model.The latter captures the Gaussian noise distribution during forward diffusion and progressively removes it during backward diffusion to generate the flow field.Experiments on various supercritical airfoils under different flow conditions show that FlowViT-Diff can robustly reconstruct the flow field across multiple levels of downsampling.It obtains more consistent global and local features than traditional SR methods,and yields a 3.6-fold increase in its training speed via transfer learning.Its accuracy of reconstruction of the flow field is 99.7%under ultra-low downsampling.The results demonstrate that Flow Vi T-Diff not only exhibits effective flow field reconstruction capabilities,but also provides two reconstruction strategies,both of which show effective transferability.展开更多
Inverse design of advanced materials represents a pivotal challenge in materials science.Leveraging the latent space of Variational Autoencoders(VAEs)for material optimization has emerged as a significant advancement ...Inverse design of advanced materials represents a pivotal challenge in materials science.Leveraging the latent space of Variational Autoencoders(VAEs)for material optimization has emerged as a significant advancement in the field of material inverse design.However,VAEs are inherently prone to generating blurred images,posing challenges for precise inverse design and microstructure manufacturing.While increasing the dimensionality of the VAE latent space can mitigate reconstruction blurriness to some extent,it simultaneously imposes a substantial burden on target optimization due to an excessively high search space.To address these limitations,this study adopts a Variational Autoencoder guided Conditional Diffusion Generative Model(VAE-CDGM)framework integrated with Bayesian optimization to achieve the inverse design of composite materials with targeted mechanical properties.The VAE-CDGM model synergizes the strengths of VAEs and Denoising Diffusion Probabilistic Models(DDPM),enabling the generation of high-quality,sharp images while preserving a manipulable latent space.To accommodate varying dimensional requirements of the latent space,two optimization strategies are proposed.When the latent space dimensionality is excessively high,SHapley Additive exPlanations(SHAP)sensitivity analysis is employed to identify critical latent features for optimization within a reduced subspace.Conversely,direct optimization is performed in the low-dimensional latent space of VAE-CDGM when dimensionality is modest.The results demonstrate that both strategies accurately achieve the targeted design of composite materials while circumventing the blurred reconstruction flaws of VAEs,which offers a novel pathway for the precise design of advanced materials.展开更多
Multi-target digital material design has been challenging due to the expansive design space and instability of traditional methods in satisfying multiple objectives.This work proposes and demonstrates a customizer bas...Multi-target digital material design has been challenging due to the expansive design space and instability of traditional methods in satisfying multiple objectives.This work proposes and demonstrates a customizer based on a classifier-free,conditional denoising diffusion probability model(cDDPM)to efficiently create the layouts of digital materials meeting the design goal of multiple mechanical properties all together.A case study has been conducted based on a micro mechanical resonator with four pre-assigned resonant frequencies.Using 29,430 samples generated via finite element analysis(FEA),the cDDPM is trained to simultaneously customize up to four vibrational modes,achieving over 95%prediction accuracy.Furthermore,the cDDPM approach also shows superior performances in the single-target customization for up to 99%in prediction accuracy when compared with traditional conditional generative adversarial networks(cGANs).As such,the proposed design framework provides a highly customizable and robust methodology for the design of complicated digital materials.展开更多
Renewable energy production and the balance between production and demand have become increasingly crucial in modern power systems,necessitating accurate forecasting.Traditional deterministic methods fail to capture t...Renewable energy production and the balance between production and demand have become increasingly crucial in modern power systems,necessitating accurate forecasting.Traditional deterministic methods fail to capture the inherent uncertainties associated with intermittent renewable sources and fluctuating demand patterns.This paper proposes a novel denoising diffusion method for multivariate time series probabilistic forecasting that explicitly models the interdependencies between variables through graph modeling.Our framework employs a parallel feature extraction module that simultaneously captures temporal dynamics and spatial correlations,enabling improved forecasting accuracy.Through extensive evaluation on two world real-datasets focused on renewable energy and electricity demand,we demonstrate that our approach achieves state-of-the-art performance in probabilistic energy time series forecasting tasks.By explicitly modeling variable interdependencies and incorporating temporal information,our method provides reliable probabilistic forecasts,crucial for effective decision-making and resource allocation in the energy sector.Extensive experiments validate that our proposed method reduces the Continuous Ranked Probability Score(CRPS)by 2.1%-70.9%,Mean Absolute Error(MAE)by 4.4%-52.2%,and Root Mean Squared Error(RMSE)by 7.9%-53.4%over existing methods on two real-world datasets.展开更多
Diffusion models, a family of generative models based on deep learning, have become increasinglyprominent in cutting-edge machine learning research. With distinguished performance in generating samples thatresemble th...Diffusion models, a family of generative models based on deep learning, have become increasinglyprominent in cutting-edge machine learning research. With distinguished performance in generating samples thatresemble the observed data, diffusion models are widely used in image, video, and text synthesis nowadays. Inrecent years, the concept of diffusion has been extended to time-series applications, and many powerful models havebeen developed. Considering the deficiency of a methodical summary and discourse on these models, we providethis survey as an elementary resource for new researchers in this area and to provide inspiration to motivate futureresearch. For better understanding, we include an introduction about the basics of diffusion models. Except forthis, we primarily focus on diffusion-based methods for time-series forecasting, imputation, and generation, andpresent them, separately, in three individual sections. We also compare different methods for the same applicationand highlight their connections if applicable. Finally, we conclude with the common limitation of diffusion-basedmethods and highlight potential future research directions.展开更多
Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given exemplar.However,most existing GAN-based translation methods fail to produce photorealisti...Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given exemplar.However,most existing GAN-based translation methods fail to produce photorealistic results.In this study,we propose a new diffusion model-based approach for generating high-quality images that are semantically aligned with the input mask and resemble an exemplar in style.The proposed method trains a conditional denoising diffusion probabilistic model(DDPM)with a SPADE module to integrate the semantic map.We then used a novel contextual loss and auxiliary color loss to guide the optimization process,resulting in images that were visually pleasing and semantically accurate.Experiments demonstrate that our method outperforms state-of-the-art approaches in terms of both visual quality and quantitative metrics.展开更多
Single-image super-resolution(SISR)typically focuses on restoring various degraded low-resolution(LR)images to a single high-resolution(HR)image.However,during SISR tasks,it is often challenging for models to simultan...Single-image super-resolution(SISR)typically focuses on restoring various degraded low-resolution(LR)images to a single high-resolution(HR)image.However,during SISR tasks,it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features.This challenge can lead to issues such as model collapse,lack of rich details and texture features in the reconstructed HR images,and excessive time consumption for model sampling.To address these problems,this paper proposes a Latent Feature-oriented Diffusion Probability Model(LDDPM).First,we designed a conditional encoder capable of effectively encoding LR images,reducing the solution space for model image reconstruction and thereby improving the quality of the reconstructed images.We then employed a normalized flow and multimodal adversarial training,learning from complex multimodal distributions,to model the denoising distribution.Doing so boosts the generative modeling capabilities within a minimal number of sampling steps.Experimental comparisons of our proposed model with existing SISR methods on mainstream datasets demonstrate that our model reconstructs more realistic HR images and achieves better performance on multiple evaluation metrics,providing a fresh perspective for tackling SISR tasks.展开更多
基金the National Natural Science Foundation of China(Grant No.:52508343)the Fundamental Research Funds for the Central Universities(Grant No.:B250201004).
文摘Crack detection accuracy in computer vision is often constrained by limited annotated datasets.Although Generative Adversarial Networks(GANs)have been applied for data augmentation,they frequently introduce blurs and artifacts.To address this challenge,this study leverages Denoising Diffusion Probabilistic Models(DDPMs)to generate high-quality synthetic crack images,enriching the training set with diverse and structurally consistent samples that enhance the crack segmentation.The proposed framework involves a two-stage pipeline:first,DDPMs are used to synthesize high-fidelity crack images that capture fine structural details.Second,these generated samples are combined with real data to train segmentation networks,thereby improving accuracy and robustness in crack detection.Compared with GAN-based approaches,DDPM achieved the best fidelity,with the highest Structural Similarity Index(SSIM)(0.302)and lowest Learned Perceptual Image Patch Similarity(LPIPS)(0.461),producing artifact-free images that preserve fine crack details.To validate its effectiveness,six segmentation models were tested,among which LinkNet consistently achieved the best performance,excelling in both region-level accuracy and structural continuity.Incorporating DDPM-augmented data further enhanced segmentation outcomes,increasing F1 scores by up to 1.1%and IoU by 1.7%,while also improving boundary alignment and skeleton continuity compared with models trained on real images alone.Experiments with varying augmentation ratios showed consistent improvements,with F1 rising from 0.946(no augmentation)to 0.957 and IoU from 0.897 to 0.913 at the highest ratio.These findings demonstrate the effectiveness of diffusion-based augmentation for complex crack detection in structural health monitoring.
基金co-supported by the National Natural Science Foundation of China(Nos.61806219,61876189 and 61703426)the Young Talent Fund of University Association for Science and Technology in Shaanxi,China(Nos.20190108 and 20220106)the Innvation Talent Supporting Project of Shaanxi,China(No.2020KJXX-065)。
文摘Air target intent recognition holds significant importance in aiding commanders to assess battlefield situations and secure a competitive edge in decision-making.Progress in this domain has been hindered by challenges posed by imbalanced battlefield data and the limited robustness of traditional recognition models.Inspired by the success of diffusion models in addressing visual domain sample imbalances,this paper introduces a new approach that utilizes the Markov Transfer Field(MTF)method for time series data visualization.This visualization,when combined with the Denoising Diffusion Probabilistic Model(DDPM),effectively enhances sample data and mitigates noise within the original dataset.Additionally,a transformer-based model tailored for time series visualization and air target intent recognition is developed.Comprehensive experimental results,encompassing comparative,ablation,and denoising validations,reveal that the proposed method achieves a notable 98.86%accuracy in air target intent recognition while demonstrating exceptional robustness and generalization capabilities.This approach represents a promising avenue for advancing air target intent recognition.
基金supported by the National Natural Science Foundation of China(No.12472265)。
文摘High-Resolution(HR)data on flow fields are critical for accurately evaluating the aerodynamic performance of aircraft.However,acquiring such data through large-scale numerical simulations or wind tunnel experiments is highly resource intensive.This paper proposes a FlowViT-Diff framework that integrates a Vision Transformer(ViT)with an enhanced denoising diffusion probabilistic model for the Super-Resolution(SR)reconstruction of HR flow fields based on low-resolution inputs.It provides a quick initial prediction of the HR flow field by optimizing the ViT architecture,and incorporates this preliminary output as guidance within an enhanced diffusion model.The latter captures the Gaussian noise distribution during forward diffusion and progressively removes it during backward diffusion to generate the flow field.Experiments on various supercritical airfoils under different flow conditions show that FlowViT-Diff can robustly reconstruct the flow field across multiple levels of downsampling.It obtains more consistent global and local features than traditional SR methods,and yields a 3.6-fold increase in its training speed via transfer learning.Its accuracy of reconstruction of the flow field is 99.7%under ultra-low downsampling.The results demonstrate that Flow Vi T-Diff not only exhibits effective flow field reconstruction capabilities,but also provides two reconstruction strategies,both of which show effective transferability.
文摘Inverse design of advanced materials represents a pivotal challenge in materials science.Leveraging the latent space of Variational Autoencoders(VAEs)for material optimization has emerged as a significant advancement in the field of material inverse design.However,VAEs are inherently prone to generating blurred images,posing challenges for precise inverse design and microstructure manufacturing.While increasing the dimensionality of the VAE latent space can mitigate reconstruction blurriness to some extent,it simultaneously imposes a substantial burden on target optimization due to an excessively high search space.To address these limitations,this study adopts a Variational Autoencoder guided Conditional Diffusion Generative Model(VAE-CDGM)framework integrated with Bayesian optimization to achieve the inverse design of composite materials with targeted mechanical properties.The VAE-CDGM model synergizes the strengths of VAEs and Denoising Diffusion Probabilistic Models(DDPM),enabling the generation of high-quality,sharp images while preserving a manipulable latent space.To accommodate varying dimensional requirements of the latent space,two optimization strategies are proposed.When the latent space dimensionality is excessively high,SHapley Additive exPlanations(SHAP)sensitivity analysis is employed to identify critical latent features for optimization within a reduced subspace.Conversely,direct optimization is performed in the low-dimensional latent space of VAE-CDGM when dimensionality is modest.The results demonstrate that both strategies accurately achieve the targeted design of composite materials while circumventing the blurred reconstruction flaws of VAEs,which offers a novel pathway for the precise design of advanced materials.
文摘Multi-target digital material design has been challenging due to the expansive design space and instability of traditional methods in satisfying multiple objectives.This work proposes and demonstrates a customizer based on a classifier-free,conditional denoising diffusion probability model(cDDPM)to efficiently create the layouts of digital materials meeting the design goal of multiple mechanical properties all together.A case study has been conducted based on a micro mechanical resonator with four pre-assigned resonant frequencies.Using 29,430 samples generated via finite element analysis(FEA),the cDDPM is trained to simultaneously customize up to four vibrational modes,achieving over 95%prediction accuracy.Furthermore,the cDDPM approach also shows superior performances in the single-target customization for up to 99%in prediction accuracy when compared with traditional conditional generative adversarial networks(cGANs).As such,the proposed design framework provides a highly customizable and robust methodology for the design of complicated digital materials.
文摘Renewable energy production and the balance between production and demand have become increasingly crucial in modern power systems,necessitating accurate forecasting.Traditional deterministic methods fail to capture the inherent uncertainties associated with intermittent renewable sources and fluctuating demand patterns.This paper proposes a novel denoising diffusion method for multivariate time series probabilistic forecasting that explicitly models the interdependencies between variables through graph modeling.Our framework employs a parallel feature extraction module that simultaneously captures temporal dynamics and spatial correlations,enabling improved forecasting accuracy.Through extensive evaluation on two world real-datasets focused on renewable energy and electricity demand,we demonstrate that our approach achieves state-of-the-art performance in probabilistic energy time series forecasting tasks.By explicitly modeling variable interdependencies and incorporating temporal information,our method provides reliable probabilistic forecasts,crucial for effective decision-making and resource allocation in the energy sector.Extensive experiments validate that our proposed method reduces the Continuous Ranked Probability Score(CRPS)by 2.1%-70.9%,Mean Absolute Error(MAE)by 4.4%-52.2%,and Root Mean Squared Error(RMSE)by 7.9%-53.4%over existing methods on two real-world datasets.
文摘Diffusion models, a family of generative models based on deep learning, have become increasinglyprominent in cutting-edge machine learning research. With distinguished performance in generating samples thatresemble the observed data, diffusion models are widely used in image, video, and text synthesis nowadays. Inrecent years, the concept of diffusion has been extended to time-series applications, and many powerful models havebeen developed. Considering the deficiency of a methodical summary and discourse on these models, we providethis survey as an elementary resource for new researchers in this area and to provide inspiration to motivate futureresearch. For better understanding, we include an introduction about the basics of diffusion models. Except forthis, we primarily focus on diffusion-based methods for time-series forecasting, imputation, and generation, andpresent them, separately, in three individual sections. We also compare different methods for the same applicationand highlight their connections if applicable. Finally, we conclude with the common limitation of diffusion-basedmethods and highlight potential future research directions.
基金supported in part by National Natural Science Foundation of China(U21B2023)DEGP Innovation Team(2022KCXTD025)+1 种基金Shenzhen Science and Technology Program(KQTD20210811090044003,RCJC20200714114435012)Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ).
文摘Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given exemplar.However,most existing GAN-based translation methods fail to produce photorealistic results.In this study,we propose a new diffusion model-based approach for generating high-quality images that are semantically aligned with the input mask and resemble an exemplar in style.The proposed method trains a conditional denoising diffusion probabilistic model(DDPM)with a SPADE module to integrate the semantic map.We then used a novel contextual loss and auxiliary color loss to guide the optimization process,resulting in images that were visually pleasing and semantically accurate.Experiments demonstrate that our method outperforms state-of-the-art approaches in terms of both visual quality and quantitative metrics.
基金supported by General Project of Guangxi Science and Technology Major Project(AA19254016)Beihai City Science and Technology Planning Project(202082033)+1 种基金Beihai City Science and Technology Planning Project(202082023)Guangxi Graduate Student Innovation Project(YCSW2021174)。
文摘Single-image super-resolution(SISR)typically focuses on restoring various degraded low-resolution(LR)images to a single high-resolution(HR)image.However,during SISR tasks,it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features.This challenge can lead to issues such as model collapse,lack of rich details and texture features in the reconstructed HR images,and excessive time consumption for model sampling.To address these problems,this paper proposes a Latent Feature-oriented Diffusion Probability Model(LDDPM).First,we designed a conditional encoder capable of effectively encoding LR images,reducing the solution space for model image reconstruction and thereby improving the quality of the reconstructed images.We then employed a normalized flow and multimodal adversarial training,learning from complex multimodal distributions,to model the denoising distribution.Doing so boosts the generative modeling capabilities within a minimal number of sampling steps.Experimental comparisons of our proposed model with existing SISR methods on mainstream datasets demonstrate that our model reconstructs more realistic HR images and achieves better performance on multiple evaluation metrics,providing a fresh perspective for tackling SISR tasks.