The use of deep generative models(DGMs)such as variational autoencoders,autoregressive models,flow-based models,energy-based models,generative adversarial networks,and diffusion models has been advantageous in various...The use of deep generative models(DGMs)such as variational autoencoders,autoregressive models,flow-based models,energy-based models,generative adversarial networks,and diffusion models has been advantageous in various disciplines due to their high data generative skills.Using DGMs has become one of the most trending research topics in Artificial Intelligence in recent years.On the other hand,the research and development endeavors in the civil structural health monitoring(SHM)area have also been very progressive owing to the increasing use of Machine Learning techniques.As such,some of the DGMs have also been used in the civil SHM field lately.This short review communication paper aims to assist researchers in the civil SHM field in understanding the fundamentals of DGMs and,consequently,to help initiate their use for current and possible future engineering applications.On this basis,this study briefly introduces the concept and mechanism of different DGMs in a comparative fashion.While preparing this short review communication,it was observed that some DGMs had not been utilized or exploited fully in the SHM area.Accordingly,some representative studies presented in the civil SHM field that use DGMs are briefly overviewed.The study also presents a short comparative discussion on DGMs,their link to the SHM,and research directions.展开更多
Two dimensional(2D) materials based on boron and carbon have attracted wide attention due to their unique properties. BC compounds have rich active sites and diverse chemical coordination, showing great potential in o...Two dimensional(2D) materials based on boron and carbon have attracted wide attention due to their unique properties. BC compounds have rich active sites and diverse chemical coordination, showing great potential in optoelectronic applications. However, due to the limitation of calculation and experimental conditions, it is still a challenging task to predict new 2D BC monolayer materials. Specifically, we utilized Crystal Diffusion Variational Autoencoder(CDVAE) and pre-trained Materials Graph Neural Network with 3-Body Interactions(M3GNet) model to generate novel and stable BCP materials. Each crystal structure was treated as a high-dimensional vector, where the encoder extracted lattice information and element coordinates, mapping the high-dimensional data into a low-dimensional latent space. The decoder then reconstructed the latent representation back into the original data space. Additionally, our designed attribute predictor network combined the advantages of dilated convolutions and residual connections,effectively increasing the model's receptive field and learning capacity while maintaining relatively low parameter count and computational complexity. By progressively increasing the dilation rate, the model can capture features at different scales. We used the DFT data set of about 1600 BCP monolayer materials to train the diffusion model, and combined with the pre-trained M3GNet model to screen the best candidate structure. Finally, we used DFT calculations to confirm the stability of the candidate structure.The results show that the combination of generative deep learning model and attribute prediction model can help accelerate the discovery and research of new 2D materials, and provide effective methods for exploring the inverse design of new two-dimensional materials.展开更多
Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distr...Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties.One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules,thereby accelerating the molecular optimization process.Deep molecular diffusion generative models simulate a gradual process that creates novel,chemically feasible molecules from noise.However,the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules,leading to challenges in modifying the scaffold-based molecular structures,and creates limitations in the stability and diversity of the generated molecules.To address these challenges,we propose a deep molecular diffusion generative model,the three-dimensional(3D)equivariant diffusion-driven molecular generation(3D-EDiffMG)model.The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder(dual-SWLEE)is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions.Addi-tionally,a gate multilayer perceptron(gMLP)block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies.The experimental results show that 3D-EDiffMG effectively generates unique,novel,stable,and diverse drug-like molecules,highlighting its potential for lead optimization and accelerating drug discovery.展开更多
Deep generative models allow the synthesis of realistic human faces from freehand sketches or semantic maps.However,although they are flexible,sketches and semantic maps provide too much freedom for manipulation,and t...Deep generative models allow the synthesis of realistic human faces from freehand sketches or semantic maps.However,although they are flexible,sketches and semantic maps provide too much freedom for manipulation,and thus,are not easy for novice users to control.In this study,we present DeepFaceReshaping,a novel landmarkbased deep generative framework for interactive face reshaping.To edit the shape of a face realistically by manipulating a small number of face landmarks,we employ neural shape deformation to reshape individual face components.Furthermore,we propose a novel Transformer-based partial refinement network to synthesize the reshaped face components conditioned on the edited landmarks,and fuse the components to generate the entire face using a local-to-global approach.In this manner,we limit possible reshaping effects within a feasible component-based face space.Thus,our interface is intuitive even for novice users,asconfirmed by auser study.Our experiments demonstrate that our method outperforms traditional warping-based approaches and recent deep generative techniques.展开更多
In many applications of computer graphics,art,and design,it is desirable for a user to provide intuitive non-image input,such as text,sketch,stroke,graph,or layout,and have a computer system automatically generate pho...In many applications of computer graphics,art,and design,it is desirable for a user to provide intuitive non-image input,such as text,sketch,stroke,graph,or layout,and have a computer system automatically generate photo-realistic images according to that input.While classically,works that allow such automatic image content generation have followed a framework of image retrieval and composition,recent advances in deep generative models such as generative adversarial networks(GANs),variational autoencoders(VAEs),and flow-based methods have enabled more powerful and versatile image generation approaches.This paper reviews recent works for image synthesis given intuitive user input,covering advances in input versatility,image generation methodology,benchmark datasets,and evaluation metrics.This motivates new perspectives on input representation and interactivity,cross fertilization between major image generation paradigms,and evaluation and comparison of generation methods.展开更多
The vast potential of medical big data to enhance healthcare outcomes remains underutilized due to privacy concerns,which restrict cross-center data sharing and the construction of diverse,large-scale datasets.To addr...The vast potential of medical big data to enhance healthcare outcomes remains underutilized due to privacy concerns,which restrict cross-center data sharing and the construction of diverse,large-scale datasets.To address this challenge,we developed a deep generative model aimed at synthesizing medical data to overcome data sharing barriers,with a focus on breast ultrasound(US)image synthesis.Specifically,we introduce CoLDiT,a conditional latent diffusion model with a transformer backbone,to generate US images of breast lesions across various Breast Imaging Reporting and Data System(BI-RADS)categories.Using a training dataset of 9,705 US images from 5,243 patients across 202 hospitals with diverse US systems,CoLDiT generated breast US images without duplicating private information,as confirmed through nearest-neighbor analysis.Blinded reader studies further validated the realism of these images,with area under the receiver operating characteristic curve(AUC)scores ranging from 0.53 to 0.77.Additionally,synthetic breast US images effectively augmented the training dataset for BI-RADS classification,achieving performance comparable to that using an equal-sized training set comprising solely real images(P=0.81 for AUC).Our findings suggest that synthetic data,such as CoLDiT-generated images,offer a viable,privacy-preserving solution to facilitate secure medical data sharing and advance the utilization of medical big data.展开更多
In graphic design,layout is a result of the interaction between the design elements in the foreground and background images.However,prevalent research focuses on enhancing the quality of layout generation algorithms,o...In graphic design,layout is a result of the interaction between the design elements in the foreground and background images.However,prevalent research focuses on enhancing the quality of layout generation algorithms,overlooking the interaction and controllability that are essential for designers when applying these methods in realworld situations.This paper proposes a user-centered layout design system,Iris,which provides designers with an interactive environment to expedite the workflow,and this environment encompasses the features of user-constraint specification,layout generation,custom editing,and final rendering.To satisfy the multiple constraints specified by designers,we introduce a novel generation model,multi-constraint LayoutVQ-VAE,for advancing layout generation under intra-and inter-domain constraints.Qualitative and quantitative experiments on our proposed model indicate that it outperforms or is comparable to prevalent state-of-the-art models in multiple aspects.User studies on Iris further demonstrate that the system significantly enhances design efficiency while achieving human-like layout designs.展开更多
Generating realistic building layouts for automatic building design has been studied in both computer vision and architectural domains.Traditional approaches in the latter,which are based on optimization techniques or...Generating realistic building layouts for automatic building design has been studied in both computer vision and architectural domains.Traditional approaches in the latter,which are based on optimization techniques or heuristic design guidelines,can synthesize desirable layouts,but usually require post-processing and involve human interaction in the design pipeline,making them costly and time-consuming.The advent of deep generative models has significantly improved the fidelity and diversity of the generated architecture layouts,reducing the workload of designers and making the process much more efficient.This paper presents a comprehensive review of three major research topics in architectural layout design and generation:floorplan layout generation,scene layout synthesis,and generation of various other formats of building layouts.For each topic,we overview the leading paradigms,categorized either by research domains(architecture or machine learning)or by user input conditions or constraints.We then introduce commonly-adopted benchmark datasets used to verify the effectiveness of the methods,as well as corresponding evaluation metrics.Finally,we identify the well-solved problems and limitations of existing approaches,and then propose promising directions for future research.This survey has an associated project which aims to maintain the resources,at https://github.com/jcliu0428/awesome-building-layout-generation.展开更多
基金the National Aeronautics and Space Administration(NASA)Award No.80NSSC20K0326 for the research activities and particularly for this paper。
文摘The use of deep generative models(DGMs)such as variational autoencoders,autoregressive models,flow-based models,energy-based models,generative adversarial networks,and diffusion models has been advantageous in various disciplines due to their high data generative skills.Using DGMs has become one of the most trending research topics in Artificial Intelligence in recent years.On the other hand,the research and development endeavors in the civil structural health monitoring(SHM)area have also been very progressive owing to the increasing use of Machine Learning techniques.As such,some of the DGMs have also been used in the civil SHM field lately.This short review communication paper aims to assist researchers in the civil SHM field in understanding the fundamentals of DGMs and,consequently,to help initiate their use for current and possible future engineering applications.On this basis,this study briefly introduces the concept and mechanism of different DGMs in a comparative fashion.While preparing this short review communication,it was observed that some DGMs had not been utilized or exploited fully in the SHM area.Accordingly,some representative studies presented in the civil SHM field that use DGMs are briefly overviewed.The study also presents a short comparative discussion on DGMs,their link to the SHM,and research directions.
基金supported by the National Nature Science Foundation of China (Nos. 61671362 and 62071366)。
文摘Two dimensional(2D) materials based on boron and carbon have attracted wide attention due to their unique properties. BC compounds have rich active sites and diverse chemical coordination, showing great potential in optoelectronic applications. However, due to the limitation of calculation and experimental conditions, it is still a challenging task to predict new 2D BC monolayer materials. Specifically, we utilized Crystal Diffusion Variational Autoencoder(CDVAE) and pre-trained Materials Graph Neural Network with 3-Body Interactions(M3GNet) model to generate novel and stable BCP materials. Each crystal structure was treated as a high-dimensional vector, where the encoder extracted lattice information and element coordinates, mapping the high-dimensional data into a low-dimensional latent space. The decoder then reconstructed the latent representation back into the original data space. Additionally, our designed attribute predictor network combined the advantages of dilated convolutions and residual connections,effectively increasing the model's receptive field and learning capacity while maintaining relatively low parameter count and computational complexity. By progressively increasing the dilation rate, the model can capture features at different scales. We used the DFT data set of about 1600 BCP monolayer materials to train the diffusion model, and combined with the pre-trained M3GNet model to screen the best candidate structure. Finally, we used DFT calculations to confirm the stability of the candidate structure.The results show that the combination of generative deep learning model and attribute prediction model can help accelerate the discovery and research of new 2D materials, and provide effective methods for exploring the inverse design of new two-dimensional materials.
基金supported by the National Key R&D Program of China(Grant No.:2023YFF1205102)the National Natural Science Foundation of China(Grant Nos.:82273856,22077143,and 21977127)the Science Foundation of Guangzhou,China(No.:2Grant024A04J2172).
文摘Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties.One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules,thereby accelerating the molecular optimization process.Deep molecular diffusion generative models simulate a gradual process that creates novel,chemically feasible molecules from noise.However,the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules,leading to challenges in modifying the scaffold-based molecular structures,and creates limitations in the stability and diversity of the generated molecules.To address these challenges,we propose a deep molecular diffusion generative model,the three-dimensional(3D)equivariant diffusion-driven molecular generation(3D-EDiffMG)model.The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder(dual-SWLEE)is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions.Addi-tionally,a gate multilayer perceptron(gMLP)block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies.The experimental results show that 3D-EDiffMG effectively generates unique,novel,stable,and diverse drug-like molecules,highlighting its potential for lead optimization and accelerating drug discovery.
基金supported by grants from the Open Researchh Projects of Zhejiang Lab(No.2021KE0AB06)the National Natural Science Foundation of China(Nos.62061136007 and 62102403)+1 种基金the Beijing Municipal Natural Science Foundation for Distinguished Young Scholars(No.JQ21013)d the Open Project Program of the State Key Laboratory of Virtual Reality Technology and Systems,Beihang University(No.VRLAB2022C07).
文摘Deep generative models allow the synthesis of realistic human faces from freehand sketches or semantic maps.However,although they are flexible,sketches and semantic maps provide too much freedom for manipulation,and thus,are not easy for novice users to control.In this study,we present DeepFaceReshaping,a novel landmarkbased deep generative framework for interactive face reshaping.To edit the shape of a face realistically by manipulating a small number of face landmarks,we employ neural shape deformation to reshape individual face components.Furthermore,we propose a novel Transformer-based partial refinement network to synthesize the reshaped face components conditioned on the edited landmarks,and fuse the components to generate the entire face using a local-to-global approach.In this manner,we limit possible reshaping effects within a feasible component-based face space.Thus,our interface is intuitive even for novice users,asconfirmed by auser study.Our experiments demonstrate that our method outperforms traditional warping-based approaches and recent deep generative techniques.
基金supported by the National Natural Science Foundation of China(Project Nos.61521002 and 61772298)。
文摘In many applications of computer graphics,art,and design,it is desirable for a user to provide intuitive non-image input,such as text,sketch,stroke,graph,or layout,and have a computer system automatically generate photo-realistic images according to that input.While classically,works that allow such automatic image content generation have followed a framework of image retrieval and composition,recent advances in deep generative models such as generative adversarial networks(GANs),variational autoencoders(VAEs),and flow-based methods have enabled more powerful and versatile image generation approaches.This paper reviews recent works for image synthesis given intuitive user input,covering advances in input versatility,image generation methodology,benchmark datasets,and evaluation metrics.This motivates new perspectives on input representation and interactivity,cross fertilization between major image generation paradigms,and evaluation and comparison of generation methods.
基金supported by the National Natural Science Foundation of China(grant no.82071928)the Program of Shanghai Academic/Technology Research Leader(grant no.23XD1401300).
文摘The vast potential of medical big data to enhance healthcare outcomes remains underutilized due to privacy concerns,which restrict cross-center data sharing and the construction of diverse,large-scale datasets.To address this challenge,we developed a deep generative model aimed at synthesizing medical data to overcome data sharing barriers,with a focus on breast ultrasound(US)image synthesis.Specifically,we introduce CoLDiT,a conditional latent diffusion model with a transformer backbone,to generate US images of breast lesions across various Breast Imaging Reporting and Data System(BI-RADS)categories.Using a training dataset of 9,705 US images from 5,243 patients across 202 hospitals with diverse US systems,CoLDiT generated breast US images without duplicating private information,as confirmed through nearest-neighbor analysis.Blinded reader studies further validated the realism of these images,with area under the receiver operating characteristic curve(AUC)scores ranging from 0.53 to 0.77.Additionally,synthetic breast US images effectively augmented the training dataset for BI-RADS classification,achieving performance comparable to that using an equal-sized training set comprising solely real images(P=0.81 for AUC).Our findings suggest that synthetic data,such as CoLDiT-generated images,offer a viable,privacy-preserving solution to facilitate secure medical data sharing and advance the utilization of medical big data.
基金the Alibaba–Zhejiang University Joint Research Institute of Frontier Technologies,China and the Zhejiang–Singapore Innovation and AI Joint Research Lab,China。
文摘In graphic design,layout is a result of the interaction between the design elements in the foreground and background images.However,prevalent research focuses on enhancing the quality of layout generation algorithms,overlooking the interaction and controllability that are essential for designers when applying these methods in realworld situations.This paper proposes a user-centered layout design system,Iris,which provides designers with an interactive environment to expedite the workflow,and this environment encompasses the features of user-constraint specification,layout generation,custom editing,and final rendering.To satisfy the multiple constraints specified by designers,we introduce a novel generation model,multi-constraint LayoutVQ-VAE,for advancing layout generation under intra-and inter-domain constraints.Qualitative and quantitative experiments on our proposed model indicate that it outperforms or is comparable to prevalent state-of-the-art models in multiple aspects.User studies on Iris further demonstrate that the system significantly enhances design efficiency while achieving human-like layout designs.
文摘Generating realistic building layouts for automatic building design has been studied in both computer vision and architectural domains.Traditional approaches in the latter,which are based on optimization techniques or heuristic design guidelines,can synthesize desirable layouts,but usually require post-processing and involve human interaction in the design pipeline,making them costly and time-consuming.The advent of deep generative models has significantly improved the fidelity and diversity of the generated architecture layouts,reducing the workload of designers and making the process much more efficient.This paper presents a comprehensive review of three major research topics in architectural layout design and generation:floorplan layout generation,scene layout synthesis,and generation of various other formats of building layouts.For each topic,we overview the leading paradigms,categorized either by research domains(architecture or machine learning)or by user input conditions or constraints.We then introduce commonly-adopted benchmark datasets used to verify the effectiveness of the methods,as well as corresponding evaluation metrics.Finally,we identify the well-solved problems and limitations of existing approaches,and then propose promising directions for future research.This survey has an associated project which aims to maintain the resources,at https://github.com/jcliu0428/awesome-building-layout-generation.