The growing demand for wireless connectivity has made massive multiple-input multiple-output(MIMO)a cornerstone of modern communication systems.To optimize network performance and resource allocation,an efficient and ...The growing demand for wireless connectivity has made massive multiple-input multiple-output(MIMO)a cornerstone of modern communication systems.To optimize network performance and resource allocation,an efficient and robust approach is joint device activity detection and channel estimation.In this paper,we present an approach utilizing score-based generative models to address the underdetermined nature of channel estimation,which is data-driven and well-suited for the complex and dynamic environment of massive MIMO systems.Our experimental results,based on a comprehensive dataset generated through Monte-Carlo sampling,demonstrate the high precision of our channel estimation approach,with errors reduced to as low as-45 d B,and exceptional accuracy in detecting active devices.展开更多
Inverse design has long been an efficient and powerful design tool in the aircraft industry.In this paper,a novel inverse design method for supercritical airfoils is proposed based on generative models in deep learnin...Inverse design has long been an efficient and powerful design tool in the aircraft industry.In this paper,a novel inverse design method for supercritical airfoils is proposed based on generative models in deep learning.A Conditional Variational Auto Encoder(CVAE)and an integrated generative network CVAE-GAN that combines the CVAE with the Wasserstein Generative Adversarial Networks(WGAN),are conducted as generative models.They are used to generate target wall Mach distributions for the inverse design that matches specified features,such as locations of suction peak,shock and aft loading.Qualitative and quantitative results show that both adopted generative models can generate diverse and realistic wall Mach number distributions satisfying the given features.The CVAE-GAN model outperforms the CVAE model and achieves better reconstruction accuracies for all the samples in the dataset.Furthermore,a deep neural network for nonlinear mapping is adopted to obtain the airfoil shape corresponding to the target wall Mach number distribution.The performances of the designed deep neural network are fully demonstrated and a smoothness measurement is proposed to quantify small oscillations in the airfoil surface,proving the authenticity and accuracy of the generated airfoil shapes.展开更多
Deep learning(DL)has proven to be important for computed tomography(CT)image denoising.However,such models are usually trained under supervision,requiring paired data that may be difficult to obtain in practice.Diffus...Deep learning(DL)has proven to be important for computed tomography(CT)image denoising.However,such models are usually trained under supervision,requiring paired data that may be difficult to obtain in practice.Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling.In particular,using the estimated unconditional score function of the prior distribution,obtained via unsupervised learning,one can sample from the desired posterior via hijacking and regularization.However,due to the iterative solvers used,the number of function evaluations(NFE)required may be orders of magnitudes larger than for single-step samplers.In this paper,we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models(PFGM)++.By hijacking and regularizing the sampling process we obtain a single-step sampler,that is NFE=1.Our proposed method incorporates posterior sampling using diffusion models as a special case.We demonstrate that the added robustness afforded by the PFGM++framework yields significant performance gains.Our results indicate competitive performance compared to popular supervised,including state-of-the-art diffusion-style models with NFE=1(consistency models),unsupervised,and non-DL-based image denoising techniques,on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.展开更多
Natural products(NPs) have long been recognized as a valuable resource for drug discovery, and bringing NP-related features to virtual libraries is believed to be an effective way to increase the coverage of druggab...Natural products(NPs) have long been recognized as a valuable resource for drug discovery, and bringing NP-related features to virtual libraries is believed to be an effective way to increase the coverage of druggable chemical space. Here, deep learning-based molecule generative model, which is a recent technique in de novo molecule design, was applied to generate virtual libraries with NP-like properties. Results demonstrated that the model was effective in generating molecules that highly resemble NPs. Moreover, the model was also found to be capable of generating NP-like molecules that were also easy to synthesize, significantly increasing the practical value of the compound library.展开更多
This paper systematically reviews the latest advances in deep learning-based path planning for autonomous mobile robots,addressing the limitations of traditional methods(e.g.,A*,Rapidly-exploring Random Tree(RRT))in d...This paper systematically reviews the latest advances in deep learning-based path planning for autonomous mobile robots,addressing the limitations of traditional methods(e.g.,A*,Rapidly-exploring Random Tree(RRT))in dynamic,high-dimensional,and unstructured environments.We comprehensively analyze five major deep learning model categories:Convolutional Neural Networks(CNNs)for spatial feature extraction,Graph Neural Networks(GNNs)for multi-agent collaboration,Recurrent Neural Networks(RNNs)for temporal modeling,Transformers for long-range dependency and complex instruction understanding,and generative models(e.g.,GANs,Diffusion Models)for creative path generation.Our analysis covers technical principles,advantages,limitations,application scenarios,and development trends of these methods.The review reveals that deep learning has fundamentally transformed path planning from perception enhancement to decision substitution,from isolated agents to multi-agent collaboration,and from search-based to generative paradigms.Key findings indicate significant performance improvements:GNN-based distributed planning triples multi-robot collaboration efficiency,and generative models increase complex instruction planning success rates to 78.1%.Future directions include cross-modal integration,lightweight deployment,simulationtoreality transfer,and verifiable safety assurance,which will be crucial for advancing next-generation intelligent mobile robot navigation systems.展开更多
Natural products(NPs)are invaluable resources for drug discovery,characterized by their intricate scaffolds and diverse bioactivities.Al drug discovery&design(AIDD)has emerged as a transformative approach for the ...Natural products(NPs)are invaluable resources for drug discovery,characterized by their intricate scaffolds and diverse bioactivities.Al drug discovery&design(AIDD)has emerged as a transformative approach for the rational structural modification of NPs.This review examines a variety of molecular generation models since 2020,focusing on their potential applications in two primary scenarios of NPs structure modification:modifications when the target is identified and when it remains unidentified.Most of the molecular generative models discussed herein are open-source,and their applicability across different domains and technical feasibility have been evaluated.This evaluation was accomplished by integrating a limited number of research cases and successful practices observed in the molec-ular optimization of synthetic compounds.Furthermore,the challenges and prospects of employing molecular generation modeling for the structural modification of NPs are discussed.展开更多
The rapid acceleration of big data and artificial intelligence has spurred the application of advanced machine learning methods to address multifaceted transportation challenges.Among these, generative models(GMs) hav...The rapid acceleration of big data and artificial intelligence has spurred the application of advanced machine learning methods to address multifaceted transportation challenges.Among these, generative models(GMs) have garnered significant attention, demonstrating great potential for advancing intelligent transportation systems. This paper provides a comprehensive investigation into the applications and potential of GMs within this domain. First, the paper systematically reviews the fundamental principles, architectures,and comparative characteristics of mainstream generative models. The primary contribution is an in-depth review of GM applications across three core areas: trajectory generation, traffic flow prediction, and autonomous driving. In trajectory generation, we examine how GMs synthesize realistic data to address data scarcity and privacy preservation. For traffic flow, the review covers GM-based approaches for prediction and critical data imputation tasks. In autonomous driving, the analysis details GM applications in sensor data restoration, perception enhancement, realistic scenario simulation, and behavior prediction. Although GMs have shown significant value, their full potential remains underexplored. Therefore, this paper identifies and discusses promising avenues for future research, including the integration of diffusion models in autonomous driving, the use of GMs for infrastructure planning, and their application in enhancing traffic safety.This paper is anticipated to serve as a comprehensive reference for researchers exploring generative models in transportation.展开更多
With the rapid development of quantum devices across various platforms[1–4],reconstructing quantum many-body states from experimentally measured data posts a crucial challenge.Straightforward quantum state tomography...With the rapid development of quantum devices across various platforms[1–4],reconstructing quantum many-body states from experimentally measured data posts a crucial challenge.Straightforward quantum state tomography(QST)is only applicable for small systems[5],since the required classical computing resources,such as the number of measurements and the memory size,grow exponentially as the system size increases.展开更多
The rational design of catalyst structures tailored to target performance is an ambitious and profoundly impactful goal.Key challenges include achieving refined representations of the three-dimensional structure of ac...The rational design of catalyst structures tailored to target performance is an ambitious and profoundly impactful goal.Key challenges include achieving refined representations of the three-dimensional structure of active sites and imbuing models with robust physical interpretability.Herein,we developed a topology-based variational autoencoder framework(PGH-VAEs)to enable the interpretable inverse design of catalytic active sites.Leveraging high-entropy alloys as a case,we demonstrate that persistent GLMY homology,an advanced topological algebraic analysis tool,enables the quantification of three-dimensional structural sensitivity and establishes correlations with adsorption properties.The multi-channel PGH-VAEs illustrate how coordination and ligand effects shape the latent space and influence the adsorption energies.Building on the inverse design results from PGH-VAEs,the strategies to optimize the composition and facet structures to maximize the proportion of optimal active sites are proposed.This interpretable inverse design framework can be extended to diverse systems,paving the way for AI-driven catalyst design.展开更多
Recent advances in generative models have significantly facilitated the development of personalized content creation.Given a small set of images containing a user-specific concept,personalized image generation allows ...Recent advances in generative models have significantly facilitated the development of personalized content creation.Given a small set of images containing a user-specific concept,personalized image generation allows the user to create images that incorporate that concept while adhering to provided text descriptions.The technologies used for personalization have evolved alongside the development of generative models,with their distinct and interrelated components.In this survey,we present a comprehensive review of generalized personalized image generation across various generative models,including traditional GANs,contemporary text-to-image diffusion models,and emerging multi-modal autoregressive(AR)models.We first define a unified framework that standardizes the personalization process across different generative models,encompassing three key components:inversion spaces,inversion methods,and personalization schemes.This unified framework offers a structured approach to dissecting and comparing personalization techniques across different generative architectures.Building upon our framework,we provide an in-depth analysis of personalization techniques within each generative model,highlighting their unique contributions and innovations.Through comparative analysis,we elucidate the current landscape of personalized image generation,identifying commonalities and distinguishing features of existing methods.Finally,we discuss open challenges in the field and propose potential directions for future research.We keep a bibliography of related works at https://github.com/csyxwei/Awesome-Personalized-Image-Generation.展开更多
The use of deep generative models(DGMs)such as variational autoencoders,autoregressive models,flow-based models,energy-based models,generative adversarial networks,and diffusion models has been advantageous in various...The use of deep generative models(DGMs)such as variational autoencoders,autoregressive models,flow-based models,energy-based models,generative adversarial networks,and diffusion models has been advantageous in various disciplines due to their high data generative skills.Using DGMs has become one of the most trending research topics in Artificial Intelligence in recent years.On the other hand,the research and development endeavors in the civil structural health monitoring(SHM)area have also been very progressive owing to the increasing use of Machine Learning techniques.As such,some of the DGMs have also been used in the civil SHM field lately.This short review communication paper aims to assist researchers in the civil SHM field in understanding the fundamentals of DGMs and,consequently,to help initiate their use for current and possible future engineering applications.On this basis,this study briefly introduces the concept and mechanism of different DGMs in a comparative fashion.While preparing this short review communication,it was observed that some DGMs had not been utilized or exploited fully in the SHM area.Accordingly,some representative studies presented in the civil SHM field that use DGMs are briefly overviewed.The study also presents a short comparative discussion on DGMs,their link to the SHM,and research directions.展开更多
AlphaPanda(AlphaFold2[1]inspired protein-specific antibody design in a diffusional manner)is an advanced algorithm for designing complementary determining regions(CDRs)of the antibody targeted the specific epitope,com...AlphaPanda(AlphaFold2[1]inspired protein-specific antibody design in a diffusional manner)is an advanced algorithm for designing complementary determining regions(CDRs)of the antibody targeted the specific epitope,combining transformer[2]models,3DCNN[3],and diffusion[4]generative models.展开更多
Over the past century,advancements in chemistry have significantly propelled human innovation,enhancing both industrial and consumer products.However,this rapid progression has resulted in chemical pollution increasin...Over the past century,advancements in chemistry have significantly propelled human innovation,enhancing both industrial and consumer products.However,this rapid progression has resulted in chemical pollution increasingly surpassing planetary boundaries,as production and release rates have outpaced our monitoring capabilities.To catalyze more impactful efforts,this study transitions from traditional chemical assessment to inverse chemical design,introducing a generative graph latent diffusion model aimed at discovering safer alternatives.In a case study on the design of green solvents for cyclohexane/benzene extraction distillation,we constructed a design database encompassing functional,environmental hazards,and process constraints.Virtual screening of previous design dataset revealed distinct trade-off trends between these design requirements.Based on the screening outcomes,an unconstrained generative model was developed,which covered a broader chemical space and demonstrated superior capabilities for structural interpolation and extrapolation.To further optimize molecular generation towards desired properties,a multi-objective latent diffusion method was applied,yielding 19 candidate molecules.Of these,7 were identified in PubChem as the most viable green solvent candidates,while the remaining 12 as potential novel candidates.Overall,this study effectively designed green solvent candidates for safer and more sustainable industrial production,setting a promising precedent for the development of environmentally friendly alternatives in other areas of chemical research.展开更多
Recently,diffusion models have emerged as a promising paradigm for molecular design and optimization.However,most diffusion-based molecular generative models focus on modeling 2D graphs or 3D geom-etries,with limited ...Recently,diffusion models have emerged as a promising paradigm for molecular design and optimization.However,most diffusion-based molecular generative models focus on modeling 2D graphs or 3D geom-etries,with limited research on molecular sequence diffusion models.The International Union of Pure and Applied Chemistry(IUPAC)names are more akin to chemical natural language than the simplified molecular input line entry system(SMILES)for organic compounds.In this work,we apply an IUPAC-guided conditional diffusion model to facilitate molecular editing from chemical natural language to chemical language(SMILES)and explore whether the pre-trained generative performance of diffusion models can be transferred to chemical natural language.We propose DiffIUPAC,a controllable molecular editing diffusion model that converts IUPAC names to SMILES strings.Evaluation results demonstrate that our model out-performs existing methods and successfully captures the semantic rules of both chemical languages.Chemical space and scaffold analysis show that the model can generate similar compounds with diverse scaffolds within the specified constraints.Additionally,to illustrate the model’s applicability in drug design,we conducted case studies in functional group editing,analogue design and linker design.展开更多
The exponential growth of over-the-top(OTT)entertainment has fueled a surge in content consumption across diverse formats,especially in regional Indian languages.With the Indian film industry producing over 1500 films...The exponential growth of over-the-top(OTT)entertainment has fueled a surge in content consumption across diverse formats,especially in regional Indian languages.With the Indian film industry producing over 1500 films annually in more than 20 languages,personalized recommendations are essential to highlight relevant content.To overcome the limitations of traditional recommender systems-such as static latent vectors,poor handling of cold-start scenarios,and the absence of uncertainty modeling-we propose a deep Collaborative Neural Generative Embedding(C-NGE)model.C-NGE dynamically learns user and item representations by integrating rating information and metadata features in a unified neural framework.It uses metadata as sampled noise and applies the reparameterization trick to capture latent patterns better and support predictions for new users or items without retraining.We evaluate CNGE on the Indian Regional Movies(IRM)dataset,along with MovieLens 100 K and 1 M.Results show that our model consistently outperforms several existing methods,and its extensibility allows for incorporating additional signals like user reviews and multimodal data to enhance recommendation quality.展开更多
Solar forecasting using ground-based sky image offers a promising approach to reduce uncertainty in photovoltaic(PV)power generation.However,existing methods often rely on deterministic predictions that lack diversity...Solar forecasting using ground-based sky image offers a promising approach to reduce uncertainty in photovoltaic(PV)power generation.However,existing methods often rely on deterministic predictions that lack diversity,making it difficult to capture the inherently stochastic nature of cloud movement.To address this limitation,we propose a new two-stage probabilistic forecasting framework.In the first stage,we introduce I-GPT,a multiscale physics-constrained generative model for stochastic sky image prediction.Given a sequence of past sky images,I-GPT uses a Transformer-based VQ-VAE.It also incorporates multi-scale physics-informed recurrent units(Multi-scale PhyCell)and dynamically weighted fuses physical and appearance features.This approach enables the generation of multiple plausible future sky images with realistic and coherent cloud motion.In the second stage,these predicted sky images are fed into an Image-to-Power U-Net(IP-U-Net)to produce 15-min-ahead probabilistic PV power forecasts.In experiments using our dataset,the proposed approach significantly outperforms deterministic,other stochastic,multimodal,and smart persistence baselines models,achieving a superior reliability–sharpness trade-off.It attains a Continuous Ranked Probability Score(CRPS)of 2.912 kW and a Winkler Score(WS)of 33.103 kW on the test set and CRPS of 2.073 kW and WS of 22.202 kW on the validation set.Translating to 35.9%and 42.78%improvement in predictive skill over the smart persistence model.Notably,our method excels during rapidly changing cloud-cover conditions.By enhancing both the accuracy and robustness of short-term PV forecasting,the framework provides tangible benefits for Virtual Power Plant(VPP)operation,supporting more reliable scheduling,grid stability,and risk-aware energy management.展开更多
Robust stereo disparity estimation plays a critical role in minimally invasive surgery,where dynamic soft tissues,specular reflections,and data scarcity pose major challenges to traditional end-to-end deep learning an...Robust stereo disparity estimation plays a critical role in minimally invasive surgery,where dynamic soft tissues,specular reflections,and data scarcity pose major challenges to traditional end-to-end deep learning and deformable model-based methods.In this paper,we propose a novel disparity estimation framework that leverages a pretrained StyleGAN generator to represent the disparity manifold of Minimally Invasive Surgery(MIS)scenes and reformulates the stereo matching task as a latent-space optimization problem.Specifically,given a stereo pair,we search for the optimal latent vector in the intermediate latent space of StyleGAN,such that the photometric reconstruction loss between the stereo images is minimized while regularizing the latent code to remain within the generator’s high-confidence region.Unlike existing encoder-based embedding methods,our approach directly exploits the geometry of the learned latent space and enforces both photometric consistency and manifold prior during inference,without the need for additional training or supervision.Extensive experiments on stereo-endoscopic videos demonstrate that our method achieves high-fidelity and robust disparity estimation across varying lighting,occlusion,and tissue dynamics,outperforming Thin Plate Spline(TPS)-based and linear representation baselines.This work bridges generative modeling and 3D perception by enabling efficient,training-free disparity recovery from pre-trained generative models with reduced inference latency.展开更多
The growth of Sakhalin fir(Abies sachalinen-sis)seedlings,an important forest tree species in northern Hokkaido,Japan,is significantly affected by competition from surrounding vegetation,especially evergreen dwarf bam...The growth of Sakhalin fir(Abies sachalinen-sis)seedlings,an important forest tree species in northern Hokkaido,Japan,is significantly affected by competition from surrounding vegetation,especially evergreen dwarf bamboo.In this study,we investigated the height and root collar diameter(RCD)growth of Sakhalin fir seedlings under various degrees of cover by deciduous vegetation and evergreen dwarf bamboo.Generalized additive models were used to quantify the effects of canopy cover and forest floor cover on the relative growth rates of these two parameters.The canopy cover of Sakhalin fir seedlings had a nonlin-ear negative effect on both the height growth of seedlings in the subsequent year and the RCD growth in the current year,given the general growth pattern in this species,where height growth ceases in early summer and RCD growth con-tinues until autumn.Height growth declined sharply after the canopy cover rate exceeded 50%,while RCD growth declined rapidly between 0 and 50%canopy cover rate.The forest floor cover had a greater negative impact on RCD growth than on height growth.These results suggested that Sakhalin fir seedlings respond to vegetative competition by prioritizing height growth for light acquisition at the expense of diameter growth and possibly root growth for below-ground competition.The cover of evergreen dwarf bamboo reduced the height growth of fir seedlings significantly more than the cover of deciduous vegetation.This difference is likely due to the timing of light availability.When competing with deciduous vegetation,Sakhalin fir seedlings exposed to light during the post-snow melt and early spring before the development of the deciduous vegetation canopy can photosynthesize more effectively,leading to greater height growth.The results of this study highlighted the importance of vegetation control considering the type of vegetation for successful Sakhalin fir reforestation.Adjusting the intensity and timing of weeding based on the presence and abundance of dwarf bamboo and other competing vegetation could potentially reduce weeding costs and increase biodiversity in reforested areas.展开更多
Time series anomaly detection is critical in domains such as manufacturing,finance,and cybersecurity.Recent generative AI models,particularly Transformer-and Autoencoder-based architectures,show strong accuracy but th...Time series anomaly detection is critical in domains such as manufacturing,finance,and cybersecurity.Recent generative AI models,particularly Transformer-and Autoencoder-based architectures,show strong accuracy but their robustness under noisy conditions is less understood.This study evaluates three representative models—AnomalyTransformer,TranAD,and USAD—on the Server Machine Dataset(SMD)and cross-domain benchmarks including the SoilMoisture Active Passive(SMAP)dataset,theMars Science Laboratory(MSL)dataset,and the Secure Water Treatment(SWaT)testbed.Seven noise settings(five canonical,two mixed)at multiple intensities are tested under fixed clean-data training,with variations in window,stride,and thresholding.Results reveal distinct robustness profiles:AnomalyTransformermaintains recall but loses precision under abrupt noise,TranAD balances sensitivity yet is vulnerable to structured anomalies,and USAD resists Gaussian perturbations but collapses under block anomalies.Quantitatively,F1 drops 60%–70%on noisy SMD,with severe collapse in SWaT(F1≤0.10,Drop up to 84%)but relative stability on SMAP/MSL(Drop within±10%).Overall,generative models exhibit complementary robustness patterns,highlighting noise-type dependent vulnerabilities and providing practical guidance for robust deployment.展开更多
With the miniaturization of devices and the development of modern heating technologies,the generalization of heat conduction and thermoelastic coupling has become crucial,effectively emulating the thermodynamic behavi...With the miniaturization of devices and the development of modern heating technologies,the generalization of heat conduction and thermoelastic coupling has become crucial,effectively emulating the thermodynamic behavior of materials in ultrashort time scales.Theoretically,generalized heat conductive models are considered in this work.By analogy with mechanical viscoelastic models,this paper further enriches the heat conduction models and gives their one-dimensional physical expression.Numerically,the transient thermoelastic response of the slim strip material under thermal shock is investigated by applying the proposed models.First,the analytical solution in the Laplace domain is obtained by the Laplace transform.Then,the numerical results of the transient responses are obtained by the numerical inverse Laplace transform.Finally,the transient responses of different models are analyzed and compared,and the effects of material parameters are discussed.This work not only opens up new research perspectives on generalized heat conductive and thermoelastic coupling theories,but also is expected to be beneficial for the deeper understanding of the heat wave theory.展开更多
文摘The growing demand for wireless connectivity has made massive multiple-input multiple-output(MIMO)a cornerstone of modern communication systems.To optimize network performance and resource allocation,an efficient and robust approach is joint device activity detection and channel estimation.In this paper,we present an approach utilizing score-based generative models to address the underdetermined nature of channel estimation,which is data-driven and well-suited for the complex and dynamic environment of massive MIMO systems.Our experimental results,based on a comprehensive dataset generated through Monte-Carlo sampling,demonstrate the high precision of our channel estimation approach,with errors reduced to as low as-45 d B,and exceptional accuracy in detecting active devices.
基金co-supported by the National Key Project of China(No.GJXM92579)the National Natural Science Foundation of China(Nos.92052203,61903178 and61906081)。
文摘Inverse design has long been an efficient and powerful design tool in the aircraft industry.In this paper,a novel inverse design method for supercritical airfoils is proposed based on generative models in deep learning.A Conditional Variational Auto Encoder(CVAE)and an integrated generative network CVAE-GAN that combines the CVAE with the Wasserstein Generative Adversarial Networks(WGAN),are conducted as generative models.They are used to generate target wall Mach distributions for the inverse design that matches specified features,such as locations of suction peak,shock and aft loading.Qualitative and quantitative results show that both adopted generative models can generate diverse and realistic wall Mach number distributions satisfying the given features.The CVAE-GAN model outperforms the CVAE model and achieves better reconstruction accuracies for all the samples in the dataset.Furthermore,a deep neural network for nonlinear mapping is adopted to obtain the airfoil shape corresponding to the target wall Mach number distribution.The performances of the designed deep neural network are fully demonstrated and a smoothness measurement is proposed to quantify small oscillations in the airfoil surface,proving the authenticity and accuracy of the generated airfoil shapes.
基金supported by MedTechLabs,GE HealthCare,the Swedish Research council,No.2021-05103the Göran Gustafsson foundation,No.2114.
文摘Deep learning(DL)has proven to be important for computed tomography(CT)image denoising.However,such models are usually trained under supervision,requiring paired data that may be difficult to obtain in practice.Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling.In particular,using the estimated unconditional score function of the prior distribution,obtained via unsupervised learning,one can sample from the desired posterior via hijacking and regularization.However,due to the iterative solvers used,the number of function evaluations(NFE)required may be orders of magnitudes larger than for single-step samplers.In this paper,we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models(PFGM)++.By hijacking and regularizing the sampling process we obtain a single-step sampler,that is NFE=1.Our proposed method incorporates posterior sampling using diffusion models as a special case.We demonstrate that the added robustness afforded by the PFGM++framework yields significant performance gains.Our results indicate competitive performance compared to popular supervised,including state-of-the-art diffusion-style models with NFE=1(consistency models),unsupervised,and non-DL-based image denoising techniques,on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.
基金The National Natural Science Foundation of China(Grant No.81573273,81673279,21572010 and 21772005)National Major Scientific and Technological Special Project for"Significant New Drugs Development"(Grant No.2018ZX09735001-003)
文摘Natural products(NPs) have long been recognized as a valuable resource for drug discovery, and bringing NP-related features to virtual libraries is believed to be an effective way to increase the coverage of druggable chemical space. Here, deep learning-based molecule generative model, which is a recent technique in de novo molecule design, was applied to generate virtual libraries with NP-like properties. Results demonstrated that the model was effective in generating molecules that highly resemble NPs. Moreover, the model was also found to be capable of generating NP-like molecules that were also easy to synthesize, significantly increasing the practical value of the compound library.
文摘This paper systematically reviews the latest advances in deep learning-based path planning for autonomous mobile robots,addressing the limitations of traditional methods(e.g.,A*,Rapidly-exploring Random Tree(RRT))in dynamic,high-dimensional,and unstructured environments.We comprehensively analyze five major deep learning model categories:Convolutional Neural Networks(CNNs)for spatial feature extraction,Graph Neural Networks(GNNs)for multi-agent collaboration,Recurrent Neural Networks(RNNs)for temporal modeling,Transformers for long-range dependency and complex instruction understanding,and generative models(e.g.,GANs,Diffusion Models)for creative path generation.Our analysis covers technical principles,advantages,limitations,application scenarios,and development trends of these methods.The review reveals that deep learning has fundamentally transformed path planning from perception enhancement to decision substitution,from isolated agents to multi-agent collaboration,and from search-based to generative paradigms.Key findings indicate significant performance improvements:GNN-based distributed planning triples multi-robot collaboration efficiency,and generative models increase complex instruction planning success rates to 78.1%.Future directions include cross-modal integration,lightweight deployment,simulationtoreality transfer,and verifiable safety assurance,which will be crucial for advancing next-generation intelligent mobile robot navigation systems.
基金financially supported by the National Science Fund for Distinguished Young Scholars(82325047)Regional Innovation and Development Joint Fund of NSFC(U24A20807)+4 种基金Youth Innovation Promotion Association CAS(2023411)National Natural Science Foundation of China(22477123)Major Projects for Fundamental Research of Yunnan Province(202201BC070002)CAS“Light of West China”Program and CAS Interdisciplinary Innovation Team(xbzg-zdsys-202303)Yunnan Revitalization Talent Support Program:Yunling Scholar Project,Yunnan Province Science and Technology Department(202305AH340005).
文摘Natural products(NPs)are invaluable resources for drug discovery,characterized by their intricate scaffolds and diverse bioactivities.Al drug discovery&design(AIDD)has emerged as a transformative approach for the rational structural modification of NPs.This review examines a variety of molecular generation models since 2020,focusing on their potential applications in two primary scenarios of NPs structure modification:modifications when the target is identified and when it remains unidentified.Most of the molecular generative models discussed herein are open-source,and their applicability across different domains and technical feasibility have been evaluated.This evaluation was accomplished by integrating a limited number of research cases and successful practices observed in the molec-ular optimization of synthetic compounds.Furthermore,the challenges and prospects of employing molecular generation modeling for the structural modification of NPs are discussed.
基金funded by the National Natural Science Foundation of China(Nos.T2588101,52572334,52221005,52472446,and 52220105001)the Independent Research Project of the State Key Laboratory of Intelligent Green Vehicle and Mobility,Tsinghua University(No.ZZ-GG20250403)+2 种基金the National Key Research and Development Program of China(No.2021YFB2501205)Tsinghua University(State Key Laboratory of Intelligent Green Vehicle and Mobility)—Hangzhou Airport Economic Demonstration Zone Joint Research Center for Integrated TransportationShaanxi Province Merit-based Funding Project for Science and Technology Activities of Overseas Educated Personnel(No.2023001)
文摘The rapid acceleration of big data and artificial intelligence has spurred the application of advanced machine learning methods to address multifaceted transportation challenges.Among these, generative models(GMs) have garnered significant attention, demonstrating great potential for advancing intelligent transportation systems. This paper provides a comprehensive investigation into the applications and potential of GMs within this domain. First, the paper systematically reviews the fundamental principles, architectures,and comparative characteristics of mainstream generative models. The primary contribution is an in-depth review of GM applications across three core areas: trajectory generation, traffic flow prediction, and autonomous driving. In trajectory generation, we examine how GMs synthesize realistic data to address data scarcity and privacy preservation. For traffic flow, the review covers GM-based approaches for prediction and critical data imputation tasks. In autonomous driving, the analysis details GM applications in sensor data restoration, perception enhancement, realistic scenario simulation, and behavior prediction. Although GMs have shown significant value, their full potential remains underexplored. Therefore, this paper identifies and discusses promising avenues for future research, including the integration of diffusion models in autonomous driving, the use of GMs for infrastructure planning, and their application in enhancing traffic safety.This paper is anticipated to serve as a comprehensive reference for researchers exploring generative models in transportation.
基金supported by the National Natural Science Foundation of China(11925404,92165209,92265210,92365301,T2225008,12075128,and 62173201)the Innovation Program for Quantum Science and Technology(2021ZD0300203,2021ZD0302203,and 2021ZD0300201)+1 种基金the National Key Research and Development Program of China(2017YFA0304303)the Tsinghua University Dushi Program,and the Shanghai Qi Zhi Institute Innovation Program(SQZ202318)。
文摘With the rapid development of quantum devices across various platforms[1–4],reconstructing quantum many-body states from experimentally measured data posts a crucial challenge.Straightforward quantum state tomography(QST)is only applicable for small systems[5],since the required classical computing resources,such as the number of measurements and the memory size,grow exponentially as the system size increases.
基金supported by the Guangdong Basic and Applied Basic Research Foundation(2020A1515110843)Young S&T Talent Training Program of Guangdong Provincial Association for S&T(SKXRC202211)+3 种基金National Natural Science Foundation of China(22402163,22109003)the Major Science and Technology Infrastructure Project of Material Genome Big-science Facilities Platform supported by Municipal Development and Reform Commission of Shenzhen,Soft Science Research Project of Guangdong Province(No.2017B030301013)Natural Science Foundation of Xiamen,China(3502Z202472001)High-level Scientific Research Foundation of Hebei Province and Fundamental Research Funds for the Central Universities(20720240054).
文摘The rational design of catalyst structures tailored to target performance is an ambitious and profoundly impactful goal.Key challenges include achieving refined representations of the three-dimensional structure of active sites and imbuing models with robust physical interpretability.Herein,we developed a topology-based variational autoencoder framework(PGH-VAEs)to enable the interpretable inverse design of catalytic active sites.Leveraging high-entropy alloys as a case,we demonstrate that persistent GLMY homology,an advanced topological algebraic analysis tool,enables the quantification of three-dimensional structural sensitivity and establishes correlations with adsorption properties.The multi-channel PGH-VAEs illustrate how coordination and ligand effects shape the latent space and influence the adsorption energies.Building on the inverse design results from PGH-VAEs,the strategies to optimize the composition and facet structures to maximize the proportion of optimal active sites are proposed.This interpretable inverse design framework can be extended to diverse systems,paving the way for AI-driven catalyst design.
基金supported by National Key R&D Program of China(2022YFA1004100).
文摘Recent advances in generative models have significantly facilitated the development of personalized content creation.Given a small set of images containing a user-specific concept,personalized image generation allows the user to create images that incorporate that concept while adhering to provided text descriptions.The technologies used for personalization have evolved alongside the development of generative models,with their distinct and interrelated components.In this survey,we present a comprehensive review of generalized personalized image generation across various generative models,including traditional GANs,contemporary text-to-image diffusion models,and emerging multi-modal autoregressive(AR)models.We first define a unified framework that standardizes the personalization process across different generative models,encompassing three key components:inversion spaces,inversion methods,and personalization schemes.This unified framework offers a structured approach to dissecting and comparing personalization techniques across different generative architectures.Building upon our framework,we provide an in-depth analysis of personalization techniques within each generative model,highlighting their unique contributions and innovations.Through comparative analysis,we elucidate the current landscape of personalized image generation,identifying commonalities and distinguishing features of existing methods.Finally,we discuss open challenges in the field and propose potential directions for future research.We keep a bibliography of related works at https://github.com/csyxwei/Awesome-Personalized-Image-Generation.
基金the National Aeronautics and Space Administration(NASA)Award No.80NSSC20K0326 for the research activities and particularly for this paper。
文摘The use of deep generative models(DGMs)such as variational autoencoders,autoregressive models,flow-based models,energy-based models,generative adversarial networks,and diffusion models has been advantageous in various disciplines due to their high data generative skills.Using DGMs has become one of the most trending research topics in Artificial Intelligence in recent years.On the other hand,the research and development endeavors in the civil structural health monitoring(SHM)area have also been very progressive owing to the increasing use of Machine Learning techniques.As such,some of the DGMs have also been used in the civil SHM field lately.This short review communication paper aims to assist researchers in the civil SHM field in understanding the fundamentals of DGMs and,consequently,to help initiate their use for current and possible future engineering applications.On this basis,this study briefly introduces the concept and mechanism of different DGMs in a comparative fashion.While preparing this short review communication,it was observed that some DGMs had not been utilized or exploited fully in the SHM area.Accordingly,some representative studies presented in the civil SHM field that use DGMs are briefly overviewed.The study also presents a short comparative discussion on DGMs,their link to the SHM,and research directions.
基金supported by the Key Project of International Cooperation of Qilu University of Technology(Grant No.:QLUTGJHZ2018008)Shandong Provincial Natural Science Foundation Committee,China(Grant No.:ZR2016HB54)Shandong Provincial Key Laboratory of Microbial Engineering(SME).
文摘AlphaPanda(AlphaFold2[1]inspired protein-specific antibody design in a diffusional manner)is an advanced algorithm for designing complementary determining regions(CDRs)of the antibody targeted the specific epitope,combining transformer[2]models,3DCNN[3],and diffusion[4]generative models.
基金supported by Shanghai Science and Technology Commission Project(No.21DZ1201502)Shanghai Municipal Bureau of Ecology and Environment(Shanghai Environ-mental Science[2023]No.40)+1 种基金the Interdisciplinary Joint Research Project of Tongji University(No.2022-4-YB-12)Shanghai Science and Technology Commission Project(No.22DZ2200200).
文摘Over the past century,advancements in chemistry have significantly propelled human innovation,enhancing both industrial and consumer products.However,this rapid progression has resulted in chemical pollution increasingly surpassing planetary boundaries,as production and release rates have outpaced our monitoring capabilities.To catalyze more impactful efforts,this study transitions from traditional chemical assessment to inverse chemical design,introducing a generative graph latent diffusion model aimed at discovering safer alternatives.In a case study on the design of green solvents for cyclohexane/benzene extraction distillation,we constructed a design database encompassing functional,environmental hazards,and process constraints.Virtual screening of previous design dataset revealed distinct trade-off trends between these design requirements.Based on the screening outcomes,an unconstrained generative model was developed,which covered a broader chemical space and demonstrated superior capabilities for structural interpolation and extrapolation.To further optimize molecular generation towards desired properties,a multi-objective latent diffusion method was applied,yielding 19 candidate molecules.Of these,7 were identified in PubChem as the most viable green solvent candidates,while the remaining 12 as potential novel candidates.Overall,this study effectively designed green solvent candidates for safer and more sustainable industrial production,setting a promising precedent for the development of environmentally friendly alternatives in other areas of chemical research.
基金supported by the Yonsei University graduate school Department of Integrative Biotechnology.
文摘Recently,diffusion models have emerged as a promising paradigm for molecular design and optimization.However,most diffusion-based molecular generative models focus on modeling 2D graphs or 3D geom-etries,with limited research on molecular sequence diffusion models.The International Union of Pure and Applied Chemistry(IUPAC)names are more akin to chemical natural language than the simplified molecular input line entry system(SMILES)for organic compounds.In this work,we apply an IUPAC-guided conditional diffusion model to facilitate molecular editing from chemical natural language to chemical language(SMILES)and explore whether the pre-trained generative performance of diffusion models can be transferred to chemical natural language.We propose DiffIUPAC,a controllable molecular editing diffusion model that converts IUPAC names to SMILES strings.Evaluation results demonstrate that our model out-performs existing methods and successfully captures the semantic rules of both chemical languages.Chemical space and scaffold analysis show that the model can generate similar compounds with diverse scaffolds within the specified constraints.Additionally,to illustrate the model’s applicability in drug design,we conducted case studies in functional group editing,analogue design and linker design.
文摘The exponential growth of over-the-top(OTT)entertainment has fueled a surge in content consumption across diverse formats,especially in regional Indian languages.With the Indian film industry producing over 1500 films annually in more than 20 languages,personalized recommendations are essential to highlight relevant content.To overcome the limitations of traditional recommender systems-such as static latent vectors,poor handling of cold-start scenarios,and the absence of uncertainty modeling-we propose a deep Collaborative Neural Generative Embedding(C-NGE)model.C-NGE dynamically learns user and item representations by integrating rating information and metadata features in a unified neural framework.It uses metadata as sampled noise and applies the reparameterization trick to capture latent patterns better and support predictions for new users or items without retraining.We evaluate CNGE on the Indian Regional Movies(IRM)dataset,along with MovieLens 100 K and 1 M.Results show that our model consistently outperforms several existing methods,and its extensibility allows for incorporating additional signals like user reviews and multimodal data to enhance recommendation quality.
基金supported by the“Regional Innovation Strategy(RIS)”through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(MOE)(2021RIS002)the Technology Development Program(RS-2025-02312851)funded by the Ministry of SMEs and Startups(MSS,Republic of Korea).
文摘Solar forecasting using ground-based sky image offers a promising approach to reduce uncertainty in photovoltaic(PV)power generation.However,existing methods often rely on deterministic predictions that lack diversity,making it difficult to capture the inherently stochastic nature of cloud movement.To address this limitation,we propose a new two-stage probabilistic forecasting framework.In the first stage,we introduce I-GPT,a multiscale physics-constrained generative model for stochastic sky image prediction.Given a sequence of past sky images,I-GPT uses a Transformer-based VQ-VAE.It also incorporates multi-scale physics-informed recurrent units(Multi-scale PhyCell)and dynamically weighted fuses physical and appearance features.This approach enables the generation of multiple plausible future sky images with realistic and coherent cloud motion.In the second stage,these predicted sky images are fed into an Image-to-Power U-Net(IP-U-Net)to produce 15-min-ahead probabilistic PV power forecasts.In experiments using our dataset,the proposed approach significantly outperforms deterministic,other stochastic,multimodal,and smart persistence baselines models,achieving a superior reliability–sharpness trade-off.It attains a Continuous Ranked Probability Score(CRPS)of 2.912 kW and a Winkler Score(WS)of 33.103 kW on the test set and CRPS of 2.073 kW and WS of 22.202 kW on the validation set.Translating to 35.9%and 42.78%improvement in predictive skill over the smart persistence model.Notably,our method excels during rapidly changing cloud-cover conditions.By enhancing both the accuracy and robustness of short-term PV forecasting,the framework provides tangible benefits for Virtual Power Plant(VPP)operation,supporting more reliable scheduling,grid stability,and risk-aware energy management.
基金Support by Sichuan Science and Technology Program[2023YFSY0026,2023YFH0004]Guangzhou Huashang University[2024HSZD01].
文摘Robust stereo disparity estimation plays a critical role in minimally invasive surgery,where dynamic soft tissues,specular reflections,and data scarcity pose major challenges to traditional end-to-end deep learning and deformable model-based methods.In this paper,we propose a novel disparity estimation framework that leverages a pretrained StyleGAN generator to represent the disparity manifold of Minimally Invasive Surgery(MIS)scenes and reformulates the stereo matching task as a latent-space optimization problem.Specifically,given a stereo pair,we search for the optimal latent vector in the intermediate latent space of StyleGAN,such that the photometric reconstruction loss between the stereo images is minimized while regularizing the latent code to remain within the generator’s high-confidence region.Unlike existing encoder-based embedding methods,our approach directly exploits the geometry of the learned latent space and enforces both photometric consistency and manifold prior during inference,without the need for additional training or supervision.Extensive experiments on stereo-endoscopic videos demonstrate that our method achieves high-fidelity and robust disparity estimation across varying lighting,occlusion,and tissue dynamics,outperforming Thin Plate Spline(TPS)-based and linear representation baselines.This work bridges generative modeling and 3D perception by enabling efficient,training-free disparity recovery from pre-trained generative models with reduced inference latency.
基金supported by the Ministry of Agriculture,Forestry,and Fisheries of Japan (25093 C)JSPS KAKENHI (JP23H02262)
文摘The growth of Sakhalin fir(Abies sachalinen-sis)seedlings,an important forest tree species in northern Hokkaido,Japan,is significantly affected by competition from surrounding vegetation,especially evergreen dwarf bamboo.In this study,we investigated the height and root collar diameter(RCD)growth of Sakhalin fir seedlings under various degrees of cover by deciduous vegetation and evergreen dwarf bamboo.Generalized additive models were used to quantify the effects of canopy cover and forest floor cover on the relative growth rates of these two parameters.The canopy cover of Sakhalin fir seedlings had a nonlin-ear negative effect on both the height growth of seedlings in the subsequent year and the RCD growth in the current year,given the general growth pattern in this species,where height growth ceases in early summer and RCD growth con-tinues until autumn.Height growth declined sharply after the canopy cover rate exceeded 50%,while RCD growth declined rapidly between 0 and 50%canopy cover rate.The forest floor cover had a greater negative impact on RCD growth than on height growth.These results suggested that Sakhalin fir seedlings respond to vegetative competition by prioritizing height growth for light acquisition at the expense of diameter growth and possibly root growth for below-ground competition.The cover of evergreen dwarf bamboo reduced the height growth of fir seedlings significantly more than the cover of deciduous vegetation.This difference is likely due to the timing of light availability.When competing with deciduous vegetation,Sakhalin fir seedlings exposed to light during the post-snow melt and early spring before the development of the deciduous vegetation canopy can photosynthesize more effectively,leading to greater height growth.The results of this study highlighted the importance of vegetation control considering the type of vegetation for successful Sakhalin fir reforestation.Adjusting the intensity and timing of weeding based on the presence and abundance of dwarf bamboo and other competing vegetation could potentially reduce weeding costs and increase biodiversity in reforested areas.
基金supported by the“Regional Innovation System&Education(RISE)”through the Seoul RISE Center,funded by the Ministry of Education(MOE)the Seoul Metropolitan Government(2025-RISE-01-018-04)supported by the Korea Digital Forensic Center.
文摘Time series anomaly detection is critical in domains such as manufacturing,finance,and cybersecurity.Recent generative AI models,particularly Transformer-and Autoencoder-based architectures,show strong accuracy but their robustness under noisy conditions is less understood.This study evaluates three representative models—AnomalyTransformer,TranAD,and USAD—on the Server Machine Dataset(SMD)and cross-domain benchmarks including the SoilMoisture Active Passive(SMAP)dataset,theMars Science Laboratory(MSL)dataset,and the Secure Water Treatment(SWaT)testbed.Seven noise settings(five canonical,two mixed)at multiple intensities are tested under fixed clean-data training,with variations in window,stride,and thresholding.Results reveal distinct robustness profiles:AnomalyTransformermaintains recall but loses precision under abrupt noise,TranAD balances sensitivity yet is vulnerable to structured anomalies,and USAD resists Gaussian perturbations but collapses under block anomalies.Quantitatively,F1 drops 60%–70%on noisy SMD,with severe collapse in SWaT(F1≤0.10,Drop up to 84%)but relative stability on SMAP/MSL(Drop within±10%).Overall,generative models exhibit complementary robustness patterns,highlighting noise-type dependent vulnerabilities and providing practical guidance for robust deployment.
基金Project supported by the Guangdong Basic and Applied Basic Research Foundation of China(No.2023A1515012809)the Natural Science Foundation of Shaanxi Province of China(No.2023-JC-YB-073)the Fundamental Research Funds for the Central Universities of China(No.D5000230066)。
文摘With the miniaturization of devices and the development of modern heating technologies,the generalization of heat conduction and thermoelastic coupling has become crucial,effectively emulating the thermodynamic behavior of materials in ultrashort time scales.Theoretically,generalized heat conductive models are considered in this work.By analogy with mechanical viscoelastic models,this paper further enriches the heat conduction models and gives their one-dimensional physical expression.Numerically,the transient thermoelastic response of the slim strip material under thermal shock is investigated by applying the proposed models.First,the analytical solution in the Laplace domain is obtained by the Laplace transform.Then,the numerical results of the transient responses are obtained by the numerical inverse Laplace transform.Finally,the transient responses of different models are analyzed and compared,and the effects of material parameters are discussed.This work not only opens up new research perspectives on generalized heat conductive and thermoelastic coupling theories,but also is expected to be beneficial for the deeper understanding of the heat wave theory.