This paper focuses on the design of the cross layer between the video application layer and the MIMO physical layer. MIMO physical layer research has promised an enormous increase in the capacity of wireless communica...This paper focuses on the design of the cross layer between the video application layer and the MIMO physical layer. MIMO physical layer research has promised an enormous increase in the capacity of wireless communication systems. Also MIMO wireless systems operate under fading conditions where the channel faces arbitrary fluctuations. Since the wireless channel changes over each coherence period, the capacity of the wireless channel, given the power constraints, changes. Hence to make efficient use of the available capacity one needs to adapt the video bit rate. However it is impossible to adapt at the application layer as changing the parameters of the video takes more time than the coherence period of the channel. In this paper we address this problem through a novel solution and also investigate its performance through a simulation study.展开更多
The Locally Self-consistent Multiple Scattering(LSMS)code solves the first principles Density Functional theory Kohn-Sham equation for a wide range of materials with a special focus on metals,alloys and metallic nano-...The Locally Self-consistent Multiple Scattering(LSMS)code solves the first principles Density Functional theory Kohn-Sham equation for a wide range of materials with a special focus on metals,alloys and metallic nano-structures.It has traditionally exhibited near perfect scalability on massively parallel high performance computer architectures.We present our efforts to exploit GPUs to accelerate the LSMS code to enable first principles calculations of O(100,000)atoms and statistical physics sampling of finite temperature properties.Using the Cray XK7 system Titan at the Oak Ridge Leadership Computing Facility we achieve a sustained performance of 14.5PFlop/s and a speedup of 8.6 compared to the CPU only code.展开更多
We present a novel framework for audio-guided localized image stylization.Sound often provides information about the specific context of a scene and is closely related to a certain part of the scene or object.However,...We present a novel framework for audio-guided localized image stylization.Sound often provides information about the specific context of a scene and is closely related to a certain part of the scene or object.However,existing image stylization works have focused on stylizing the entire image using an image or text input.Stylizing a particular part of the image based on audio input is natural but challenging.This work proposes a framework in which a user provides an audio input to localize the target in the input image and another to locally stylize the target object or scene.We first produce a fine localization map using an audio-visual localization network leveraging CLIP embedding space.We then utilize an implicit neural representation(INR)along with the predicted localization map to stylize the target based on sound information.The INR manipulates local pixel values to be semantically consistent with the provided audio input.Our experiments show that the proposed framework outperforms other audio-guided stylization methods.Moreover,we observe that our method constructs concise localization maps and naturally manipulates the target object or scene in accordance with the given audio input.展开更多
The development of machine learning models has led to an abundance of datasets containing quantum mechanical(QM)calculations for molecular and material systems.However,traditional training methods for machine learning...The development of machine learning models has led to an abundance of datasets containing quantum mechanical(QM)calculations for molecular and material systems.However,traditional training methods for machine learning models are unable to leverage the plethora of data available as they require that each dataset be generated using the same QM method.Taking machine learning interatomic potentials(MLIPs)as an example,we show that meta-learning techniques,a recent advancement from the machine learning community,can be used to fit multiple levels of QMtheory in the same training process.Meta-learning changes the training procedure to learn a representation that can be easily re-trained to new tasks with small amounts of data.We then demonstrate that metalearning enables simultaneously training to multiple large organic molecule datasets.As a proof of concept,we examine the performance of aMLIP refit to a small drug-like molecule and show that pretraining potentials to multiple levels of theory with meta-learning improves performance.This difference in performance can be seen both in the reduced error and in the improved smoothness of the potential energy surface produced.We therefore show that meta-learning can utilize existing datasets with inconsistentQMlevels of theory to producemodels that are better at specializing to new datasets.This opens new routes for creating pre-trained,foundationmodels for interatomic potentials.展开更多
As Moore’s Law approaches its limits,3-D integrated circuits(ICs)have emerged as promising alternatives to conventional scaling methodologies.However,the benefits of 3-D integration in terms of lower power consumptio...As Moore’s Law approaches its limits,3-D integrated circuits(ICs)have emerged as promising alternatives to conventional scaling methodologies.However,the benefits of 3-D integration in terms of lower power consumption,higher performance,and reduced area are accompanied by testing challenges.The unique vertical stacking of components in 3-D ICs introduces concerns related to the robustness of bonding surfaces.Moreover,immature manufacturing processes during 3-D fabrication can lead to high defect rates in different tiers.Therefore,there is a need for design-for-test solutions to ensure the reliability and performance of 3-D-integrated architectures.In this paper,we provide a comprehensive survey of existing testing strategies for 3-D ICs.We describe recent advances,including research efforts and industry practice,that address concerns related to bonding defects,elevated power supply noise,fault diagnosis,and fault localization specific to the unique characteristics of 3-D ICs.展开更多
Limited main memory bandwidth is becoming a fundamental performance bottleneck in chipmultiprocessor (CMP) design. Yet directly increasing the peak memory bandwidth can incur high cost and power consumption. In this...Limited main memory bandwidth is becoming a fundamental performance bottleneck in chipmultiprocessor (CMP) design. Yet directly increasing the peak memory bandwidth can incur high cost and power consumption. In this paper, we address this problem by proposing a memory, a bandwidth-aware reconfigurable cache hierarchy, BACH, with hybrid memory technologies. Components of our BACH design include a hybrid cache hierarchy, a reconfiguration mechanism, and a statistical prediction engine. Our hybrid cache hierarchy chooses different memory technologies with various bandwidth characteristics, such as spin-transfer torque memory (STT-MRAM), resistive memory (ReRAM), and embedded DRAM (eDRAM), to configure each level so that the peak bandwidth of the overall cache hierarchy is optimized. Our reconfiguration mechanism can dynamically adjust the cache capacity of each level based on the predicted bandwidth demands of running workloads. The bandwidth prediction is performed by our prediction engine. We evaluate the system performance gain obtained by BACH design with a set of multithreaded and multiprogrammed workloads with and without the limitation of system power budget. Compared with traditional SRAM-based cache design, BACH improves the system throughput by 58% and 14% with multithreaded and multiprogrammed workloads respectively.展开更多
文摘This paper focuses on the design of the cross layer between the video application layer and the MIMO physical layer. MIMO physical layer research has promised an enormous increase in the capacity of wireless communication systems. Also MIMO wireless systems operate under fading conditions where the channel faces arbitrary fluctuations. Since the wireless channel changes over each coherence period, the capacity of the wireless channel, given the power constraints, changes. Hence to make efficient use of the available capacity one needs to adapt the video bit rate. However it is impossible to adapt at the application layer as changing the parameters of the video takes more time than the coherence period of the channel. In this paper we address this problem through a novel solution and also investigate its performance through a simulation study.
文摘The Locally Self-consistent Multiple Scattering(LSMS)code solves the first principles Density Functional theory Kohn-Sham equation for a wide range of materials with a special focus on metals,alloys and metallic nano-structures.It has traditionally exhibited near perfect scalability on massively parallel high performance computer architectures.We present our efforts to exploit GPUs to accelerate the LSMS code to enable first principles calculations of O(100,000)atoms and statistical physics sampling of finite temperature properties.Using the Cray XK7 system Titan at the Oak Ridge Leadership Computing Facility we achieve a sustained performance of 14.5PFlop/s and a speedup of 8.6 compared to the CPU only code.
基金supported by the Culture,Sports and Tourism R&D Program through the Korea Creative Content Agency grant funded by the Ministry of Culture,Sports and Tourism in 2022-(4D Content Generation and Copyright Protection with Artificial Intelligence,R2022020068,30%Research on neural watermark technology for copyright protection of generative AI 3D content,RS-2024-00348469,40%+1 种基金International Collaborative Research and Global Talent Development for the Development of Copyright Management and Protection Technologies for Generative AI,RS-2024-00345025,10%)the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(RS-2019-II190079,10%,No.2017-0-00417,10%).
文摘We present a novel framework for audio-guided localized image stylization.Sound often provides information about the specific context of a scene and is closely related to a certain part of the scene or object.However,existing image stylization works have focused on stylizing the entire image using an image or text input.Stylizing a particular part of the image based on audio input is natural but challenging.This work proposes a framework in which a user provides an audio input to localize the target in the input image and another to locally stylize the target object or scene.We first produce a fine localization map using an audio-visual localization network leveraging CLIP embedding space.We then utilize an implicit neural representation(INR)along with the predicted localization map to stylize the target based on sound information.The INR manipulates local pixel values to be semantically consistent with the provided audio input.Our experiments show that the proposed framework outperforms other audio-guided stylization methods.Moreover,we observe that our method constructs concise localization maps and naturally manipulates the target object or scene in accordance with the given audio input.
基金supported by the United States Department of Energy(US DOE),Office of Science,Basic Energy Sciences,Chemical Sciences,Geosciences,and Biosciences Division under Triad National Security,LLC(‘Triad’)contract grant no.89233218CNA000001(FWP:LANLE3F2)A.E.A.Allen and S.Matin also acknowledge the Center for Nonlinear Studies.Computer time was provided by the CCS-7 Darwin cluster at LANL.LAUR-23-27568.
文摘The development of machine learning models has led to an abundance of datasets containing quantum mechanical(QM)calculations for molecular and material systems.However,traditional training methods for machine learning models are unable to leverage the plethora of data available as they require that each dataset be generated using the same QM method.Taking machine learning interatomic potentials(MLIPs)as an example,we show that meta-learning techniques,a recent advancement from the machine learning community,can be used to fit multiple levels of QMtheory in the same training process.Meta-learning changes the training procedure to learn a representation that can be easily re-trained to new tasks with small amounts of data.We then demonstrate that metalearning enables simultaneously training to multiple large organic molecule datasets.As a proof of concept,we examine the performance of aMLIP refit to a small drug-like molecule and show that pretraining potentials to multiple levels of theory with meta-learning improves performance.This difference in performance can be seen both in the reduced error and in the improved smoothness of the potential energy surface produced.We therefore show that meta-learning can utilize existing datasets with inconsistentQMlevels of theory to producemodels that are better at specializing to new datasets.This opens new routes for creating pre-trained,foundationmodels for interatomic potentials.
基金supported in part by the National Science Foundation under Grant CCF-1908045 and Grant CCF-2309822in part by the Semiconductor Research Corporation(SRC)under Contract 2470+3 种基金in part by Intel Corporationin part by the DARPA ERI 3DSOC program under Award HR001118C0096in part by CHIMES,one of the seven centers in JUMP 2.0in part by DARPA through SRC Program.
文摘As Moore’s Law approaches its limits,3-D integrated circuits(ICs)have emerged as promising alternatives to conventional scaling methodologies.However,the benefits of 3-D integration in terms of lower power consumption,higher performance,and reduced area are accompanied by testing challenges.The unique vertical stacking of components in 3-D ICs introduces concerns related to the robustness of bonding surfaces.Moreover,immature manufacturing processes during 3-D fabrication can lead to high defect rates in different tiers.Therefore,there is a need for design-for-test solutions to ensure the reliability and performance of 3-D-integrated architectures.In this paper,we provide a comprehensive survey of existing testing strategies for 3-D ICs.We describe recent advances,including research efforts and industry practice,that address concerns related to bonding defects,elevated power supply noise,fault diagnosis,and fault localization specific to the unique characteristics of 3-D ICs.
文摘Limited main memory bandwidth is becoming a fundamental performance bottleneck in chipmultiprocessor (CMP) design. Yet directly increasing the peak memory bandwidth can incur high cost and power consumption. In this paper, we address this problem by proposing a memory, a bandwidth-aware reconfigurable cache hierarchy, BACH, with hybrid memory technologies. Components of our BACH design include a hybrid cache hierarchy, a reconfiguration mechanism, and a statistical prediction engine. Our hybrid cache hierarchy chooses different memory technologies with various bandwidth characteristics, such as spin-transfer torque memory (STT-MRAM), resistive memory (ReRAM), and embedded DRAM (eDRAM), to configure each level so that the peak bandwidth of the overall cache hierarchy is optimized. Our reconfiguration mechanism can dynamically adjust the cache capacity of each level based on the predicted bandwidth demands of running workloads. The bandwidth prediction is performed by our prediction engine. We evaluate the system performance gain obtained by BACH design with a set of multithreaded and multiprogrammed workloads with and without the limitation of system power budget. Compared with traditional SRAM-based cache design, BACH improves the system throughput by 58% and 14% with multithreaded and multiprogrammed workloads respectively.