The Godson-3B processor is a powerful processor designed for high performance servers including Dawning Servers. It offers significantly improved performance over previous Godson-3 series CPUs by incorporating eight C...The Godson-3B processor is a powerful processor designed for high performance servers including Dawning Servers. It offers significantly improved performance over previous Godson-3 series CPUs by incorporating eight CPU cores and vector computing units. It contains 582.6 M transistors within 300 mm2 area in 65 nm technology and is implemented in parallel with full hierarchical design flows. In Godson-3B, advanced clock distribution mechanisms including GALS (Globally Asynchronous Locally Synchronous) and clock mesh are adopted to obtain an OCV tolerable clock network. Custom-designed de-skew modules are also implemented to afford further latency balance after fabrication. The power reduction of Godson- 3B is maintained by MLMM (Multi Level Multi Mode) clock gating and multi-threshold-voltage cells substitution schemes. The highest frequency of Godson-3B is 1.05 GHz and the peak performance is 128 GFlops (double-precision) or 256 GFlops (single-precision) with 40 W power consumption.展开更多
The Godson-3A microprocessor is a quad-core version of the scalable Godson-3 multi-core series. It is physically implemented based on the 65 nm CMOS process. This 174 mm2 chip consists of 425 million transistors. The ...The Godson-3A microprocessor is a quad-core version of the scalable Godson-3 multi-core series. It is physically implemented based on the 65 nm CMOS process. This 174 mm2 chip consists of 425 million transistors. The maximum frequency is 1GHz with a maximum power consumption of 15 W. The main challenges of Godson-3A physical implementation include very large scale, high frequency requirement, sub-micron technology effects and aggressive time schedule. This paper describes the design methodology of the physical implementation of Godson-3A, with particular emphasis on design methods for high frequency, clock tree design, power management, and on-chip variation (OCV) issue.展开更多
Artificial neural networks with internal dynamics exhibit remarkable capability in processing information.Reservoir computing(RC)is a canonical example that features rich computing expressivity and compatibility with ...Artificial neural networks with internal dynamics exhibit remarkable capability in processing information.Reservoir computing(RC)is a canonical example that features rich computing expressivity and compatibility with physical implementations for enhanced efficiency.Recently,a new RC paradigm known as next generation reservoir computing(NGRC)further improves expressivity but compromises its physical openness,posing challenges for realizations in physical systems.Here we demonstrate optical NGRC with computations performed by light scattering through disordered media.In contrast to conventional optical RC implementations,we directly and solely drive our optical reservoir with time-delayed inputs.Much like digital NGRC that relies on polynomial features of delayed inputs,our optical reservoir also implicitly generates these polynomial features for desired functionalities.By leveraging the domain knowledge of the reservoir inputs,we show that the optical NGRC not only predicts the short-term dynamics of the low-dimensional Lorenz63 and large-scale Kuramoto-Sivashinsky chaotic time series,but also replicates their long-term ergodic properties.Optical NGRC shows superiority in shorter training length and fewer hyperparameters compared to conventional optical RC based on scattering media,while achieving better forecasting performance.Our optical NGRC framework may inspire the realization of NGRC in other physical RC systems,new applications beyond time-series processing,and the development of deep and parallel architectures broadly.展开更多
基金supported by the Important National Science and Technology Specific Projects under Grant Nos. 2009ZX01028-002-003,2009ZX01029-001-003,2010ZX01036-001-002the National Natural Science Foundation of China under Grant Nos. 61050002,60736012,60921002,61003064
文摘The Godson-3B processor is a powerful processor designed for high performance servers including Dawning Servers. It offers significantly improved performance over previous Godson-3 series CPUs by incorporating eight CPU cores and vector computing units. It contains 582.6 M transistors within 300 mm2 area in 65 nm technology and is implemented in parallel with full hierarchical design flows. In Godson-3B, advanced clock distribution mechanisms including GALS (Globally Asynchronous Locally Synchronous) and clock mesh are adopted to obtain an OCV tolerable clock network. Custom-designed de-skew modules are also implemented to afford further latency balance after fabrication. The power reduction of Godson- 3B is maintained by MLMM (Multi Level Multi Mode) clock gating and multi-threshold-voltage cells substitution schemes. The highest frequency of Godson-3B is 1.05 GHz and the peak performance is 128 GFlops (double-precision) or 256 GFlops (single-precision) with 40 W power consumption.
基金supported by the National Basic Research 973 Program of China under Grant No.2005CB321600the National High Technology Research & Development 863 Program of China under Grant Nos.2008AA110901,2009AA01Z125 and 2007AA01Z114the National Natural Science Foundation of China under Grant Nos.60803029,60673146,60736012.
文摘The Godson-3A microprocessor is a quad-core version of the scalable Godson-3 multi-core series. It is physically implemented based on the 65 nm CMOS process. This 174 mm2 chip consists of 425 million transistors. The maximum frequency is 1GHz with a maximum power consumption of 15 W. The main challenges of Godson-3A physical implementation include very large scale, high frequency requirement, sub-micron technology effects and aggressive time schedule. This paper describes the design methodology of the physical implementation of Godson-3A, with particular emphasis on design methods for high frequency, clock tree design, power management, and on-chip variation (OCV) issue.
基金supported by Swiss National Science Foundation(SNF)projects LION,ERC SMARTIES and Institut Universitaire de France.H.W.acknowledges China Scholarship Council and National Natural Science Foundation of China(623B2064 and 62275137)J.H.acknowledges SNF fellowship(P2ELP2_199825)+3 种基金Y.B.acknowledges the support from Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(2022R1A6A3A03072108)European Union’s Horizon Europe research and innovation program(N.101105899)Q.L.acknowledges National Natural Science Foundation of China(62275137)the Tsinghua University(Department of Precision Instrument)-North Laser Research Institute Co.,Ltd Joint Research Center for Advanced Laser Technology(20244910194).
文摘Artificial neural networks with internal dynamics exhibit remarkable capability in processing information.Reservoir computing(RC)is a canonical example that features rich computing expressivity and compatibility with physical implementations for enhanced efficiency.Recently,a new RC paradigm known as next generation reservoir computing(NGRC)further improves expressivity but compromises its physical openness,posing challenges for realizations in physical systems.Here we demonstrate optical NGRC with computations performed by light scattering through disordered media.In contrast to conventional optical RC implementations,we directly and solely drive our optical reservoir with time-delayed inputs.Much like digital NGRC that relies on polynomial features of delayed inputs,our optical reservoir also implicitly generates these polynomial features for desired functionalities.By leveraging the domain knowledge of the reservoir inputs,we show that the optical NGRC not only predicts the short-term dynamics of the low-dimensional Lorenz63 and large-scale Kuramoto-Sivashinsky chaotic time series,but also replicates their long-term ergodic properties.Optical NGRC shows superiority in shorter training length and fewer hyperparameters compared to conventional optical RC based on scattering media,while achieving better forecasting performance.Our optical NGRC framework may inspire the realization of NGRC in other physical RC systems,new applications beyond time-series processing,and the development of deep and parallel architectures broadly.