Milling Process Simulation is one of the important re search areas in manufacturing science. For the purpose of improving the prec ision of simulation and extending its usability, numerical algorithm is more and more ...Milling Process Simulation is one of the important re search areas in manufacturing science. For the purpose of improving the prec ision of simulation and extending its usability, numerical algorithm is more and more used in the milling modeling areas. But simulative efficiency is decreasin g with increase of its complexity. As a result, application of the method is lim ited. Aimed at above question, high-efficient algorithm for milling process sim ulation is studied. It is important for milling process simulation’s applicatio n. Parallel computing is widely used to solve the large-scale computation question s. Its advantages include system flexibility, robust, high-efficient computing capability and high ratio of performance to price. With the development of compu ter network, utilizing the computing resource in the Internet, a virtual computi ng environment with powerful computing capability can be consisted by microc omputers, and the difficulty of building hardware environment which is used to s upport parallel computing is reduced. How to use network technology and parallel algorithm to improve simulative effic iency for milling forces simulation is investigated in the paper. In order to pr edict milling forces, a simplified local milling forces model is used in the pap er. End milling cutter is assumed to be divided by r number of differential elem ents along the axial direction of the cutter. For a given time, the total cuttin g forces can be obtained by summarizing the resultant cutting force produced by each differential cutter disc. Divide the whole simulative time into some segmen ts, send these program’s segments to microcomputers in the Internet and obtain the result of the program’s segments, all of the result of program’s segments a re composed the final result. For implementing the algorithm, a distributed Parallel computing framework is de signed in the paper. In the framework, web server plays a role of controller. Us ing Java RMI(remote method interface), the computing processes in computing serv er are called by web server. There are lots of control processes in web server a nd control the computing servers. The codes of simulative algorithm can be dynam ic sent to the computing servers, and milling forces at the different time are c omputed through utilizing the local computer’s resource. The results that are ca lculated by every computing servers are sent to the web server, and composed the final result. The framework can be used by different simulative algorithm. Comp ared with the algorithm running single machine, the efficiency of provided algor ithm is higher than that of single machine.展开更多
Parallel computing assigns the computing model to different processors on different devices and implements it simultaneously.Accordingly,it has broad applications in the numerical simulation of geotechnical engineerin...Parallel computing assigns the computing model to different processors on different devices and implements it simultaneously.Accordingly,it has broad applications in the numerical simulation of geotechnical engineering and underground engineering,of which models are always large-scale.With parallel computing,the computing time or the memory requirements will be reduced by splitting the original domain of the numerical model into many subdomains,which is thus named as the domain decomposition method.In this study,a cubic and equal volume domain decomposition strategy was utilized to realize the parallel computing on the distributed memory system of four-dimensional lattice spring model(4D-LSM)based on the message passing interface.With a more efficient communication strategy introduced,this study aimed at operating an one-billion-particle model on a supercomputer platform.The preprocessing procedure of the parallelized 4D-LSM was restructured and the particle generation strategy suitable for the supercomputer platform was employed to minimize the time consumption in preprocessing and calculation.On this basis,numerical calculations were performed on TianHe-3 prototype E class supercomputer at the National Supercomputer Center in Tianjin.Two fieldscale three-dimensional blasting wave propagation models were carried out,of which the numerical results verify the computing power and the advantage of the parallelized 4D-LSM in the simulation of large-scale three-dimension models.Subsequently,the time complexity and spatial complexity of 4D-LSM and other particle discrete element methods were analyzed.展开更多
High-performance computing(HPC)refers to the ability to process data and perform complex calculations at high speeds.It is one of the most essential tools fueling the advancement of science and technology.
Ambient noise tomography is an established technique in seismology,where calculating single-or ninecomponent noise cross-correlation functions(NCFs)is a fundamental first step.In this study,we introduced a novel CPU-G...Ambient noise tomography is an established technique in seismology,where calculating single-or ninecomponent noise cross-correlation functions(NCFs)is a fundamental first step.In this study,we introduced a novel CPU-GPU heterogeneous computing framework designed to significantly enhance the efficiency of computing 9-component NCFs from seismic ambient noise data.This framework not only accelerated the computational process by leveraging the Compute Unified Device Architecture(CUDA)but also improved the signal-to-noise ratio(SNR)through innovative stacking techniques,such as time-frequency domain phaseweighted stacking(tf-PWS).We validated the program using multiple datasets,confirming its superior computation speed,improved reliability,and higher signal-to-noise ratios for NCFs.Our comprehensive study provides detailed insights into optimizing the computational processes for noise cross-correlation functions,thereby enhancing the precision and efficiency of ambient noise imaging.展开更多
Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage reso...Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs.展开更多
Brain science accelerates the study of intelligence and behavior,contributes fundamental insights into human cognition,and offers prospective treatments for brain disease.Faced with the challenges posed by imaging tec...Brain science accelerates the study of intelligence and behavior,contributes fundamental insights into human cognition,and offers prospective treatments for brain disease.Faced with the challenges posed by imaging technologies and deep learning computational models,big data and high-performance computing(HPC)play essential roles in studying brain function,brain diseases,and large-scale brain models or connectomes.We review the driving forces behind big data and HPC methods applied to brain science,including deep learning,powerful data analysis capabilities,and computational performance solutions,each of which can be used to improve diagnostic accuracy and research output.This work reinforces predictions that big data and HPC will continue to improve brain science by making ultrahigh-performance analysis possible,by improving data standardization and sharing,and by providing new neuromorphic insights.展开更多
The publication of Tsinghua Science and Technology was started in 1996.Since then,it has been an international academic journal sponsored by Tsinghua University and published bimonthly.This journal aims at presenting ...The publication of Tsinghua Science and Technology was started in 1996.Since then,it has been an international academic journal sponsored by Tsinghua University and published bimonthly.This journal aims at presenting the state-of-the-art scientific achievements in computer science and other IT fields.展开更多
This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from g...This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology.展开更多
The increasing popularity of quantum computing has resulted in a considerable rise in demand for cloud quantum computing usage in recent years.Nevertheless,the rapid surge in demand for cloud-based quantum computing r...The increasing popularity of quantum computing has resulted in a considerable rise in demand for cloud quantum computing usage in recent years.Nevertheless,the rapid surge in demand for cloud-based quantum computing resources has led to a scarcity.In order to meet the needs of an increasing number of researchers,it is imperative to facilitate efficient and flexible access to computing resources in a cloud environment.In this paper,we propose a novel quantum computing paradigm,Virtual QPU(VQPU),which addresses this issue and enhances quantum cloud throughput with guaranteed circuit fidelity.The proposal introduces three innovative concepts:(1)The integration of virtualization technology into the field of quantum computing to enhance quantum cloud throughput.(2)The introduction of an asynchronous execution of circuits methodology to improve quantum computing flexibility.(3)The development of a virtual QPU allocation scheme for quantum tasks in a cloud environment to improve circuit fidelity.The concepts have been validated through the utilization of a self-built simulated quantum cloud platform.展开更多
Organic electrochemical transistor(OECT)devices demonstrate great promising potential for reservoir computing(RC)systems,but their lack of tunable dynamic characteristics limits their application in multi-temporal sca...Organic electrochemical transistor(OECT)devices demonstrate great promising potential for reservoir computing(RC)systems,but their lack of tunable dynamic characteristics limits their application in multi-temporal scale tasks.In this study,we report an OECT-based neuromorphic device with tunable relaxation time(τ)by introducing an additional vertical back-gate electrode into a planar structure.The dual-gate design enablesτreconfiguration from 93 to 541 ms.The tunable relaxation behaviors can be attributed to the combined effects of planar-gate induced electrochemical doping and back-gateinduced electrostatic coupling,as verified by electrochemical impedance spectroscopy analysis.Furthermore,we used theτ-tunable OECT devices as physical reservoirs in the RC system for intelligent driving trajectory prediction,achieving a significant improvement in prediction accuracy from below 69%to 99%.The results demonstrate that theτ-tunable OECT shows a promising candidate for multi-temporal scale neuromorphic computing applications.展开更多
The cloud-fog computing paradigm has emerged as a novel hybrid computing model that integrates computational resources at both fog nodes and cloud servers to address the challenges posed by dynamic and heterogeneous c...The cloud-fog computing paradigm has emerged as a novel hybrid computing model that integrates computational resources at both fog nodes and cloud servers to address the challenges posed by dynamic and heterogeneous computing networks.Finding an optimal computational resource for task offloading and then executing efficiently is a critical issue to achieve a trade-off between energy consumption and transmission delay.In this network,the task processed at fog nodes reduces transmission delay.Still,it increases energy consumption,while routing tasks to the cloud server saves energy at the cost of higher communication delay.Moreover,the order in which offloaded tasks are executed affects the system’s efficiency.For instance,executing lower-priority tasks before higher-priority jobs can disturb the reliability and stability of the system.Therefore,an efficient strategy of optimal computation offloading and task scheduling is required for operational efficacy.In this paper,we introduced a multi-objective and enhanced version of Cheeta Optimizer(CO),namely(MoECO),to jointly optimize the computation offloading and task scheduling in cloud-fog networks to minimize two competing objectives,i.e.,energy consumption and communication delay.MoECO first assigns tasks to the optimal computational nodes and then the allocated tasks are scheduled for processing based on the task priority.The mathematical modelling of CO needs improvement in computation time and convergence speed.Therefore,MoECO is proposed to increase the search capability of agents by controlling the search strategy based on a leader’s location.The adaptive step length operator is adjusted to diversify the solution and thus improves the exploration phase,i.e.,global search strategy.Consequently,this prevents the algorithm from getting trapped in the local optimal solution.Moreover,the interaction factor during the exploitation phase is also adjusted based on the location of the prey instead of the adjacent Cheetah.This increases the exploitation capability of agents,i.e.,local search capability.Furthermore,MoECO employs a multi-objective Pareto-optimal front to simultaneously minimize designated objectives.Comprehensive simulations in MATLAB demonstrate that the proposed algorithm obtains multiple solutions via a Pareto-optimal front and achieves an efficient trade-off between optimization objectives compared to baseline methods.展开更多
Developing a chiral material as versatile and universal chiral stationary phase(CSP) for chiral separation in diverse chromatographic techniques simultaneously is of great significance.In this study,we demonstrated fo...Developing a chiral material as versatile and universal chiral stationary phase(CSP) for chiral separation in diverse chromatographic techniques simultaneously is of great significance.In this study,we demonstrated for the first time that a chiral metal-organic cage(MOC),[Zn_(6)M_(4)],as a universal chiral recognition material for both multi-mode high-performance liquid chromatography(HPLC) and capillary gas chromatography(GC) enantioseparation.Two novel HPLC CSPs with different bonding arms(CSP-A with a cationic imidazolium bonding arm and CSP-B with an alkyl chain bonding arm) were prepared by clicking of functionalized chiral MOC [Zn_(6)M_(4)] onto thiolated silica via thiol-ene click chemistry.Meanwhile,a capillary GC column statically coated with the chiral MOC [Zn_(6)M_(4)] was also fabricated.The results showed that the chiral MOC exhibits excellent enantioselectivity not only in normal phase HPLC(NP-HPLC) and reversed phase(RP-HPLC) but also in GC,and various racemates were well separated,including alcohols,diols,esters,ketones,ethers,amines,and epoxides.Importantly,CSP-A and CSP-B are complementary to commercially available Chiralcel OD-H and Chiralpak AD-H columns in enantioseparation,which can separate some racemates that could not be or could not well be separated by the two widely used commercial columns,suggesting the great potential of the two prepared CSPs in enantioseparation.This work reveals that the chiral MOC is potential versatile chiral recognition materials for both HPLC and GC,and also paves the way to expand the potential applications of MOCs.展开更多
In recent years,fog computing has become an important environment for dealing with the Internet of Things.Fog computing was developed to handle large-scale big data by scheduling tasks via cloud computing.Task schedul...In recent years,fog computing has become an important environment for dealing with the Internet of Things.Fog computing was developed to handle large-scale big data by scheduling tasks via cloud computing.Task scheduling is crucial for efficiently handling IoT user requests,thereby improving system performance,cost,and energy consumption across nodes in cloud computing.With the large amount of data and user requests,achieving the optimal solution to the task scheduling problem is challenging,particularly in terms of cost and energy efficiency.In this paper,we develop novel strategies to save energy consumption across nodes in fog computing when users execute tasks through the least-cost paths.Task scheduling is developed using modified artificial ecosystem optimization(AEO),combined with negative swarm operators,Salp Swarm Algorithm(SSA),in order to competitively optimize their capabilities during the exploitation phase of the optimal search process.In addition,the proposed strategy,Enhancement Artificial Ecosystem Optimization Salp Swarm Algorithm(EAEOSSA),attempts to find the most suitable solution.The optimization that combines cost and energy for multi-objective task scheduling optimization problems.The backpack problem is also added to improve both cost and energy in the iFogSim implementation as well.A comparison was made between the proposed strategy and other strategies in terms of time,cost,energy,and productivity.Experimental results showed that the proposed strategy improved energy consumption,cost,and time over other algorithms.Simulation results demonstrate that the proposed algorithm increases the average cost,average energy consumption,and mean service time in most scenarios,with average reductions of up to 21.15%in cost and 25.8%in energy consumption.展开更多
As emerging two-dimensional(2D)materials,carbides and nitrides(MXenes)could be solid solutions or organized structures made up of multi-atomic layers.With remarkable and adjustable electrical,optical,mechanical,and el...As emerging two-dimensional(2D)materials,carbides and nitrides(MXenes)could be solid solutions or organized structures made up of multi-atomic layers.With remarkable and adjustable electrical,optical,mechanical,and electrochemical characteristics,MXenes have shown great potential in brain-inspired neuromorphic computing electronics,including neuromorphic gas sensors,pressure sensors and photodetectors.This paper provides a forward-looking review of the research progress regarding MXenes in the neuromorphic sensing domain and discussed the critical challenges that need to be resolved.Key bottlenecks such as insufficient long-term stability under environmental exposure,high costs,scalability limitations in large-scale production,and mechanical mismatch in wearable integration hinder their practical deployment.Furthermore,unresolved issues like interfacial compatibility in heterostructures and energy inefficiency in neu-romorphic signal conversion demand urgent attention.The review offers insights into future research directions enhance the fundamental understanding of MXene properties and promote further integration into neuromorphic computing applications through the convergence with various emerging technologies.展开更多
The advancement of flexible memristors has significantly promoted the development of wearable electronic for emerging neuromorphic computing applications.Inspired by in-memory computing architecture of human brain,fle...The advancement of flexible memristors has significantly promoted the development of wearable electronic for emerging neuromorphic computing applications.Inspired by in-memory computing architecture of human brain,flexible memristors exhibit great application potential in emulating artificial synapses for highefficiency and low power consumption neuromorphic computing.This paper provides comprehensive overview of flexible memristors from perspectives of development history,material system,device structure,mechanical deformation method,device performance analysis,stress simulation during deformation,and neuromorphic computing applications.The recent advances in flexible electronics are summarized,including single device,device array and integration.The challenges and future perspectives of flexible memristor for neuromorphic computing are discussed deeply,paving the way for constructing wearable smart electronics and applications in large-scale neuromorphic computing and high-order intelligent robotics.展开更多
High-entropy oxides(HEOs)have emerged as a promising class of memristive materials,characterized by entropy-stabilized crystal structures,multivalent cation coordination,and tunable defect landscapes.These intrinsic f...High-entropy oxides(HEOs)have emerged as a promising class of memristive materials,characterized by entropy-stabilized crystal structures,multivalent cation coordination,and tunable defect landscapes.These intrinsic features enable forming-free resistive switching,multilevel conductance modulation,and synaptic plasticity,making HEOs attractive for neuromorphic computing.This review outlines recent progress in HEO-based memristors across materials engineering,switching mechanisms,and synaptic emulation.Particular attention is given to vacancy migration,phase transitions,and valence-state dynamics—mechanisms that underlie the switching behaviors observed in both amorphous and crystalline systems.Their relevance to neuromorphic functions such as short-term plasticity and spike-timing-dependent learning is also examined.While encouraging results have been achieved at the device level,challenges remain in conductance precision,variability control,and scalable integration.Addressing these demands a concerted effort across materials design,interface optimization,and task-aware modeling.With such integration,HEO memristors offer a compelling pathway toward energy-efficient and adaptable brain-inspired electronics.展开更多
Neuromorphic devices have garnered significant attention as potential building blocks for energy-efficient hardware systems owing to their capacity to emulate the computational efficiency of the brain.In this regard,r...Neuromorphic devices have garnered significant attention as potential building blocks for energy-efficient hardware systems owing to their capacity to emulate the computational efficiency of the brain.In this regard,reservoir computing(RC)framework,which leverages straightforward training methods and efficient temporal signal processing,has emerged as a promising scheme.While various physical reservoir devices,including ferroelectric,optoelectronic,and memristor-based systems,have been demonstrated,many still face challenges related to compatibility with mainstream complementary metal oxide semiconductor(CMOS)integration processes.This study introduced a silicon-based schottky barrier metal-oxide-semiconductor field effect transistor(SB-MOSFET),which was fabricated under low thermal budget and compatible with back-end-of-line(BEOL).The device demonstrated short-term memory characteristics,facilitated by the modulation of schottky barriers and charge trapping.Utilizing these characteristics,a RC system for temporal data processing was constructed,and its performance was validated in a 5×4 digital classification task,achieving an accuracy exceeding 98%after 50 training epochs.Furthermore,the system successfully processed temporal signal in waveform classification and prediction tasks using time-division multiplexing.Overall,the SB-MOSFET's high compatibility with CMOS technology provides substantial advantages for large-scale integration,enabling the development of energy-efficient reservoir computing hardware.展开更多
Nowadays,advances in communication technology and cloud computing have spawned a variety of smart mobile devices,which will generate a great amount of computing-intensive businesses,and require corresponding resources...Nowadays,advances in communication technology and cloud computing have spawned a variety of smart mobile devices,which will generate a great amount of computing-intensive businesses,and require corresponding resources of computation and communication.Multiaccess edge computing(MEC)can offload computing-intensive tasks to the nearby edge servers,which alleviates the pressure of devices.Ultra-dense network(UDN)can provide effective spectrum resources by deploying a large number of micro base stations.Furthermore,network slicing can support various applications in different communication scenarios.Therefore,this paper integrates the ultra-dense network slicing and the MEC technology,and introduces a hybrid computing offloading strategy in order to satisfy various quality of service(QoS)of edge devices.In order to dynamically allocate limited resources,the above problem is formulated as multiagent distributed deep reinforcement learning(DRL),which will achieve low overhead computation offloading strategy and real-time resource allocation decisions.In this context,federated learning is added to train DRL agents in a distributed manner,where each agent is dedicated to exploring actions composed of offloading decisions and allocating resources,so as to jointly optimize system delay and energy consumption.Simulation results show that the proposed learning algorithm has better performance compared with other strategies in literature.展开更多
This study proposes a lightweight rice disease detection model optimized for edge computing environments.The goal is to enhance the You Only Look Once(YOLO)v5 architecture to achieve a balance between real-time diagno...This study proposes a lightweight rice disease detection model optimized for edge computing environments.The goal is to enhance the You Only Look Once(YOLO)v5 architecture to achieve a balance between real-time diagnostic performance and computational efficiency.To this end,a total of 3234 high-resolution images(2400×1080)were collected from three major rice diseases Rice Blast,Bacterial Blight,and Brown Spot—frequently found in actual rice cultivation fields.These images served as the training dataset.The proposed YOLOv5-V2 model removes the Focus layer from the original YOLOv5s and integrates ShuffleNet V2 into the backbone,thereby resulting in both model compression and improved inference speed.Additionally,YOLOv5-P,based on PP-PicoDet,was configured as a comparative model to quantitatively evaluate performance.Experimental results demonstrated that YOLOv5-V2 achieved excellent detection performance,with an mAP 0.5 of 89.6%,mAP 0.5–0.95 of 66.7%,precision of 91.3%,and recall of 85.6%,while maintaining a lightweight model size of 6.45 MB.In contrast,YOLOv5-P exhibited a smaller model size of 4.03 MB,but showed lower performance with an mAP 0.5 of 70.3%,mAP 0.5–0.95 of 35.2%,precision of 62.3%,and recall of 74.1%.This study lays a technical foundation for the implementation of smart agriculture and real-time disease diagnosis systems by proposing a model that satisfies both accuracy and lightweight requirements.展开更多
文摘Milling Process Simulation is one of the important re search areas in manufacturing science. For the purpose of improving the prec ision of simulation and extending its usability, numerical algorithm is more and more used in the milling modeling areas. But simulative efficiency is decreasin g with increase of its complexity. As a result, application of the method is lim ited. Aimed at above question, high-efficient algorithm for milling process sim ulation is studied. It is important for milling process simulation’s applicatio n. Parallel computing is widely used to solve the large-scale computation question s. Its advantages include system flexibility, robust, high-efficient computing capability and high ratio of performance to price. With the development of compu ter network, utilizing the computing resource in the Internet, a virtual computi ng environment with powerful computing capability can be consisted by microc omputers, and the difficulty of building hardware environment which is used to s upport parallel computing is reduced. How to use network technology and parallel algorithm to improve simulative effic iency for milling forces simulation is investigated in the paper. In order to pr edict milling forces, a simplified local milling forces model is used in the pap er. End milling cutter is assumed to be divided by r number of differential elem ents along the axial direction of the cutter. For a given time, the total cuttin g forces can be obtained by summarizing the resultant cutting force produced by each differential cutter disc. Divide the whole simulative time into some segmen ts, send these program’s segments to microcomputers in the Internet and obtain the result of the program’s segments, all of the result of program’s segments a re composed the final result. For implementing the algorithm, a distributed Parallel computing framework is de signed in the paper. In the framework, web server plays a role of controller. Us ing Java RMI(remote method interface), the computing processes in computing serv er are called by web server. There are lots of control processes in web server a nd control the computing servers. The codes of simulative algorithm can be dynam ic sent to the computing servers, and milling forces at the different time are c omputed through utilizing the local computer’s resource. The results that are ca lculated by every computing servers are sent to the web server, and composed the final result. The framework can be used by different simulative algorithm. Comp ared with the algorithm running single machine, the efficiency of provided algor ithm is higher than that of single machine.
基金National Natural Science Foundation of China,Grant/Award Number:51979187。
文摘Parallel computing assigns the computing model to different processors on different devices and implements it simultaneously.Accordingly,it has broad applications in the numerical simulation of geotechnical engineering and underground engineering,of which models are always large-scale.With parallel computing,the computing time or the memory requirements will be reduced by splitting the original domain of the numerical model into many subdomains,which is thus named as the domain decomposition method.In this study,a cubic and equal volume domain decomposition strategy was utilized to realize the parallel computing on the distributed memory system of four-dimensional lattice spring model(4D-LSM)based on the message passing interface.With a more efficient communication strategy introduced,this study aimed at operating an one-billion-particle model on a supercomputer platform.The preprocessing procedure of the parallelized 4D-LSM was restructured and the particle generation strategy suitable for the supercomputer platform was employed to minimize the time consumption in preprocessing and calculation.On this basis,numerical calculations were performed on TianHe-3 prototype E class supercomputer at the National Supercomputer Center in Tianjin.Two fieldscale three-dimensional blasting wave propagation models were carried out,of which the numerical results verify the computing power and the advantage of the parallelized 4D-LSM in the simulation of large-scale three-dimension models.Subsequently,the time complexity and spatial complexity of 4D-LSM and other particle discrete element methods were analyzed.
文摘High-performance computing(HPC)refers to the ability to process data and perform complex calculations at high speeds.It is one of the most essential tools fueling the advancement of science and technology.
基金supported by the Key Research and Development Program of China(2021YFC3000704)Institute of Geophysics,China Earthquake Administration Grant DQJB23R18+1 种基金the USTC Research Funds of the Double First-Class Initiative(YD2080002012)NSFC Grant(U2239206)。
文摘Ambient noise tomography is an established technique in seismology,where calculating single-or ninecomponent noise cross-correlation functions(NCFs)is a fundamental first step.In this study,we introduced a novel CPU-GPU heterogeneous computing framework designed to significantly enhance the efficiency of computing 9-component NCFs from seismic ambient noise data.This framework not only accelerated the computational process by leveraging the Compute Unified Device Architecture(CUDA)but also improved the signal-to-noise ratio(SNR)through innovative stacking techniques,such as time-frequency domain phaseweighted stacking(tf-PWS).We validated the program using multiple datasets,confirming its superior computation speed,improved reliability,and higher signal-to-noise ratios for NCFs.Our comprehensive study provides detailed insights into optimizing the computational processes for noise cross-correlation functions,thereby enhancing the precision and efficiency of ambient noise imaging.
基金This work was supported by the National key R&D Program of China(2018YFB0203901)the National Natural Science Foundation of China under(Grant No.61772053)the fund of the State Key Laboratory of Software Development Environment(SKLSDE-2020ZX15).
文摘Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs.
基金supported by the National Natural Science Foundation of China(Grant No.31771466)the National Key R&D Program of China(Grant Nos.2018YFB0203903,2016YFC0503607,and 2016YFB0200300)+3 种基金the Transformation Project in Scientific and Technological Achievements of Qinghai,China(Grant No.2016-SF-127)the Special Project of Informatization of Chinese Academy of Sciences,China(Grant No.XXH13504-08)the Strategic Pilot Science and Technology Project of Chinese Academy of Sciences,China(Grant No.XDA12010000)the 100-Talents Program of Chinese Academy of Sciences,China(awarded to BN)
文摘Brain science accelerates the study of intelligence and behavior,contributes fundamental insights into human cognition,and offers prospective treatments for brain disease.Faced with the challenges posed by imaging technologies and deep learning computational models,big data and high-performance computing(HPC)play essential roles in studying brain function,brain diseases,and large-scale brain models or connectomes.We review the driving forces behind big data and HPC methods applied to brain science,including deep learning,powerful data analysis capabilities,and computational performance solutions,each of which can be used to improve diagnostic accuracy and research output.This work reinforces predictions that big data and HPC will continue to improve brain science by making ultrahigh-performance analysis possible,by improving data standardization and sharing,and by providing new neuromorphic insights.
文摘The publication of Tsinghua Science and Technology was started in 1996.Since then,it has been an international academic journal sponsored by Tsinghua University and published bimonthly.This journal aims at presenting the state-of-the-art scientific achievements in computer science and other IT fields.
文摘This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology.
文摘The increasing popularity of quantum computing has resulted in a considerable rise in demand for cloud quantum computing usage in recent years.Nevertheless,the rapid surge in demand for cloud-based quantum computing resources has led to a scarcity.In order to meet the needs of an increasing number of researchers,it is imperative to facilitate efficient and flexible access to computing resources in a cloud environment.In this paper,we propose a novel quantum computing paradigm,Virtual QPU(VQPU),which addresses this issue and enhances quantum cloud throughput with guaranteed circuit fidelity.The proposal introduces three innovative concepts:(1)The integration of virtualization technology into the field of quantum computing to enhance quantum cloud throughput.(2)The introduction of an asynchronous execution of circuits methodology to improve quantum computing flexibility.(3)The development of a virtual QPU allocation scheme for quantum tasks in a cloud environment to improve circuit fidelity.The concepts have been validated through the utilization of a self-built simulated quantum cloud platform.
基金supported by the National Key Research and Development Program of China under Grant 2022YFB3608300in part by the National Nature Science Foundation of China(NSFC)under Grants 62404050,U2341218,62574056,62204052。
文摘Organic electrochemical transistor(OECT)devices demonstrate great promising potential for reservoir computing(RC)systems,but their lack of tunable dynamic characteristics limits their application in multi-temporal scale tasks.In this study,we report an OECT-based neuromorphic device with tunable relaxation time(τ)by introducing an additional vertical back-gate electrode into a planar structure.The dual-gate design enablesτreconfiguration from 93 to 541 ms.The tunable relaxation behaviors can be attributed to the combined effects of planar-gate induced electrochemical doping and back-gateinduced electrostatic coupling,as verified by electrochemical impedance spectroscopy analysis.Furthermore,we used theτ-tunable OECT devices as physical reservoirs in the RC system for intelligent driving trajectory prediction,achieving a significant improvement in prediction accuracy from below 69%to 99%.The results demonstrate that theτ-tunable OECT shows a promising candidate for multi-temporal scale neuromorphic computing applications.
基金appreciation to the Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R384)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The cloud-fog computing paradigm has emerged as a novel hybrid computing model that integrates computational resources at both fog nodes and cloud servers to address the challenges posed by dynamic and heterogeneous computing networks.Finding an optimal computational resource for task offloading and then executing efficiently is a critical issue to achieve a trade-off between energy consumption and transmission delay.In this network,the task processed at fog nodes reduces transmission delay.Still,it increases energy consumption,while routing tasks to the cloud server saves energy at the cost of higher communication delay.Moreover,the order in which offloaded tasks are executed affects the system’s efficiency.For instance,executing lower-priority tasks before higher-priority jobs can disturb the reliability and stability of the system.Therefore,an efficient strategy of optimal computation offloading and task scheduling is required for operational efficacy.In this paper,we introduced a multi-objective and enhanced version of Cheeta Optimizer(CO),namely(MoECO),to jointly optimize the computation offloading and task scheduling in cloud-fog networks to minimize two competing objectives,i.e.,energy consumption and communication delay.MoECO first assigns tasks to the optimal computational nodes and then the allocated tasks are scheduled for processing based on the task priority.The mathematical modelling of CO needs improvement in computation time and convergence speed.Therefore,MoECO is proposed to increase the search capability of agents by controlling the search strategy based on a leader’s location.The adaptive step length operator is adjusted to diversify the solution and thus improves the exploration phase,i.e.,global search strategy.Consequently,this prevents the algorithm from getting trapped in the local optimal solution.Moreover,the interaction factor during the exploitation phase is also adjusted based on the location of the prey instead of the adjacent Cheetah.This increases the exploitation capability of agents,i.e.,local search capability.Furthermore,MoECO employs a multi-objective Pareto-optimal front to simultaneously minimize designated objectives.Comprehensive simulations in MATLAB demonstrate that the proposed algorithm obtains multiple solutions via a Pareto-optimal front and achieves an efficient trade-off between optimization objectives compared to baseline methods.
基金supported by the National Natural Science Foundation of China (Nos.22064020,22364022,and 22174125)the Applied Basic Research Foundation of Yunnan Province (Nos.202101AT070101 and 202201AT070029)。
文摘Developing a chiral material as versatile and universal chiral stationary phase(CSP) for chiral separation in diverse chromatographic techniques simultaneously is of great significance.In this study,we demonstrated for the first time that a chiral metal-organic cage(MOC),[Zn_(6)M_(4)],as a universal chiral recognition material for both multi-mode high-performance liquid chromatography(HPLC) and capillary gas chromatography(GC) enantioseparation.Two novel HPLC CSPs with different bonding arms(CSP-A with a cationic imidazolium bonding arm and CSP-B with an alkyl chain bonding arm) were prepared by clicking of functionalized chiral MOC [Zn_(6)M_(4)] onto thiolated silica via thiol-ene click chemistry.Meanwhile,a capillary GC column statically coated with the chiral MOC [Zn_(6)M_(4)] was also fabricated.The results showed that the chiral MOC exhibits excellent enantioselectivity not only in normal phase HPLC(NP-HPLC) and reversed phase(RP-HPLC) but also in GC,and various racemates were well separated,including alcohols,diols,esters,ketones,ethers,amines,and epoxides.Importantly,CSP-A and CSP-B are complementary to commercially available Chiralcel OD-H and Chiralpak AD-H columns in enantioseparation,which can separate some racemates that could not be or could not well be separated by the two widely used commercial columns,suggesting the great potential of the two prepared CSPs in enantioseparation.This work reveals that the chiral MOC is potential versatile chiral recognition materials for both HPLC and GC,and also paves the way to expand the potential applications of MOCs.
基金supported and funded by theDeanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(grant number IMSIU-DDRSP2503).
文摘In recent years,fog computing has become an important environment for dealing with the Internet of Things.Fog computing was developed to handle large-scale big data by scheduling tasks via cloud computing.Task scheduling is crucial for efficiently handling IoT user requests,thereby improving system performance,cost,and energy consumption across nodes in cloud computing.With the large amount of data and user requests,achieving the optimal solution to the task scheduling problem is challenging,particularly in terms of cost and energy efficiency.In this paper,we develop novel strategies to save energy consumption across nodes in fog computing when users execute tasks through the least-cost paths.Task scheduling is developed using modified artificial ecosystem optimization(AEO),combined with negative swarm operators,Salp Swarm Algorithm(SSA),in order to competitively optimize their capabilities during the exploitation phase of the optimal search process.In addition,the proposed strategy,Enhancement Artificial Ecosystem Optimization Salp Swarm Algorithm(EAEOSSA),attempts to find the most suitable solution.The optimization that combines cost and energy for multi-objective task scheduling optimization problems.The backpack problem is also added to improve both cost and energy in the iFogSim implementation as well.A comparison was made between the proposed strategy and other strategies in terms of time,cost,energy,and productivity.Experimental results showed that the proposed strategy improved energy consumption,cost,and time over other algorithms.Simulation results demonstrate that the proposed algorithm increases the average cost,average energy consumption,and mean service time in most scenarios,with average reductions of up to 21.15%in cost and 25.8%in energy consumption.
基金supported by the NSFC(12474071)Natural Science Foundation of Shandong Province(ZR2024YQ051,ZR2025QB50)+6 种基金Guangdong Basic and Applied Basic Research Foundation(2025A1515011191)the Shanghai Sailing Program(23YF1402200,23YF1402400)funded by Basic Research Program of Jiangsu(BK20240424)Open Research Fund of State Key Laboratory of Crystal Materials(KF2406)Taishan Scholar Foundation of Shandong Province(tsqn202408006,tsqn202507058)Young Talent of Lifting engineering for Science and Technology in Shandong,China(SDAST2024QTB002)the Qilu Young Scholar Program of Shandong University。
文摘As emerging two-dimensional(2D)materials,carbides and nitrides(MXenes)could be solid solutions or organized structures made up of multi-atomic layers.With remarkable and adjustable electrical,optical,mechanical,and electrochemical characteristics,MXenes have shown great potential in brain-inspired neuromorphic computing electronics,including neuromorphic gas sensors,pressure sensors and photodetectors.This paper provides a forward-looking review of the research progress regarding MXenes in the neuromorphic sensing domain and discussed the critical challenges that need to be resolved.Key bottlenecks such as insufficient long-term stability under environmental exposure,high costs,scalability limitations in large-scale production,and mechanical mismatch in wearable integration hinder their practical deployment.Furthermore,unresolved issues like interfacial compatibility in heterostructures and energy inefficiency in neu-romorphic signal conversion demand urgent attention.The review offers insights into future research directions enhance the fundamental understanding of MXene properties and promote further integration into neuromorphic computing applications through the convergence with various emerging technologies.
基金supported by the NSFC(12474071)Natural Science Foundation of Shandong Province(ZR2024YQ051)+5 种基金Open Research Fund of State Key Laboratory of Materials for Integrated Circuits(SKLJC-K2024-12)the Shanghai Sailing Program(23YF1402200,23YF1402400)Natural Science Foundation of Jiangsu Province(BK20240424)Taishan Scholar Foundation of Shandong Province(tsqn202408006)Young Talent of Lifting engineering for Science and Technology in Shandong,China(SDAST2024QTB002)the Qilu Young Scholar Program of Shandong University.
文摘The advancement of flexible memristors has significantly promoted the development of wearable electronic for emerging neuromorphic computing applications.Inspired by in-memory computing architecture of human brain,flexible memristors exhibit great application potential in emulating artificial synapses for highefficiency and low power consumption neuromorphic computing.This paper provides comprehensive overview of flexible memristors from perspectives of development history,material system,device structure,mechanical deformation method,device performance analysis,stress simulation during deformation,and neuromorphic computing applications.The recent advances in flexible electronics are summarized,including single device,device array and integration.The challenges and future perspectives of flexible memristor for neuromorphic computing are discussed deeply,paving the way for constructing wearable smart electronics and applications in large-scale neuromorphic computing and high-order intelligent robotics.
基金financially supported by the National Natural Science Foundation of China(Grant No.12172093)the Guangdong Basic and Applied Basic Research Foundation(Grant No.2021A1515012607)。
文摘High-entropy oxides(HEOs)have emerged as a promising class of memristive materials,characterized by entropy-stabilized crystal structures,multivalent cation coordination,and tunable defect landscapes.These intrinsic features enable forming-free resistive switching,multilevel conductance modulation,and synaptic plasticity,making HEOs attractive for neuromorphic computing.This review outlines recent progress in HEO-based memristors across materials engineering,switching mechanisms,and synaptic emulation.Particular attention is given to vacancy migration,phase transitions,and valence-state dynamics—mechanisms that underlie the switching behaviors observed in both amorphous and crystalline systems.Their relevance to neuromorphic functions such as short-term plasticity and spike-timing-dependent learning is also examined.While encouraging results have been achieved at the device level,challenges remain in conductance precision,variability control,and scalable integration.Addressing these demands a concerted effort across materials design,interface optimization,and task-aware modeling.With such integration,HEO memristors offer a compelling pathway toward energy-efficient and adaptable brain-inspired electronics.
基金supported in part by the Chinese Academy of Sciences(No.XDA0330302)NSFC program(No.22127901)。
文摘Neuromorphic devices have garnered significant attention as potential building blocks for energy-efficient hardware systems owing to their capacity to emulate the computational efficiency of the brain.In this regard,reservoir computing(RC)framework,which leverages straightforward training methods and efficient temporal signal processing,has emerged as a promising scheme.While various physical reservoir devices,including ferroelectric,optoelectronic,and memristor-based systems,have been demonstrated,many still face challenges related to compatibility with mainstream complementary metal oxide semiconductor(CMOS)integration processes.This study introduced a silicon-based schottky barrier metal-oxide-semiconductor field effect transistor(SB-MOSFET),which was fabricated under low thermal budget and compatible with back-end-of-line(BEOL).The device demonstrated short-term memory characteristics,facilitated by the modulation of schottky barriers and charge trapping.Utilizing these characteristics,a RC system for temporal data processing was constructed,and its performance was validated in a 5×4 digital classification task,achieving an accuracy exceeding 98%after 50 training epochs.Furthermore,the system successfully processed temporal signal in waveform classification and prediction tasks using time-division multiplexing.Overall,the SB-MOSFET's high compatibility with CMOS technology provides substantial advantages for large-scale integration,enabling the development of energy-efficient reservoir computing hardware.
文摘Nowadays,advances in communication technology and cloud computing have spawned a variety of smart mobile devices,which will generate a great amount of computing-intensive businesses,and require corresponding resources of computation and communication.Multiaccess edge computing(MEC)can offload computing-intensive tasks to the nearby edge servers,which alleviates the pressure of devices.Ultra-dense network(UDN)can provide effective spectrum resources by deploying a large number of micro base stations.Furthermore,network slicing can support various applications in different communication scenarios.Therefore,this paper integrates the ultra-dense network slicing and the MEC technology,and introduces a hybrid computing offloading strategy in order to satisfy various quality of service(QoS)of edge devices.In order to dynamically allocate limited resources,the above problem is formulated as multiagent distributed deep reinforcement learning(DRL),which will achieve low overhead computation offloading strategy and real-time resource allocation decisions.In this context,federated learning is added to train DRL agents in a distributed manner,where each agent is dedicated to exploring actions composed of offloading decisions and allocating resources,so as to jointly optimize system delay and energy consumption.Simulation results show that the proposed learning algorithm has better performance compared with other strategies in literature.
文摘This study proposes a lightweight rice disease detection model optimized for edge computing environments.The goal is to enhance the You Only Look Once(YOLO)v5 architecture to achieve a balance between real-time diagnostic performance and computational efficiency.To this end,a total of 3234 high-resolution images(2400×1080)were collected from three major rice diseases Rice Blast,Bacterial Blight,and Brown Spot—frequently found in actual rice cultivation fields.These images served as the training dataset.The proposed YOLOv5-V2 model removes the Focus layer from the original YOLOv5s and integrates ShuffleNet V2 into the backbone,thereby resulting in both model compression and improved inference speed.Additionally,YOLOv5-P,based on PP-PicoDet,was configured as a comparative model to quantitatively evaluate performance.Experimental results demonstrated that YOLOv5-V2 achieved excellent detection performance,with an mAP 0.5 of 89.6%,mAP 0.5–0.95 of 66.7%,precision of 91.3%,and recall of 85.6%,while maintaining a lightweight model size of 6.45 MB.In contrast,YOLOv5-P exhibited a smaller model size of 4.03 MB,but showed lower performance with an mAP 0.5 of 70.3%,mAP 0.5–0.95 of 35.2%,precision of 62.3%,and recall of 74.1%.This study lays a technical foundation for the implementation of smart agriculture and real-time disease diagnosis systems by proposing a model that satisfies both accuracy and lightweight requirements.