期刊文献+
共找到633篇文章
< 1 2 32 >
每页显示 20 50 100
Efficient rock joint detection from large-scale 3D point clouds using vectorization and parallel computing approaches
1
作者 Yunfeng Ge Zihao Li +2 位作者 Huiming Tang Qian Chen Zhongxu Wen 《Geoscience Frontiers》 2025年第5期1-15,共15页
The application of three-dimensional(3D)point cloud parametric analyses on exposed rock surfaces,enabled by Light Detection and Ranging(LiDAR)technology,has gained significant popularity due to its efficiency and the ... The application of three-dimensional(3D)point cloud parametric analyses on exposed rock surfaces,enabled by Light Detection and Ranging(LiDAR)technology,has gained significant popularity due to its efficiency and the high quality of data it provides.However,as research extends to address more regional and complex geological challenges,the demand for algorithms that are both robust and highly efficient in processing large datasets continues to grow.This study proposes an advanced rock joint identification algorithm leveraging artificial neural networks(ANNs),incorporating parallel computing and vectorization of high-performance computing.The algorithm utilizes point cloud attributes—specifically point normal and point curvatures-as input parameters for ANNs,which classify data into rock joints and non-rock joints.Subsequently,individual rock joints are extracted using the density-based spatial clustering of applications with noise(DBSCAN)technique.Principal component analysis(PCA)is subsequently employed to calculate their orientations.By fully utilizing the computational power of parallel computing and vectorization,the algorithm increases the running speed by 3–4 times,enabling the processing of large-scale datasets within seconds.This breakthrough maximizes computational efficiency while maintaining high accuracy(compared with manual measurement,the deviation of the automatic measurement is within 2°),making it an effective solution for large-scale rock joint detection challenges.©2025 China University of Geosciences(Beijing)and Peking University. 展开更多
关键词 Rock joints Pointclouds Artificialneuralnetwork high-performance computing Parallel computing VECTORIZATION
在线阅读 下载PDF
Analog Optical Computing for Artificial Intelligence 被引量:13
2
作者 Jiamin Wu Xing Lin +4 位作者 Yuchen Guo Junwei Liu Lu Fang Shuming Jiao Qionghai Dai 《Engineering》 SCIE EI 2022年第3期133-145,共13页
The rapid development of artificial intelligence(AI)facilitates various applications from all areas but also poses great challenges in its hardware implementation in terms of speed and energy because of the explosive ... The rapid development of artificial intelligence(AI)facilitates various applications from all areas but also poses great challenges in its hardware implementation in terms of speed and energy because of the explosive growth of data.Optical computing provides a distinctive perspective to address this bottleneck by harnessing the unique properties of photons including broad bandwidth,low latency,and high energy efficiency.In this review,we introduce the latest developments of optical computing for different AI models,including feedforward neural networks,reservoir computing,and spiking neural networks(SNNs).Recent progress in integrated photonic devices,combined with the rise of AI,provides a great opportunity for the renaissance of optical computing in practical applications.This effort requires multidisciplinary efforts from a broad community.This review provides an overview of the state-of-the-art accomplishments in recent years,discusses the availability of current technologies,and points out various remaining challenges in different aspects to push the frontier.We anticipate that the era of large-scale integrated photonics processors will soon arrive for practical AI applications in the form of hybrid optoelectronic frameworks. 展开更多
关键词 Artificial intelligence Optical computing Opto-electronic framework Neural network Neuromorphic computing Reservoir computing Photonics processor
在线阅读 下载PDF
MatDEM-fast matrix computing of the discrete element method 被引量:8
3
作者 Chun Liu Hui Liu Hongyong Zhang 《Earthquake Research Advances》 CSCD 2021年第3期1-7,共7页
Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperfo... Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperformance discrete element software MatDEM may handle millions of elements in one computer,and enables the discrete element simulation at the engineering scale.It supports heat calculation,multi-field and fluidsolid coupling numerical simulations.Furthermore,the software integrates pre-processing,solver,postprocessing,and powerful secondary development,allowing recompiling new discrete element software.The basic principles of the DEM,the implement and development of the MatDEM software,and its applications are introduced in this paper.The software and sample source code are available online(http://matdem.com). 展开更多
关键词 Discrete element method high-performance MatDEM Matrix computing
在线阅读 下载PDF
Optimization Task Scheduling Using Cooperation Search Algorithm for Heterogeneous Cloud Computing Systems 被引量:2
4
作者 Ahmed Y.Hamed M.Kh.Elnahary +1 位作者 Faisal S.Alsubaei Hamdy H.El-Sayed 《Computers, Materials & Continua》 SCIE EI 2023年第1期2133-2148,共16页
Cloud computing has taken over the high-performance distributed computing area,and it currently provides on-demand services and resource polling over the web.As a result of constantly changing user service demand,the ... Cloud computing has taken over the high-performance distributed computing area,and it currently provides on-demand services and resource polling over the web.As a result of constantly changing user service demand,the task scheduling problem has emerged as a critical analytical topic in cloud computing.The primary goal of scheduling tasks is to distribute tasks to available processors to construct the shortest possible schedule without breaching precedence restrictions.Assignments and schedules of tasks substantially influence system operation in a heterogeneous multiprocessor system.The diverse processes inside the heuristic-based task scheduling method will result in varying makespan in the heterogeneous computing system.As a result,an intelligent scheduling algorithm should efficiently determine the priority of every subtask based on the resources necessary to lower the makespan.This research introduced a novel efficient scheduling task method in cloud computing systems based on the cooperation search algorithm to tackle an essential task and schedule a heterogeneous cloud computing problem.The basic idea of thismethod is to use the advantages of meta-heuristic algorithms to get the optimal solution.We assess our algorithm’s performance by running it through three scenarios with varying numbers of tasks.The findings demonstrate that the suggested technique beats existingmethods NewGenetic Algorithm(NGA),Genetic Algorithm(GA),Whale Optimization Algorithm(WOA),Gravitational Search Algorithm(GSA),and Hybrid Heuristic and Genetic(HHG)by 7.9%,2.1%,8.8%,7.7%,3.4%respectively according to makespan. 展开更多
关键词 Heterogeneous processors cooperation search algorithm task scheduling cloud computing
在线阅读 下载PDF
Design and implementation of near-memory computing array architecture based on shared buffer 被引量:2
5
作者 SHAN Rui GAO Xu +3 位作者 FENG Yani HUI Chao CUI Xinyue CHAI Miaomiao 《High Technology Letters》 EI CAS 2022年第4期345-353,共9页
Deep learning algorithms have been widely used in computer vision,natural language processing and other fields.However,due to the ever-increasing scale of the deep learning model,the requirements for storage and compu... Deep learning algorithms have been widely used in computer vision,natural language processing and other fields.However,due to the ever-increasing scale of the deep learning model,the requirements for storage and computing performance are getting higher and higher,and the processors based on the von Neumann architecture have gradually exposed significant shortcomings such as consumption and long latency.In order to alleviate this problem,large-scale processing systems are shifting from a traditional computing-centric model to a data-centric model.A near-memory computing array architecture based on the shared buffer is proposed in this paper to improve system performance,which supports instructions with the characteristics of store-calculation integration,reducing the data movement between the processor and main memory.Through data reuse,the processing speed of the algorithm is further improved.The proposed architecture is verified and tested through the parallel realization of the convolutional neural network(CNN)algorithm.The experimental results show that at the frequency of 110 MHz,the calculation speed of a single convolution operation is increased by 66.64%on average compared with the CNN architecture that performs parallel calculations on field programmable gate array(FPGA).The processing speed of the whole convolution layer is improved by 8.81%compared with the reconfigurable array processor that does not support near-memory computing. 展开更多
关键词 near-memory computing shared buffer reconfigurable array processor convolutional neural network(CNN)
在线阅读 下载PDF
Challenges and reflections on exascale computing
6
作者 Yang Xuejun 《Engineering Sciences》 EI 2014年第3期17-22,共6页
This paper introduces the development of the exascale (10^18) computing. Though exascalc computing is a hot research direction worldwide, we are facing many challenges in the areas of memory wall, communica- tion wa... This paper introduces the development of the exascale (10^18) computing. Though exascalc computing is a hot research direction worldwide, we are facing many challenges in the areas of memory wall, communica- tion wall, reliability wall, power wall and scalability of parallel computing. According to these challenges, some thoughts and strategies are proposed. 展开更多
关键词 EXASCALE computing CENTRAL processing unit (CPU) storage WALL HETEROGENEOUS processor
在线阅读 下载PDF
Task Scheduling Optimization in Cloud Computing by Rao Algorithm
7
作者 A.Younes M.KhElnahary +1 位作者 Monagi H.Alkinani Hamdy H.El-Sayed 《Computers, Materials & Continua》 SCIE EI 2022年第9期4339-4356,共18页
Cloud computing is currently dominated within the space of highperformance distributed computing and it provides resource polling and ondemand services through the web.So,task scheduling problem becomes a very importa... Cloud computing is currently dominated within the space of highperformance distributed computing and it provides resource polling and ondemand services through the web.So,task scheduling problem becomes a very important analysis space within the field of a cloud computing environment as a result of user’s services demand modification dynamically.The main purpose of task scheduling is to assign tasks to available processors to produce minimum schedule length without violating precedence restrictions.In heterogeneous multiprocessor systems,task assignments and schedules have a significant impact on system operation.Within the heuristic-based task scheduling algorithm,the different processes will lead to a different task execution time(makespan)on a heterogeneous computing system.Thus,a good scheduling algorithm should be able to set precedence efficiently for every subtask depending on the resources required to reduce(makespan).In this paper,we propose a new efficient task scheduling algorithm in cloud computing systems based on RAO algorithm to solve an important task and schedule a heterogeneous multiple processing problem.The basic idea of this process is to exploit the advantages of heuristic-based algorithms to reduce space search and time to get the best solution.We evaluate our algorithm’s performance by applying it to three examples with a different number of tasks and processors.The experimental results show that the proposed approach significantly succeeded in finding the optimal solutions than others in terms of the time of task implementation. 展开更多
关键词 Heterogeneous processors RAO algorithm heuristic algorithms task scheduling MULTIPROCESSING cloud computing
在线阅读 下载PDF
Evaluation of the Application Benefit of Meteorological High Performance Computing Resources
8
作者 Min Wei Bin Wang 《Journal of Geoscience and Environment Protection》 2017年第7期153-160,共8页
The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application... The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application of meteorological high-performance computing resources can not only provide reference for the optimization of active resources, but also provide a quantitative basis for future resource construction and planning. In this paper, the concept of the utility value B and index compliance rate E of the meteorological high performance computing system are presented. The evaluation process, evaluation index and calculation method of the high performance computing resource application benefits are introduced. 展开更多
关键词 high-performance computing RESOURCES RESOURCE Application BENEFIT EVALUATION BENEFIT Value
暂未订购
API Development Increases Access to Shared Computing Resources at Boston University
9
作者 George Jones Amanda E. Wakefield +4 位作者 Jeff Triplett Kojo Idrissa James Goebel Dima Kozakov Sandor Vajda 《Journal of Software Engineering and Applications》 2022年第6期197-207,共11页
Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must... Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must be able to access computing resources. One way to provide such resources is through High-Performance Computing (HPC) centers. Many academic research institutions offer their own HPC Centers but struggle to make the computing resources easily accessible and user-friendly. Here we present SHABU, a RESTful Web API framework that enables S & E communities to access resources from Boston University’s Shared Computing Center (SCC). The SHABU requirements are derived from the use cases described in this work. 展开更多
关键词 API Framework Open Source high-performance computing Software Architecture Science and Engineering
在线阅读 下载PDF
Zuchongzhi-3 Sets New Benchmark with 105-Qubit Superconducting Quantum Processor
10
作者 LIU Danxu GE Shuyun WU Yuyang 《Bulletin of the Chinese Academy of Sciences》 2025年第1期55-56,共2页
A team of researchers from the University of Science and Technology of China(USTC)of the Chinese Academy of Sciences(CAS)and its partners have made significant advancements in random quantum circuit sampling with Zuch... A team of researchers from the University of Science and Technology of China(USTC)of the Chinese Academy of Sciences(CAS)and its partners have made significant advancements in random quantum circuit sampling with Zuchongzhi-3,a superconducting quantum computing prototype featuring 105 qubits and 182 couplers. 展开更多
关键词 quantum circuit sampling superconducting quantum computing prototype zuchongzhi superconducting quantum processor QUBITS COUPLERS
在线阅读 下载PDF
SW-DDFT: Parallel Optimization of the Dynamical Density Functional Theory Algorithm Based on Sunway Bluelight II Supercomputer
11
作者 Xiaoguang Lv Tao Liu +5 位作者 Han Qin Ying Guo Jingshan Pan Dawei Zhao Xiaoming Wu Meihong Yang 《Computers, Materials & Continua》 2025年第7期1417-1436,共20页
The Dynamical Density Functional Theory(DDFT)algorithm,derived by associating classical Density Functional Theory(DFT)with the fundamental Smoluchowski dynamical equation,describes the evolution of inhomo-geneous flui... The Dynamical Density Functional Theory(DDFT)algorithm,derived by associating classical Density Functional Theory(DFT)with the fundamental Smoluchowski dynamical equation,describes the evolution of inhomo-geneous fluid density distributions over time.It plays a significant role in studying the evolution of density distributions over time in inhomogeneous systems.The Sunway Bluelight II supercomputer,as a new generation of China’s developed supercomputer,possesses powerful computational capabilities.Porting and optimizing industrial software on this platform holds significant importance.For the optimization of the DDFT algorithm,based on the Sunway Bluelight II supercomputer and the unique hardware architecture of the SW39000 processor,this work proposes three acceleration strategies to enhance computational efficiency and performance,including direct parallel optimization,local-memory constrained optimization for CPEs,and multi-core groups collaboration and communication optimization.This method combines the characteristics of the program’s algorithm with the unique hardware architecture of the Sunway Bluelight II supercomputer,optimizing the storage and transmission structures to achieve a closer integration of software and hardware.For the first time,this paper presents Sunway-Dynamical Density Functional Theory(SW-DDFT).Experimental results show that SW-DDFT achieves a speedup of 6.67 times within a single-core group compared to the original DDFT implementation,with six core groups(a total of 384 CPEs),the maximum speedup can reach 28.64 times,and parallel efficiency can reach 71%,demonstrating excellent acceleration performance. 展开更多
关键词 Sunway supercomputer high-performance computing dynamical density functional theory parallel optimization
在线阅读 下载PDF
基于指令串行融合的RISC-V向量处理器计算方法
12
作者 李凯歌 高鑫 杨孟飞 《微电子学与计算机》 2026年第3期155-163,共9页
在传统冯诺依曼计算机架构中,卷积神经网络、矩阵计算与快速傅里叶变换等算法存在频繁的数据重用,导致向量处理器流水线中产生大量写后读指令,易引发数据冲突。同时,数据在向量寄存器和计算单元之间的反复传输带来了显著的功耗开销。针... 在传统冯诺依曼计算机架构中,卷积神经网络、矩阵计算与快速傅里叶变换等算法存在频繁的数据重用,导致向量处理器流水线中产生大量写后读指令,易引发数据冲突。同时,数据在向量寄存器和计算单元之间的反复传输带来了显著的功耗开销。针对上述问题,提出了一种面向向量计算的数据冲突解决机制。通过利用数据重用减少数据流动,从而降低计算芯片功耗。将该方法应用于RISC-V向量处理器的仿真实验表明:在128×128矩阵乘法计算时,整体芯片功耗降低约5.8%;在计算神经卷积网络算法时,功耗降低约6.2%。该方法具有轻量化的特点,所引入的面积开销可忽略不计。 展开更多
关键词 RISC-V 向量处理器 矩阵计算 能效
在线阅读 下载PDF
New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques 被引量:4
13
作者 Qiuhua LIANG Luke SMITH Xilin XIA 《Journal of Hydrodynamics》 SCIE EI CSCD 2016年第6期977-985,共9页
In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this... In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this work demonstrates the promise of modern high-performance computing technology to achieve real-time flood modeling at a regional scale. The software is implemented for high-performance heterogeneous computing using the OpenCL programming framework, and developed to support simulations across multiple GPUs using a domain decomposition technique and across multiple systems through an efficient implementation of the Message Passing Interface (MPI) standard. The software is applied for a convective storm induced flood event in Newcastle upon Tyne, demonstrating high computational performance across a GPU cluster, and good agreement against crowd- sourced observations. Issues relating to data availability, complex urban topography and differences in drainage capacity affect results for a small number of areas. 展开更多
关键词 computational hydraulics high-performance computing flood modeling shallow water equations shock-capttLring hydrodynamic model
原文传递
Image processing algorithm acceleration using reconfigurable macro processor model 被引量:2
14
作者 SunGuanKfu ChenHuaming LuHuanzhang 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2004年第2期110-114,共5页
The concept and advantage of reconfigurable technology is introduced. A kind of processor architecture of re configurable macro processor (RMP) model based on FPGA array and DSP is put forward and has been implemented... The concept and advantage of reconfigurable technology is introduced. A kind of processor architecture of re configurable macro processor (RMP) model based on FPGA array and DSP is put forward and has been implemented. Two image algorithms are developed: template-based automatic target recognition and zone labeling. One is estimating for motion direction in the infrared image background, another is line picking-up algorithm based on image zone labeling and phase grouping technique. It is a kind of 'hardware' function that can be called by the DSP in high-level algorithm. It is also a kind of hardware algorithm of the DSP. The results of experiments show the reconfigurable computing technology based on RMP is an ideal accelerating means to deal with the high-speed image processing tasks. High real time performance is obtained in our two applications on RMP. 展开更多
关键词 real-time image processing reconfigurable computing technology reconfigurable macro processor model template matching image zone labeling.
在线阅读 下载PDF
The use of high-performance and high-throughput computing for the fertilization of digital earth and global change studies 被引量:2
15
作者 Yong Xue Dominic Palmer-Brown Huadong Guo 《International Journal of Digital Earth》 SCIE 2011年第3期185-210,共26页
The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions... The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions;(3)the factors driving these components;(4)the history of global change and the projection of future change;and(5)how knowledge about global environmental variability and change can be applied to present-day and future decision-making.This paper addresses the use of high-performance computing and high-throughput computing for a global change study on the Digital Earth(DE)platform.Two aspects of the use of high-performance computing(HPC)/high-throughput computing(HTC)on the DE platform are the processing of data from all sources,especially Earth observation data,and the simulation of global change models.The HPC/HTC is an essential and efficient tool for the processing of vast amounts of global data,especially Earth observation data.The current trend involves running complex global climate models using potentially millions of personal computers to achieve better climate change predictions than would ever be possible using the supercomputers currently available to scientists. 展开更多
关键词 high-performance computing(HPC) high-throughput computing(HTC) digital earth global change climate change Earth observation grid computing
原文传递
Ad Hoc File Systems for High-Performance Computing 被引量:1
16
作者 AndréBrinkmann Kathryn Mohror +7 位作者 Weikuan Yu Philip Carns Toni Cortes Scott A.Klasky Alberto Miranda Franz-Josef Pfreundt Robert B.Ross Marc-AndréVef 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第1期4-26,共23页
Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed withi... Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed within compute nodes.Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task,and most scientists therefore do not take advantage of the faster storage media.One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes,which serve as temporary storage systems for single applications or longer-running campaigns.This paper presents results from the Dagstuhl Seminar 17202"Challenges and Opportunities of User-Level File Systems for HPC"and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media.The discussion includes open research questions,such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems.Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility.Various interfaces and semantics are presented,for example those used by the three ad hoc file systems BeeOND,GekkoFS,and BurstFS.Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices. 展开更多
关键词 parallel architectures distributed FILE SYSTEM high-performance computing BURST BUFFER POSIX(portable operating SYSTEM interface)
原文传递
GCSS:a global collaborative scheduling strategy for wide-area high-performance computing 被引量:1
17
作者 Yao SONG Limin XIAO +4 位作者 Liang WANG Guangjun QIN Bing WEI Baicheng YAN Chenhao ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第5期1-15,共15页
Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage reso... Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs. 展开更多
关键词 high-performance computing scheduling strategy task scheduling data placement
原文传递
Overall plan and design of the task management system of ternary optical computer 被引量:3
18
作者 宋凯 金翊 《Journal of Shanghai University(English Edition)》 CAS 2011年第5期467-472,共6页
t In this paper an overall scheme of the task management system of ternary optical computer (TOC) is proposed, and the software architecture chart is given. The function and accomplishment of each module in the syst... t In this paper an overall scheme of the task management system of ternary optical computer (TOC) is proposed, and the software architecture chart is given. The function and accomplishment of each module in the system are described in general. In addition, according to the aforementioned scheme a prototype of TOC task management system is implemented, and the feasibility, rationality and completeness of the scheme are verified via running and testing the prototype. 展开更多
关键词 ternary optical computer (TOC) task management system overall plan task scheduling processor resource allocation
在线阅读 下载PDF
Mochi: Composing Data Services for High-Performance Computing Environments
19
作者 Robert BRoss George Amvrosiadis +14 位作者 Philip Carns Charles DCranor Matthieu Dorier Kevin Harms Greg Ganger Garth Gibson Samuel KGutierrez Robert Latham Bob Robey Dana Robinson Bradley Settlemyer Galen Shipman Shane Snyder Jerome Soumagne Qing Zheng 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第1期121-144,共24页
Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platfo... Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work. 展开更多
关键词 STORAGE and I/O DATA-INTENSIVE computing distributed SERVICES high-performance computing
原文传递
FTRP:a new fault tolerance framework using process replication and prefetching for high-performance computing
20
作者 Wei HU Guang-ming LIU Yan-huang JIANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2018年第10期1273-1290,共18页
As the scale of supercomputers rapidly grows, the reliability problem dominates the system availability. Existing fault tolerance mechanisms, such as periodic checkpointing and process redundancy, cannot effectively f... As the scale of supercomputers rapidly grows, the reliability problem dominates the system availability. Existing fault tolerance mechanisms, such as periodic checkpointing and process redundancy, cannot effectively fix this problem. To address this issue, we present a new fault tolerance framework using process replication and prefetching (FTRP), combining the benefits of proactive and reactive mechanisms. FTRP incorporates a novel cost model and a new proactive fault tolerance mechanism to improve the application execution efficiency. The novel cost model, called the 'work-most' (WM) model, makes runtime decisions to adaptively choose an action from a set of fault tolerance mechanisms based on failure prediction results and application status. Similar to program locality, we observe the failure locality phenomenon in supercomputers for the first time. In the new proactive fault tolerance mechanism, process replication with process prefetching is proposed based on the failure locality, significantly avoiding losses caused by the failures regardless of whether they have been predicted. Simulations with real failure traces demonstrate that the FTRP framework outperforms existing fault tolerance mechanisms with up to 10% improvement in application efficiency for common failure prediction accuracy, and is effective for petascale systems and beyond. 展开更多
关键词 high-performance computing PROACTIVE fault tolerance Failure LOCALITY PROCESS REPLICATION PROCESS PREFETCHING
原文传递
上一页 1 2 32 下一页 到第
使用帮助 返回顶部