期刊文献+
共找到11篇文章
< 1 >
每页显示 20 50 100
Adapting Backward Error Recovery to Parallel Real Time Systems
1
作者 周笛 《Journal of Computer Science & Technology》 SCIE EI CSCD 1992年第3期257-267,共11页
The problem of adapting backward error recovery to parallel real time systems is discussed in this paper. Because of error propagation among different cooperating processes, an error occurring in one process may influ... The problem of adapting backward error recovery to parallel real time systems is discussed in this paper. Because of error propagation among different cooperating processes, an error occurring in one process may influence some important outputs in other processes. Therefore, a local output has to be delayed until its validity is confirmed globally. Since backward error recovery adopts redundancy of computing time instead of processing equipment, the variation of the actual execution time of a cooperating process may be very large if it works in an unreliable environment. These problems are the primary obstacles to be removed. Previous studies focus their attentions on how to eliminate domino-effect dynamically. But backward error recovery cannot be applied directly in parallel real time systems even under the condition that no domino-effect exists. How to reduce output delays efficiently if no domino-effect remains? How to estimate this delay time? How to calculate the actual execution time of every process and how to schedule these processes under an unstable condition? These problems were omitted in literature unfortunately. The interest of this paper is to provide satisfactory solutions to these problems to make it possible to adopt backward error recovery efficiently in parallel real time systems. 展开更多
关键词 very TIME Adapting Backward error recovery to Parallel Real Time Systems REAL
原文传递
A Distributed Error Recovery Technique and Its Implementation and Application on UNIX
2
作者 周笛 徐向文 《Journal of Computer Science & Technology》 SCIE EI CSCD 1990年第2期127-138,共12页
This paper presents a checkpoint setting technique to eliminate domino effect in backward recovery in distributed systems,which is very efficient,powerful,widely applicable and easy to be implememted.Besides theoretic... This paper presents a checkpoint setting technique to eliminate domino effect in backward recovery in distributed systems,which is very efficient,powerful,widely applicable and easy to be implememted.Besides theoretical analysis,an implementation on UNIX system and a package for software fault-tolerance are in- troduced.Then the problems of checkpoint management and process termination are discussed. 展开更多
关键词 Pro very A Distributed error recovery Technique and Its Implementation and Application on UNIX
原文传递
Information loss recovery for JPEG2000 image transmission in an error-prone environment
3
作者 刘洁瑜 张德运 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2008年第3期430-435,共6页
Information loss recovery techniques are important for transmitting images over error-prone channels at the decoder. A novel error recovery scheme for JPEG2000 image is presented in this paper, which adopts different ... Information loss recovery techniques are important for transmitting images over error-prone channels at the decoder. A novel error recovery scheme for JPEG2000 image is presented in this paper, which adopts different techniques for the lowest frequency coefficients and high frequency coefficients in the wavelet domain. The low-frequency recovery algorithm was implemented by adopting the watermarking technique and the packet structure of JPEG2000. The low-frequency eoefficients taken as the hidden data were extracted from the compressed bit stream, and then were embedded back into the bit stream itself prior to transmission. The embedded data were used to recover the information loss. High-frequency reconstruction was performed in bitplane base. The damaged bitplanes were recovered according to the correlation in the wavelet subband structure and by using the algorithm based on the horizontal and vertical edge detection. Experiments verified the effectiveness of these algorithms. 展开更多
关键词 JPEG2000 error recovery image transmission bitplane
在线阅读 下载PDF
Error Recovery in a Real-Time Multiprocessor System
4
作者 李卫华 袁由光 《Journal of Computer Science & Technology》 SCIE EI CSCD 1992年第1期83-87,共5页
In this paper,a new scheme for recovering errors due to transient faults in a real-time multiprocessor system is presented.The scheme,called dynamic redundancy at the task level,is implemented in a real-time multitask... In this paper,a new scheme for recovering errors due to transient faults in a real-time multiprocessor system is presented.The scheme,called dynamic redundancy at the task level,is implemented in a real-time multitasking environment.Utilizing the facilities in the operating system,the scheme makes backup tasks for the primary tasks as redundancy.The paper introduces an algorithm to gene- rate a fault tolerant schedule for the tasks so that they recover errors as retry or checkpointing does.A reliability model is proposed to evahaste the effectiveness of the scheme. 展开更多
关键词 error recovery in a Real-Time Multiprocessor System
原文传递
Moving Least Squares Interpolation Based A-Posteriori Error Technique in Finite Element Elastic Analysis 被引量:1
5
作者 Mohd Ahmed Devender Singh +1 位作者 Saeed Al Qadhi Nguyen Viet Thanh 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第10期167-189,共23页
The performance of a-posteriori error methodology based on moving least squares(MLS)interpolation is explored in this paper by varying the finite element error recovery parameters,namely recovery points and field vari... The performance of a-posteriori error methodology based on moving least squares(MLS)interpolation is explored in this paper by varying the finite element error recovery parameters,namely recovery points and field variable derivatives recovery.The MLS interpolation based recovery technique uses the weighted least squares method on top of the finite element method’s field variable derivatives solution to build a continuous field variable derivatives approximation.The boundary of the node support(mesh free patch of influenced nodes within a determined distance)is taken as circular,i.e.,circular support domain constructed using radial weights is considered.The field variable derivatives(stress and strains)are recovered at two kinds of points in the support domain,i.e.,Gauss points(super-convergent stress locations)and nodal points.The errors are computed as the difference between the stress from the finite element results and projected stress from the post-processed energy norm at both elemental and global levels.The benchmark numerical tests using quadrilateral and triangular meshes measure the finite element errors in strain and stress fields.The numerical examples showed the support domain-based recovery technique’s capabilities for effective and efficient error estimation in the finite element analysis of elastic problems.The MLS interpolation based recovery technique performs better for stress extraction at Gauss points with the quadrilateral discretization of the problem domain.It is also shown that the behavior of the MLS interpolation based a-posteriori error technique in stress extraction is comparable to classical Zienkiewicz-Zhu(ZZ)a-posteriori error technique. 展开更多
关键词 recovery points field variable derivatives effectivity error recovery support domain error convergence
在线阅读 下载PDF
Single Epoch GPS Deformation Signals Extraction and Gross Error Detection Technique Based on Wavelet Transform 被引量:1
6
作者 WANG Jian GAO Jingxiang XU Changhui 《Geo-Spatial Information Science》 2006年第3期187-190,共4页
Wavelet theory is efficient as an adequate tool for analyzing single epoch GPS deformation signal. Wavelet analysis technique on gross error detection and recovery is advanced. Criteria of wavelet function choosing an... Wavelet theory is efficient as an adequate tool for analyzing single epoch GPS deformation signal. Wavelet analysis technique on gross error detection and recovery is advanced. Criteria of wavelet function choosing and Mallat decomposition levels decision are discussed. An effective deformation signal extracting method is proposed, that is wavelet noise reduction technique considering gross error recovery, which combines wavelet multi-resolution gross error detection results. Time position recognizing of gross errors and their repairing performance are realized. In the experiment, compactly supported orthogonal wavelet with short support block is more efficient than the longer one when discerning gross errors, which can obtain more finely analyses. And the shape of discerned gross error of short support wavelet is simpler than that of the longer one. Meanwhile, the time scale is easier to identify. 展开更多
关键词 noise single epoch GPS deformation signal Mallat algorithm gross error detection gross error recovery
在线阅读 下载PDF
Superconvergence and recovery type a posteriori error estimation for hybrid stress finite element method 被引量:1
7
作者 BAI YanHong WU YongKe XIE XiaoPing 《Science China Mathematics》 SCIE CSCD 2016年第9期1835-1850,共16页
Superconvergence and recovery type a posteriori error estimators are analyzed for Pian and Sumihara's 4-node hybrid stress quadrilateral finite element method for linear elasticity problems. Superconvergence of or... Superconvergence and recovery type a posteriori error estimators are analyzed for Pian and Sumihara's 4-node hybrid stress quadrilateral finite element method for linear elasticity problems. Superconvergence of order O(h^(1+min){α,1}) is established for both the displacement approximation in H^1-norm and the stress approximation in L^2-norm under a mesh assumption, where α > 0 is a parameter characterizing the distortion of meshes from parallelograms to quadrilaterals. Recovery type approximations for the displacement gradients and the stress tensor are constructed, and a posteriori error estimators based on the recovered quantities are shown to be asymptotically exact. Numerical experiments confirm the theoretical results. 展开更多
关键词 linear elasticity hybrid stress finite element superconvergence recovery a posteriori error estimator
原文传递
Adaptive phase field modelling of crack propagation in orthotropic functionally graded materials
8
作者 Hirshikesh Emilio Martínez-Paneda Sundararajan Natarajan 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2021年第1期185-195,共11页
In this work,we extend the recently proposed adaptive phase field method to model fracture in orthotropic functionally graded materials(FGMs).A recovery type error indicator combined with quadtree decomposition is emp... In this work,we extend the recently proposed adaptive phase field method to model fracture in orthotropic functionally graded materials(FGMs).A recovery type error indicator combined with quadtree decomposition is employed for adaptive mesh refinement.The proposed approach is capable of capturing the fracture process with a localized mesh refinement that provides notable gains in computational efficiency.The implementation is validated against experimental data and other numerical experiments on orthotropic materials with different material orientations.The results reveal an increase in the stiffness and the maximum force with increasing material orientation angle.The study is then extended to the analysis of orthotropic FGMs.It is observed that,if the gradation in fracture properties is neglected,the material gradient plays a secondary role,with the fracture behaviour being dominated by the orthotropy of the material.However,when the toughness increases along the crack propagation path,a substantial gain in fracture resistance is observed. 展开更多
关键词 Functionally graded materials Phase field fracture Polygonal finite element method Orthotropic materials recovery based error indicator
在线阅读 下载PDF
An Error Recoverable Structure Based on Complementary Logic and Alternating-Retry
9
作者 江建慧 《Journal of Computer Science & Technology》 SCIE EI CSCD 2005年第6期885-894,共10页
Modern VLSI circuits provide adequate on-chip resources. So that online testing and retry integrated into a chip are absolutely necessary for system-on-a-chip technology. This paper firstly proposes a general online t... Modern VLSI circuits provide adequate on-chip resources. So that online testing and retry integrated into a chip are absolutely necessary for system-on-a-chip technology. This paper firstly proposes a general online testing plus retrying structure. Obviously, although retry can mask transient or intermittent faults, it is useless for handling permanent faults generally. To solve this problem, this paper presents a novel dual modular redundancy (DMR) structure using complementary logic--alternating-complementary logic (CL-ACL) switching mode. During error-free operation, the CL-ACL structure operates by complementary logic mode. After an error is detected, it retries by alternating logic mode. If all errors belong to single or multiple temporary 0/1-error or stuck-at-error produced by one module, then these errors can be corrected effectively. The results obtained from the simulation validate the correctness of the CL-ACL structure. Analytic results show that the delay of the CL-ACL structure is dramatically less than that of a DMR structure using alternating-complementary logic mode. 展开更多
关键词 error recovery fault tolerance complementary logic alternating-retry temporary error stuck-at-error
原文传递
BAFT:bubble-aware fault-tolerant framework for distributed DNN training with hybrid parallelism
10
作者 Runzhe CHEN Guandong LU +6 位作者 Yakai WANG Rui ZHANG Zheng HU Yanming MIAO Zhifang CAI Jingwen LENG Minyi GUO 《Frontiers of Computer Science》 2025年第1期29-39,共11页
As deep neural networks (DNNs) have been successfully adopted in various domains, the training of these large-scale models becomes increasingly difficult and is often deployed on compute clusters composed of many devi... As deep neural networks (DNNs) have been successfully adopted in various domains, the training of these large-scale models becomes increasingly difficult and is often deployed on compute clusters composed of many devices like GPUs. However, as the size of the cluster increases, so does the possibility of failures during training. Currently, faults are mainly handled by recording checkpoints and recovering, but this approach causes large overhead and affects the training efficiency even when no error occurs. The low checkpointing frequency leads to a large loss of training time, while the high recording frequency affects the training efficiency. To solve this contradiction, we propose BAFT, a bubble-aware fault tolerant framework for hybrid parallel distributed training. BAFT can automatically analyze parallel strategies, profile the runtime information, and schedule checkpointing tasks at the granularity of pipeline stage depending on the bubble distribution in the training. It supports higher checkpoint efficiency and only introduces less than 1% time overhead, which allows us to record checkpoints at high frequency, thereby reducing the time loss in error recovery and avoiding the impact of fault tolerance on training. 展开更多
关键词 distributed training fault tolerance CHECKPOINT pipeline parallelism error recovery
原文传递
Adaptive efficient video transmission over the Internet based on congestion control and RS coding
11
作者 黄伟红 张福炎 孙正兴 《Science in China(Series F)》 2002年第2期121-129,共9页
An approach based on adaptive congestion control and adaptive error recovery with RS (Reed-Solomon) coding method is presented for efficient video transmission over the Internet. Featured by weighted moving average ra... An approach based on adaptive congestion control and adaptive error recovery with RS (Reed-Solomon) coding method is presented for efficient video transmission over the Internet. Featured by weighted moving average rate control and TCP-friendliness, AVSP, a novel adaptive video streaming protocol, is designed with adjustable rate control parameters so as to respond quickly to the QoS status fluctuation during video transmission over the Internet. Combined with congestion control policy, an adaptive RS coding error recovery scheme with variable parameters is presented to enhance the robustness of MPEG video transmission over the Internet with restriction to the total system bandwidth . 展开更多
关键词 video transmission congestion control error recovery Reed-Solomon coding.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部