An in-memory storage system provides submillisecond latency and improves the concurrency of user applications by caching data into memory from external storage.Fault tolerance of in-memory storage systems is essential...An in-memory storage system provides submillisecond latency and improves the concurrency of user applications by caching data into memory from external storage.Fault tolerance of in-memory storage systems is essential,as the loss of cached data requires access to data from external storage,which evidently increases the response latency.Typically,replication and erasure code(EC)are two fault-tolerant schemes that pose different trade-offs between access performance and storage usage.To help make the best performance and space trade-off,we design ElasticMem,a hybrid fault-tolerant distributed in-memory storage system that supports elastic redundancy transition to dynamically change the fault-tolerant scheme.ElasticMem exploits a novel EC-oriented replication(EOR)that carefully designs the data placement of replication according to the future data layout of EC to enhance the I/O efficiency of redundancy transition.ElasticMem solves the consistency problem caused by concurrent data accesses via a lightweight table-based scheme combined with data bypassing.It detects correlated read and write requests and serves subsequent read requests with local data.We implement a prototype that realizes ElasticMem based on Memcached.Experiments show that ElasticMem remarkably reduces the time of redundancy transition,the overall latency of correlated concurrent data accesses,and the latency of single data access among them.展开更多
We study a novel replication mechanism to ensure service continuity against multiple simultaneous server failures.In this mechanism,each item represents a computing task and is replicated intoξ+1 servers for some int...We study a novel replication mechanism to ensure service continuity against multiple simultaneous server failures.In this mechanism,each item represents a computing task and is replicated intoξ+1 servers for some integerξ≥1,with workloads specified by the amount of required resources.If one or more servers fail,the affected workloads can be redirected to other servers that host replicas associated with the same item,such that the service is not interrupted by the failure of up toξservers.This requires that any feasible assignment algorithm must reserve some capacity in each server to accommodate the workload redirected from potential failed servers without overloading,and determining the optimal method for reserving capacity becomes a key issue.Unlike existing algorithms that assume that no two servers share replicas of more than one item,we first formulate capacity reservation for a general arbitrary scenario.Due to the combinatorial nature of this problem,finding the optimal solution is difficult.To this end,we propose a Generalized and Simple Calculating Reserved Capacity(GSCRC)algorithm,with a time complexity only related to the number of items packed in the server.In conjunction with GSCRC,we propose a robust replica packing algorithm with capacity optimization(RobustPack),which aims to minimize the number of servers hosting replicas and tolerate multiple server failures.Through theoretical analysis and experimental evaluations,we show that the RobustPack algorithm can achieve better performance.展开更多
This paper presents 3RVAV(Three-Round Voting with Advanced Validation),a novel Byzantine Fault Tolerant consensus protocol combining Proof-of-Stake with a multi-phase voting mechanism.The protocol introduces three lay...This paper presents 3RVAV(Three-Round Voting with Advanced Validation),a novel Byzantine Fault Tolerant consensus protocol combining Proof-of-Stake with a multi-phase voting mechanism.The protocol introduces three layers of randomized committee voting with distinct participant roles(Validators,Delegators,and Users),achieving(4/5)-threshold approval per round through a verifiable random function(VRF)-based selection process.Our security analysis demonstrates 3RVAV provides 1−(1−s/n)^(3k) resistance to Sybil attacks with n participants and stake s,while maintaining O(kn log n)communication complexity.Experimental simulations show 3247 TPS throughput with 4-s finality,representing a 5.8×improvement over Algorand’s committee-based approach.The proposed protocol achieves approximately 4.2-s finality,demonstrating low latency while maintaining strong consistency and resilience.The protocol introduces a novel punishment matrix incorporating both stake slashing and probabilistic blacklisting,proving a Nash equilibrium for honest participation under rational actor assumptions.展开更多
Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and ...Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.展开更多
Dear Editor,This letter studies the bipartite consensus tracking problem for heterogeneous multi-agent systems with actuator faults and a leader's unknown time-varying control input. To handle such a problem, the ...Dear Editor,This letter studies the bipartite consensus tracking problem for heterogeneous multi-agent systems with actuator faults and a leader's unknown time-varying control input. To handle such a problem, the continuous fault-tolerant control protocol via observer design is developed. In addition, it is strictly proved that the multi-agent system driven by the designed controllers can still achieve bipartite consensus tracking after faults occur.展开更多
In distributed fusion,when one or more sensors are disturbed by faults,a common problem is that their local estimations are inconsistent with those of other fault-free sensors.Most of the existing fault-tolerant distr...In distributed fusion,when one or more sensors are disturbed by faults,a common problem is that their local estimations are inconsistent with those of other fault-free sensors.Most of the existing fault-tolerant distributed fusion algorithms,such as the Covariance Union(CU)and Faulttolerant Generalized Convex Combination(FGCC),are only used for the point estimation case where local estimates and their associated error covariances are provided.A treatment with focus on the fault-tolerant distributed fusions of arbitrary local Probability Density Functions(PDFs)is lacking.For this problem,we first propose Kullback–Leibler Divergence(KLD)and reversed KLD induced functional Fuzzy c-Means(FCM)clustering algorithms to soft cluster all local PDFs,respectively.On this basis,two fault-tolerant distributed fusion algorithms of arbitrary local PDFs are then developed.They select the representing PDF of the cluster with the largest sum of memberships as the fused PDF.Numerical examples verify the better fault tolerance of the developed two distributed fusion algorithms.展开更多
Effective fault diagnosis and fault-tolerant control method for aeronautics electromechanical actuator is concerned in this paper.By borrowing the advantages of model-driven and data-driven methods,a fault tolerant no...Effective fault diagnosis and fault-tolerant control method for aeronautics electromechanical actuator is concerned in this paper.By borrowing the advantages of model-driven and data-driven methods,a fault tolerant nonsingular terminal sliding mode control method based on support vector machine(SVM)is proposed.A SVM is designed to estimate the fault by off-line learning from small sample data with solving convex quadratic programming method and is introduced into a high-gain observer,so as to improve the state estimation and fault detection accuracy when the fault occurs.The state estimation value of the observer is used for state reconfiguration.A novel nonsingular terminal sliding mode surface is designed,and Lyapunov theorem is used to derive a parameter adaptation law and a control law.It is guaranteed that the proposed controller can achieve asymptotical stability which is superior to many advanced fault-tolerant controllers.In addition,the parameter estimation also can help to diagnose the system faults because the faults can be reflected by the parameters variation.Extensive comparative simulation and experimental results illustrate the effectiveness and advancement of the proposed controller compared with several other main-stream controllers.展开更多
For permanent faults(PF)in the power communication network(PCN),such as link interruptions,the timesensitive networking(TSN)relied on by PCN,typically employs spatial redundancy fault-tolerance methods to keep service...For permanent faults(PF)in the power communication network(PCN),such as link interruptions,the timesensitive networking(TSN)relied on by PCN,typically employs spatial redundancy fault-tolerance methods to keep service stability and reliability,which often limits TSN scheduling performance in fault-free ideal states.So this paper proposes a graph attention residual network-based routing and fault-tolerant scheduling mechanism(GRFS)for data flow in PCN,which specifically includes a communication system architecture for integrated terminals based on a cyclic queuing and forwarding(CQF)model and fault recovery method,which reduces the impact of faults by simplified scheduling configurations of CQF and fault-tolerance of prioritizing the rerouting of faulty time-sensitive(TS)flows;considering that PF leading to changes in network topology is more appropriately solved by doing routing and time slot injection decisions hop-by-hop,and that reasonable network load can reduce the damage caused by PF and reserve resources for the rerouting of faulty TS flows,an optimization model for joint routing and scheduling is constructed with scheduling success rate as the objective,and with traffic latency and network load as constraints;to catch changes in TSN topology and traffic load,a D3QN algorithm based on a multi-head graph attention residual network(MGAR)is designed to solve the problem model,where the MGAR based encoder reconstructs the TSN status into feature embedding vectors,and a dueling network decoder performs decoding tasks on the reconstructed feature embedding vectors.Simulation results show that GRFS outperforms heuristic fault-tolerance algorithms and other benchmark schemes by approximately 10%in routing and scheduling success rate in ideal states and 5%in rerouting and rescheduling success rate in fault states.展开更多
Blockchain with these characteristics of decentralized structure, transparent and credible, time-series and immutability, has been considering as a promising technology. Consensus algorithm as one of the core techniqu...Blockchain with these characteristics of decentralized structure, transparent and credible, time-series and immutability, has been considering as a promising technology. Consensus algorithm as one of the core techniques of blockchain directly affects the scalability of blockchain systems. Existing probabilistic finality blockchain consensus algorithms such as PoW, PoS, suffer from power consumptions and low efficiency;while absolute finality blockchain consensus algorithms such as PBFT, HoneyBadgerBFT, could not meet the scalability requirement in a largescale network. In this paper, we propose a novel optimized practical Byzantine fault tolerance consensus algorithm based on EigenTrust model, namely T-PBFT, which is a multi-stage consensus algorithm. It evaluates node trust by the transactions between nodes so that the high quality of nodes in the network will be selected to construct a consensus group. To reduce the probability of view change, we propose to replace a single primary node with a primary group. By group signature and mutual supervision, we can enhance the robustness of the primary group further. Finally, we analyze T-PBFT and compare it with the other Byzantine fault tolerant consensus algorithms. Theoretical analysis shows that our T-PBFT can optimize the Byzantine fault-tolerant rate,reduce the probability of view change and communication complexity.展开更多
Environmental perception is one of the key technologies to realize autonomous vehicles.Autonomous vehicles are often equipped with multiple sensors to form a multi-source environmental perception system.Those sensors ...Environmental perception is one of the key technologies to realize autonomous vehicles.Autonomous vehicles are often equipped with multiple sensors to form a multi-source environmental perception system.Those sensors are very sensitive to light or background conditions,which will introduce a variety of global and local fault signals that bring great safety risks to autonomous driving system during long-term running.In this paper,a real-time data fusion network with fault diagnosis and fault tolerance mechanism is designed.By introducing prior features to realize the lightweight network,the features of the input data can be extracted in real time.A new sensor reliability evaluation method is proposed by calculating the global and local confidence of sensors.Through the temporal and spatial correlation between sensor data,the sensor redundancy is utilized to diagnose the local and global confidence level of sensor data in real time,eliminate the fault data,and ensure the accuracy and reliability of data fusion.Experiments show that the network achieves state-of-the-art results in speed and accuracy,and can accurately detect the location of the target when some sensors are out of focus or out of order.The fusion framework proposed in this paper is proved to be effective for intelligent vehicles in terms of real-time performance and reliability.展开更多
The in-core self-powered neutron detector(SPND)acts as a key measuring device for the monitoring of parameters and evaluation of the operating conditions of nuclear reactors.Prompt detection and tolerance of faulty SP...The in-core self-powered neutron detector(SPND)acts as a key measuring device for the monitoring of parameters and evaluation of the operating conditions of nuclear reactors.Prompt detection and tolerance of faulty SPNDs are indispensable for reliable reactor management.To completely extract the correlated state information of SPNDs,we constructed a twin model based on a generalized regression neural network(GRNN)that represents the common relationships among overall signals.Faulty SPNDs were determined because of the functional concordance of the twin model and real monitoring sys-tems,which calculated the error probability distribution between the model outputs and real values.Fault detection follows a tolerance phase to reinforce the stability of the twin model in the case of massive failures.A weighted K-nearest neighbor model was employed to reasonably reconstruct the values of the faulty signals and guarantee data purity.The experimental evaluation of the proposed method showed promising results,with excellent output consistency and high detection accuracy for both single-and multiple-point faulty SPNDs.For unexpected excessive failures,the proposed tolerance approach can efficiently repair fault behaviors and enhance the prediction performance of the twin model.展开更多
The open-circuit fault of the power switches in shunt active power filter(SAPF) would exacerbate the harmonic pollution of power grid, and degrade the reliability of the devices and system. A fault diagnosis method is...The open-circuit fault of the power switches in shunt active power filter(SAPF) would exacerbate the harmonic pollution of power grid, and degrade the reliability of the devices and system. A fault diagnosis method is proposed based on reference model and an over-modulation strategy under hardware fault tolerance for SAPF. First, a mathematic model is established for SAPF. Second, the residuals are generated by comparing the outputs of reference model and those of actual model, and open-switch fault is detected and diagnosed by residual evaluation. After that, hardware fault tolerance is performed with the three-phase four-switch(TPFS) topology to isolate the faulty phase. Finally, the over-modulation strategy is proposed to increase the voltage transfer ratio of the TPFS topology. Simulation and experimental results verified the feasibility and effectiveness of the proposed method.展开更多
The defects of an OLED-based display,mainly electrical shorts,cause pixels to stay dark,decrease the brightness of a panel,severely influence the display uniformity,and also consume a considerable amount of power. In ...The defects of an OLED-based display,mainly electrical shorts,cause pixels to stay dark,decrease the brightness of a panel,severely influence the display uniformity,and also consume a considerable amount of power. In this paper, for AM-OLEDs, a novel circuit employing p-type low-temperature poly-Si thin-film transistors is introduced to offer fault-tolerant capabilities for such defects. The results show that this circuit can save significant power and maintain the luminance of the pixel without changing the driving current.展开更多
The use of technology has increased vastly and today computer systems are interconnected via different communication medium. The use of distributed systems in our day to day activities has solely improved with data di...The use of technology has increased vastly and today computer systems are interconnected via different communication medium. The use of distributed systems in our day to day activities has solely improved with data distributions. This is because distributed systems enable nodes to organise and allow their resources to be used among the connected systems or devices that make people to be integrated with geographically distributed computing facilities. The distributed systems may lead to lack of service availability due to multiple system failures on multiple failure points. This article highlights the different fault tolerance mechanism in distributed systems used to prevent multiple system failures on multiple failure points by considering replication, high redundancy and high availability of the distributed services.展开更多
Readout errors caused by measurement noise are a significant source of errors in quantum circuits,which severely affect the output results and are an urgent problem to be solved in noisy-intermediate scale quantum(NIS...Readout errors caused by measurement noise are a significant source of errors in quantum circuits,which severely affect the output results and are an urgent problem to be solved in noisy-intermediate scale quantum(NISQ)computing.In this paper,we use the bit-flip averaging(BFA)method to mitigate frequent readout errors in quantum generative adversarial networks(QGAN)for image generation,which simplifies the response matrix structure by averaging the qubits for each random bit-flip in advance,successfully solving problems with high cost of measurement for traditional error mitigation methods.Our experiments were simulated in Qiskit using the handwritten digit image recognition dataset under the BFA-based method,the Kullback-Leibler(KL)divergence of the generated images converges to 0.04,0.05,and 0.1 for readout error probabilities of p=0.01,p=0.05,and p=0.1,respectively.Additionally,by evaluating the fidelity of the quantum states representing the images,we observe average fidelity values of 0.97,0.96,and 0.95 for the three readout error probabilities,respectively.These results demonstrate the robustness of the model in mitigating readout errors and provide a highly fault tolerant mechanism for image generation models.展开更多
Mobile Edge Computing(MEC)is a technology designed for the on-demand provisioning of computing and storage services,strategically positioned close to users.In the MEC environment,frequently accessed content can be dep...Mobile Edge Computing(MEC)is a technology designed for the on-demand provisioning of computing and storage services,strategically positioned close to users.In the MEC environment,frequently accessed content can be deployed and cached on edge servers to optimize the efficiency of content delivery,ultimately enhancing the quality of the user experience.However,due to the typical placement of edge devices and nodes at the network’s periphery,these components may face various potential fault tolerance challenges,including network instability,device failures,and resource constraints.Considering the dynamic nature ofMEC,making high-quality content caching decisions for real-time mobile applications,especially those sensitive to latency,by effectively utilizing mobility information,continues to be a significant challenge.In response to this challenge,this paper introduces FT-MAACC,a mobility-aware caching solution grounded in multi-agent deep reinforcement learning and equipped with fault tolerance mechanisms.This approach comprehensively integrates content adaptivity algorithms to evaluate the priority of highly user-adaptive cached content.Furthermore,it relies on collaborative caching strategies based onmulti-agent deep reinforcement learningmodels and establishes a fault-tolerancemodel to ensure the system’s reliability,availability,and persistence.Empirical results unequivocally demonstrate that FTMAACC outperforms its peer methods in cache hit rates and transmission latency.展开更多
Fault tolerance in microprocessor systems has become a popular topic of architecture research. Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance archit...Fault tolerance in microprocessor systems has become a popular topic of architecture research. Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance architectures have been proposed. But little attention is paid to the thread level superscalar fault tolerance. This letter introduces microthread concept into superscalar processor fault tolerance domain, and puts forward a novel fault tolerance architecture, namely, MicroThread Based (MTB) coarse grained transient fault tolerance superscalar processor architecture, then discusses some detailed implementations.展开更多
This paper proposes a policy driven and multi-agent based model to enhance the fault tolerance and recovery capabilities of Web services in distributed environment. The evaluation function of fault specifications and ...This paper proposes a policy driven and multi-agent based model to enhance the fault tolerance and recovery capabilities of Web services in distributed environment. The evaluation function of fault specifications and the corresponding handling mechanisms of the services are both defined in policies, which are expressed in XML. During the implementation of the services,the occurrences of faults are monitored by the service monitor agent through the local knowledge on the faults. Such local knowledge is dynamically generated by the service policy agent through querying and parsing the service policies from the service policies repository. When the fault occurs, the service process agent will focus on the process of fault handling and service recovery, which will be directed with the actions defined in the policies upon the specific conditions. Such a policy driven and multi-agent based fault handling approach can address the issues of flexibility, automation and availability.展开更多
In a smart grid, a huge amount of data is collected for various applications, such as load monitoring and demand response. These data are used for analyzing the power state and formulating the optimal dispatching stra...In a smart grid, a huge amount of data is collected for various applications, such as load monitoring and demand response. These data are used for analyzing the power state and formulating the optimal dispatching strategy. However, these big energy data in terms of volume, velocity and variety raise concern over consumers' privacy. For instance, in order to optimize energy utilization and support demand response, numerous smart meters are installed at a consumer's home to collect energy consumption data at a fine granularity, but these fine-grained data may contain information on the appliances and thus the consumer's behaviors at home. In this paper, we propose a privacy-preserving data aggregation scheme based on secret sharing with fault tolerance in a smart grid, which ensures that the control center obtains the integrated data without compromising privacy. Meanwhile, we also consider fault tolerance and resistance to differential attack during the data aggregation. Finally, we perform a security analysis and performance evaluation of our scheme in comparison with the other similar schemes. The analysis shows that our scheme can meet the security requirement, and it also shows better performance than other popular methods.展开更多
In this paper,a fault-tolerance wide voltage conversion gain DC/DC converter for More Electric Aircraft(MEA)is proposed.The proposed converter consists of a basic Cuk converter module and n expandable units.By adjusti...In this paper,a fault-tolerance wide voltage conversion gain DC/DC converter for More Electric Aircraft(MEA)is proposed.The proposed converter consists of a basic Cuk converter module and n expandable units.By adjusting the operation state of the expandable units,the voltage conversion gain of the proposed converter could be regulated,which makes it available for wide voltage conversion applications.Especially,since mutual redundancy can be realized between the basic Cuk converter module and the expandable units,the converter can continuously work when an unpredictable fault occurs to the fault-tolerant parts of the proposed converter,which reflects the fault tolerance of the converter and significantly improves the reliability of the system.Moreover,the advantages of small input current ripple,automatic current sharing and low voltage stress are also integrated in this converter.The working principle and features of the proposed converter are mainly introduced,and an experimental prototype with 800 W output power has been manufactured to verify the practicability and availability of the proposed converter.展开更多
基金supported by the Fundamental Research Funds for the Central Universities(WK2150110022)Anhui Provincial Natural Science Foundation(2208085QF189)National Natural Science Foundation of China(62202440).
文摘An in-memory storage system provides submillisecond latency and improves the concurrency of user applications by caching data into memory from external storage.Fault tolerance of in-memory storage systems is essential,as the loss of cached data requires access to data from external storage,which evidently increases the response latency.Typically,replication and erasure code(EC)are two fault-tolerant schemes that pose different trade-offs between access performance and storage usage.To help make the best performance and space trade-off,we design ElasticMem,a hybrid fault-tolerant distributed in-memory storage system that supports elastic redundancy transition to dynamically change the fault-tolerant scheme.ElasticMem exploits a novel EC-oriented replication(EOR)that carefully designs the data placement of replication according to the future data layout of EC to enhance the I/O efficiency of redundancy transition.ElasticMem solves the consistency problem caused by concurrent data accesses via a lightweight table-based scheme combined with data bypassing.It detects correlated read and write requests and serves subsequent read requests with local data.We implement a prototype that realizes ElasticMem based on Memcached.Experiments show that ElasticMem remarkably reduces the time of redundancy transition,the overall latency of correlated concurrent data accesses,and the latency of single data access among them.
基金supported in part by the National Key R&D Program of China under No.2023YFB2703800the National Science Foundation of China under Grants U22B2027,62172297,62102262,61902276 and 62272311+3 种基金Tianjin Intelligent Manufacturing Special Fund Project under Grants 20211097the China Guangxi Science and Technology Plan Project(Guangxi Science and Technology Base and Talent Special Project)under Grant AD23026096(Application Number 2022AC20001)Henan Provincial Natural Science Foundation of China under Grant 622RC616CCF-Nsfocus Kunpeng Fund Project under Grants CCF-NSFOCUS202207。
文摘We study a novel replication mechanism to ensure service continuity against multiple simultaneous server failures.In this mechanism,each item represents a computing task and is replicated intoξ+1 servers for some integerξ≥1,with workloads specified by the amount of required resources.If one or more servers fail,the affected workloads can be redirected to other servers that host replicas associated with the same item,such that the service is not interrupted by the failure of up toξservers.This requires that any feasible assignment algorithm must reserve some capacity in each server to accommodate the workload redirected from potential failed servers without overloading,and determining the optimal method for reserving capacity becomes a key issue.Unlike existing algorithms that assume that no two servers share replicas of more than one item,we first formulate capacity reservation for a general arbitrary scenario.Due to the combinatorial nature of this problem,finding the optimal solution is difficult.To this end,we propose a Generalized and Simple Calculating Reserved Capacity(GSCRC)algorithm,with a time complexity only related to the number of items packed in the server.In conjunction with GSCRC,we propose a robust replica packing algorithm with capacity optimization(RobustPack),which aims to minimize the number of servers hosting replicas and tolerate multiple server failures.Through theoretical analysis and experimental evaluations,we show that the RobustPack algorithm can achieve better performance.
文摘This paper presents 3RVAV(Three-Round Voting with Advanced Validation),a novel Byzantine Fault Tolerant consensus protocol combining Proof-of-Stake with a multi-phase voting mechanism.The protocol introduces three layers of randomized committee voting with distinct participant roles(Validators,Delegators,and Users),achieving(4/5)-threshold approval per round through a verifiable random function(VRF)-based selection process.Our security analysis demonstrates 3RVAV provides 1−(1−s/n)^(3k) resistance to Sybil attacks with n participants and stake s,while maintaining O(kn log n)communication complexity.Experimental simulations show 3247 TPS throughput with 4-s finality,representing a 5.8×improvement over Algorand’s committee-based approach.The proposed protocol achieves approximately 4.2-s finality,demonstrating low latency while maintaining strong consistency and resilience.The protocol introduces a novel punishment matrix incorporating both stake slashing and probabilistic blacklisting,proving a Nash equilibrium for honest participation under rational actor assumptions.
基金supported by the Deanship of Scientific Research and Graduate Studies at King Khalid University under research grant number(R.G.P.2/93/45).
文摘Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.
基金supported by the National Natural Science Foundation of China(62325304,U22B2046,62073079,62376029)the Jiangsu Provincial Scientific Research Center of Applied Mathematics(BK20233002)the China Postdoctoral Science Foundation(2023M730255,2024T171123)
文摘Dear Editor,This letter studies the bipartite consensus tracking problem for heterogeneous multi-agent systems with actuator faults and a leader's unknown time-varying control input. To handle such a problem, the continuous fault-tolerant control protocol via observer design is developed. In addition, it is strictly proved that the multi-agent system driven by the designed controllers can still achieve bipartite consensus tracking after faults occur.
基金supported in part by the Open Fund of Intelligent Control Laboratory,China(No.ICL-2023–0202)in part by National Key R&D Program of China(Nos.2021YFC2202600,2021YFC2202603)。
文摘In distributed fusion,when one or more sensors are disturbed by faults,a common problem is that their local estimations are inconsistent with those of other fault-free sensors.Most of the existing fault-tolerant distributed fusion algorithms,such as the Covariance Union(CU)and Faulttolerant Generalized Convex Combination(FGCC),are only used for the point estimation case where local estimates and their associated error covariances are provided.A treatment with focus on the fault-tolerant distributed fusions of arbitrary local Probability Density Functions(PDFs)is lacking.For this problem,we first propose Kullback–Leibler Divergence(KLD)and reversed KLD induced functional Fuzzy c-Means(FCM)clustering algorithms to soft cluster all local PDFs,respectively.On this basis,two fault-tolerant distributed fusion algorithms of arbitrary local PDFs are then developed.They select the representing PDF of the cluster with the largest sum of memberships as the fused PDF.Numerical examples verify the better fault tolerance of the developed two distributed fusion algorithms.
基金Supported by National Natural Science Foundation of China (Grant No.51975294)Fundamental Research Funds for the Central Universities of China (Grant No.30922010706)。
文摘Effective fault diagnosis and fault-tolerant control method for aeronautics electromechanical actuator is concerned in this paper.By borrowing the advantages of model-driven and data-driven methods,a fault tolerant nonsingular terminal sliding mode control method based on support vector machine(SVM)is proposed.A SVM is designed to estimate the fault by off-line learning from small sample data with solving convex quadratic programming method and is introduced into a high-gain observer,so as to improve the state estimation and fault detection accuracy when the fault occurs.The state estimation value of the observer is used for state reconfiguration.A novel nonsingular terminal sliding mode surface is designed,and Lyapunov theorem is used to derive a parameter adaptation law and a control law.It is guaranteed that the proposed controller can achieve asymptotical stability which is superior to many advanced fault-tolerant controllers.In addition,the parameter estimation also can help to diagnose the system faults because the faults can be reflected by the parameters variation.Extensive comparative simulation and experimental results illustrate the effectiveness and advancement of the proposed controller compared with several other main-stream controllers.
基金supported by Research and Application of Edge IoT Technology for Distributed New Energy Consumption in Distribution Areas,Project Number(5108-202218280A-2-394-XG)。
文摘For permanent faults(PF)in the power communication network(PCN),such as link interruptions,the timesensitive networking(TSN)relied on by PCN,typically employs spatial redundancy fault-tolerance methods to keep service stability and reliability,which often limits TSN scheduling performance in fault-free ideal states.So this paper proposes a graph attention residual network-based routing and fault-tolerant scheduling mechanism(GRFS)for data flow in PCN,which specifically includes a communication system architecture for integrated terminals based on a cyclic queuing and forwarding(CQF)model and fault recovery method,which reduces the impact of faults by simplified scheduling configurations of CQF and fault-tolerance of prioritizing the rerouting of faulty time-sensitive(TS)flows;considering that PF leading to changes in network topology is more appropriately solved by doing routing and time slot injection decisions hop-by-hop,and that reasonable network load can reduce the damage caused by PF and reserve resources for the rerouting of faulty TS flows,an optimization model for joint routing and scheduling is constructed with scheduling success rate as the objective,and with traffic latency and network load as constraints;to catch changes in TSN topology and traffic load,a D3QN algorithm based on a multi-head graph attention residual network(MGAR)is designed to solve the problem model,where the MGAR based encoder reconstructs the TSN status into feature embedding vectors,and a dueling network decoder performs decoding tasks on the reconstructed feature embedding vectors.Simulation results show that GRFS outperforms heuristic fault-tolerance algorithms and other benchmark schemes by approximately 10%in routing and scheduling success rate in ideal states and 5%in rerouting and rescheduling success rate in fault states.
基金supported by Nature Key Research and Development Program of China (2017YFB1400700)the National Natural Science Foundation of China (61602537, U1509214)+1 种基金the Central University of Finance and Economics Funds for the Youth Talent Support Plan (QYP1808)First-Class Discipline Construction in 2019,open fund of Key Laboratory of Grain Information Processing and Control (KFJJ-2018-202)
文摘Blockchain with these characteristics of decentralized structure, transparent and credible, time-series and immutability, has been considering as a promising technology. Consensus algorithm as one of the core techniques of blockchain directly affects the scalability of blockchain systems. Existing probabilistic finality blockchain consensus algorithms such as PoW, PoS, suffer from power consumptions and low efficiency;while absolute finality blockchain consensus algorithms such as PBFT, HoneyBadgerBFT, could not meet the scalability requirement in a largescale network. In this paper, we propose a novel optimized practical Byzantine fault tolerance consensus algorithm based on EigenTrust model, namely T-PBFT, which is a multi-stage consensus algorithm. It evaluates node trust by the transactions between nodes so that the high quality of nodes in the network will be selected to construct a consensus group. To reduce the probability of view change, we propose to replace a single primary node with a primary group. By group signature and mutual supervision, we can enhance the robustness of the primary group further. Finally, we analyze T-PBFT and compare it with the other Byzantine fault tolerant consensus algorithms. Theoretical analysis shows that our T-PBFT can optimize the Byzantine fault-tolerant rate,reduce the probability of view change and communication complexity.
基金Supported by the National Natural Science Foundation of China(Grant U1964201,Grant 61790562 and Grant 61803120)by the Fundamental Research Fundsfor the Central Universities.
文摘Environmental perception is one of the key technologies to realize autonomous vehicles.Autonomous vehicles are often equipped with multiple sensors to form a multi-source environmental perception system.Those sensors are very sensitive to light or background conditions,which will introduce a variety of global and local fault signals that bring great safety risks to autonomous driving system during long-term running.In this paper,a real-time data fusion network with fault diagnosis and fault tolerance mechanism is designed.By introducing prior features to realize the lightweight network,the features of the input data can be extracted in real time.A new sensor reliability evaluation method is proposed by calculating the global and local confidence of sensors.Through the temporal and spatial correlation between sensor data,the sensor redundancy is utilized to diagnose the local and global confidence level of sensor data in real time,eliminate the fault data,and ensure the accuracy and reliability of data fusion.Experiments show that the network achieves state-of-the-art results in speed and accuracy,and can accurately detect the location of the target when some sensors are out of focus or out of order.The fusion framework proposed in this paper is proved to be effective for intelligent vehicles in terms of real-time performance and reliability.
基金supported by the Natural Science Foundation of Fujian Province,China(No.2022J01566).
文摘The in-core self-powered neutron detector(SPND)acts as a key measuring device for the monitoring of parameters and evaluation of the operating conditions of nuclear reactors.Prompt detection and tolerance of faulty SPNDs are indispensable for reliable reactor management.To completely extract the correlated state information of SPNDs,we constructed a twin model based on a generalized regression neural network(GRNN)that represents the common relationships among overall signals.Faulty SPNDs were determined because of the functional concordance of the twin model and real monitoring sys-tems,which calculated the error probability distribution between the model outputs and real values.Fault detection follows a tolerance phase to reinforce the stability of the twin model in the case of massive failures.A weighted K-nearest neighbor model was employed to reasonably reconstruct the values of the faulty signals and guarantee data purity.The experimental evaluation of the proposed method showed promising results,with excellent output consistency and high detection accuracy for both single-and multiple-point faulty SPNDs.For unexpected excessive failures,the proposed tolerance approach can efficiently repair fault behaviors and enhance the prediction performance of the twin model.
基金Project(2012AA051601)supported by the High-Tech Research and Development Program of China
文摘The open-circuit fault of the power switches in shunt active power filter(SAPF) would exacerbate the harmonic pollution of power grid, and degrade the reliability of the devices and system. A fault diagnosis method is proposed based on reference model and an over-modulation strategy under hardware fault tolerance for SAPF. First, a mathematic model is established for SAPF. Second, the residuals are generated by comparing the outputs of reference model and those of actual model, and open-switch fault is detected and diagnosed by residual evaluation. After that, hardware fault tolerance is performed with the three-phase four-switch(TPFS) topology to isolate the faulty phase. Finally, the over-modulation strategy is proposed to increase the voltage transfer ratio of the TPFS topology. Simulation and experimental results verified the feasibility and effectiveness of the proposed method.
文摘The defects of an OLED-based display,mainly electrical shorts,cause pixels to stay dark,decrease the brightness of a panel,severely influence the display uniformity,and also consume a considerable amount of power. In this paper, for AM-OLEDs, a novel circuit employing p-type low-temperature poly-Si thin-film transistors is introduced to offer fault-tolerant capabilities for such defects. The results show that this circuit can save significant power and maintain the luminance of the pixel without changing the driving current.
文摘The use of technology has increased vastly and today computer systems are interconnected via different communication medium. The use of distributed systems in our day to day activities has solely improved with data distributions. This is because distributed systems enable nodes to organise and allow their resources to be used among the connected systems or devices that make people to be integrated with geographically distributed computing facilities. The distributed systems may lead to lack of service availability due to multiple system failures on multiple failure points. This article highlights the different fault tolerance mechanism in distributed systems used to prevent multiple system failures on multiple failure points by considering replication, high redundancy and high availability of the distributed services.
基金Project supported by the Natural Science Foundation of Shandong Province,China (Grant No.ZR2021MF049)Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos.ZR2022LLZ012 and ZR2021LLZ001)。
文摘Readout errors caused by measurement noise are a significant source of errors in quantum circuits,which severely affect the output results and are an urgent problem to be solved in noisy-intermediate scale quantum(NISQ)computing.In this paper,we use the bit-flip averaging(BFA)method to mitigate frequent readout errors in quantum generative adversarial networks(QGAN)for image generation,which simplifies the response matrix structure by averaging the qubits for each random bit-flip in advance,successfully solving problems with high cost of measurement for traditional error mitigation methods.Our experiments were simulated in Qiskit using the handwritten digit image recognition dataset under the BFA-based method,the Kullback-Leibler(KL)divergence of the generated images converges to 0.04,0.05,and 0.1 for readout error probabilities of p=0.01,p=0.05,and p=0.1,respectively.Additionally,by evaluating the fidelity of the quantum states representing the images,we observe average fidelity values of 0.97,0.96,and 0.95 for the three readout error probabilities,respectively.These results demonstrate the robustness of the model in mitigating readout errors and provide a highly fault tolerant mechanism for image generation models.
基金supported by the Innovation Fund Project of Jiangxi Normal University(YJS2022065)the Domestic Visiting Program of Jiangxi Normal University.
文摘Mobile Edge Computing(MEC)is a technology designed for the on-demand provisioning of computing and storage services,strategically positioned close to users.In the MEC environment,frequently accessed content can be deployed and cached on edge servers to optimize the efficiency of content delivery,ultimately enhancing the quality of the user experience.However,due to the typical placement of edge devices and nodes at the network’s periphery,these components may face various potential fault tolerance challenges,including network instability,device failures,and resource constraints.Considering the dynamic nature ofMEC,making high-quality content caching decisions for real-time mobile applications,especially those sensitive to latency,by effectively utilizing mobility information,continues to be a significant challenge.In response to this challenge,this paper introduces FT-MAACC,a mobility-aware caching solution grounded in multi-agent deep reinforcement learning and equipped with fault tolerance mechanisms.This approach comprehensively integrates content adaptivity algorithms to evaluate the priority of highly user-adaptive cached content.Furthermore,it relies on collaborative caching strategies based onmulti-agent deep reinforcement learningmodels and establishes a fault-tolerancemodel to ensure the system’s reliability,availability,and persistence.Empirical results unequivocally demonstrate that FTMAACC outperforms its peer methods in cache hit rates and transmission latency.
文摘Fault tolerance in microprocessor systems has become a popular topic of architecture research. Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance architectures have been proposed. But little attention is paid to the thread level superscalar fault tolerance. This letter introduces microthread concept into superscalar processor fault tolerance domain, and puts forward a novel fault tolerance architecture, namely, MicroThread Based (MTB) coarse grained transient fault tolerance superscalar processor architecture, then discusses some detailed implementations.
文摘This paper proposes a policy driven and multi-agent based model to enhance the fault tolerance and recovery capabilities of Web services in distributed environment. The evaluation function of fault specifications and the corresponding handling mechanisms of the services are both defined in policies, which are expressed in XML. During the implementation of the services,the occurrences of faults are monitored by the service monitor agent through the local knowledge on the faults. Such local knowledge is dynamically generated by the service policy agent through querying and parsing the service policies from the service policies repository. When the fault occurs, the service process agent will focus on the process of fault handling and service recovery, which will be directed with the actions defined in the policies upon the specific conditions. Such a policy driven and multi-agent based fault handling approach can address the issues of flexibility, automation and availability.
文摘In a smart grid, a huge amount of data is collected for various applications, such as load monitoring and demand response. These data are used for analyzing the power state and formulating the optimal dispatching strategy. However, these big energy data in terms of volume, velocity and variety raise concern over consumers' privacy. For instance, in order to optimize energy utilization and support demand response, numerous smart meters are installed at a consumer's home to collect energy consumption data at a fine granularity, but these fine-grained data may contain information on the appliances and thus the consumer's behaviors at home. In this paper, we propose a privacy-preserving data aggregation scheme based on secret sharing with fault tolerance in a smart grid, which ensures that the control center obtains the integrated data without compromising privacy. Meanwhile, we also consider fault tolerance and resistance to differential attack during the data aggregation. Finally, we perform a security analysis and performance evaluation of our scheme in comparison with the other similar schemes. The analysis shows that our scheme can meet the security requirement, and it also shows better performance than other popular methods.
基金supported by the National Natural Science Foundation of China(No.51707103)the Hubei Provincial Key Laboratory on Operation and Control of Cascaded Hydropower Station,China(No.2022KJX08).
文摘In this paper,a fault-tolerance wide voltage conversion gain DC/DC converter for More Electric Aircraft(MEA)is proposed.The proposed converter consists of a basic Cuk converter module and n expandable units.By adjusting the operation state of the expandable units,the voltage conversion gain of the proposed converter could be regulated,which makes it available for wide voltage conversion applications.Especially,since mutual redundancy can be realized between the basic Cuk converter module and the expandable units,the converter can continuously work when an unpredictable fault occurs to the fault-tolerant parts of the proposed converter,which reflects the fault tolerance of the converter and significantly improves the reliability of the system.Moreover,the advantages of small input current ripple,automatic current sharing and low voltage stress are also integrated in this converter.The working principle and features of the proposed converter are mainly introduced,and an experimental prototype with 800 W output power has been manufactured to verify the practicability and availability of the proposed converter.