Test data compression and test resource partitioning (TRP) are essential to reduce the amount of test data in system-on-chip testing. A novel variable-to-variable-length compression codes is designed as advanced fre...Test data compression and test resource partitioning (TRP) are essential to reduce the amount of test data in system-on-chip testing. A novel variable-to-variable-length compression codes is designed as advanced fre- quency-directed run-length (AFDR) codes. Different [rom frequency-directed run-length (FDR) codes, AFDR encodes both 0- and 1-runs and uses the same codes to the equal length runs. It also modifies the codes for 00 and 11 to improve the compression performance. Experimental results for ISCAS 89 benchmark circuits show that AFDR codes achieve higher compression ratio than FDR and other compression codes.展开更多
The evolvable multiprocessor (EvoMP), as a novel multiprocessor system-on-chip (MPSoC) machine with evolvable task decomposition and scheduling, claims a major feature of low-cost and efficient fault tolerance. Non-ce...The evolvable multiprocessor (EvoMP), as a novel multiprocessor system-on-chip (MPSoC) machine with evolvable task decomposition and scheduling, claims a major feature of low-cost and efficient fault tolerance. Non-centralized control and adaptive distribution of the program among the available processors are two major capabilities of this platform, which remarkably help to achieve an efficient fault tolerance scheme. This letter presents the operational as well as architectural details of this fault tolerance scheme. In this method, when a processor becomes faulty, it will be eliminated of contribution in program execution in remaining run-time. This method also utilizes dynamic rescheduling capability of the system to achieve the maximum possible efficiency after processor reduction. The results confirm the efficiency and remarkable advantages of the proposed approach over common redundancy based techniques in similar systems.展开更多
In the context of real-time fault-tolerant scheduling in multiprocessor systems, Primary-backup scheme plays an important role. A backup copy is always preferred to be executed as passive backup copy whenever possible...In the context of real-time fault-tolerant scheduling in multiprocessor systems, Primary-backup scheme plays an important role. A backup copy is always preferred to be executed as passive backup copy whenever possible because it can take the advantages of backup copy de-allocation technique and overloading technique to improve schedulability. In this paper, we propose a novel efficient fault-tolerant ratemonotonic best-fit algorithm efficient fault-tolerant rate-monotonic best-fit (ERMBF) based on multiprocessors systems to enhance the schedulability. Unlike existing scheduling algorithms that start scheduling tasks with only one processor. ERMBF pre-allocates a certain amount of processors before starting scheduling tasks, which enlarge the searching spaces for tasks. Besides, when a new processor is allocated, we reassign the task copies that have already been assigned to the existing processors in order to find a superior tasks assignment configuration. These two strategies are all aiming at making as many backup copies as possible to be executed as passive status. As a result, ERMBF can use fewer processors to schedule a set of tasks without losing real-time and fault-tolerant capabilities of the system. Simulation results reveal that ERMBF significantly improves the schedulability over existing, comparable algorithms in literature.展开更多
The high-speed computational performance is gained at the cost of huge hardware resource,which restricts the application of high-accuracy algorithms because of the limited hardware cost in practical use.To solve the p...The high-speed computational performance is gained at the cost of huge hardware resource,which restricts the application of high-accuracy algorithms because of the limited hardware cost in practical use.To solve the problem,a novel method for designing the field programmable gate array(FPGA)-based non-uniform rational B-spline(NURBS) interpolator and motion controller,which adopts the embedded multiprocessor technique,is proposed in this study.The hardware and software design for the multiprocessor,one of which is for NURBS interpolation and the other for position servo control,is presented.Performance analysis and experiments on an X-Y table are carried out,hardware cost as well as consuming time for interpolation and motion control is compared with the existing methods.The experimental and comparing results indicate that,compared with the existing methods,the proposed method can reduce the hardware cost by 97.5% using higher-accuracy interpolation algorithm within the period of 0.5 ms.A method which ensures the real-time performance and interpolation accuracy,and reduces the hardware cost significantly is proposed,and it’s practical in the use of industrial application.展开更多
Single-event effects(SEEs)induced by mediumenergy protons in a 28 nm system-on-chip(SoC)were investigated at the China Institute of Atomic Energy.An on-chip memory block was irradiated with 90 MeV and 70 MeV protons,r...Single-event effects(SEEs)induced by mediumenergy protons in a 28 nm system-on-chip(SoC)were investigated at the China Institute of Atomic Energy.An on-chip memory block was irradiated with 90 MeV and 70 MeV protons,respectively.Single-bit upset and multicell upset events were observed,and an uppermost number of nine upset cells were discovered in the 90 MeV proton irradiation test.The results indicate that the SEE sensitivities of the 28 nm SoC to the 90 MeV and 70 MeV protons were similar.Cosmic Ray Effects on Micro-Electronics Monte Carlo simulations were analyzed,and it demonstrates that protons can induce effects in a 28 nm SoC if their energies are greater than 1.4 MeV and that the lowest corresponding linear energy transfer was 0.142 MeV cm^2 mg^-1.The similarities and discrepancies of the SEEs induced by the 90 MeV and 70 MeV protons were analyzed.展开更多
Single event effects (SEEs) in a 28-nm system-on-chip (SoC) were assessed using heavy ion irradiations, and susceptibilities in different processor configurations with data accessing patterns were investigated. The pa...Single event effects (SEEs) in a 28-nm system-on-chip (SoC) were assessed using heavy ion irradiations, and susceptibilities in different processor configurations with data accessing patterns were investigated. The patterns included the sole processor (SP) and asymmetric multiprocessing (AMP) patterns with static and dynamic data accessing. Single event upset (SEU) cross sections in static accessing can be more than twice as high as those of the dynamic accessing, and processor configuration pattern is not a critical factor for the SEU cross sections. Cross section interval of upset events was evaluated and the soft error rates in aerospace environment were predicted for the SoC. The tests also indicated that ultra-high linear energy transfer (LET) particle can cause exception currents in the 28-nm SoC, and some even are lower than the normal case.展开更多
In order to deal with the limitations during the register transfer level verification, a new functional verification method based on the random testing for the system-level of system-on-chip is proposed.The validity o...In order to deal with the limitations during the register transfer level verification, a new functional verification method based on the random testing for the system-level of system-on-chip is proposed.The validity of this method is proven theoretically.Specifically, testcases are generated according to many approaches of randomization.Moreover, the testbench for the system-level verification according to the proposed method is designed by using advanced modeling language.Therefore, under the circumstances that the testbench generates testcases quickly, the hardware/software co-simulation and co-verification can be implemented and the hardware/software partitioning planning can be evaluated easily.The comparison method is put to use in the evaluation approach of the testing validity.The evaluation result indicates that the efficiency of the partition testing is better than that of the random testing only when one or more subdomains are covered over with the area of errors, although the efficiency of the random testing is generally better than that of the partition testing.The experimental result indicates that this method has a good performance in the functional coverage and the cost of testing and can discover the functional errors as soon as possible.展开更多
To decrease the cost of exchanging load information among processors, a dynamic load-balancing (DLB) algorithm which adopts multieast tree technology is proposed. The muhieast tree construction rules are also propos...To decrease the cost of exchanging load information among processors, a dynamic load-balancing (DLB) algorithm which adopts multieast tree technology is proposed. The muhieast tree construction rules are also proposed to avoid wrongly transferred or redundant DLB messages due to the overlapping of multicast trees. The proposed DLB algorithm is distributed controlled, sender initiated and can help heavily loaded processors with complete distribution of redundant loads with minimum number of executions. Experiments were executed to compare the effects of the proposed DLB algorithm and other three ones, the results prove the effectivity and practicability of the proposed algorithm in dealing with great scale compute-intensive tasks.展开更多
P k |fix| C max problem is a new scheduling problem based on the multiprocessor parallel job, and it is proved to be NP hard problem when k ≥3. This paper focuses on the case of k =3. Some new observations and new te...P k |fix| C max problem is a new scheduling problem based on the multiprocessor parallel job, and it is proved to be NP hard problem when k ≥3. This paper focuses on the case of k =3. Some new observations and new techniques for P 3 |fix| C max problem are offered. The concept of semi normal schedulings is introduced, and a very simple linear time algorithm Semi normal Algorithm for constructing semi normal schedulings is developed. With the method of the classical Graham List Scheduling, a thorough analysis of the optimal scheduling on a special instance is provided, which shows that the algorithm is an approximation algorithm of ratio of 9/8 for any instance of P 3|fix| C max problem, and improves the previous best ratio of 7/6 by M.X.Goemans.展开更多
In shared-memory bus-based multiprocessors, when the number of processors grows, the processors spend an increasing amount of time waiting for access to the bus (and shared memory). This contention reduces the perform...In shared-memory bus-based multiprocessors, when the number of processors grows, the processors spend an increasing amount of time waiting for access to the bus (and shared memory). This contention reduces the performance of processors and imposes a limitation of the number of processors that can be used efficiently in bus-based systems. Since the multi-processor’s performance depends upon many parameters which affect the performance in different ways, timed Petri nets are used to model shared-memory bus-based multiprocessors at the instruction execution level, and the developed models are used to study how the performance of processors changes with the number of processors in the system. The results illustrate very well the restriction on the number of processors imposed by the shared bus. All performance characteristics presented in this paper are obtained by discrete-event simulation of Petri net models.展开更多
The Fork-Join program consisting of K parallel tasks is a useful model for a large number of computing applications. When the parallel processor has multi-channels, later tasks may finish execution earlier than their ...The Fork-Join program consisting of K parallel tasks is a useful model for a large number of computing applications. When the parallel processor has multi-channels, later tasks may finish execution earlier than their earlier tasks and may join with tasks from other programs. This phenomenon is called exchangeable join (EJ), which introduces correlation to the task’s service time. In this work, we investigate the response time of multiprocessor systems with EJ with a new approach. We analyze two aspects of this kind of systems: exchangeable join (EJ) and the capacity constraint (CC). We prove that the system response time can be effectively reduced by EJ, while the reduced amount is constrained by the capacity of the multiprocessor. An upper bound model is constructed based on this analysis and a quick estimation algorithm is proposed. The approximation formula is verified by extensive simulation results, which show that the relative error of approximation is less than 5%.展开更多
This paper presents the mechanism of the bus arbitration in PI-MPS multiprocessor sys-tem,describes encode approach,arbiter timing states and uniqueness of master modular ininterconnection bus,and measures and analyse...This paper presents the mechanism of the bus arbitration in PI-MPS multiprocessor sys-tem,describes encode approach,arbiter timing states and uniqueness of master modular ininterconnection bus,and measures and analyses latency of bus arbitration as well.展开更多
The design of parallel algorithms is studied in this paper. These algorithms are applicable to shared memory MIMD machines In this paper, the emphasis is put on the methods for design of the efficient parallel algori...The design of parallel algorithms is studied in this paper. These algorithms are applicable to shared memory MIMD machines In this paper, the emphasis is put on the methods for design of the efficient parallel algorithms. The design of efficient parallel algorithms should be based on the following considerationst algorithm parallelism and the hardware-parallelism; granularity of the parallel algorithm, algorithm optimization according to the underling parallel machine. In this paper , these principles are applied to solve a model problem of the PDE. The speedup of the new method is high. The results were tested and evaluated on a shared memory MIMD machine. The practical results were agree with the predicted performance.展开更多
This paper considers the scheduling problem observed in chip sorting operation of LED manufacturing, where each lot (job) with release time have four operations to be processed on a set of processing stages without pr...This paper considers the scheduling problem observed in chip sorting operation of LED manufacturing, where each lot (job) with release time have four operations to be processed on a set of processing stages without pre-determined necessary route. Each stage has one and more identical sorting machines. The sorting machines scheduling problem can be treated as a four-stage multiprocessor open shop problem with dynamic job release, and the objective is minimizing the makespan in the paper. This problem is formulated into a mixed integer programming (MIP) model and empirically shows its computational intractability. Due to the computational intractability, a particle swarm optimization (PSO) algorithm is proposed. A series of computational experiments are conducted to evaluate the performance of the proposed PSO in comparison with exact solution on various small-size problem instances. The results show that the PSO algorithm could finds most optimal or better solutions in one second.展开更多
IEEE J.Solid-State Circuits,2019,doi:10.1109/JSSC.2018.2884349Nonvolatile processor(NVP)is promising for energy-harvesting-powered internet-of-things(IoT)devices,owing to its unique capability to sustain computation p...IEEE J.Solid-State Circuits,2019,doi:10.1109/JSSC.2018.2884349Nonvolatile processor(NVP)is promising for energy-harvesting-powered internet-of-things(IoT)devices,owing to its unique capability to sustain computation progress over power outages.Recently.展开更多
FMS is the basic and frontier technology of advanced manufacturing.Its critical compo-nent is FMS control system.Reconstructable fault-tolerant multiprocessor control system,YH-MCS,is the result of the research on the...FMS is the basic and frontier technology of advanced manufacturing.Its critical compo-nent is FMS control system.Reconstructable fault-tolerant multiprocessor control system,YH-MCS,is the result of the research on the high-performance and high-reliable FMS con-trol system.This paper describes its architecture,technology characteristics,academic valueand application potentiality.展开更多
Maintaining temporal consistency of real-time data is important for cyber-physical systems.Most of the previous studies focus on uniprocessor systems.In this paper,the problem of temporal consistency maintenance on mu...Maintaining temporal consistency of real-time data is important for cyber-physical systems.Most of the previous studies focus on uniprocessor systems.In this paper,the problem of temporal consistency maintenance on multiprocessor platforms with instance skipping was formulated based on the(m,k)-constrained model.A partitioned scheduling method SC-AD was proposed to solve the problem.SC-AD uses a derived sufficient schedulability condition to calculate the initial value of m for each sensor transaction.It then partitions the transactions among the processors in a balanced way.To further reduce the average relative invalid time of real-time data,SC-AD judiciously increases the values of m for transactions assigned to each processor.Experiment results show that SC-AD outperforms the baseline methods in terms of the average relative invalid time and the average valid ratio under different system workloads.展开更多
As the technology of IP-core-reused has been widely used, a lot of intellectual property (IP) cores have been embedded in different layers of system-on-chip (SOC). Although the cycles of development and overhead a...As the technology of IP-core-reused has been widely used, a lot of intellectual property (IP) cores have been embedded in different layers of system-on-chip (SOC). Although the cycles of development and overhead are reduced by this method, it is a challenge to the SOC test. This paper proposes a scheduling method based on the virtual flattened architecture for hierarchical SOC, which breaks the hierarchical architecture to the virtual flattened one. Moreover, this method has more advantages compared with the traditional one, which tests the parent cores and child cores separately. Finally, the method is verified by the ITC'02 benchmark, and gives good results that reduce the test time and overhead effectively.展开更多
Ultrasonic testing systems have been extensively used in medical imaging and non-destructive testing applications. Generally, these systems aim at a particular application or target material. To make these systems por...Ultrasonic testing systems have been extensively used in medical imaging and non-destructive testing applications. Generally, these systems aim at a particular application or target material. To make these systems portable and more adaptable to the test environments, this study presents a reconfigurable ultrasonic testing system (RUTS), which possesses dynamic reconfiguration capabilities. RUTS consists a fully programmable Analog Front-End (AFE), which facilitates beamforming and signal conditioning for variety of applications. RUTS AFE supports up to 8 transducers for phased-array implementation. Xilinx Zynq System-on-Chip (SoC) based Zedboard provides the back-end processing of RUTS. The powerful ARM embedded processor available within Zynq SoC manages the ultrasonic data acquisition/processing and overall system control, which makes RUTS a unique platform for the ultrasonic researchers to experiment and evaluate a wide range of real-time ultrasonic signal processing applications. This Linux-based system is utilized for ultra-sonic data compression implementation providing a versatile environment for further development of ultrasonic imaging and testing system. Furthermore, this study demonstrates the capabilities of RUTS by performing ultrasonic data acquisition and data compression in real-time. Thus, this reconfigurable system enables ultrasonic designers and researchers to efficiently prototype different experiments and to incorporate and analyze high performance ultrasonic signal and image processing algorithms.展开更多
基金Supported by the National Natural Science Foundation of China(61076019,61106018)the Aeronautical Science Foundation of China(20115552031)+3 种基金the China Postdoctoral Science Foundation(20100481134)the Jiangsu Province Key Technology R&D Program(BE2010003)the Nanjing University of Aeronautics and Astronautics Research Funding(NS2010115)the Nanjing University of Aeronatics and Astronautics Initial Funding for Talented Faculty(1004-YAH10027)~~
文摘Test data compression and test resource partitioning (TRP) are essential to reduce the amount of test data in system-on-chip testing. A novel variable-to-variable-length compression codes is designed as advanced fre- quency-directed run-length (AFDR) codes. Different [rom frequency-directed run-length (FDR) codes, AFDR encodes both 0- and 1-runs and uses the same codes to the equal length runs. It also modifies the codes for 00 and 11 to improve the compression performance. Experimental results for ISCAS 89 benchmark circuits show that AFDR codes achieve higher compression ratio than FDR and other compression codes.
文摘The evolvable multiprocessor (EvoMP), as a novel multiprocessor system-on-chip (MPSoC) machine with evolvable task decomposition and scheduling, claims a major feature of low-cost and efficient fault tolerance. Non-centralized control and adaptive distribution of the program among the available processors are two major capabilities of this platform, which remarkably help to achieve an efficient fault tolerance scheme. This letter presents the operational as well as architectural details of this fault tolerance scheme. In this method, when a processor becomes faulty, it will be eliminated of contribution in program execution in remaining run-time. This method also utilizes dynamic rescheduling capability of the system to achieve the maximum possible efficiency after processor reduction. The results confirm the efficiency and remarkable advantages of the proposed approach over common redundancy based techniques in similar systems.
基金Supported by the National Basic Reseach Program of China (973 Program 2004 CB318200)
文摘In the context of real-time fault-tolerant scheduling in multiprocessor systems, Primary-backup scheme plays an important role. A backup copy is always preferred to be executed as passive backup copy whenever possible because it can take the advantages of backup copy de-allocation technique and overloading technique to improve schedulability. In this paper, we propose a novel efficient fault-tolerant ratemonotonic best-fit algorithm efficient fault-tolerant rate-monotonic best-fit (ERMBF) based on multiprocessors systems to enhance the schedulability. Unlike existing scheduling algorithms that start scheduling tasks with only one processor. ERMBF pre-allocates a certain amount of processors before starting scheduling tasks, which enlarge the searching spaces for tasks. Besides, when a new processor is allocated, we reassign the task copies that have already been assigned to the existing processors in order to find a superior tasks assignment configuration. These two strategies are all aiming at making as many backup copies as possible to be executed as passive status. As a result, ERMBF can use fewer processors to schedule a set of tasks without losing real-time and fault-tolerant capabilities of the system. Simulation results reveal that ERMBF significantly improves the schedulability over existing, comparable algorithms in literature.
基金supported by National Key Basic Research Program of China(973 ProgramGrant No.2011CB706804)+1 种基金Shanghai Municipal Science and Technology Commission of China(Grant No.11QH1401400)Research Project of State Key Laboratory of Mechanical System & Vibration of China(Grant No.MSVMS201102)
文摘The high-speed computational performance is gained at the cost of huge hardware resource,which restricts the application of high-accuracy algorithms because of the limited hardware cost in practical use.To solve the problem,a novel method for designing the field programmable gate array(FPGA)-based non-uniform rational B-spline(NURBS) interpolator and motion controller,which adopts the embedded multiprocessor technique,is proposed in this study.The hardware and software design for the multiprocessor,one of which is for NURBS interpolation and the other for position servo control,is presented.Performance analysis and experiments on an X-Y table are carried out,hardware cost as well as consuming time for interpolation and motion control is compared with the existing methods.The experimental and comparing results indicate that,compared with the existing methods,the proposed method can reduce the hardware cost by 97.5% using higher-accuracy interpolation algorithm within the period of 0.5 ms.A method which ensures the real-time performance and interpolation accuracy,and reduces the hardware cost significantly is proposed,and it’s practical in the use of industrial application.
基金supported by the National Natural Science Foundation of China(Grant Nos.11575138,11835006,11690040,and 11690043)
文摘Single-event effects(SEEs)induced by mediumenergy protons in a 28 nm system-on-chip(SoC)were investigated at the China Institute of Atomic Energy.An on-chip memory block was irradiated with 90 MeV and 70 MeV protons,respectively.Single-bit upset and multicell upset events were observed,and an uppermost number of nine upset cells were discovered in the 90 MeV proton irradiation test.The results indicate that the SEE sensitivities of the 28 nm SoC to the 90 MeV and 70 MeV protons were similar.Cosmic Ray Effects on Micro-Electronics Monte Carlo simulations were analyzed,and it demonstrates that protons can induce effects in a 28 nm SoC if their energies are greater than 1.4 MeV and that the lowest corresponding linear energy transfer was 0.142 MeV cm^2 mg^-1.The similarities and discrepancies of the SEEs induced by the 90 MeV and 70 MeV protons were analyzed.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.11575138,11835006,11690040,and 11690043)the Fund from Innovation Center of Radiation Application(Grant No.KFZC2019050321)+1 种基金the Fund from the Science and Technology on Vacuum Technology and Physics Laboratory,Lanzhou Institute of Physics(Grant No.ZWK1804)the Program of China Scholarships Council(Grant No.201906280343)。
文摘Single event effects (SEEs) in a 28-nm system-on-chip (SoC) were assessed using heavy ion irradiations, and susceptibilities in different processor configurations with data accessing patterns were investigated. The patterns included the sole processor (SP) and asymmetric multiprocessing (AMP) patterns with static and dynamic data accessing. Single event upset (SEU) cross sections in static accessing can be more than twice as high as those of the dynamic accessing, and processor configuration pattern is not a critical factor for the SEU cross sections. Cross section interval of upset events was evaluated and the soft error rates in aerospace environment were predicted for the SoC. The tests also indicated that ultra-high linear energy transfer (LET) particle can cause exception currents in the 28-nm SoC, and some even are lower than the normal case.
基金supported by the National High Technology Research and Development Program of China (863 Program) (2002AA1Z1490)Specialized Research Fund for the Doctoral Program of Higher Education (20040486049)the University Cooperative Research Fund of Huawei Technology Co., Ltd
文摘In order to deal with the limitations during the register transfer level verification, a new functional verification method based on the random testing for the system-level of system-on-chip is proposed.The validity of this method is proven theoretically.Specifically, testcases are generated according to many approaches of randomization.Moreover, the testbench for the system-level verification according to the proposed method is designed by using advanced modeling language.Therefore, under the circumstances that the testbench generates testcases quickly, the hardware/software co-simulation and co-verification can be implemented and the hardware/software partitioning planning can be evaluated easily.The comparison method is put to use in the evaluation approach of the testing validity.The evaluation result indicates that the efficiency of the partition testing is better than that of the random testing only when one or more subdomains are covered over with the area of errors, although the efficiency of the random testing is generally better than that of the partition testing.The experimental result indicates that this method has a good performance in the functional coverage and the cost of testing and can discover the functional errors as soon as possible.
基金the National Natural Science Foundation of China(69973007)
文摘To decrease the cost of exchanging load information among processors, a dynamic load-balancing (DLB) algorithm which adopts multieast tree technology is proposed. The muhieast tree construction rules are also proposed to avoid wrongly transferred or redundant DLB messages due to the overlapping of multicast trees. The proposed DLB algorithm is distributed controlled, sender initiated and can help heavily loaded processors with complete distribution of redundant loads with minimum number of executions. Experiments were executed to compare the effects of the proposed DLB algorithm and other three ones, the results prove the effectivity and practicability of the proposed algorithm in dealing with great scale compute-intensive tasks.
文摘P k |fix| C max problem is a new scheduling problem based on the multiprocessor parallel job, and it is proved to be NP hard problem when k ≥3. This paper focuses on the case of k =3. Some new observations and new techniques for P 3 |fix| C max problem are offered. The concept of semi normal schedulings is introduced, and a very simple linear time algorithm Semi normal Algorithm for constructing semi normal schedulings is developed. With the method of the classical Graham List Scheduling, a thorough analysis of the optimal scheduling on a special instance is provided, which shows that the algorithm is an approximation algorithm of ratio of 9/8 for any instance of P 3|fix| C max problem, and improves the previous best ratio of 7/6 by M.X.Goemans.
文摘In shared-memory bus-based multiprocessors, when the number of processors grows, the processors spend an increasing amount of time waiting for access to the bus (and shared memory). This contention reduces the performance of processors and imposes a limitation of the number of processors that can be used efficiently in bus-based systems. Since the multi-processor’s performance depends upon many parameters which affect the performance in different ways, timed Petri nets are used to model shared-memory bus-based multiprocessors at the instruction execution level, and the developed models are used to study how the performance of processors changes with the number of processors in the system. The results illustrate very well the restriction on the number of processors imposed by the shared bus. All performance characteristics presented in this paper are obtained by discrete-event simulation of Petri net models.
基金Project supported by the National Natural Science Foundation of0 China (Nos. 60274011 and 60574067), and the Program for NewCentury Excellent Talents in University (No. NCET-04-0094), China
文摘The Fork-Join program consisting of K parallel tasks is a useful model for a large number of computing applications. When the parallel processor has multi-channels, later tasks may finish execution earlier than their earlier tasks and may join with tasks from other programs. This phenomenon is called exchangeable join (EJ), which introduces correlation to the task’s service time. In this work, we investigate the response time of multiprocessor systems with EJ with a new approach. We analyze two aspects of this kind of systems: exchangeable join (EJ) and the capacity constraint (CC). We prove that the system response time can be effectively reduced by EJ, while the reduced amount is constrained by the capacity of the multiprocessor. An upper bound model is constructed based on this analysis and a quick estimation algorithm is proposed. The approximation formula is verified by extensive simulation results, which show that the relative error of approximation is less than 5%.
文摘This paper presents the mechanism of the bus arbitration in PI-MPS multiprocessor sys-tem,describes encode approach,arbiter timing states and uniqueness of master modular ininterconnection bus,and measures and analyses latency of bus arbitration as well.
文摘The design of parallel algorithms is studied in this paper. These algorithms are applicable to shared memory MIMD machines In this paper, the emphasis is put on the methods for design of the efficient parallel algorithms. The design of efficient parallel algorithms should be based on the following considerationst algorithm parallelism and the hardware-parallelism; granularity of the parallel algorithm, algorithm optimization according to the underling parallel machine. In this paper , these principles are applied to solve a model problem of the PDE. The speedup of the new method is high. The results were tested and evaluated on a shared memory MIMD machine. The practical results were agree with the predicted performance.
文摘This paper considers the scheduling problem observed in chip sorting operation of LED manufacturing, where each lot (job) with release time have four operations to be processed on a set of processing stages without pre-determined necessary route. Each stage has one and more identical sorting machines. The sorting machines scheduling problem can be treated as a four-stage multiprocessor open shop problem with dynamic job release, and the objective is minimizing the makespan in the paper. This problem is formulated into a mixed integer programming (MIP) model and empirically shows its computational intractability. Due to the computational intractability, a particle swarm optimization (PSO) algorithm is proposed. A series of computational experiments are conducted to evaluate the performance of the proposed PSO in comparison with exact solution on various small-size problem instances. The results show that the PSO algorithm could finds most optimal or better solutions in one second.
文摘IEEE J.Solid-State Circuits,2019,doi:10.1109/JSSC.2018.2884349Nonvolatile processor(NVP)is promising for energy-harvesting-powered internet-of-things(IoT)devices,owing to its unique capability to sustain computation progress over power outages.Recently.
基金the Commission of science,Technology and Industry for National Defence
文摘FMS is the basic and frontier technology of advanced manufacturing.Its critical compo-nent is FMS control system.Reconstructable fault-tolerant multiprocessor control system,YH-MCS,is the result of the research on the high-performance and high-reliable FMS con-trol system.This paper describes its architecture,technology characteristics,academic valueand application potentiality.
基金Project(2020JJ4032)supported by the Hunan Provincial Natural Science Foundation of China。
文摘Maintaining temporal consistency of real-time data is important for cyber-physical systems.Most of the previous studies focus on uniprocessor systems.In this paper,the problem of temporal consistency maintenance on multiprocessor platforms with instance skipping was formulated based on the(m,k)-constrained model.A partitioned scheduling method SC-AD was proposed to solve the problem.SC-AD uses a derived sufficient schedulability condition to calculate the initial value of m for each sensor transaction.It then partitions the transactions among the processors in a balanced way.To further reduce the average relative invalid time of real-time data,SC-AD judiciously increases the values of m for transactions assigned to each processor.Experiment results show that SC-AD outperforms the baseline methods in terms of the average relative invalid time and the average valid ratio under different system workloads.
基金Project supported by the Applied Materials Foundation Project of Science and Technology Commission of Shanghai Mu-nicipality (Grant No.08700741000)the System Design on Chip Project of Science and Technology Commission of Shanghai Municipality (Grant No.08706201000)+1 种基金the Leading Academic Discipline Project of Shanghai Municipal Education Committee(Grant No.J50104)the Innovation Foundation Project of Shanghai University
文摘As the technology of IP-core-reused has been widely used, a lot of intellectual property (IP) cores have been embedded in different layers of system-on-chip (SOC). Although the cycles of development and overhead are reduced by this method, it is a challenge to the SOC test. This paper proposes a scheduling method based on the virtual flattened architecture for hierarchical SOC, which breaks the hierarchical architecture to the virtual flattened one. Moreover, this method has more advantages compared with the traditional one, which tests the parent cores and child cores separately. Finally, the method is verified by the ITC'02 benchmark, and gives good results that reduce the test time and overhead effectively.
文摘Ultrasonic testing systems have been extensively used in medical imaging and non-destructive testing applications. Generally, these systems aim at a particular application or target material. To make these systems portable and more adaptable to the test environments, this study presents a reconfigurable ultrasonic testing system (RUTS), which possesses dynamic reconfiguration capabilities. RUTS consists a fully programmable Analog Front-End (AFE), which facilitates beamforming and signal conditioning for variety of applications. RUTS AFE supports up to 8 transducers for phased-array implementation. Xilinx Zynq System-on-Chip (SoC) based Zedboard provides the back-end processing of RUTS. The powerful ARM embedded processor available within Zynq SoC manages the ultrasonic data acquisition/processing and overall system control, which makes RUTS a unique platform for the ultrasonic researchers to experiment and evaluate a wide range of real-time ultrasonic signal processing applications. This Linux-based system is utilized for ultra-sonic data compression implementation providing a versatile environment for further development of ultrasonic imaging and testing system. Furthermore, this study demonstrates the capabilities of RUTS by performing ultrasonic data acquisition and data compression in real-time. Thus, this reconfigurable system enables ultrasonic designers and researchers to efficiently prototype different experiments and to incorporate and analyze high performance ultrasonic signal and image processing algorithms.