The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,...The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS.展开更多
In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of pa...In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of parallel processing mechanisms.One is that it can evenly allocate tasks to each server node in the cluster and the other is that it can implement the load balancing inside a server node.Based on the strategy,a new web-based spatial computing model is designed in this paper,in which,a task response ratio calculation method,a request queue buffer mechanism and a thread scheduling strategy are focused on.Experimental results show that the new model can fully use the multi-core computing advantage of each server node in the concurrent access environment and improve the average hits per second,average I/O Hits,CPU utilization and throughput.Using speed-up ratio to analyze the traditional model and the new one,the result shows that the new model has the best performance.The performance of the multi-core server nodes in the cluster is optimized;the resource utilization and the parallel processing capabilities are enhanced.The more CPU cores you have,the higher parallel processing capabilities will be obtained.展开更多
To cope with the task scheduling problem under multi-task and transportation consideration in large-scale service oriented manufacturing systems(SOMS), a service allocation optimization mathematical model was establis...To cope with the task scheduling problem under multi-task and transportation consideration in large-scale service oriented manufacturing systems(SOMS), a service allocation optimization mathematical model was established, and then a hybrid discrete particle swarm optimization-genetic algorithm(HDPSOGA) was proposed. In SOMS, each resource involved in the whole life cycle of a product, whether it is provided by a piece of software or a hardware device, is encapsulated into a service. So, the transportation during production of a task should be taken into account because the hard-services selected are possibly provided by various providers in different areas. In the service allocation optimization mathematical model, multi-task and transportation were considered simultaneously. In the proposed HDPSOGA algorithm, integer coding method was applied to establish the mapping between the particle location matrix and the service allocation scheme. The position updating process was performed according to the cognition part, the social part, and the previous velocity and position while introducing the crossover and mutation idea of genetic algorithm to fit the discrete space. Finally, related simulation experiments were carried out to compare with other two previous algorithms. The results indicate the effectiveness and efficiency of the proposed hybrid algorithm.展开更多
Unmanned Aerial Vehicles(UAVs)cooperative multi-task system has become the research focus in recent years.However,the existing network frameworks of UAVs are not flexible and efficient enough to deal with the complex ...Unmanned Aerial Vehicles(UAVs)cooperative multi-task system has become the research focus in recent years.However,the existing network frameworks of UAVs are not flexible and efficient enough to deal with the complex multi-task scheduling,because they are not able to perceive the different features.In this paper,a novel cooperated UAVs network framework for multi-task scheduling is proposed.It is a three-layer network including a core layer,an aggregation layer and an execution layer,which enhances the efficiency of multi-task distribution,aggregation and transmission.Furthermore,an Aggre Gate Flow(AGFlow)based scheduler is dedicatedly designed to maximize the task completion rate,whose key point is to aggregate flows belonging to one task during the multi-task transmission of UAVs network and to allocate priority by calculating the urgency-level of each AGFlow.Simulation results demonstrate that,compared with that of state-of-the-art scheduler,the average task completion rate of AGFlow based scheduler is raised by 0.278.展开更多
Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturin...Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturing process lead to a significant spread in the operating speeds of cores within homogeneous multi-core processors. Task scheduling approaches, which do not consider such heterogeneity caused by within-die variations,can lead to an overly pessimistic result in terms of performance. To realize an optimal performance according to the actual maximum clock frequencies at which cores can run, we present a heterogeneity-aware schedule refining(HASR) scheme by fully exploiting the heterogeneities of homogeneous multi-core processors in embedded domains.We analyze and show how the actual maximum frequencies of cores are used to guide the scheduling. In the scheme,representative chip operating points are selected and the corresponding optimal schedules are generated as candidate schedules. During the booting of each chip, according to the actual maximum clock frequencies of cores, one of the candidate schedules is bound to the chip to maximize the performance. A set of applications are designed to evaluate the proposed scheme. Experimental results show that the proposed scheme can improve the performance by an average value of 22.2%, compared with the baseline schedule based on the worst case timing analysis. Compared with the conventional task scheduling approach based on the actual maximum clock frequencies, the proposed scheme also improves the performance by up to 12%.展开更多
When multiple central processing unit(CPU)cores and integrated graphics processing units(GPUs)share off-chip main memory,CPU and GPU applications compete for the critical memory resource.This causes serious resource c...When multiple central processing unit(CPU)cores and integrated graphics processing units(GPUs)share off-chip main memory,CPU and GPU applications compete for the critical memory resource.This causes serious resource competition and has a negative impact on the overall performance of the system.We describe the competition for shared-memory resources in a CPU-GPU heterogeneous multi-core architecture,and a sharedmemory request scheduling strategy based on perceptual and predictive batch-processing is proposed.By sensing the CPU and GPU memory request conditions in the request buffer,the proposed scheduling strategy estimates the GPU latency tolerance and reduces mutual interference between CPU and GPU by processing CPU or GPU memory requests in batches.According to the simulation results,the scheduling strategy improves CPU performance by8.53%and reduces mutual interference by 10.38%with low hardware complexity.展开更多
Multi-core processor is widely used as the running platform for safety-critical real-time systems such as spacecraft,and various types of real-time tasks are dynamically added at runtime.In order to improve the utiliz...Multi-core processor is widely used as the running platform for safety-critical real-time systems such as spacecraft,and various types of real-time tasks are dynamically added at runtime.In order to improve the utilization of multi-core processors and ensure the real-time performance of the system,it is necessary to adopt a reasonable real-time task allocation method,but the existing methods are only for single-core processors or the performance is too low to be applicable.Aiming at the task allocation problem when mixed real-time tasks are dynamically added,we propose a heuristic mixed real-time task allocation algorithm of virtual utilization VU-WF(Virtual Utilization Worst Fit)in multi-core processor.First,a 4-tuple task model is established to describe the fixedpoint task and the sporadic task in a unified manner.Then,a VDS(Virtual Deferral Server)for serving execution requests of fixed-point task is constructed and a schedulability test of the mixed task set is derived.Finally,combined with the analysis of VDS's capacity,VU-WF is proposed,which selects cores in ascending order of virtual utilization for the schedulability test.Experiments show that the overall performance of VU-WF is better than available algorithms,not only has a good schedulable ratio and load balancing but also has the lowest runtime overhead.In a 4-core processor,compared with available algorithms of the same schedulability ratio,the load balancing is improved by 73.9%,and the runtime overhead is reduced by 38.3%.In addition,we also develop a visual multi-core mixed task scheduling simulator RT-MCSS(open source)to facilitate the design and verification of multi-core scheduling for users.As the high performance,VU-WF can be widely used in resource-constrained and safety-critical real-time systems,such as spacecraft,self-driving cars,industrial robots,etc.展开更多
Contemporary operating systems for single-ISA (instruction set architecture) multi-core systems attempt to distribute tasks equally among all the CPUs. This approach works relatively well when there is no difference...Contemporary operating systems for single-ISA (instruction set architecture) multi-core systems attempt to distribute tasks equally among all the CPUs. This approach works relatively well when there is no difference in CPU capability. However, there are cases in which CPU capability differs from one another. For instance, static capability asymmetry results from the advent of new asymmetric hardware, and dynamic capability asymmetry comes from the operating system (OS) outside noise caused from networking or I/O handling. These asymmetries can make it hard for the OS scheduler to evenly distribute the tasks, resulting in less efficient load balancing. In this paper, we propose a user-level load balaneer for parallel applications, called the 'capability balancer', which recognizes the difference of CPU capability and makes subtasks share the entire CPU capability fairly. The balancer can coexist with the existing kemel-level load balancer without detrimenting the behavior of the kernel balancer. The capability balancer can fairly distribute CPU capability to tasks with very little overhead. For real workloads like the NAS Parallel Benchmark (NPB), we have accomplished speedups of up to 9.8% and 8.5% in dynamic and static asymmetries, respectively. We have also experienced speedups of 13.3% for dynamic asymmetry and 24.1% for static asymmetry in a competitive environment. The impacts of our task selection policies, FIFO (first in, first out) and cache, were compared. The use of the cache policy led to a speedup of 5.3% in overall execution time and a decrease of 4.7% in the overall cache miss count, compared with the FIFO policy, which is used by default.展开更多
基金supported by Taif University Researchers Supporting Program(Project Number:TURSP-2020/195)Taif University,Saudi Arabia.Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R203)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS.
基金Supported by the China Postdoctoral Science Foundation(No.2014M552115)the Fundamental Research Funds for the Central Universities,ChinaUniversity of Geosciences(Wuhan)(No.CUGL140833)the National Key Technology Support Program of China(No.2011BAH06B04)
文摘In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of parallel processing mechanisms.One is that it can evenly allocate tasks to each server node in the cluster and the other is that it can implement the load balancing inside a server node.Based on the strategy,a new web-based spatial computing model is designed in this paper,in which,a task response ratio calculation method,a request queue buffer mechanism and a thread scheduling strategy are focused on.Experimental results show that the new model can fully use the multi-core computing advantage of each server node in the concurrent access environment and improve the average hits per second,average I/O Hits,CPU utilization and throughput.Using speed-up ratio to analyze the traditional model and the new one,the result shows that the new model has the best performance.The performance of the multi-core server nodes in the cluster is optimized;the resource utilization and the parallel processing capabilities are enhanced.The more CPU cores you have,the higher parallel processing capabilities will be obtained.
基金Project(2012B091100444)supported by the Production,Education and Research Cooperative Program of Guangdong Province and Ministry of Education,ChinaProject(2013ZM0091)supported by Fundamental Research Funds for the Central Universities of China
文摘To cope with the task scheduling problem under multi-task and transportation consideration in large-scale service oriented manufacturing systems(SOMS), a service allocation optimization mathematical model was established, and then a hybrid discrete particle swarm optimization-genetic algorithm(HDPSOGA) was proposed. In SOMS, each resource involved in the whole life cycle of a product, whether it is provided by a piece of software or a hardware device, is encapsulated into a service. So, the transportation during production of a task should be taken into account because the hard-services selected are possibly provided by various providers in different areas. In the service allocation optimization mathematical model, multi-task and transportation were considered simultaneously. In the proposed HDPSOGA algorithm, integer coding method was applied to establish the mapping between the particle location matrix and the service allocation scheme. The position updating process was performed according to the cognition part, the social part, and the previous velocity and position while introducing the crossover and mutation idea of genetic algorithm to fit the discrete space. Finally, related simulation experiments were carried out to compare with other two previous algorithms. The results indicate the effectiveness and efficiency of the proposed hybrid algorithm.
基金co-supported by the National Natural Science Foundation of China(Nos.61762030 and 61971148)the Guangxi Natural Science Foundation,China(Nos.2019GXNSFFA245007,2018GXNSFDA281013 and 2016GXNSFGA380002)Key Science and Technology Project of Guangxi,China(Nos.AA18242021,ZY19183005,2017AB13014,2018JJA70209,AA19110044 and AA19110046)。
文摘Unmanned Aerial Vehicles(UAVs)cooperative multi-task system has become the research focus in recent years.However,the existing network frameworks of UAVs are not flexible and efficient enough to deal with the complex multi-task scheduling,because they are not able to perceive the different features.In this paper,a novel cooperated UAVs network framework for multi-task scheduling is proposed.It is a three-layer network including a core layer,an aggregation layer and an execution layer,which enhances the efficiency of multi-task distribution,aggregation and transmission.Furthermore,an Aggre Gate Flow(AGFlow)based scheduler is dedicatedly designed to maximize the task completion rate,whose key point is to aggregate flows belonging to one task during the multi-task transmission of UAVs network and to allocate priority by calculating the urgency-level of each AGFlow.Simulation results demonstrate that,compared with that of state-of-the-art scheduler,the average task completion rate of AGFlow based scheduler is raised by 0.278.
基金Project supported by the National Natural Science Foundation of China(Nos.6122500861373074+3 种基金and 61373090)the National Basic Research Program(973)of China(No.2014CB349304)the Specialized Research Fund for the Doctoral Program of Higher Education,the Ministry of Education of China(No.20120002110033)the Tsinghua University Initiative Scientific Research Program
文摘Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturing process lead to a significant spread in the operating speeds of cores within homogeneous multi-core processors. Task scheduling approaches, which do not consider such heterogeneity caused by within-die variations,can lead to an overly pessimistic result in terms of performance. To realize an optimal performance according to the actual maximum clock frequencies at which cores can run, we present a heterogeneity-aware schedule refining(HASR) scheme by fully exploiting the heterogeneities of homogeneous multi-core processors in embedded domains.We analyze and show how the actual maximum frequencies of cores are used to guide the scheduling. In the scheme,representative chip operating points are selected and the corresponding optimal schedules are generated as candidate schedules. During the booting of each chip, according to the actual maximum clock frequencies of cores, one of the candidate schedules is bound to the chip to maximize the performance. A set of applications are designed to evaluate the proposed scheme. Experimental results show that the proposed scheme can improve the performance by an average value of 22.2%, compared with the baseline schedule based on the worst case timing analysis. Compared with the conventional task scheduling approach based on the actual maximum clock frequencies, the proposed scheme also improves the performance by up to 12%.
基金Project supported by the National Natural Science Foundation of China(Nos.62276011 and 61202076)the Natural Science Foundation of Beijing,China(No.4192007)。
文摘When multiple central processing unit(CPU)cores and integrated graphics processing units(GPUs)share off-chip main memory,CPU and GPU applications compete for the critical memory resource.This causes serious resource competition and has a negative impact on the overall performance of the system.We describe the competition for shared-memory resources in a CPU-GPU heterogeneous multi-core architecture,and a sharedmemory request scheduling strategy based on perceptual and predictive batch-processing is proposed.By sensing the CPU and GPU memory request conditions in the request buffer,the proposed scheduling strategy estimates the GPU latency tolerance and reduces mutual interference between CPU and GPU by processing CPU or GPU memory requests in batches.According to the simulation results,the scheduling strategy improves CPU performance by8.53%and reduces mutual interference by 10.38%with low hardware complexity.
文摘Multi-core processor is widely used as the running platform for safety-critical real-time systems such as spacecraft,and various types of real-time tasks are dynamically added at runtime.In order to improve the utilization of multi-core processors and ensure the real-time performance of the system,it is necessary to adopt a reasonable real-time task allocation method,but the existing methods are only for single-core processors or the performance is too low to be applicable.Aiming at the task allocation problem when mixed real-time tasks are dynamically added,we propose a heuristic mixed real-time task allocation algorithm of virtual utilization VU-WF(Virtual Utilization Worst Fit)in multi-core processor.First,a 4-tuple task model is established to describe the fixedpoint task and the sporadic task in a unified manner.Then,a VDS(Virtual Deferral Server)for serving execution requests of fixed-point task is constructed and a schedulability test of the mixed task set is derived.Finally,combined with the analysis of VDS's capacity,VU-WF is proposed,which selects cores in ascending order of virtual utilization for the schedulability test.Experiments show that the overall performance of VU-WF is better than available algorithms,not only has a good schedulable ratio and load balancing but also has the lowest runtime overhead.In a 4-core processor,compared with available algorithms of the same schedulability ratio,the load balancing is improved by 73.9%,and the runtime overhead is reduced by 38.3%.In addition,we also develop a visual multi-core mixed task scheduling simulator RT-MCSS(open source)to facilitate the design and verification of multi-core scheduling for users.As the high performance,VU-WF can be widely used in resource-constrained and safety-critical real-time systems,such as spacecraft,self-driving cars,industrial robots,etc.
基金supported by the Next-Generation Information Computing Development Program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology(No. 2011-0020521)the Korea Communications Commission,under the Communications Policy Research Center Support Program supervised by the Korea Communications Agency (No. KCA-2011-1194100004-110010100)
文摘Contemporary operating systems for single-ISA (instruction set architecture) multi-core systems attempt to distribute tasks equally among all the CPUs. This approach works relatively well when there is no difference in CPU capability. However, there are cases in which CPU capability differs from one another. For instance, static capability asymmetry results from the advent of new asymmetric hardware, and dynamic capability asymmetry comes from the operating system (OS) outside noise caused from networking or I/O handling. These asymmetries can make it hard for the OS scheduler to evenly distribute the tasks, resulting in less efficient load balancing. In this paper, we propose a user-level load balaneer for parallel applications, called the 'capability balancer', which recognizes the difference of CPU capability and makes subtasks share the entire CPU capability fairly. The balancer can coexist with the existing kemel-level load balancer without detrimenting the behavior of the kernel balancer. The capability balancer can fairly distribute CPU capability to tasks with very little overhead. For real workloads like the NAS Parallel Benchmark (NPB), we have accomplished speedups of up to 9.8% and 8.5% in dynamic and static asymmetries, respectively. We have also experienced speedups of 13.3% for dynamic asymmetry and 24.1% for static asymmetry in a competitive environment. The impacts of our task selection policies, FIFO (first in, first out) and cache, were compared. The use of the cache policy led to a speedup of 5.3% in overall execution time and a decrease of 4.7% in the overall cache miss count, compared with the FIFO policy, which is used by default.