We consider a spectrum efficiency(SE)maximization problem for cooperative power beacon-enabled wireless powered communication networks(CPB-WPCNs),where each transmitter harvests en-ergy from multi-antenna power beacon...We consider a spectrum efficiency(SE)maximization problem for cooperative power beacon-enabled wireless powered communication networks(CPB-WPCNs),where each transmitter harvests en-ergy from multi-antenna power beacons(PBs)and transmits data to the corresponding receiver.For data transmission,both orthogonal transmission,i.e.,the time splitting(TS)mode,and non-orthogonal trans-mission,i.e.,the interference channel(IC)mode,are considered.Aiming to improve the system SE,the energy beamformers of PBs,the transmit power,and the transmit time duration of transmitters are jointly optimized.For the TS mode,the original non-convex problem is transformed into a convex opti-mization problem by means of variable substitution and semidefinite relaxation(SDR).The rank-one na-ture of this SDR is proved,and then a Lagrange-dual based fast algorithm is proposed to obtain the opti-mal solution with much lower complexity.For the IC mode,to conquer the strong non-convexity of the problem,a branch-reduce-and-bound(BRB)mono-tonic optimization algorithm is designed as a bench-mark.Furthermore,a low-complexity distributed suc-cessive convex approximation(SCA)algorithm is pre-sented.Finally,simulation results validate the perfor-mance of the proposed algorithms,achieving optimal-ity within only 1%∼2%computation time compared to the CVX solver in the TS mode and achieving 98%of the optimal performance in the IC mode.展开更多
Human saccade is a dynamic process of information pursuit. There are many methods using either global context or local context cues to model human saccadic scan-paths. In contrast to them, this paper introduces a mode...Human saccade is a dynamic process of information pursuit. There are many methods using either global context or local context cues to model human saccadic scan-paths. In contrast to them, this paper introduces a model for gaze movement control using both global and local cues. To test the performance of this model, an experiment is done to collect human eye movement data by using an SMI iVIEW X Hi-Speed eye tracker with a sampling rate of 1250 Hz. The experiment used a two-by-four mixed design with the location of the targets and the four initial positions. We compare the saccadic scan-paths generated by the proposed model against human eye movement data on a face benchmark dataset. Experimental results demonstrate that the simulated scan-paths by the proposed model are similar to human saccades in term of the fixation order, Hausdorff distance, and prediction accuracy for both static fixation locations and dynamic scan-paths.展开更多
Federated learning effectively addresses issues such as data privacy by collaborating across participating devices to train global models.However,factors such as network topology and computing power of devices can aff...Federated learning effectively addresses issues such as data privacy by collaborating across participating devices to train global models.However,factors such as network topology and computing power of devices can affect its training or communication process in complex network environments.Computing and network convergence(CNC)of sixth-generation(6G)networks,a new network architecture and paradigm with computing-measurable,perceptible,distributable,dispatchable,and manageable capabilities,can effectively support federated learning training and improve its communication efficiency.By guiding the participating devices'training in federated learning based on business requirements,resource load,network conditions,and computing power of devices,CNC can reach this goal.In this paper,to improve the communication eficiency of federated learning in complex networks,we study the communication eficiency optimization methods of federated learning for CNC of 6G networks that give decisions on the training process for different network conditions and computing power of participating devices.The simulations address two architectures that exist for devices in federated learning and arrange devices to participate in training based on arithmetic power while achieving optimization of communication efficiency in the process of transferring model parameters.The results show that the methods we proposed can cope well with complex network situations,effectively balance the delay distribution of participating devices for local training,improve the communication eficiency during the transfer of model parameters,and improve the resource utilization in the network.展开更多
GPUs become a ubiquitous choice as coprocessors since they have excellent ability in concurrent processing. In GPU architecture, shared memory plays a very important role in system performance as it can largely improv...GPUs become a ubiquitous choice as coprocessors since they have excellent ability in concurrent processing. In GPU architecture, shared memory plays a very important role in system performance as it can largely improve bandwidth utilization and accelerate memory operations. However, even for affine GPU applications that contain regular access patterns, optimizing for shared memory is not an easy work. It often requires programmer expertise and nontrivial parameter selection. Improper shared memory usage might even underutilize GPU resource: Even using state-of-the-art high level programming models (e.g., OpenACC and OpenHMPP), it is still hard to utilize shared memory since they lack inherent support in describing shared memory optimization and selecting suitable parameters, let alone maintaining high resource utilization. Targeting higher productivity for affine applications, we propose a data centric way to shared memory optimization on GPU. We design a pragma extension on OpenACC so as to convey data management hints of programmers to compiler. Meanwhile, we devise a compiler framework to automatically select optimal parameters for shared arrays, using the polyhedral model. We further propose optimization techniques to expose higher memory and instruction level parallelism. The experimental results show that our shared memory centric approaches effectively improve the performance of five typical GPU applications across four widely used platforms by 3.7x on average, and do not burden programmers with lots of pragmas.展开更多
基金National Natural Science Foundation of China(61771066,61629101).
文摘We consider a spectrum efficiency(SE)maximization problem for cooperative power beacon-enabled wireless powered communication networks(CPB-WPCNs),where each transmitter harvests en-ergy from multi-antenna power beacons(PBs)and transmits data to the corresponding receiver.For data transmission,both orthogonal transmission,i.e.,the time splitting(TS)mode,and non-orthogonal trans-mission,i.e.,the interference channel(IC)mode,are considered.Aiming to improve the system SE,the energy beamformers of PBs,the transmit power,and the transmit time duration of transmitters are jointly optimized.For the TS mode,the original non-convex problem is transformed into a convex opti-mization problem by means of variable substitution and semidefinite relaxation(SDR).The rank-one na-ture of this SDR is proved,and then a Lagrange-dual based fast algorithm is proposed to obtain the opti-mal solution with much lower complexity.For the IC mode,to conquer the strong non-convexity of the problem,a branch-reduce-and-bound(BRB)mono-tonic optimization algorithm is designed as a bench-mark.Furthermore,a low-complexity distributed suc-cessive convex approximation(SCA)algorithm is pre-sented.Finally,simulation results validate the perfor-mance of the proposed algorithms,achieving optimal-ity within only 1%∼2%computation time compared to the CVX solver in the TS mode and achieving 98%of the optimal performance in the IC mode.
文摘Human saccade is a dynamic process of information pursuit. There are many methods using either global context or local context cues to model human saccadic scan-paths. In contrast to them, this paper introduces a model for gaze movement control using both global and local cues. To test the performance of this model, an experiment is done to collect human eye movement data by using an SMI iVIEW X Hi-Speed eye tracker with a sampling rate of 1250 Hz. The experiment used a two-by-four mixed design with the location of the targets and the four initial positions. We compare the saccadic scan-paths generated by the proposed model against human eye movement data on a face benchmark dataset. Experimental results demonstrate that the simulated scan-paths by the proposed model are similar to human saccades in term of the fixation order, Hausdorff distance, and prediction accuracy for both static fixation locations and dynamic scan-paths.
基金supported by the National Natural Science Foundation of China(Nos.62271062 and 62071063)。
文摘Federated learning effectively addresses issues such as data privacy by collaborating across participating devices to train global models.However,factors such as network topology and computing power of devices can affect its training or communication process in complex network environments.Computing and network convergence(CNC)of sixth-generation(6G)networks,a new network architecture and paradigm with computing-measurable,perceptible,distributable,dispatchable,and manageable capabilities,can effectively support federated learning training and improve its communication efficiency.By guiding the participating devices'training in federated learning based on business requirements,resource load,network conditions,and computing power of devices,CNC can reach this goal.In this paper,to improve the communication eficiency of federated learning in complex networks,we study the communication eficiency optimization methods of federated learning for CNC of 6G networks that give decisions on the training process for different network conditions and computing power of participating devices.The simulations address two architectures that exist for devices in federated learning and arrange devices to participate in training based on arithmetic power while achieving optimization of communication efficiency in the process of transferring model parameters.The results show that the methods we proposed can cope well with complex network situations,effectively balance the delay distribution of participating devices for local training,improve the communication eficiency during the transfer of model parameters,and improve the resource utilization in the network.
基金This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2012AA010902, the National Natural Science Foundation of China (NSFC) under Grant No. 61432018, and the Innovation Research Group of NSFC under Grant No. 61221062.
文摘GPUs become a ubiquitous choice as coprocessors since they have excellent ability in concurrent processing. In GPU architecture, shared memory plays a very important role in system performance as it can largely improve bandwidth utilization and accelerate memory operations. However, even for affine GPU applications that contain regular access patterns, optimizing for shared memory is not an easy work. It often requires programmer expertise and nontrivial parameter selection. Improper shared memory usage might even underutilize GPU resource: Even using state-of-the-art high level programming models (e.g., OpenACC and OpenHMPP), it is still hard to utilize shared memory since they lack inherent support in describing shared memory optimization and selecting suitable parameters, let alone maintaining high resource utilization. Targeting higher productivity for affine applications, we propose a data centric way to shared memory optimization on GPU. We design a pragma extension on OpenACC so as to convey data management hints of programmers to compiler. Meanwhile, we devise a compiler framework to automatically select optimal parameters for shared arrays, using the polyhedral model. We further propose optimization techniques to expose higher memory and instruction level parallelism. The experimental results show that our shared memory centric approaches effectively improve the performance of five typical GPU applications across four widely used platforms by 3.7x on average, and do not burden programmers with lots of pragmas.