In this article,we review recent advances in the technology of writing fiber Bragg gratings(FBGs)in selected cores of multicore fibers(MCFs)by using femtosecond laser pulses.The writing technology of such a key elemen...In this article,we review recent advances in the technology of writing fiber Bragg gratings(FBGs)in selected cores of multicore fibers(MCFs)by using femtosecond laser pulses.The writing technology of such a key element as the FBG opens up wide opportunities for the creation of next generation fiber lasers and sensors based on MCFs.The advantages of the technology are shown by using the examples of 3D shape sensors,acoustic emission sensors with spatially multiplexed channels,as well as multicore fiber Raman lasers.展开更多
Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery.In this work,we reported a monoaxial dispersion electrospray...Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery.In this work,we reported a monoaxial dispersion electrospraying-ionotropic gelation technique to prepare multicore millimeter-sized spherical capsules for specific and sustained release of fish oil.The spherical capsules had diameters from 2.05 mm to 0.35 mm with the increased applied voltages.The capsules consisted of uniform(at applied voltages of≤10 k V)or nonuniform(at applied voltages of>10 k V)multicores.The obtained capsules had reasonable loading ratios(9.7%-6.3%)due to the multicore structure.In addition,the obtained capsules had specific and sustained release behaviors of fish oil into the small intestinal phase of in vitro gastrointestinal tract and small intestinal tract models.The simple monoaxial dispersion electrospraying-ionotropic gelatin technique does not involve complicated preparation formulations and polymer modification,which makes the technique has a potential application prospect for the fish oil preparations and the encapsulation of functional active substances in the field of food and drug industries.展开更多
Great strides have been made over the past decade to establish femtosecond lasers in advanced manufacturing systems for enabling new forms of non-contact processing of transparent materials.Research advances have show...Great strides have been made over the past decade to establish femtosecond lasers in advanced manufacturing systems for enabling new forms of non-contact processing of transparent materials.Research advances have shown that a myriad of additive and subtractive techniques is now possible for flexible 2D and 3D structuring of such materials with micro-and nano-scale precision.In this paper,these techniques have been refined and scaled up to demonstrate the potential for 3D writing of high-density optical packaging components,specifically addressing the major bottleneck for efficiently connecting optical fibres to silicon photonic(SiP)processors for use in telecom and data centres.An 84-channel fused silica interposer was introduced for high-density edge coupling of multicore fibres(MCFs)to a SiP chip.Femtosecond laser irradiation followed by chemical etching was further harnessed to open alignment sockets,permitting rapid assembly with precise locking of MCF positions for efficient coupling to laser written optical waveguides in the interposer.A 3D waveguide fanout design provided an attractive balancing of low losses,modematching,high channel density,compact footprint,and low crosstalk.The 3D additive and subtractive processes thus demonstrated the potential for higher scale integration and rapid photonic assembly and packaging of micro-optic components for telecom interconnects,with possible broader applications in integrated biophotonic chips or micro-displays.展开更多
Recently,Multicore systems use Dynamic Voltage/Frequency Scaling(DV/FS)technology to allow the cores to operate with various voltage and/or frequencies than other cores to save power and enhance the performance.In thi...Recently,Multicore systems use Dynamic Voltage/Frequency Scaling(DV/FS)technology to allow the cores to operate with various voltage and/or frequencies than other cores to save power and enhance the performance.In this paper,an effective and reliable hybridmodel to reduce the energy and makespan in multicore systems is proposed.The proposed hybrid model enhances and integrates the greedy approach with dynamic programming to achieve optimal Voltage/Frequency(Vmin/F)levels.Then,the allocation process is applied based on the availableworkloads.The hybrid model consists of three stages.The first stage gets the optimum safe voltage while the second stage sets the level of energy efficiency,and finally,the third is the allocation stage.Experimental results on various benchmarks show that the proposed model can generate optimal solutions to save energy while minimizing the makespan penalty.Comparisons with other competitive algorithms show that the proposed model provides on average 48%improvements in energy-saving and achieves an 18%reduction in computation time while ensuring a high degree of system reliability.展开更多
We proposed a method for shape sensing using a few multicore fiber Bragg grating (FBG) sensors ina single-port continuum surgical robot (CSR). The traditional method of utilizing a forward kinematic model tocalculate t...We proposed a method for shape sensing using a few multicore fiber Bragg grating (FBG) sensors ina single-port continuum surgical robot (CSR). The traditional method of utilizing a forward kinematic model tocalculate the shape of a single-port CSR is limited by the accuracy of the model. If FBG sensors are used forshape sensing, their accuracy will be affected by their number, especially in long and flexible CSRs. A fusionmethod based on an extended Kalman filter (EKF) was proposed to solve this problem. Shape reconstructionwas performed using the CSR forward kinematic model and FBG sensors, and the two results were fused usingan EKF. The CSR reconstruction method adopted the incremental form of the forward kinematic model, whilethe FBG sensor method adopted the discrete arc-segment assumption method. The fusion method can eliminatethe inaccuracy of the kinematic model and obtain more accurate shape reconstruction results using only a smallnumber of FBG sensors. We validated our algorithm through experiments on multiple bending shapes underdifferent load conditions. The results show that our method significantly outperformed the traditional methodsin terms of robustness and effectiveness.展开更多
Multicore fiber(MCF)which contains more than one core in a single fiber cladding has attracted ever increasing attention for application in optical sensing systems owing to its unique capability of independent light t...Multicore fiber(MCF)which contains more than one core in a single fiber cladding has attracted ever increasing attention for application in optical sensing systems owing to its unique capability of independent light transmission in multiple spatial channels.Different from the situation in standard single mode fiber(SMF),the fiber bending gives rise to tangential strain in off-center cores,and this unique feature has been employed for directional bending and shape sensing,where strain measurement is achieved by using either fiber Bragg gratings(FBGs),optical frequency-domain reflectometry(OFDR)or Brillouin distributed sensing technique.On the other hand,the parallel spatial cores enable space-division multiplexed(SDM)system configuration that allows for the multiplexing of multiple distributed sensing techniques.As a result,multi-parameter sensing or performance enhanced sensing can be achieved by using MCF.In this paper,we review the research progress in MCF based distributed fiber sensors.Brief introductions of MCF and the multiplexing/de-multiplexing methods are presented.The bending sensitivity of off-center cores is analyzed.Curvature and shape sensing,as well as various SDM distributed sensing using MCF are summarized,and the working principles of diverse MCF sensors are discussed.Finally,we present the challenges and prospects of MCF for distributed sensing applications.展开更多
The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In...The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems.展开更多
Graphic processing units (GPUs) have been widely recognized as cost-efficient co-processors with acceptable size, weight, and power consumption. However, adopting GPUs in real-time systems is still challenging, due ...Graphic processing units (GPUs) have been widely recognized as cost-efficient co-processors with acceptable size, weight, and power consumption. However, adopting GPUs in real-time systems is still challenging, due to the lack in framework for real-time analysis. In order to guarantee real-time requirements while maintaining system utilization ~in modern heterogeneous systems, such as multicore multi-GPU systems, a novel suspension-based k-exclusion real-time locking protocol and the associated suspension-aware schedulability analysis are proposed. The proposed protocol provides a synchronization framework that enables multiple GPUs to be efficiently integrated in multicore real-time systems. Comparative evaluations show that the proposed methods improve upon the existing work in terms of schedulability.展开更多
This study produced a statistical analysis of multicore eddy structures based on 23 years’ altimetry data in global oceans. Multicore structures were identified using a threshold-free closed-contour algorithm of sea ...This study produced a statistical analysis of multicore eddy structures based on 23 years’ altimetry data in global oceans. Multicore structures were identified using a threshold-free closed-contour algorithm of sea surface height, which was improved for this study in respect of certain technical details. Meanwhile a more accurate definition of eddy boundary was used to estimate eddy scale. Generally, multicore structures, which have two or more closed eddies of the same polarity within their boundaries, represent an important transitional stage in their lives during which the component eddies might experience splitting or merging. In comparison with global eddies, the lifetimes and propagation distances of multicore eddies were found to be much smaller because of their inherent structural instability. However, at the same latitude, the spatial scale of multicore eddies was found larger than that of single-core eddies, i.e., the eddy area could be at least twice as large. Multicore eddies were found to exhibit some features similar to global eddies. For example, multicore eddies tend to occur in the Antarctic Circumpolar Current, some western boundary currents, and mid-latitude regions around 25°N/S, the majority(70%) of eddies propagate westward while only 30% propagate eastward, and large-amplitude eddies are restricted mainly to reasonably confined regions of highly unstable currents.展开更多
In this paper,a typical experiment is carried out based on a high-resolution air-sea coupled model,namely,the coupled ocean-atmosphere-wave-sediment transport(COAWST)model,on both heterogeneous many-core(SW)and homoge...In this paper,a typical experiment is carried out based on a high-resolution air-sea coupled model,namely,the coupled ocean-atmosphere-wave-sediment transport(COAWST)model,on both heterogeneous many-core(SW)and homogenous multicore(Intel)supercomputing platforms.We construct a hindcast of Typhoon Lekima on both the SW and Intel platforms,compare the simulation results between these two platforms and compare the key elements of the atmospheric and ocean modules to reanalysis data.The comparative experiment in this typhoon case indicates that the domestic many-core computing platform and general cluster yield almost no differences in the simulated typhoon path and intensity,and the differences in surface pressure(PSFC)in the WRF model and sea surface temperature(SST)in the short-range forecast are very small,whereas a major difference can be identified at high latitudes after the first 10 days.Further heat budget analysis verifies that the differences in SST after 10 days are mainly caused by shortwave radiation variations,as influenced by subsequently generated typhoons in the system.These typhoons generated in the hindcast after the first 10 days attain obviously different trajectories between the two platforms.展开更多
Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation ...Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation is still a problem in large-scale structural analysis based on heterogeneousmulticore clusters.To solve it,a hybrid hierarchical parallel algorithm(HHPA)is proposed on the basis of the conventional domain decomposition algorithm(CDDA)and the parallel sparse solver.In this new algorithm,a three-layer parallelization of the computational procedure is introduced to enable the separation of the communication of inter-nodes,heterogeneous-core-groups(HCGs)and inside-heterogeneous-core-groups through mapping computing tasks to various hardware layers.This approach can not only achieve load balancing at different layers efficiently but can also improve the communication rate significantly through hierarchical communication.Additionally,the proposed hybrid parallel approach in this article can reduce the interface equation size and further reduce the solution time,which can make up for the shortcoming of growing communication overheads with the increase of interface equation size when employing CDDA.Moreover,the distributed sparse storage of a large amount of data is introduced to improve memory access.By solving benchmark instances on the Shenwei-Taihuzhiguang supercomputer,the results show that the proposed method can obtain higher speedup and parallel efficiency compared with CDDA and more superior extensibility of parallel partition compared with the two-level parallel computing algorithm(TPCA).展开更多
3D reverse time migration in tiled transversly isotropic(3D RTM-TTI) is the most precise model for complex seismic imaging.However,vast computing time of 3D RTM-TTI prevents it from being widely used,which is addresse...3D reverse time migration in tiled transversly isotropic(3D RTM-TTI) is the most precise model for complex seismic imaging.However,vast computing time of 3D RTM-TTI prevents it from being widely used,which is addressed by providing parallel solutions for 3D RTM-TTI on multicores and many-cores.After data parallelism and memory optimization,the hot spot function of 3D RTMTTI gains 35.99 X speedup on two Intel Xeon CPUs,89.75 X speedup on one Intel Xeon Phi,89.92 X speedup on one NVIDIA K20 GPU compared with serial CPU baseline.This study makes RTM-TTI practical in industry.Since the computation pattern in RTM is stencil,the approaches also benefit a wide range of stencil-based applications.展开更多
This thesis will present the research and practice of traffic lights and traffic signs recognition system based on multicore of FPGA. This system consists of four parts as following: the collection of dynamic images, ...This thesis will present the research and practice of traffic lights and traffic signs recognition system based on multicore of FPGA. This system consists of four parts as following: the collection of dynamic images, the preprocessing of gray value, the detection of the edges and the patterning and the judgment of the pattern matching. The multiple cores system is consist of three cores. Each core parallels processes the incoming images from camera collection in terms of different colors and graphic elements. The image data read in from the camera works as the sharing data of the three cores.展开更多
HPC(high perfomance computing)based on clusters of multicores is one of the main research lines in parallel programming.It is important to study the impact of programming paradigms of shared memory,message passing or ...HPC(high perfomance computing)based on clusters of multicores is one of the main research lines in parallel programming.It is important to study the impact of programming paradigms of shared memory,message passing or a combination of both on these architectures in order to efficiently exploit the power of these architectures.The Smith-Waterman algorithm is used as study case for the local alignment of DNA sequences,which allows establishing the similarity degree between two sequences.In this paper,the Smith-Waterman algorithm is parallelized by means of a pipeline scheme due to the data dependencies that are inherent to the problem,using the various communication/synchronization models mentioned above and then carrying out a comparative analysis.Finally,experimental results are presented,as well as future research lines.展开更多
Sparse Matrix Vector Multiplication (SpMV) is one of the most basic problems in scientific and engineering computations. It is the basic operation in many realms, such as solving linear systems or eigenvalue problems....Sparse Matrix Vector Multiplication (SpMV) is one of the most basic problems in scientific and engineering computations. It is the basic operation in many realms, such as solving linear systems or eigenvalue problems. Nowadays, more than 90 percent of the world’s highest performance parallel computers in the top 500 use multicore architecture. So it is important practically to design the efficient methods of computing SpMV on multicore parallel computers. Usually, algorithms based on compressed sparse row (CSR) format suffer from a number of nonzero elements on each row so hardly as to use the multicore structure efficiently. Compressed Sparse Block (CSB) format is an effective storage format which can compute SpMV efficiently in a multicore computer. This paper presents a parallel multicore CSB format and SpMV based on it. We carried out numerical experiments on a parallel multicore computer. The results show that our parallel multicore CSB format and SpMV algorithm can reach high speedup, and they are highly scalable for banded matrices.展开更多
The propagation of an optical vortex in a hexagonally arranged single mode multicore fiber structure is investigated for possible generation of additional vortices and their spread dynamics. Fields are separated into ...The propagation of an optical vortex in a hexagonally arranged single mode multicore fiber structure is investigated for possible generation of additional vortices and their spread dynamics. Fields are separated into a slowly varying paraxial envelope and a rapidly changing exponential component. Solutions are derived from the paraxial inhomogeneous Schrodinger equation in two dimensions along with the index of refraction of the proposed structure. Numerical analyses are based on the beam propagation method and transparent boundary conditions in matrix form with different parameters to represent the intensity and phase of all derived fields. Vortices are numerically identified by their points of zero intensity and their phase change or polarity. The optical interferogram with a plane wave reference is also employed to distinguish the dislocation points in the transverse directions of the propagating fields.展开更多
The load balancing strategy of RSS used in the PF_RING capture method does not work well on multi-core processor platforms to achieve the disadvantage of the load balancing on the processor cores.This paper presents a...The load balancing strategy of RSS used in the PF_RING capture method does not work well on multi-core processor platforms to achieve the disadvantage of the load balancing on the processor cores.This paper presents a packet load balancing method based on FD and RSS.The basic idea of this method is to capture the packet with the 5 tuple filter matching,and then can not be classified packets and flow oriented filter matching,and finally can not be classified packets matching RSS.Design of experiments to test the packet capture performance and load balancing performance which the packet capture method of PF_RING using the combination of load balancing strategy based on FD+RSS and RSS,the results show that the data packet stream load balancing method based on FD+RSS can improve the performance of data packet capture and load balancing among multiple cores.展开更多
This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and...This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and targets high-end applications. Advanced techniques are adopted to make the DFT design scalable and achieve low-power and low-cost test with limited IO resources. To achieve a scalable and flexible test access, a highly elaborate test access mechanism (TAM) is implemented to support multiple test instructions and test modes. Taking advantage of multiple identical cores embedding in the processor, scan partition and on-chip comparisons are employed to reduce test power and test time. Test compression technique is also utilized to decrease test time. To further reduce test power, clock controlling logics are designed with ability to turn off clocks of non-testing partitions. In addition, scan collars of CACHEs are designed to perform functional test with low-speed ATE for speed-binning purposes, which poses low complexity and has good correlation results.展开更多
In-vivo microendoscopy in animal models became a groundbreaking technique in neuroscience that rapidly expands our understanding of the brain.Emerging hair-thin endoscopes based on multimode fibres are now opening up ...In-vivo microendoscopy in animal models became a groundbreaking technique in neuroscience that rapidly expands our understanding of the brain.Emerging hair-thin endoscopes based on multimode fibres are now opening up the prospect of ultra-minimally invasive neuroimaging of deeply located brain structures.Complementing these advancements with methods of functional imaging and optogenetics,as well as extending its applicability to awake and motile animals constitute the most pressing challenges for this technology.Here we demonstrate a novel fibre design capable of both,high-resolution imaging in immobilised animals and bending-resilient optical addressing of neurons in motile animals.The optimised refractive index profile and the probe structure allowed reaching a spatial resolution of 2μm across a 230μm field of view for the initial layout of the fibre.Simultaneously,the fibre exhibits negligible cross-talk between individual inner-cores during fibre deformation.This work provides a technological solution for imaging-assisted spatially selective photo-activation and activity monitoring in awake and freely moving animal models.展开更多
基金supported by the Russian Ministry of Science and Higher Education (14.Y26.31.0017)Russian Foundation for Basic Research(18-52-7822)the work concerning MCF fiber Raman lasers was supported by Russian Science Foundation (21-72-30024)
文摘In this article,we review recent advances in the technology of writing fiber Bragg gratings(FBGs)in selected cores of multicore fibers(MCFs)by using femtosecond laser pulses.The writing technology of such a key element as the FBG opens up wide opportunities for the creation of next generation fiber lasers and sensors based on MCFs.The advantages of the technology are shown by using the examples of 3D shape sensors,acoustic emission sensors with spatially multiplexed channels,as well as multicore fiber Raman lasers.
基金supported by research grants from the National Key R&D Program(2019YFD0902003)。
文摘Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery.In this work,we reported a monoaxial dispersion electrospraying-ionotropic gelation technique to prepare multicore millimeter-sized spherical capsules for specific and sustained release of fish oil.The spherical capsules had diameters from 2.05 mm to 0.35 mm with the increased applied voltages.The capsules consisted of uniform(at applied voltages of≤10 k V)or nonuniform(at applied voltages of>10 k V)multicores.The obtained capsules had reasonable loading ratios(9.7%-6.3%)due to the multicore structure.In addition,the obtained capsules had specific and sustained release behaviors of fish oil into the small intestinal phase of in vitro gastrointestinal tract and small intestinal tract models.The simple monoaxial dispersion electrospraying-ionotropic gelatin technique does not involve complicated preparation formulations and polymer modification,which makes the technique has a potential application prospect for the fish oil preparations and the encapsulation of functional active substances in the field of food and drug industries.
基金Financial support from Huawei Technologies Co.,Ltd,China(Project YB2016020025)is gratefully acknowledged.
文摘Great strides have been made over the past decade to establish femtosecond lasers in advanced manufacturing systems for enabling new forms of non-contact processing of transparent materials.Research advances have shown that a myriad of additive and subtractive techniques is now possible for flexible 2D and 3D structuring of such materials with micro-and nano-scale precision.In this paper,these techniques have been refined and scaled up to demonstrate the potential for 3D writing of high-density optical packaging components,specifically addressing the major bottleneck for efficiently connecting optical fibres to silicon photonic(SiP)processors for use in telecom and data centres.An 84-channel fused silica interposer was introduced for high-density edge coupling of multicore fibres(MCFs)to a SiP chip.Femtosecond laser irradiation followed by chemical etching was further harnessed to open alignment sockets,permitting rapid assembly with precise locking of MCF positions for efficient coupling to laser written optical waveguides in the interposer.A 3D waveguide fanout design provided an attractive balancing of low losses,modematching,high channel density,compact footprint,and low crosstalk.The 3D additive and subtractive processes thus demonstrated the potential for higher scale integration and rapid photonic assembly and packaging of micro-optic components for telecom interconnects,with possible broader applications in integrated biophotonic chips or micro-displays.
文摘Recently,Multicore systems use Dynamic Voltage/Frequency Scaling(DV/FS)technology to allow the cores to operate with various voltage and/or frequencies than other cores to save power and enhance the performance.In this paper,an effective and reliable hybridmodel to reduce the energy and makespan in multicore systems is proposed.The proposed hybrid model enhances and integrates the greedy approach with dynamic programming to achieve optimal Voltage/Frequency(Vmin/F)levels.Then,the allocation process is applied based on the availableworkloads.The hybrid model consists of three stages.The first stage gets the optimum safe voltage while the second stage sets the level of energy efficiency,and finally,the third is the allocation stage.Experimental results on various benchmarks show that the proposed model can generate optimal solutions to save energy while minimizing the makespan penalty.Comparisons with other competitive algorithms show that the proposed model provides on average 48%improvements in energy-saving and achieves an 18%reduction in computation time while ensuring a high degree of system reliability.
基金the National Natural Science Foundation of China(Nos.61873257 and U20A20195)the Project of Natural Science Foundation of Liaoning Province(No.2021-MS-033)the Foundation of Millions of Talents Project of the Department of Human Resources and Social Security of Liaoning Province(No.2021921037)。
文摘We proposed a method for shape sensing using a few multicore fiber Bragg grating (FBG) sensors ina single-port continuum surgical robot (CSR). The traditional method of utilizing a forward kinematic model tocalculate the shape of a single-port CSR is limited by the accuracy of the model. If FBG sensors are used forshape sensing, their accuracy will be affected by their number, especially in long and flexible CSRs. A fusionmethod based on an extended Kalman filter (EKF) was proposed to solve this problem. Shape reconstructionwas performed using the CSR forward kinematic model and FBG sensors, and the two results were fused usingan EKF. The CSR reconstruction method adopted the incremental form of the forward kinematic model, whilethe FBG sensor method adopted the discrete arc-segment assumption method. The fusion method can eliminatethe inaccuracy of the kinematic model and obtain more accurate shape reconstruction results using only a smallnumber of FBG sensors. We validated our algorithm through experiments on multiple bending shapes underdifferent load conditions. The results show that our method significantly outperformed the traditional methodsin terms of robustness and effectiveness.
文摘Multicore fiber(MCF)which contains more than one core in a single fiber cladding has attracted ever increasing attention for application in optical sensing systems owing to its unique capability of independent light transmission in multiple spatial channels.Different from the situation in standard single mode fiber(SMF),the fiber bending gives rise to tangential strain in off-center cores,and this unique feature has been employed for directional bending and shape sensing,where strain measurement is achieved by using either fiber Bragg gratings(FBGs),optical frequency-domain reflectometry(OFDR)or Brillouin distributed sensing technique.On the other hand,the parallel spatial cores enable space-division multiplexed(SDM)system configuration that allows for the multiplexing of multiple distributed sensing techniques.As a result,multi-parameter sensing or performance enhanced sensing can be achieved by using MCF.In this paper,we review the research progress in MCF based distributed fiber sensors.Brief introductions of MCF and the multiplexing/de-multiplexing methods are presented.The bending sensitivity of off-center cores is analyzed.Curvature and shape sensing,as well as various SDM distributed sensing using MCF are summarized,and the working principles of diverse MCF sensors are discussed.Finally,we present the challenges and prospects of MCF for distributed sensing applications.
文摘The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems.
基金supported by the National Natural Science Foundation of China under Grant No.61003032/F020207
文摘Graphic processing units (GPUs) have been widely recognized as cost-efficient co-processors with acceptable size, weight, and power consumption. However, adopting GPUs in real-time systems is still challenging, due to the lack in framework for real-time analysis. In order to guarantee real-time requirements while maintaining system utilization ~in modern heterogeneous systems, such as multicore multi-GPU systems, a novel suspension-based k-exclusion real-time locking protocol and the associated suspension-aware schedulability analysis are proposed. The proposed protocol provides a synchronization framework that enables multiple GPUs to be efficiently integrated in multicore real-time systems. Comparative evaluations show that the proposed methods improve upon the existing work in terms of schedulability.
基金The National Key Reasearch and Development Program of China under contract No.2016YFC1401800the National Natural Science Foundation of China under contract No.41576176+1 种基金the National Programme on Global Change and Air-Sea InteractionDragon 4 Project under contract No.32292
文摘This study produced a statistical analysis of multicore eddy structures based on 23 years’ altimetry data in global oceans. Multicore structures were identified using a threshold-free closed-contour algorithm of sea surface height, which was improved for this study in respect of certain technical details. Meanwhile a more accurate definition of eddy boundary was used to estimate eddy scale. Generally, multicore structures, which have two or more closed eddies of the same polarity within their boundaries, represent an important transitional stage in their lives during which the component eddies might experience splitting or merging. In comparison with global eddies, the lifetimes and propagation distances of multicore eddies were found to be much smaller because of their inherent structural instability. However, at the same latitude, the spatial scale of multicore eddies was found larger than that of single-core eddies, i.e., the eddy area could be at least twice as large. Multicore eddies were found to exhibit some features similar to global eddies. For example, multicore eddies tend to occur in the Antarctic Circumpolar Current, some western boundary currents, and mid-latitude regions around 25°N/S, the majority(70%) of eddies propagate westward while only 30% propagate eastward, and large-amplitude eddies are restricted mainly to reasonably confined regions of highly unstable currents.
基金This work is supported by the National Key Research and Development Plan program of the Ministry of Science and Technology of China(No.2016YFB0201100)Additionally,this work is supported by the National Laboratory for Marine Science and Technology(Qingdao)Major Project of the Aoshan Science and Technology Innovation Program(No.2018ASKJ01-04)the Open Fundation of Key Laboratory of Marine Science and Numerical Simulation,Ministry of Natural Resources(No.2021-YB-02).
文摘In this paper,a typical experiment is carried out based on a high-resolution air-sea coupled model,namely,the coupled ocean-atmosphere-wave-sediment transport(COAWST)model,on both heterogeneous many-core(SW)and homogenous multicore(Intel)supercomputing platforms.We construct a hindcast of Typhoon Lekima on both the SW and Intel platforms,compare the simulation results between these two platforms and compare the key elements of the atmospheric and ocean modules to reanalysis data.The comparative experiment in this typhoon case indicates that the domestic many-core computing platform and general cluster yield almost no differences in the simulated typhoon path and intensity,and the differences in surface pressure(PSFC)in the WRF model and sea surface temperature(SST)in the short-range forecast are very small,whereas a major difference can be identified at high latitudes after the first 10 days.Further heat budget analysis verifies that the differences in SST after 10 days are mainly caused by shortwave radiation variations,as influenced by subsequently generated typhoons in the system.These typhoons generated in the hindcast after the first 10 days attain obviously different trajectories between the two platforms.
基金supported by the National Natural Science Foundation of China (Grant No.11772192).
文摘Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation is still a problem in large-scale structural analysis based on heterogeneousmulticore clusters.To solve it,a hybrid hierarchical parallel algorithm(HHPA)is proposed on the basis of the conventional domain decomposition algorithm(CDDA)and the parallel sparse solver.In this new algorithm,a three-layer parallelization of the computational procedure is introduced to enable the separation of the communication of inter-nodes,heterogeneous-core-groups(HCGs)and inside-heterogeneous-core-groups through mapping computing tasks to various hardware layers.This approach can not only achieve load balancing at different layers efficiently but can also improve the communication rate significantly through hierarchical communication.Additionally,the proposed hybrid parallel approach in this article can reduce the interface equation size and further reduce the solution time,which can make up for the shortcoming of growing communication overheads with the increase of interface equation size when employing CDDA.Moreover,the distributed sparse storage of a large amount of data is introduced to improve memory access.By solving benchmark instances on the Shenwei-Taihuzhiguang supercomputer,the results show that the proposed method can obtain higher speedup and parallel efficiency compared with CDDA and more superior extensibility of parallel partition compared with the two-level parallel computing algorithm(TPCA).
基金Supported by the National Natural Science Foundation of China(No.61432018)
文摘3D reverse time migration in tiled transversly isotropic(3D RTM-TTI) is the most precise model for complex seismic imaging.However,vast computing time of 3D RTM-TTI prevents it from being widely used,which is addressed by providing parallel solutions for 3D RTM-TTI on multicores and many-cores.After data parallelism and memory optimization,the hot spot function of 3D RTMTTI gains 35.99 X speedup on two Intel Xeon CPUs,89.75 X speedup on one Intel Xeon Phi,89.92 X speedup on one NVIDIA K20 GPU compared with serial CPU baseline.This study makes RTM-TTI practical in industry.Since the computation pattern in RTM is stencil,the approaches also benefit a wide range of stencil-based applications.
文摘This thesis will present the research and practice of traffic lights and traffic signs recognition system based on multicore of FPGA. This system consists of four parts as following: the collection of dynamic images, the preprocessing of gray value, the detection of the edges and the patterning and the judgment of the pattern matching. The multiple cores system is consist of three cores. Each core parallels processes the incoming images from camera collection in terms of different colors and graphic elements. The image data read in from the camera works as the sharing data of the three cores.
文摘HPC(high perfomance computing)based on clusters of multicores is one of the main research lines in parallel programming.It is important to study the impact of programming paradigms of shared memory,message passing or a combination of both on these architectures in order to efficiently exploit the power of these architectures.The Smith-Waterman algorithm is used as study case for the local alignment of DNA sequences,which allows establishing the similarity degree between two sequences.In this paper,the Smith-Waterman algorithm is parallelized by means of a pipeline scheme due to the data dependencies that are inherent to the problem,using the various communication/synchronization models mentioned above and then carrying out a comparative analysis.Finally,experimental results are presented,as well as future research lines.
文摘Sparse Matrix Vector Multiplication (SpMV) is one of the most basic problems in scientific and engineering computations. It is the basic operation in many realms, such as solving linear systems or eigenvalue problems. Nowadays, more than 90 percent of the world’s highest performance parallel computers in the top 500 use multicore architecture. So it is important practically to design the efficient methods of computing SpMV on multicore parallel computers. Usually, algorithms based on compressed sparse row (CSR) format suffer from a number of nonzero elements on each row so hardly as to use the multicore structure efficiently. Compressed Sparse Block (CSB) format is an effective storage format which can compute SpMV efficiently in a multicore computer. This paper presents a parallel multicore CSB format and SpMV based on it. We carried out numerical experiments on a parallel multicore computer. The results show that our parallel multicore CSB format and SpMV algorithm can reach high speedup, and they are highly scalable for banded matrices.
文摘The propagation of an optical vortex in a hexagonally arranged single mode multicore fiber structure is investigated for possible generation of additional vortices and their spread dynamics. Fields are separated into a slowly varying paraxial envelope and a rapidly changing exponential component. Solutions are derived from the paraxial inhomogeneous Schrodinger equation in two dimensions along with the index of refraction of the proposed structure. Numerical analyses are based on the beam propagation method and transparent boundary conditions in matrix form with different parameters to represent the intensity and phase of all derived fields. Vortices are numerically identified by their points of zero intensity and their phase change or polarity. The optical interferogram with a plane wave reference is also employed to distinguish the dislocation points in the transverse directions of the propagating fields.
文摘The load balancing strategy of RSS used in the PF_RING capture method does not work well on multi-core processor platforms to achieve the disadvantage of the load balancing on the processor cores.This paper presents a packet load balancing method based on FD and RSS.The basic idea of this method is to capture the packet with the 5 tuple filter matching,and then can not be classified packets and flow oriented filter matching,and finally can not be classified packets matching RSS.Design of experiments to test the packet capture performance and load balancing performance which the packet capture method of PF_RING using the combination of load balancing strategy based on FD+RSS and RSS,the results show that the data packet stream load balancing method based on FD+RSS can improve the performance of data packet capture and load balancing among multiple cores.
基金Supported by the National High-Tech Research and Development 863 Program of China under Grant Nos. 2008AA010901,2009AA01Z125,2009AA01Z103the National Natural Science Foundation of China under Grant Nos. 60736012,60921002,60803029,61050002+1 种基金the National Basic Research 973 Program of China under Grant No. 2005CB321600the Important National Science and Technology Specific Projects under Grant Nos. 2009ZX01028-002-003,2009ZX01029-001-003
文摘This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and targets high-end applications. Advanced techniques are adopted to make the DFT design scalable and achieve low-power and low-cost test with limited IO resources. To achieve a scalable and flexible test access, a highly elaborate test access mechanism (TAM) is implemented to support multiple test instructions and test modes. Taking advantage of multiple identical cores embedding in the processor, scan partition and on-chip comparisons are employed to reduce test power and test time. Test compression technique is also utilized to decrease test time. To further reduce test power, clock controlling logics are designed with ability to turn off clocks of non-testing partitions. In addition, scan collars of CACHEs are designed to perform functional test with low-speed ATE for speed-binning purposes, which poses low complexity and has good correlation results.
基金The authors acknowledge the support from the European Research Council(724530),Ministry of Education,Youth and Sports(CZ.02.1.01/0.0/15_003/0000476)Thüringer Ministerium für Wirtschaft,the European Regional Development Fund(CZ.02.1.01/0.0/15_003/0000476)+1 种基金Wissenschaft und Digitale Gesellschaft,the Federal Ministry of Education and Research,Germany(BMBF)the Thüringer Aufbaubank.
文摘In-vivo microendoscopy in animal models became a groundbreaking technique in neuroscience that rapidly expands our understanding of the brain.Emerging hair-thin endoscopes based on multimode fibres are now opening up the prospect of ultra-minimally invasive neuroimaging of deeply located brain structures.Complementing these advancements with methods of functional imaging and optogenetics,as well as extending its applicability to awake and motile animals constitute the most pressing challenges for this technology.Here we demonstrate a novel fibre design capable of both,high-resolution imaging in immobilised animals and bending-resilient optical addressing of neurons in motile animals.The optimised refractive index profile and the probe structure allowed reaching a spatial resolution of 2μm across a 230μm field of view for the initial layout of the fibre.Simultaneously,the fibre exhibits negligible cross-talk between individual inner-cores during fibre deformation.This work provides a technological solution for imaging-assisted spatially selective photo-activation and activity monitoring in awake and freely moving animal models.