Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies dri...Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies driven by Cybertwins have been proposed for adaptive task offloading strategies.However,the existence of random transmission delay between Cybertwin-driven agents and underlying networks is not considered in related works,which destroys the standard Markov property and increases the decision reaction time to reduce the task offloading strategy performance.In order to address this problem,we propose a pipelining task offloading method to lower the decision reaction time and model it as a delay-aware Markov Decision Process(MDP).Then,we design a delay-aware MADRL algorithm to minimize the weighted sum of task execution latency and energy consumption.Firstly,the state space is augmented using the lastly-received state and historical actions to rebuild the Markov property.Secondly,Gate Transformer-XL is introduced to capture historical actions'importance and maintain the consistent input dimension dynamically changed due to random transmission delays.Thirdly,a sampling method and a new loss function with the difference between the current and target state value and the difference between real state-action value and augmented state-action value are designed to obtain state transition trajectories close to the real ones.Numerical results demonstrate that the proposed methods are effective in reducing reaction time and improving the task offloading performance in the random-delay Cybertwin-enabled 6G networks.展开更多
The implementation of the coordinate rotational digital computer (CORDIC) algorithm with wave pipelining technique on field programmable gate array (FPGA) is described. All data in FPGA-based wave pipelining pass ...The implementation of the coordinate rotational digital computer (CORDIC) algorithm with wave pipelining technique on field programmable gate array (FPGA) is described. All data in FPGA-based wave pipelining pass through a number of logic gates, in the same way that all data pass through the same number of registers in a conventional pipeline. Moreover, all paths are routed using identical routing resources. The manual placement, timing driven routing and timing analyzing techniques are applied to optimize the layout for achieving good path balance. Experimental results show that a 256-LUT logic depth circuit mapped on XC4VLX15-12 runs as high as 330 MHz, whichis a little lower than the speed of 336 MHz based on the conventional 16-stage pipelining in the same chip. The latency of the wave pipelining circuit is 30.3 ns, which is 36.4% shorter than the latency of 16-stage conventional pipelining circuit.展开更多
The yield stress of waxy crude oil is a fundamental parameter in the calculation of pipelining technique and analysis of flow safety for the heated oil transported through pipeline.Daqing crude oil was studied and the...The yield stress of waxy crude oil is a fundamental parameter in the calculation of pipelining technique and analysis of flow safety for the heated oil transported through pipeline.Daqing crude oil was studied and the variation of yield stress with shear history was explored through simulation experiment of pipelining.It is found that the effect of throughput variation or shear rate on yield stress is not obvious.With the decrease of final dynamic cooling temperature,the yield stress of waxy crude oil decreases,but there exists a little increase at the beginning.The prediction model of yield stress for waxy crude oil under the condition of shutdown is developed and it can be used to predict the yield stress of Daqing crude oil at certain heating temperature,final dynamic cooling temperature and measurement temperature.For the 139 groups of yield stress data of Daqing crude oil from the simulation experiment of pipelining,the result of prediction with this model shows that the average relative deviation between the yield stress measured and predicted is 30.27%,and the coefficient of correlation is 0.962 3.展开更多
Communication optimization is very important for imporoving performance of parallel programs A communication optimization method called HVMP(Half Vector Message Ripelining) is presented. In comparison with the widelyu...Communication optimization is very important for imporoving performance of parallel programs A communication optimization method called HVMP(Half Vector Message Ripelining) is presented. In comparison with the widelyused vector message pipelining, HVMP can get better tradeoff between reducing and hiding communication overhead,and eliminate the communication barrier of barrier synchronization problems[1]. For parallel Systems with low bandwidth such as cluster of workstations and barrier synchronization problems with large amount of communication, HVMPmethod can get good performance.展开更多
Software process is a framework for effective and timely delivery of software system. The framework plays a crucial role for software success. However, the development of large-scale software still faces the crisis of...Software process is a framework for effective and timely delivery of software system. The framework plays a crucial role for software success. However, the development of large-scale software still faces the crisis of high risks, low quality, high costs and long cycle time. This paper proposed a three-phase parallel-pipelining software process model for improving speed and productivity, and reducing software costs and risks without sacrificing software quality. In this model, two strategies were presented. One strategy, based on subsystem-cost priority, was used to prevent software development cost wasting and to reduce software complexity as well; the other strategy, used for balancing subsystem complexity, was designed to reduce the software complexity in the later development stages. Moreover, the proposed function-detailed and workload-simplified subsystem pipelining software process model presents much higher parallelity than the concurrent incremental model. Finally, the component-based product line technology not only ensures software quality and further reduces cycle time, software costs, and software risks but also sufficiently and rationally utilizes previous software product resources and enhances the competition ability of software development organizations.展开更多
On the basis of Floyd algorithm with the extended path matrix, a parallel algorithm which resolves all-pair shortest path (APSP) problem on cluster environment is analyzed and designed. Meanwhile, the parallel APSP ...On the basis of Floyd algorithm with the extended path matrix, a parallel algorithm which resolves all-pair shortest path (APSP) problem on cluster environment is analyzed and designed. Meanwhile, the parallel APSP pipelining algorithm makes full use of overlapping technique between computation and communication. Compared with broadcast operation, the parallel algorithm reduces communication cost. This algorithm has been implemented on MPI on PC-cluster. The theoretical analysis and experimental results show that the parallel algorithm is an efficient and scalable algorithm.展开更多
An adaptive pipelining scheme for H.264/AVC context-based adaptive binary arithmetic coding(CABAC) decoder for high definition(HD) applications is proposed to solve data hazard problems coming from the data dependenci...An adaptive pipelining scheme for H.264/AVC context-based adaptive binary arithmetic coding(CABAC) decoder for high definition(HD) applications is proposed to solve data hazard problems coming from the data dependencies in CABAC decoding process.An efficiency model of CABAC decoding pipeline is derived according to the analysis of a common pipeline.Based on that,several adaptive strategies are provided.The pipelining scheme with these strategies can be adaptive to different types of syntax elements(SEs) and the pipeline will not stall during decoding process when these strategies are adopted.In addition,the decoder proposed can fully support H.264/AVC high4:2:2 profile and the experimental results show that the efficiency of decoder is much higher than other architectures with one engine.Taking both performance and cost into consideration,our design makes a good tradeoff compared with other work and it is sufficient for HD real-time decoding.展开更多
Communication overhead is an important factor in massively parallel processing systems and it has a dramatic influence on the performance of systems. If it can be implemented as quickly as possible, then the performan...Communication overhead is an important factor in massively parallel processing systems and it has a dramatic influence on the performance of systems. If it can be implemented as quickly as possible, then the performance of systems can be greatly improved. Based on the TORUS interconnection network, this paper presents the pipelining broadcasting, which reduces the broadcasting delay and improve the performance of systems.展开更多
This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We al...This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We also outline an algorithm to realize it and introduce the hardware support we designed. The performance of Ruminate Method is analyzed at the end of this paper with the aid of our preliminary experimental result.展开更多
This paper presents a ZUC-256 stream cipher algorithm hardware system in order to prevent the advanced security threats for 5 G wireless network.The main innovation of the hardware system is that a six-stage pipeline ...This paper presents a ZUC-256 stream cipher algorithm hardware system in order to prevent the advanced security threats for 5 G wireless network.The main innovation of the hardware system is that a six-stage pipeline scheme comprised of initialization and work stage is employed to enhance the solving speed of the critical logical paths.Moreover,the pipeline scheme adopts a novel optimized hardware structure to fast complete the Mod(231-1)calculation.The function of the hardware system has been validated experimentally in detail.The hardware system shows great superiorities.Compared with the same type system in recent literatures,the logic delay reduces by 47%with an additional hardware resources of only 4 multiplexers,the throughput rate reaches 5.26 Gbps and yields at least 45%better performance,the throughput rate per unit area increases 14.8%.The hardware system provides a faster and safer encryption module for the 5G wireless network.展开更多
Strong surface impact will produce strong vibration,which will pose a threat to the safety of nearby buried pipelines and other important lifeline projects.Based on the verified numerical method,a comprehensive numeri...Strong surface impact will produce strong vibration,which will pose a threat to the safety of nearby buried pipelines and other important lifeline projects.Based on the verified numerical method,a comprehensive numerical parameter analysis is conducted on the key influencing factors of the vibration isolation hole(VIH),which include hole diameter,hole net spacing,hole depth,hole number,hole arrangement,and soil parameters.The results indicate that a smaller ratio of net spacing to hole diameter,the deeper the hole,the multi-row hole,the hole adoption of staggered arrangements,and better site soil conditions can enhance the efficiency of the VIH barrier.The average maximum vibration reduction efficiency within the vibration isolation area can reach 42.2%.The vibration safety of adjacent oil pipelines during a dynamic compaction projection was evaluated according to existing standards,and the measurement of the VIH was recommended to reduce excessive vibration.The single-row vibration isolation scheme and three-row staggered arrangement with the same hole parameters are suggested according to different cases.The research findings can serve as a reference for the vibration safety analysis,assessment,and control of adjacent underground facilities under the influence of strong surface impact loads.展开更多
四川大学计算机学院学生团队在大规模语言模型参数高效微调系统研究方向取得重要进展,其研究成果“mLoRA:Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs”在国际数据库学术会议VLDB 2025 Rese...四川大学计算机学院学生团队在大规模语言模型参数高效微调系统研究方向取得重要进展,其研究成果“mLoRA:Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs”在国际数据库学术会议VLDB 2025 Research Track正式发表。VLDB(International Conference on Very Large Data Bases)是数据库领域的重要国际学术会议之一,涵盖数据库管理系统、数据密集型系统与大规模数据处理等方向。该工作已在多个国内外互联网企业的实际生产环境中部署应用,并获得一项中国发明专利和一项美国发明专利的受理。展开更多
Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Th...Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Thace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruc-tion Word (VLIW) and superscalar machines. Thace software pipelining applies a global code scheduling technique to compact the original loop body. The re-sulting loop is called a trace software pipelined (TSP) code. The trace softwrae pipelined code can be directly executed with special architectural support or call be transformed into a globally software pipelined loop for the current VLIW and superscalar processors. Thus, exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique. This makes our new technique very promis-ing in practical compilers. Finally, we also present the preliminary experimental results to support our new approach.展开更多
Widespread use of green hydrogen is a critical route to achieving a carbon-neutral society,but it cannot be accomplished without extensive hydrogen distribution.Hydrogen pipelines are the most energy-efficient approac...Widespread use of green hydrogen is a critical route to achieving a carbon-neutral society,but it cannot be accomplished without extensive hydrogen distribution.Hydrogen pipelines are the most energy-efficient approach to transporting hydrogen in areas with high,long-term demand for hydrogen.A well-known fact is that the properties of hydrogen differ from those of natural gas,which leads to significant variations in the pipeline transportation process.In addition,hydrogen can degrade the mechanical properties of steels,thereby affecting pipeline integrity.This situation has led to two inevitable key challenges in the current development of hydrogen-pipeline technology:economic viability and safety.Based on a review of the current state of hydrogen pipelines,including material compatibility with hydrogen,design methods,process operations,safety monitoring,and standards,this paper highlights key knowledge gaps in gaseous hydrogen pipelines.These gaps include the utilisation of high-strength materials for hydrogen pipelines,design of high-quality hydrogen pipelines,determination of hydrogen velocity,and repurposing of existing natural-gas pipelines.This review aims to identify the challenges in current hydrogen pipelines development and provide valuable suggestions for future research.展开更多
Dynamic neural network(NN)techniques are increasingly important because they facilitate deep learning techniques with more complex network architectures.However,existing studies,which predominantly optimize the static...Dynamic neural network(NN)techniques are increasingly important because they facilitate deep learning techniques with more complex network architectures.However,existing studies,which predominantly optimize the static computational graphs by static scheduling methods,usually focus on optimizing static neural networks in deep neural network(DNN)accelerators.We analyze the execution process of dynamic neural networks and observe that dynamic features introduce challenges for efficient scheduling and pipelining in existing DNN accelerators.We propose DyPipe,a holistic approach to optimizing dynamic neural network inferences in enhanced DNN accelerators.DyPipe achieves significant performance improvements for dynamic neural networks while it introduces negligible overhead for static neural networks.Our evaluation demonstrates that DyPipe achieves 1.7x speedup on dynamic neural networks and maintains more than 96%performance for static neural networks.展开更多
The relative stiffness between underground structures and surrounding soil may significantly influence the dynamic response of such structures.In this study,two underground pipelines were fabricated using rubber joint...The relative stiffness between underground structures and surrounding soil may significantly influence the dynamic response of such structures.In this study,two underground pipelines were fabricated using rubber joints with varying stiffness,and the corresponding dynamic response was evaluated.Model soils were prepared based on similarity ratios.Next,reduced-scale shaking table tests were conducted to investigate the impact of circular underground structures with varying stiffness joints on the amplification of ground acceleration,dynamic response,and deformation patterns of the underground pipelines.The comparative analysis showed that structures with lower stiffness exert less constraint on the surrounding soil,resulting in a higher amplification factor of ground acceleration.The seismic response of less stiff structures is generally 1.1 to 1.3 times the response of the stiffer structures.Therefore,the seismic response of the variable stiffness pipeline exhibits pronounced characteristics.Rubber joints effectively reduce the seismic response of underground structures,demonstrating favorable isolation effects.Consequently,relative stiffness plays a crucial role in the seismic design of underground structures,and the use of rubber materials in underground structures is advantageous.展开更多
Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grain...Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation techniques to support decoupling synchronization between the token generator and the collector. This paper also in- troduces the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average 3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.展开更多
A numerical simulation analysis is conducted to examine the unsteady hydrodynamic characteristics of vortex-induced vibration(VIV)and the suppression effect of helical strakes on VIV in subsea pipelines.The analysis u...A numerical simulation analysis is conducted to examine the unsteady hydrodynamic characteristics of vortex-induced vibration(VIV)and the suppression effect of helical strakes on VIV in subsea pipelines.The analysis uses the standard k−εturbulence model for 4.5-and 12.75-inch pipes,and its accuracy is verified by comparing the results with large-scale hydrodynamic experiments.These experiments are designed to evaluate the suppression efficiency of VIV with and without helical strakes,focusing on displacement and drag coefficients under different flow conditions.Furthermore,the influence of important geometric parameters of the helical strakes on drag coefficients and VIV suppression efficiency at different flow rates is compared and discussed.Numerical results agree well with experimental data for drag coefficient and vortex shedding frequency.Spring-pipe self-excited vibration experimental tests reveal that the installation of helical strakes substantially reduces the drag coefficient of VIV within a certain flow rate range,achieving suppression efficiencies exceeding 90%with strake heights larger than 0.15D.Notably,the optimized parameter combination of helical strakes,with a pitch of 15D,a fin height of 0.2D,and 45°edge slopes,maintains high suppression efficiency,thereby exhibiting superior performance.This study provides a valuable reference for the design and application of helical strakes and VIV suppression in subsea engineering.展开更多
基金funded by the National Key Research and Development Program of China under Grant 2019YFB1803301Beijing Natural Science Foundation (L202002)。
文摘Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies driven by Cybertwins have been proposed for adaptive task offloading strategies.However,the existence of random transmission delay between Cybertwin-driven agents and underlying networks is not considered in related works,which destroys the standard Markov property and increases the decision reaction time to reduce the task offloading strategy performance.In order to address this problem,we propose a pipelining task offloading method to lower the decision reaction time and model it as a delay-aware Markov Decision Process(MDP).Then,we design a delay-aware MADRL algorithm to minimize the weighted sum of task execution latency and energy consumption.Firstly,the state space is augmented using the lastly-received state and historical actions to rebuild the Markov property.Secondly,Gate Transformer-XL is introduced to capture historical actions'importance and maintain the consistent input dimension dynamically changed due to random transmission delays.Thirdly,a sampling method and a new loss function with the difference between the current and target state value and the difference between real state-action value and augmented state-action value are designed to obtain state transition trajectories close to the real ones.Numerical results demonstrate that the proposed methods are effective in reducing reaction time and improving the task offloading performance in the random-delay Cybertwin-enabled 6G networks.
文摘The implementation of the coordinate rotational digital computer (CORDIC) algorithm with wave pipelining technique on field programmable gate array (FPGA) is described. All data in FPGA-based wave pipelining pass through a number of logic gates, in the same way that all data pass through the same number of registers in a conventional pipeline. Moreover, all paths are routed using identical routing resources. The manual placement, timing driven routing and timing analyzing techniques are applied to optimize the layout for achieving good path balance. Experimental results show that a 256-LUT logic depth circuit mapped on XC4VLX15-12 runs as high as 330 MHz, whichis a little lower than the speed of 336 MHz based on the conventional 16-stage pipelining in the same chip. The latency of the wave pipelining circuit is 30.3 ns, which is 36.4% shorter than the latency of 16-stage conventional pipelining circuit.
基金Project(07E1007) supported by the Youth Innovation Foundation for Petroleum Science and Technology of China National Petroleum CorportationProject(2006AA09Z357) supported by the National High Technology Research and Development of China
文摘The yield stress of waxy crude oil is a fundamental parameter in the calculation of pipelining technique and analysis of flow safety for the heated oil transported through pipeline.Daqing crude oil was studied and the variation of yield stress with shear history was explored through simulation experiment of pipelining.It is found that the effect of throughput variation or shear rate on yield stress is not obvious.With the decrease of final dynamic cooling temperature,the yield stress of waxy crude oil decreases,but there exists a little increase at the beginning.The prediction model of yield stress for waxy crude oil under the condition of shutdown is developed and it can be used to predict the yield stress of Daqing crude oil at certain heating temperature,final dynamic cooling temperature and measurement temperature.For the 139 groups of yield stress data of Daqing crude oil from the simulation experiment of pipelining,the result of prediction with this model shows that the average relative deviation between the yield stress measured and predicted is 30.27%,and the coefficient of correlation is 0.962 3.
文摘Communication optimization is very important for imporoving performance of parallel programs A communication optimization method called HVMP(Half Vector Message Ripelining) is presented. In comparison with the widelyused vector message pipelining, HVMP can get better tradeoff between reducing and hiding communication overhead,and eliminate the communication barrier of barrier synchronization problems[1]. For parallel Systems with low bandwidth such as cluster of workstations and barrier synchronization problems with large amount of communication, HVMPmethod can get good performance.
文摘Software process is a framework for effective and timely delivery of software system. The framework plays a crucial role for software success. However, the development of large-scale software still faces the crisis of high risks, low quality, high costs and long cycle time. This paper proposed a three-phase parallel-pipelining software process model for improving speed and productivity, and reducing software costs and risks without sacrificing software quality. In this model, two strategies were presented. One strategy, based on subsystem-cost priority, was used to prevent software development cost wasting and to reduce software complexity as well; the other strategy, used for balancing subsystem complexity, was designed to reduce the software complexity in the later development stages. Moreover, the proposed function-detailed and workload-simplified subsystem pipelining software process model presents much higher parallelity than the concurrent incremental model. Finally, the component-based product line technology not only ensures software quality and further reduces cycle time, software costs, and software risks but also sufficiently and rationally utilizes previous software product resources and enhances the competition ability of software development organizations.
基金the National Natural Science Foundation of China under Grant No. 60671033.
文摘On the basis of Floyd algorithm with the extended path matrix, a parallel algorithm which resolves all-pair shortest path (APSP) problem on cluster environment is analyzed and designed. Meanwhile, the parallel APSP pipelining algorithm makes full use of overlapping technique between computation and communication. Compared with broadcast operation, the parallel algorithm reduces communication cost. This algorithm has been implemented on MPI on PC-cluster. The theoretical analysis and experimental results show that the parallel algorithm is an efficient and scalable algorithm.
基金Supported by the National Natural Science Foundation of China(No.61076021)the National Basic Research Program of China(No.2009CB320903)China Postdoctoral Science Foundation(No.2012M511364)
文摘An adaptive pipelining scheme for H.264/AVC context-based adaptive binary arithmetic coding(CABAC) decoder for high definition(HD) applications is proposed to solve data hazard problems coming from the data dependencies in CABAC decoding process.An efficiency model of CABAC decoding pipeline is derived according to the analysis of a common pipeline.Based on that,several adaptive strategies are provided.The pipelining scheme with these strategies can be adaptive to different types of syntax elements(SEs) and the pipeline will not stall during decoding process when these strategies are adopted.In addition,the decoder proposed can fully support H.264/AVC high4:2:2 profile and the experimental results show that the efficiency of decoder is much higher than other architectures with one engine.Taking both performance and cost into consideration,our design makes a good tradeoff compared with other work and it is sufficient for HD real-time decoding.
文摘Communication overhead is an important factor in massively parallel processing systems and it has a dramatic influence on the performance of systems. If it can be implemented as quickly as possible, then the performance of systems can be greatly improved. Based on the TORUS interconnection network, this paper presents the pipelining broadcasting, which reduces the broadcasting delay and improve the performance of systems.
文摘This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We also outline an algorithm to realize it and introduce the hardware support we designed. The performance of Ruminate Method is analyzed at the end of this paper with the aid of our preliminary experimental result.
基金supported in part by the National R&D Program for Major Research Instruments of China(Grant No:62027814)the National Natural Science Foundation of China(Grant No:62104054)+2 种基金the Natural Science Foundation of Heilongjiang Province(Grant No:F2018010)the Postdoctoral Science Foundation of Heilongjiang Province,China(No:LBH-Z20133)the Fundamental Research Funds for The Central Universities,China(3072021CF0806)。
文摘This paper presents a ZUC-256 stream cipher algorithm hardware system in order to prevent the advanced security threats for 5 G wireless network.The main innovation of the hardware system is that a six-stage pipeline scheme comprised of initialization and work stage is employed to enhance the solving speed of the critical logical paths.Moreover,the pipeline scheme adopts a novel optimized hardware structure to fast complete the Mod(231-1)calculation.The function of the hardware system has been validated experimentally in detail.The hardware system shows great superiorities.Compared with the same type system in recent literatures,the logic delay reduces by 47%with an additional hardware resources of only 4 multiplexers,the throughput rate reaches 5.26 Gbps and yields at least 45%better performance,the throughput rate per unit area increases 14.8%.The hardware system provides a faster and safer encryption module for the 5G wireless network.
基金National Natural Science Foundation of China under Grant Nos.52078386 and 52308496SINOMACH Youth Science and Technology Fund under Grant No.QNJJ-PY-2022-02+2 种基金Young Elite Scientists Sponsorship Program under Grant No.BYESS2023432Fund of State Key Laboratory of Precision Blasting and Hubei Key Laboratory of Blasting Engineering,Jianghan University under Grant No.PBSKL2023A9Fund of China Railway Construction Group Co.,Ltd.under Grant No.LX19-04b。
文摘Strong surface impact will produce strong vibration,which will pose a threat to the safety of nearby buried pipelines and other important lifeline projects.Based on the verified numerical method,a comprehensive numerical parameter analysis is conducted on the key influencing factors of the vibration isolation hole(VIH),which include hole diameter,hole net spacing,hole depth,hole number,hole arrangement,and soil parameters.The results indicate that a smaller ratio of net spacing to hole diameter,the deeper the hole,the multi-row hole,the hole adoption of staggered arrangements,and better site soil conditions can enhance the efficiency of the VIH barrier.The average maximum vibration reduction efficiency within the vibration isolation area can reach 42.2%.The vibration safety of adjacent oil pipelines during a dynamic compaction projection was evaluated according to existing standards,and the measurement of the VIH was recommended to reduce excessive vibration.The single-row vibration isolation scheme and three-row staggered arrangement with the same hole parameters are suggested according to different cases.The research findings can serve as a reference for the vibration safety analysis,assessment,and control of adjacent underground facilities under the influence of strong surface impact loads.
文摘四川大学计算机学院学生团队在大规模语言模型参数高效微调系统研究方向取得重要进展,其研究成果“mLoRA:Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs”在国际数据库学术会议VLDB 2025 Research Track正式发表。VLDB(International Conference on Very Large Data Bases)是数据库领域的重要国际学术会议之一,涵盖数据库管理系统、数据密集型系统与大规模数据处理等方向。该工作已在多个国内外互联网企业的实际生产环境中部署应用,并获得一项中国发明专利和一项美国发明专利的受理。
文摘Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Thace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruc-tion Word (VLIW) and superscalar machines. Thace software pipelining applies a global code scheduling technique to compact the original loop body. The re-sulting loop is called a trace software pipelined (TSP) code. The trace softwrae pipelined code can be directly executed with special architectural support or call be transformed into a globally software pipelined loop for the current VLIW and superscalar processors. Thus, exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique. This makes our new technique very promis-ing in practical compilers. Finally, we also present the preliminary experimental results to support our new approach.
基金supported by the National Key Research and Development Program of China(No.2022YFB4003400)the Key Research and Development Program of Zhejiang Province of China(No.2023C01225)the State Key Laboratory of Clean Energy Utilization,China。
文摘Widespread use of green hydrogen is a critical route to achieving a carbon-neutral society,but it cannot be accomplished without extensive hydrogen distribution.Hydrogen pipelines are the most energy-efficient approach to transporting hydrogen in areas with high,long-term demand for hydrogen.A well-known fact is that the properties of hydrogen differ from those of natural gas,which leads to significant variations in the pipeline transportation process.In addition,hydrogen can degrade the mechanical properties of steels,thereby affecting pipeline integrity.This situation has led to two inevitable key challenges in the current development of hydrogen-pipeline technology:economic viability and safety.Based on a review of the current state of hydrogen pipelines,including material compatibility with hydrogen,design methods,process operations,safety monitoring,and standards,this paper highlights key knowledge gaps in gaseous hydrogen pipelines.These gaps include the utilisation of high-strength materials for hydrogen pipelines,design of high-quality hydrogen pipelines,determination of hydrogen velocity,and repurposing of existing natural-gas pipelines.This review aims to identify the challenges in current hydrogen pipelines development and provide valuable suggestions for future research.
基金supported by the Beijing Natural Science Foundation under Grant No.JQ18013the National Natural Science Foundation of China under Grant Nos.61925208,61732007,61732002 and 61906179+1 种基金the Strategic Priority Research Program of Chinese Academy of Sciences(CAS)under Grant No.XDB32050200the Youth Innovation Promotion Association CAS,Beijing Academy of Artificial Intelligence(BAAI)and Xplore Prize.
文摘Dynamic neural network(NN)techniques are increasingly important because they facilitate deep learning techniques with more complex network architectures.However,existing studies,which predominantly optimize the static computational graphs by static scheduling methods,usually focus on optimizing static neural networks in deep neural network(DNN)accelerators.We analyze the execution process of dynamic neural networks and observe that dynamic features introduce challenges for efficient scheduling and pipelining in existing DNN accelerators.We propose DyPipe,a holistic approach to optimizing dynamic neural network inferences in enhanced DNN accelerators.DyPipe achieves significant performance improvements for dynamic neural networks while it introduces negligible overhead for static neural networks.Our evaluation demonstrates that DyPipe achieves 1.7x speedup on dynamic neural networks and maintains more than 96%performance for static neural networks.
基金Key International(Regional)Joint Research Project under Grant No.52020105002National Natural Science Foundation of China under Grant No.51991393。
文摘The relative stiffness between underground structures and surrounding soil may significantly influence the dynamic response of such structures.In this study,two underground pipelines were fabricated using rubber joints with varying stiffness,and the corresponding dynamic response was evaluated.Model soils were prepared based on similarity ratios.Next,reduced-scale shaking table tests were conducted to investigate the impact of circular underground structures with varying stiffness joints on the amplification of ground acceleration,dynamic response,and deformation patterns of the underground pipelines.The comparative analysis showed that structures with lower stiffness exert less constraint on the surrounding soil,resulting in a higher amplification factor of ground acceleration.The seismic response of less stiff structures is generally 1.1 to 1.3 times the response of the stiffer structures.Therefore,the seismic response of the variable stiffness pipeline exhibits pronounced characteristics.Rubber joints effectively reduce the seismic response of underground structures,demonstrating favorable isolation effects.Consequently,relative stiffness plays a crucial role in the seismic design of underground structures,and the use of rubber materials in underground structures is advantageous.
基金Supported by the National Natural Science Foundation of China (Grant No. 60633050, 60621003)the National High Technology Researchand Development Program of China (Grant No. 2007AA01Z06)
文摘Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation techniques to support decoupling synchronization between the token generator and the collector. This paper also in- troduces the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average 3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.
基金Supported by the National Natural Science Foundation of China (Grant No. 52222111)the National Science and Technology Major Project of China “Key Technologies and Equipment for Deepwater Dry Oil and Gas Production and Processing Platforms”(No. 2024ZD1403300)+1 种基金Subproject 5 “Research on Safety Risk Assessment Technology System for Deepwater Dry Oil and Gas Production and Processing Platforms”(No. 2024ZD1403305)the China Scholarship Council (202306440019)。
文摘A numerical simulation analysis is conducted to examine the unsteady hydrodynamic characteristics of vortex-induced vibration(VIV)and the suppression effect of helical strakes on VIV in subsea pipelines.The analysis uses the standard k−εturbulence model for 4.5-and 12.75-inch pipes,and its accuracy is verified by comparing the results with large-scale hydrodynamic experiments.These experiments are designed to evaluate the suppression efficiency of VIV with and without helical strakes,focusing on displacement and drag coefficients under different flow conditions.Furthermore,the influence of important geometric parameters of the helical strakes on drag coefficients and VIV suppression efficiency at different flow rates is compared and discussed.Numerical results agree well with experimental data for drag coefficient and vortex shedding frequency.Spring-pipe self-excited vibration experimental tests reveal that the installation of helical strakes substantially reduces the drag coefficient of VIV within a certain flow rate range,achieving suppression efficiencies exceeding 90%with strake heights larger than 0.15D.Notably,the optimized parameter combination of helical strakes,with a pitch of 15D,a fin height of 0.2D,and 45°edge slopes,maintains high suppression efficiency,thereby exhibiting superior performance.This study provides a valuable reference for the design and application of helical strakes and VIV suppression in subsea engineering.