This paper provides an implementation of a novel signal processing co-processor using a Geometric Algebra technique tailored for fast and complex geometric calculations in multiple dimensions. This is the first hardwa...This paper provides an implementation of a novel signal processing co-processor using a Geometric Algebra technique tailored for fast and complex geometric calculations in multiple dimensions. This is the first hardware implementation of Geometric Algebra to specifically address the issue of scalability to multiple (1 - 8) dimensions. This paper presents a detailed description of the implementation, with a particular focus on the techniques of optimization used to improve performance. Results are presented which demonstrate at least 3x performance improvements compared to previously published work.展开更多
QR and LU decompositions are the most important matrix decomposition algorithms. Many studies work on accelerating these algorithms by FPGA or ASIC in a case by case style. In this paper, we propose a unified framewor...QR and LU decompositions are the most important matrix decomposition algorithms. Many studies work on accelerating these algorithms by FPGA or ASIC in a case by case style. In this paper, we propose a unified framework for the matrix decomposition algorithms, combining three QR decomposition algorithms and LU algorithm with pivoting into a unified linear array structure. The QR and LU decomposition algorithms exhibit the same two-level loop structure and the same data dependency. Utilizing the similarities in loop structure and data dependency of matrix decomposition, we unify a fine-grained algorithm for all four matrix decomposition algorithms. Furthermore, we present a unified co-processor structure with a scalable linear array of processing elements (PEs), in which four types of PEs are same in the structure of memory channels and PE connections, but the only difference exists in the internal structure of data path. Our unified co-processor, which is IEEE 32-bit floating-point precision, is implemented and mapped onto a Xilinx Virtex5 FPGA chip. Experimental results show that our co-processors can achieve speedup of 2.3 to 14.9 factors compared to a Pentium Dual CPU with double SSE threads.展开更多
As kinematic calculations are complicated, it takes a long time and is difficult to get the desired accurate result with a single processor in real-time motion control of multi-degree-of-freedom(MDOF) systems. Another...As kinematic calculations are complicated, it takes a long time and is difficult to get the desired accurate result with a single processor in real-time motion control of multi-degree-of-freedom(MDOF) systems. Another calculation unit is needed, especially with the increase in the degree of freedom. The main central processing unit(CPU) has additional loads because of numerous motion elements which move independently from each other and their closed-loop controls. The system designed is also complicated because there are many parts and cabling. This paper presents the design and implementation of a hardware that will provide solutions to these problems. It is realized using the Very High Speed Integrated Circuit Hardware Description Language(VHDL) and field-programmable gate array(FPGA). This hardware is designed for a six-legged robot and has been working with servo motors controlled via the serial port. The hardware on FPGA calculates the required joint angles for the feet positions received from the serial port and sends the calculated angels to the servo motors via the serial port. This hardware has a co-processor for the calculation of kinematic equations and can be used together with the equipment that would reduce the electromechanical mess. It is intended to be used as a tool which will accelerate the transition from design to application for robots.展开更多
文摘This paper provides an implementation of a novel signal processing co-processor using a Geometric Algebra technique tailored for fast and complex geometric calculations in multiple dimensions. This is the first hardware implementation of Geometric Algebra to specifically address the issue of scalability to multiple (1 - 8) dimensions. This paper presents a detailed description of the implementation, with a particular focus on the techniques of optimization used to improve performance. Results are presented which demonstrate at least 3x performance improvements compared to previously published work.
基金Supported by the National Natural Science Foundation of China under Grant Nos.60633050 and 60833004,60903057the National High-Technology Research and Development 863 Program of China under Grant No.2009AA01Z101
文摘QR and LU decompositions are the most important matrix decomposition algorithms. Many studies work on accelerating these algorithms by FPGA or ASIC in a case by case style. In this paper, we propose a unified framework for the matrix decomposition algorithms, combining three QR decomposition algorithms and LU algorithm with pivoting into a unified linear array structure. The QR and LU decomposition algorithms exhibit the same two-level loop structure and the same data dependency. Utilizing the similarities in loop structure and data dependency of matrix decomposition, we unify a fine-grained algorithm for all four matrix decomposition algorithms. Furthermore, we present a unified co-processor structure with a scalable linear array of processing elements (PEs), in which four types of PEs are same in the structure of memory channels and PE connections, but the only difference exists in the internal structure of data path. Our unified co-processor, which is IEEE 32-bit floating-point precision, is implemented and mapped onto a Xilinx Virtex5 FPGA chip. Experimental results show that our co-processors can achieve speedup of 2.3 to 14.9 factors compared to a Pentium Dual CPU with double SSE threads.
基金Project(No.KBü-BAP-13/1-DR-011)supported by the Department of Bilimsel Arastírma Progeleri,Karabük University,Turkey
文摘As kinematic calculations are complicated, it takes a long time and is difficult to get the desired accurate result with a single processor in real-time motion control of multi-degree-of-freedom(MDOF) systems. Another calculation unit is needed, especially with the increase in the degree of freedom. The main central processing unit(CPU) has additional loads because of numerous motion elements which move independently from each other and their closed-loop controls. The system designed is also complicated because there are many parts and cabling. This paper presents the design and implementation of a hardware that will provide solutions to these problems. It is realized using the Very High Speed Integrated Circuit Hardware Description Language(VHDL) and field-programmable gate array(FPGA). This hardware is designed for a six-legged robot and has been working with servo motors controlled via the serial port. The hardware on FPGA calculates the required joint angles for the feet positions received from the serial port and sends the calculated angels to the servo motors via the serial port. This hardware has a co-processor for the calculation of kinematic equations and can be used together with the equipment that would reduce the electromechanical mess. It is intended to be used as a tool which will accelerate the transition from design to application for robots.