Floating-point summation is one of the most important operations in scientific/numerical computing applications and also a basic subroutine (SUM) in BLAS (Basic Linear Algebra Subprograms) library. However, standard floating-point arithmetic based summation algorithms may not always result in accurate solutions because of possible catastrophic cancellations. To make the situation worse, the sequence of consecutive additions will affect the final result, which makes it impossible to produce a unique solution for the same input dataset on different computer platforms with different software compilers. The emergence of high-density reconfigurable hardware devices gives us an option to customize high-performance arithmetic units for our specific computing problems. In this paper, we design an FPGA-based hardware algorithm for accurate floating-point summations using group alignment technique. The corresponding fullpipelined summation unit is proven to provide similar or even better numeri...