In this paper, we present a parallel-array implementation of a new non-restoring square root algorithm (PASQRT). The carry-save adder (CSA) is used in the parallel array. The PASQRT has several features unlike other implementations. First, it does not use redundant representation for square root result. Second, each iteration generates an exact resulting value. Next, it does not require any conversion on the inputs of the CSA. And last, a precise remainder can be obtained immediately. Furthermore, we present an improved version — a root-select parallel-array implementation (RS-PASQRT) for fast result value generation. The RS-PASQRT is capable of achieving up to about 150% speedup ratio over the PASQRT. The simplicity of the implementations indicates that the proposed approach is an alternative to consider when designing a fully pipelined square root unit.