Computing cores to be implemented on FPGAs may involve divisions by small integer constants in fixed or floating point. This article presents a family of architectures addressing this need. They are derived from a simple recurrence whose body can be implemented very efficiently as a look-up table that matches the hardware resources of the target FPGA. For instance, division of a 32-bit integer by the constant 3 may be implemented by a combinatorial circuit of 48 LUT6 on a Virtex5. Other options are studied, including iterative implementations, and architectures based on embedded memory blocks. This technique also computes the remainder. An efficient implementation of the correctly rounded division of a floating-point constant by such a small integer is also presented.