This paper presents an improved hardware structure for the computation of the Whirlpool hash function. By merging the round key computation with the data compression and by using embedded memories to perform part of the Galois Field (28 ) multiplication, a core can be implemented in just 43% of the area of the best current related art while achieving a 12% higher throughput. The proposed core improves the Throughput per Slice compared to the state of the art by 160%, achieving a throughput of 5.47 Gbit/s with 2110 slices and 32 BRAMs on a VIRTEX II Pro FPGA. Results for a real application are also presented by considering a polymorphic computational approach.