1 We present a fast, efficient, and parameterized modular multiplier and a secure exponentiation circuit especially intended for FPGAs on the low end of the price range. The desig...
Recently a GPU has acquired programmability to perform general purpose computation fast by running ten thousands of threads concurrently. This paper presents a new algorithm for d...
As the number of cores on a single-chip grows, scalable barrier synchronization becomes increasingly difficult to implement. In software implementations, such as the tournament ba...
— In this paper, we propose a new architecture that can extract information of numerous objects in an image at highspeed. Various characteristics can be obtained from the image m...
Stochastic simulations and other scientific applications that depend on random numbers are increasingly implemented in a parallelized manner in programmable logic. High-quality ps...