GPGPU is getting more and more important, but when using CUDA-enabled GPUs the special characteristics of NVIDIAs SIMT architecture have to be considered. Particularly, it is not possible to run functions concurrently, although NVIDIAs GPUs consist of many processing units. Therefore, the processing power of GPUs can not be shared among processes, and for an efficient use of the GPU, it has to be fully utilized by a single function launch of a single process. In this contribution we present an approach that overcomes these restrictions. A GPGPU service launches a persistent kernel which consists of a set of device functions. The service controls kernel execution via memory transfers and provides interfaces, through that clients can access the device functions. Using this novel approach, the GPU is shared by many clients at the same time, what greatly increases the flexibility without loss in performance.