High performance intra-node communication support for MPI applications is critical for achieving best performance from clusters of SMP workstations. Present day MPI stacks cannot make use of operating system kernel support for intra-node communication. This is primarily due to the lack of an efficient, portable, stable and MPI friendly interface to access the kernel functions. In this paper we attempt to address design challenges for implementing such a high performance and portable kernel module interface. We implement a kernel module interface called LiMIC and integrate it with MVAPICH, an open source MPI over InfiniBand. Our performance evaluation reveals that the pointto-point latency can be reduced by 71% and the bandwidth improved by 405% for 64KB message size. In addition, LiMIC can improve HPCC Effective Bandwidth and NAS IS class B benchmarks by 12% and 8%, respectively, on an 8-node dual SMP InfiniBand cluster.