RDMA reduces network latency by eliminating unnecessary copies from network interface cards to application buffers, but how to reduce memory registration cost is a challenge. Previous studies use pin-down cache and batched deregistration to address this issue. In this paper, we propose a new design of communication process: Fast RDMA Read and Write Process (FRRWP), to reduce the overhead of the memory registration and message synchronization in the critical data path of RDMA operations. FRRWP overlaps memory registrations between a client and a server, and allows applications to submit RDMA write operations without being blocked by message synchronization. We use a mathematic model to calculate the overall latency of FRRWP. Compared to traditional RDMA operations, our results show FRRWP reduces the total communication latency dramatically in the critical data path.