In a distributed file system built on network attached storage, client computers access data directly from shared storage, rather than submitting I/O requests through a server. Without a server marshaling access to data, if a computer fails or becomes isolated in a network partition while holding locks on cached data objects, those objects become inaccessible to other computers until a locking authority can guarantee that the lock holder will not again directly access these data. We describe a server that acts as the locking authority and implements a lease-based protocol for revoking access to data objects locked by an isolated or failed computer. When a lease expires, the server can be assured that the client no longer acts on locked data, and can safely redistribute locks to other clients. During normal operation, this protocol invokes no message overhead, and uses no memory and performs no computation at the locking authority.
Randal C. Burns, Robert M. Rees, Darrell D. E. Lon