Dual-execution/checkpointing based transient error tolerance techniques have been widely used in the high-end mission critical systems. These techniques, however, are not very attractive for cost-sensitive embedded systems because they require extra resources (e.g., large memory, special hardware, etc), and thus increase overall cost of the system. In this paper, we propose a transient error tolerant Java Virtual Machine (JVM) implementation for embedded systems. Our JVM uses dual-execution and checkpointing to detect and recover from transient errors. However, our technique does not require any special hardware support (except for the memory page protection mechanism, which is commonly available in modern embedded processors), and the memory space overhead it incurs is not excessive. Therefore, it is suitable for memory-constrained embedded systems. We implemented our approach and performed experiments with seven embedded Java applications. Categories and Subject Descriptors D.3.m [S...
Guangyu Chen, Mahmut T. Kandemir