Machine code disassembly routines form a fundamental component of software systems that statically analyze or modify executable programs. The task of disassembly is complicated by indirect jumps and the presence of nonexecutable data—jump tables, alignment bytes, etc.—in the instruction stream. Existing disassembly algorithms are not always able to cope successfully with executable files containing such features and fail silently—i.e., produce incorrect disassemblies without any indication that the results they are producing are incorrect. This can be a serious problem, since it can compromise the correctness of a binary rewriting tool. In this paper we examine two commonlyused disassembly algorithms and illustrate their shortcomings. We propose a hybrid approach that performs better than these algorithms in the sense that it is able to detect situations where the disassembly may be incorrect and limit the extent of such disassembly errors. Experimental results indicate that th...
Benjamin Schwarz, Saumya K. Debray, Gregory R. And