Sciweavers

AAAI
2008

Learning to Analyze Binary Computer Code

14 years 2 months ago
Learning to Analyze Binary Computer Code
We present a novel application of structured classification: identifying function entry points (FEPs, the starting byte of each function) in program binaries. Such identification is the crucial first step in analyzing many malicious, commercial and legacy software, which lack full symbol information that specifies FEPs. Existing pattern-matching FEP detection techniques are insufficient due to variable instruction sequences introduced by compiler and link-time optimizations. We formulate the FEP identification problem as structured classification using Conditional Random Fields. Our Conditional Random Fields incorporate both idiom features to represent the sequence of instructions surrounding FEPs, and control flow structure features to represent the interaction among FEPs. These features allow us to jointly label all FEPs in the binary. We perform feature selection and present an approximate inference method for massive program binaries. We evaluate our models on a large set of real-...
Nathan E. Rosenblum, Xiaojin Zhu, Barton P. Miller
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2008
Where AAAI
Authors Nathan E. Rosenblum, Xiaojin Zhu, Barton P. Miller, Karen Hunt
Comments (0)