Java applications rely on Just-In-Time (JIT) compilers or adaptive compilers to generate and optimize binary code at runtime to boost performance. In conventional Java Virtual Machines (JVM), however, the binary code is typically written into the data cache, and then is loaded into the instruction cache through the shared L2 cache or memory, which is not efficient in terms of both time and energy. In this paper, we study three hardware-based code caching strategies to write and read the dynamically generated code faster and more energy-efficiently. Our experimental results indicate that writing code directly into the instruction cache can improve the performance of a variety of Java applications by 9.6% on average, and up to 42.9%. Also, the overall energy dissipation of these Java programs can be reduced by 6% on average. Categories and Subject Descriptors B.3.2 [HARDWARE]: Memory StructuresDesign StylesCache Memories; D.3.4 [SOFTWARE]: Programming LanguagesCode Generation General Te...