Signature-based malicious code detection is the standard technique in all commercial anti-virus software. This method can detect a virus only after the virus has appeared and caused damage. Signature-based detection performs poorly when attempting to identify new viruses. Motivated by the standard signature-based technique for detecting viruses, and a recent successful text classification method, n-grams analysis, we explore the idea of automatically detecting new malicious code. We employ n-grams analysis to automatically generate signatures from malicious and benign software collections. The n-gramsbased signatures are capable of classifying unseen benign and malicious code. The datasets used are large compared to earlier applications of n-grams analysis.