Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering