Removing manually generated boilerplate from electronic texts: experiments with project Gutenberg e-books