Combining NLP and probabilistic categorisation for document and term selection for Swiss-Prot medical annotation