A prerequisite for all higher level information extraction tasks is the identication of unknown names in text. Today, when large corpora can consist of billions of words, it is of...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
A recent area of significant progress in speaker recognition is the use of high level features—idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not on...
William M. Campbell, Joseph P. Campbell, Douglas A...
A number of content management tasks, including term categorization, term clustering, and automated thesaurus generation, view natural language terms (e.g. words, noun phrases) as...
Alberto Lavelli, Fabrizio Sebastiani, Roberto Zano...
Network intrusion detection systems typically detect worms by examining packet or flow logs for known signatures. Not only does this approach mean worms cannot be detected until ...