Information extraction is concerned with applying natural language processing to automatically extract the essential details from text documents. A great disadvantage of current ap...
For many years the Polish TEX Users Group newsletter has been published online on the GUST web site. The repository now contains valuable information on TEX, METAFONT, electronic d...
As opposed to representing a document as a "bag of words" in most information retrieval applications, we propose a model of representing a web page as sets of named enti...
Nan Di, Conglei Yao, Mengcheng Duan, Jonathan J. H...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
Abstract Bioinformatic data sources available on the web are multiple and heterogenous. The lack of documentation and the difficulty of interaction with these data banks require us...