Commercial, non-profit and public organizations are accumulating huge amounts of electronically available text documents. Although composed of unstructured texts, documents contai...
In this work we design algorithms for clustering relational columns into attributes, i.e., for identifying strong relationships between columns based on the common properties and ...
Abstract. Biologists usually focus on only a small, individualized, subdomain of the huge domain of biology. With respect to their sub-domain, they often need data collected from v...
The Border Gateway Protocol (BGP) is the interdomain routing protocol used to exchange routing information between Autonomous Systems (ASes) in the internet today. While intradoma...
The traditional retrieval models based on term matching are not effective in collections of degraded documents (output of OCR or ASR systems for instance). This paper presents a n...