In this paper, we introduce the fractal summarization model based on the fractal theory. In fractal summarization, the important information is captured from the source text by ex...
The phenomenal growth that the WWW currently experiences necessitates the integration of various types of information sources to its platform. We present an open, extensible multi...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
We introduce CiteSeer-API, a public API to CiteSeer-like services. CiteSeer-API is SOAP/WSDL based and allows for easy programatical access to all the specific functionalities off...
Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Prade...
Abstract: Query languages for XML such as XPath or XQuery support Boolean retrieval where a query result is a (possibly restructured) subset of XML elements or entire documents tha...