Text data in the Internet can be partitioned into many databases naturally. Efficient retrieval of desired data can be achieved if we can accurately predict the usefulness of each...
Weiyi Meng, King-Lup Liu, Clement T. Yu, Xiaodong ...
While we expect to discover knowledge in the texts available on the Web, such discovery usually requires many complex analysis steps, most of which require different text handling...
Current peer-to-peer (p2p) full-text keyword search techniques fall into the following categories: document-based partitioning, keyword-based partitioning, hybrid indexing, and se...
Determining the similarity of short text snippets, such as search queries, works poorly with traditional document similarity measures (e.g., cosine), since there are often few, if...
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over multip...