Indexing and retrieval of speech content in various forms such as broadcast news, customer care data and on-line media has gained a lot of interest for a wide range of application...
Dogan Can, Erica Cooper, Arnab Ghoshal, Martin Jan...
Today’s web is so huge and diverse that it arguably reflects the real world. For this reason, searching the web is a promising approach to find things in the real world. This ...
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...