The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
With more than 250 million active users, Facebook (FB) is currently one of the most important online social networks. Our goal in this paper is to obtain a representative (unbiased...
Minas Gjoka, Maciej Kurant, Carter T. Butts, Athin...
The Web contains an abundance of useful semistructured information about real world objects, and our empirical study shows that strong sequence characteristics exist for Web infor...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
This paper describes the building of a research library for studying the Web, especially research on how the structure and content of the Web change over time. The library is part...
William Y. Arms, Selcuk Aya, Pavel Dmitriev, Blaze...
In this paper, we propose an iterative similarity propagation approach to explore the inter-relationships between Web images and their textual annotations for image retrieval. By ...