Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
An experimental GUI paradigm is presented which is based on the design goals of maximizing the amount of screen used for application data, reducing the amount that the UI diverts ...
Gordon Kurtenbach, George W. Fitzmaurice, Thomas B...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
In this paper we present a method of computing the posterior probability of conditional independence of two or more continuous variables from data, examined at several resolutions...