HTTP cookies have been widely used for maintaining session states, personalizing, authenticating, and tracking user behaviors. Despite their importance and usefulness, cookies hav...
Purpose – Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, so...
Ashish Kathuria, Bernard J. Jansen, Carolyn Hafern...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...