The Web is now a huge information repository with a rich semantic structure that, however, is primarily addressed to human understanding rather than automated processing by a compu...
We describe a component of a document analysis system for constructing ontologies for domain-specific web tables imported into Excel. This component automates extraction of the Wa...
Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukka...
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Abstract— Recent advances in graph-based search techniques derived from Kleinberg’s work [1] have been impressive. This paper further improves the graph-based search algorithm ...
Syllabi are important documents created by instructors for students. Students use syllabi to find information and to prepare for class. Instructors often need to find similar syl...
Xiaoyan Yu, Manas Tungare, Weiguo Fan, Manuel A. P...