In this paper, we describe the lessons we learned in developing AgentBuilder, a commercial system for rapidly creating agents that extract information from web sites. AgentBuilder...
We consider the problem of content extraction from online news webpages. To explore to what extent the syntactic markup and the visual structure of a webpage facilitate the extrac...
Social networks play important roles in the Semantic Web: knowledge management, information retrieval, ubiquitous computing, and so on. We propose a social network extraction syst...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
The Web consists of a large amount of unstructured information that hardly can be elaborated by automatic agents. In recent years, a considerable number of techniques for informat...
Leonardo Rigutini, Ernesto Di Iorio, Marco Ernande...