World Wide Web, the biggest distributed system ever built, experiences tremendous growth and change in Web sites, users, and technology. A realistic and accurate characterization ...
Katerina Goseva-Popstojanova, Fengbin Li, Xuan Wan...
We describe a system for extracting mentions of terms such as company and product names, in a large and noisy corpus of documents, such as the World Wide Web. Since natural langua...
Einat Amitay, Rani Nelken, Wayne Niblack, Ron Siva...
This paper presents a near real-time multilingual news monitoring and analysis system that forms the backbone of our research work. The system integrates technologies to address t...
A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...
Hongyu Liu, Evangelos E. Milios, Jeannette Janssen
Abstract. A growing amounts of information are currently being generated and stored in the World Wide Web (WWW), in particular, researchers in any field can find a lot of publicati...