The World Wide Web can be viewed as a gigantic distributed database including millions of interconnected hosts some of which publish information via web servers or peer-to-peer sys...
This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...
Today’s web is so huge and diverse that it arguably reflects the real world. For this reason, searching the web is a promising approach to find things in the real world. This ...
We propose two methods for constructing automated programs for extraction of information from a class of web pages that are very common and of high practical significance - varia...
Clipping Web pages, namely extracting the informative clips (areas) from Web pages, has many applications, such as Web printing and e-reading on small handheld devices. Although m...
Lei Zhang, Linpeng Tang, Ping Luo, Enhong Chen, Li...