Understanding the extent to which people'ssearch behaviors differ in terms of the interaction flow and information targeted is important in designing interfaces to help World...
We address the problem of answering broad-topic queries on the World Wide Web. We present a link based analysis algorithm SelHITS, which is an improvement over Kleinberg's HI...
Fully automated trading, such as e-procurement, using the Internet is virtually unheard of today. Three core technologies are needed to fully automate the trading process: data mi...
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...