The problem of low-quality information on the Web is nowhere more important than in the domain of health, where unsound information and misleading advice can have serious consequen...
Thanh Tin Tang, David Hawking, Ramesh S. Sankarana...
Most template detection methods process web pages in batches that a newly crawled page can not be processed until enough pages have been collected. This results in large storage c...
Yu Wang, Binxing Fang, Xueqi Cheng, Li Guo, Hongbo...
In this study, we crawled a local Web domain, created its graph representation, and analyzed the network structure. The results of network analysis revealed local scalefree patter...
It is indispensable that the users surfing on the Internet could have web pages classified into a given topic as correct as possible. Toward this ends, this paper presents a topic-...
Sanguk Noh, Youngsoo Choi, Haesung Seo, Kyunghee C...
This paper presents structural properties of the Thai Web graph. We conduct an empirical study on the Web graphs induced from two Thai web snapshots crawled during January 2007 (5...