Sciweavers

57 search results - page 9 / 12
» Web spam challenge proposal for filtering in archives
Sort
View
SIGIR
2008
ACM
13 years 7 months ago
Exploring traversal strategy for web forum crawling
In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...
Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...
BTW
2003
Springer
115views Database» more  BTW 2003»
14 years 18 days ago
Towards Federated Search Based on Web Services
Abstract: Some emerging trends in the recent development of the WWW can be observed. These trends are technical, like Web Services, as well as semantic, like the integration of ont...
Jens Graupmann, Michael Biwer, Patrick Zimmer
WWW
2008
ACM
14 years 8 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2003
ACM
14 years 8 months ago
Mining topic-specific concepts and definitions on the web
Traditionally, when one wants to learn about a particular topic, one reads a book or a survey paper. With the rapid expansion of the Web, learning in-depth knowledge about a topic...
Bing Liu, Chee Wee Chin, Hwee Tou Ng
EDBT
2009
ACM
144views Database» more  EDBT 2009»
13 years 11 months ago
Efficient maintenance techniques for views over active documents
Many Web applications are based on dynamic interactions between Web components exchanging flows of information. Such a situation arises for instance in mashup systems or when moni...
Serge Abiteboul, Pierre Bourhis, Bogdan Marinoiu