Sciweavers

395 search results - page 40 / 79
» An Automatic Data Grabber for Large Web Sites
Sort
View
CICLING
2004
Springer
14 years 1 months ago
Automatic Learning Features Using Bootstrapping for Text Categorization
When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we...
Wenliang Chen, Jingbo Zhu, Honglin Wu, Tianshun Ya...
ICDE
2007
IEEE
146views Database» more  ICDE 2007»
14 years 9 months ago
Challenges on Distributed Web Retrieval
In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
WWW
2002
ACM
14 years 8 months ago
A web-based resource migration protocol using WebDAV
The web's hyperlinks are notoriously brittle, and break whenever a resource migrates. One solution to this problem is a transparent resource migration mechanism, which separa...
Michael P. Evans, Steven Furnell
BTW
2005
Springer
142views Database» more  BTW 2005»
14 years 1 months ago
Self-Extending Peer Data Management
Abstract: Peer data management systems (PDMS) are the natural extension of integrated information systems. Conventionally, a single integrating system manages an integrated schema,...
Ralf Heese, Sven Herschel, Felix Naumann, Armin Ro...
WCRE
1999
IEEE
13 years 12 months ago
Chava: Reverse Engineering and Tracking of Java Applets
Java applets have been used increasingly on web sites to perform client-side processing and provide dynamic content. While many web site analysis tools are available, their focus ...
Jeffrey L. Korn, Yih-Farn Chen, Eleftherios Koutso...