Sciweavers

311 search results - page 41 / 63
» www 2007
Sort
View
WWW
2008
ACM
14 years 10 months ago
A larger scale study of robots.txt
A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site...
Santanu Kolay
BIOCOMP
2006
13 years 11 months ago
Mapping Biological XML DTDs Using Ontology
Several biological databases exist which use different formats for storing data. Further, each database has its own schema and a query interface. There exist no standard conversio...
Rana Hashmy
NAACL
2004
13 years 11 months ago
Acquiring Hyponymy Relations from Web Documents
This paper describes an automatic method for acquiring hyponymy relations from HTML documents on the WWW. Hyponymy relations can play a crucial role in various natural language pr...
Keiji Shinzato, Kentaro Torisawa
ACL
1998
13 years 11 months ago
Proper Name Translation in Cross-Language Information Retrieval
Recently, language barrier becomes the major problem for people to search, retrieve, and understand WWW documents in different languages. This paper deals with query translation i...
Hsin-Hsi Chen, Sheng-Jie Huang, Yung-Wei Ding, Shi...
RIAO
1997
13 years 11 months ago
Coupling information retrieval and information extraction: A new text technology for gathering information from the web
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how...
Robert J. Gaizauskas, Alexander M. Robertson