Sciweavers

308 search results - page 54 / 62
» Syntactic Similarity of Web Documents
Sort
View
BMCBI
2006
164views more  BMCBI 2006»
13 years 8 months ago
BIOZON: a system for unification, management and analysis of heterogeneous biological data
Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that o...
Aaron Birkland, Golan Yona
ICADL
2007
Springer
129views Education» more  ICADL 2007»
14 years 2 months ago
Using Automatic Metadata Extraction to Build a Structured Syllabus Repository
Syllabi are important documents created by instructors for students. Students use syllabi to find information and to prepare for class. Instructors often need to find similar syl...
Xiaoyan Yu, Manas Tungare, Weiguo Fan, Manuel A. P...
SEMWEB
2007
Springer
14 years 2 months ago
Media Watch on Climate Change: Building and Visualizing Contextualized Information Spaces
Abstract. This paper presents the ’Media Watch on Climate Change’, an interactive Web portal that combines a portfolio of semantic services with a visual interface based on tig...
Arno Scharl, Albert Weichselbraun, Alexander Hubma...
WWW
2009
ACM
14 years 8 months ago
A class-feature-centroid classifier for text categorization
Automated text categorization is an important technique for many web applications, such as document indexing, document filtering, and cataloging web resources. Many different appr...
Hu Guan, Jingyu Zhou, Minyi Guo
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...