Sciweavers

188 search results - page 4 / 38
» The hybrid representation model for web document classificat...
Sort
View
CIKM
2010
Springer
13 years 5 months ago
Using Wikipedia categories for compact representations of chemical documents
Today, Web pages are usually accessed using text search engines, whereas documents stored in the deep Web are accessed through domain-specific Web portals. These portals rely on e...
Benjamin Köhncke, Wolf-Tilo Balke
SIGIR
2004
ACM
14 years 16 days ago
Locality preserving indexing for document representation
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
Xiaofei He, Deng Cai, Haifeng Liu, Wei-Ying Ma
WWW
2001
ACM
14 years 7 months ago
Algorithms and programming models for efficient representation of XML for Internet applications
XML is poised to take the World-Wide-Web to the next level of innovation. XML data, large or small, with or without associated schema, will be exchanged between increasing number ...
Neel Sundaresan, Reshad Moussa
ECIR
2008
Springer
13 years 8 months ago
Semi-supervised Document Classification with a Mislabeling Error Model
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
Anastasia Krithara, Massih-Reza Amini, Jean-Michel...
DOCENG
2007
ACM
13 years 11 months ago
Elimination of junk document surrogate candidates through pattern recognition
A surrogate is an object that stands for a document and enables navigation to that document. Hypermedia is often represented with textual surrogates, even though studies have show...
Eunyee Koh, Daniel Caruso, Andruid Kerne, Ricardo ...