Search Sciweavers | Sciweavers

328 search results - page 42 / 66

» A Multi-level Approach for Document Clustering

261

click to vote

CIDU
2010

202views Machine Learning» more CIDU 2010»

Multi-label ASRS Dataset Classification Using Semi Supervised Subspace Clustering

15 years 5 months ago

Download www.utdallas.edu

There has been a lot of research targeting text classification. Many of them focus on a particular characteristic of text data - multi-labelity. This arises due to the fact that a ...

Mohammad Salim Ahmed, Latifur Khan, Nikunj C. Oza,...

claim paper

Read More »

294

Voted

VLDB
2002
ACM

120views Database» more VLDB 2002»

Efficient schemes for managing multiversionXML documents

16 years 7 months ago

Download www.cs.ucla.edu

Multiversion support for XML documents is needed in many critical applications, such as software configuration control, cooperative authoring, web information warehouses, and "...

Shu-Yao Chien, Vassilis J. Tsotras, Carlo Zaniolo

claim paper

Read More »

217

click to vote

ICDAR
2009
IEEE

178views Document Analysis» more ICDAR 2009»

Text Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis

16 years 2 months ago

Download liris.cnrs.fr

In this paper we propose a new approach to improve electronic editions of human science corpus, providing an efﬁcient estimation of manuscripts pages structure. In any handwriti...

Vincent Malleron, Véronique Eglin, Hubert E...

claim paper

Read More »

191

click to vote

EMNLP
2004

114views Natural Language Processing» more EMNLP 2004»

Trained Named Entity Recognition using Distributional Clusters

15 years 9 months ago

Download www.cs.cmu.edu

This work applies boosted wrapper induction (BWI), a machine learning algorithm for information extraction from semi-structured documents, to the problem of named entity recogniti...

Dayne Freitag

claim paper

Read More »

215

click to vote

WWW
2010
ACM

257views Internet Technology» more WWW 2010»

CETR: content extraction via tag ratios

16 years 2 months ago

Download www.cs.illinois.edu

We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...

Tim Weninger, William H. Hsu, Jiawei Han

claim paper

Read More »

« Prev « First page 42 / 66 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers