Search Sciweavers | Sciweavers

38 search results - page 2 / 8

» Mining Tables from Large Scale HTML Texts

221

click to vote

WSDM
2012
ACM

252views Data Mining» more WSDM 2012»

WebSets: extracting sets of entities from the web using unsupervised information extraction

14 years 2 months ago

Download www.cs.cmu.edu

We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus. Most earlier approaches to this problem rely on combining cluste...

Bhavana Bharat Dalvi, William W. Cohen, Jamie Call...

claim paper

Read More »

213

click to vote

WWW
2007
ACM

144views Internet Technology» more WWW 2007»

Towards domain-independent information extraction from web tables

16 years 7 months ago

Download www2007.org

Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...

Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...

claim paper

Read More »

164

click to vote

COLING
2010

108views Computational Linguistics» more COLING 2010»

Large Scale Parallel Document Mining for Machine Translation

15 years 1 months ago

Download static.googleusercontent.com

A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...

Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...

claim paper

Read More »

165

click to vote

KDD
2008
ACM

128views Data Mining» more KDD 2008»

Scaling up text classification for large file systems

16 years 7 months ago

Download www.hpl.hp.com

: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifie...

George Forman, Shyamsundar Rajaram

claim paper

Read More »

193

Voted

OSDI
2008
ACM

227views Operating System» more OSDI 2008»

Mining Console Logs for Large-Scale System Problem Detection

16 years 7 months ago

Download www2.berkeley.intel-research.net

The console logs generated by an application contain messages that the application developers believed would be useful in debugging or monitoring the application. Despite the ubiq...

Wei Xu, Ling Huang, Armando Fox, David A. Patterso...

claim paper

Read More »

« Prev « First page 2 / 8 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers