Search Sciweavers | Sciweavers

910 search results - page 67 / 182

» Testbed for information extraction from deep web

160

click to vote

CIKM
2005
Springer

104views Information Technology» more CIKM 2005»

Retrieving answers from frequently asked questions pages on the web

15 years 11 months ago

Download staff.science.uva.nl

We address the task of answering natural language questions by using the large number of Frequently Asked Questions (FAQ) pages available on the web. The task involves three steps...

Valentin Jijkoun, Maarten de Rijke

claim paper

Read More »

144

click to vote

ITCC
2002
IEEE

130views Information Technology» more ITCC 2002»

Web-Based Information Access: Multilingual Automatic Authoring

15 years 11 months ago

Download ai-nlp.info.uniroma2.it

The needs for managing similar documents in different languages increases with the growing amounts of electronic information available in documents of the same type (e.g. news str...

Roberto Basili, Maria Teresa Pazienza, Fabio Massi...

claim paper

Read More »

193

click to vote

SIGMOD
2009
ACM

140views Database» more SIGMOD 2009»

Robust web extraction: an approach based on a probabilistic tree-edit model

16 years 25 days ago

Download www-rcf.usc.edu

On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to eﬀectively extract information of interest. Of course, the scripts and thus ...

Nilesh N. Dalvi, Philip Bohannon, Fei Sha

claim paper

Read More »

173

click to vote

CIKM
2008
Springer

155views Information Technology» more CIKM 2008»

Characterizing and predicting community members from evolutionary and heterogeneous networks

15 years 8 months ago

Download www.cais.ntu.edu.sg

Mining different types of communities from web data have attracted a lot of research efforts in recent years. However, none of the existing community mining techniques has taken i...

Qiankun Zhao, Sourav S. Bhowmick, Xin Zheng, Kai Y...

claim paper

Read More »

173

click to vote

WWW
2010
ACM

257views Internet Technology» more WWW 2010»

CETR: content extraction via tag ratios

16 years 1 months ago

Download www.cs.illinois.edu

We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...

Tim Weninger, William H. Hsu, Jiawei Han

claim paper

Read More »

« Prev « First page 67 / 182 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers