Search Sciweavers | Sciweavers

24 search results - page 2 / 5

» Automatic Extraction of Textual Elements from News Web Pages

191

click to vote

WWW
2004
ACM

100views Internet Technology» more WWW 2004»

Automatic web news extraction using tree edit distance

16 years 8 months ago

Download www.iw3c2.org

The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient access to relevant infor...

Davi de Castro Reis, Paulo Braz Golgher, Altigran ...

claim paper

Read More »

197

click to vote

LREC
2008

133views Education» more LREC 2008»

Automatic Identification of Temporal Information in Tourism Web Pages

15 years 8 months ago

Download www.lrec-conf.org

This paper presents our work on the detection of temporal information in web pages. The pages examined within the scope of this study were taken from the tourism sector and the te...

Stéphanie Weiser, Philippe Laublet, Jean-Lu...

claim paper

Read More »

186

click to vote

HICSS
2008
IEEE

105views Biometrics» more HICSS 2008»

Using Visual Features for Fine-Grained Genre Classification of Web Pages

16 years 1 months ago

Download csdl2.computer.org

The field of automatic genre classification has primarily focused on extracting textual features from documents. The goal of this research is to investigate whether visual feature...

Ryan Levering, Michal Cutler, Lei Yu

claim paper

Read More »

211

Voted

KDD
1999
ACM

147views Data Mining» more KDD 1999»

Text Mining: Finding Nuggets in Mountains of Textual Data

15 years 11 months ago

Download maya.cs.depaul.edu

Text mining appliesthe sameanalytical functions of datamining to the domainof textual information, relying on sophisticatedtext analysis techniques that distill information from f...

Jochen Dörre, Peter Gerstl, Roland Seiffert

claim paper

Read More »

173

Voted

LREC
2010

216views Education» more LREC 2010»

BlogBuster: A Tool for Extracting Corpora from the Blogosphere

15 years 8 months ago

Download www.lrec-conf.org

This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...

Georgios Petasis, Dimitrios Petasis

claim paper

Read More »

« Prev « First page 2 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers