Sciweavers

LREC
2008

Automatic Identification of Temporal Information in Tourism Web Pages

14 years 27 days ago
Automatic Identification of Temporal Information in Tourism Web Pages
This paper presents our work on the detection of temporal information in web pages. The pages examined within the scope of this study were taken from the tourism sector and the temporal information in question is thus particular to this area. The differences that exist between extraction from plain textual data and extraction from the web are brought to light. These differences mainly concern the spatial arrangement of the text, the use of punctuation and the respect of traditional syntactic rules. The temporal expressions to be extracted are classified into two kinds: temporal information that concerns one particular event and repetitive temporal information. We adopt a symbolic approach relying on patterns and rules for the detection, extraction and annotation of temporal expressions; our method is based on the use of transducers. First evaluations have shown promising results. Since the visual structure of a web page is very important and often informs the user before he has even r...
Stéphanie Weiser, Philippe Laublet, Jean-Lu
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors Stéphanie Weiser, Philippe Laublet, Jean-Luc Minel
Comments (0)