Sciweavers

SEMCO
2009
IEEE

LAIR: A Language for Automated Semantics-Aware Text Sanitization Based on Frame Semantics

14 years 6 months ago
LAIR: A Language for Automated Semantics-Aware Text Sanitization Based on Frame Semantics
—We present LAIR: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While LAIR presupposes superficial knowledge of frames and frame semantics, it requires only limited prior programming experience. It neither contain scripting or I/O primitives, nor does it contain general loop constructions and is not Turing-complete. We have implemented a LAIR compiler and integrated it in a pipeline for automated redaction of web pages. We detail our experience with automated redaction of web pages for subjectively undesirable content; initial experiments suggest that using a small language based on semantic recognition of undesirable terms can be highly useful as a supplement to traditional methods of text sanitization. Keywords-Redaction; sanitization; frame semantics; domainspecific languages I. SANITIZATION: REDACTION AND EXPURGATION OF NATURAL LANGUAGE San...
Steffen Hedegaard, Søren Houen, Jakob Grue
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Where SEMCO
Authors Steffen Hedegaard, Søren Houen, Jakob Grue Simonsen
Comments (0)