We describe the process of converting plain text cultural heritage data to elements of a domain-specific knowledge base, using general machine learning techniques. First, digitise...
This paper proposes an extension of Sumida and Torisawa's method of acquiring hyponymy relations from hierachical layouts in Wikipedia (Sumida and Torisawa, 2008). We extract...
We report on an effort to build a corpus of Modern Hebrew tagged with parts of speech and morphology. We designed a tagset specific to Hebrew while focusing on four aspects: the t...
Meni Adler, Yael Dahan Netzer, Yoav Goldberg, Davi...
We describe the creation of a corpus that supports a real-world hierarchical text categorization task in the domain of electronic rulemaking (eRulemaking). Features of the task an...
Claire Cardie, Cynthia Farina, Matt Rawding, Adil ...
We present recent work in the area of Cross-Domain Dialogue Act (DA) tagging. We have previously reported on the use of a simple dialogue act classifier based on purely intra-utte...