We propose a novel HMM-based framework to accurately transliterate unseen named entities. The framework leverages features in letteralignment and letter n-gram pairs learned from ...
Bing Zhao, Nguyen Bach, Ian R. Lane, Stephan Vogel
PageRank computes the importance of each node in a directed graph under a random surfer model governed by a teleportation parameter. Commonly denoted alpha, this parameter models ...
David F. Gleich, Paul G. Constantine, Abraham D. F...
This paper discusses a methodology for applying general-purpose first-order inductive learning to extract information from Web documents structured as unranked ordered trees. The...
We study estimation of mixture models for problems in which multiple views of the instances are available. Examples of this setting include clustering web pages or research papers ...
Abstract. We propose a web annotation system which adds the functionality of stickies to web pages and creates bidirectional links between the stickies. The stickies allow for impo...