As an important step towards bootstrapping the Semantic Web, many efforts have been made to extract triples from Wikipedia because of its wide coverage, good organization and rich ...
Qiaoling Liu, Kaifeng Xu, Lei Zhang, Haofen Wang, ...
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
Two dimensional plots (2-D) in digital documents on the web are an important source of information that is largely under-utilized. In this paper, we outline how data and text can ...
Saurabh Kataria, William Browuer, Prasenjit Mitra,...
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...