How can we cull the facts we need from the overwhelming mass of information and misinformation that is the Web? The TextRunner extraction engine represents one approach, in which ...
The growing availability of on-line textual sources and the potential number of applications of knowledge acquisition from textual data has lead to an increase in Information Extr...
Several problems in text categorization are too hard to be solved by standard bag-of-words representations. Work in kernel-based learning has approached this problem by (i) consid...
If Kolmogorov complexity [25] measures information in one object and Information Distance [4, 23, 24, 42] measures information shared by two objects, how do we measure information...
Chinese is a language that does not have morphological tense markers that provide explicit grammaticalization of the temporal location of situations (events or states). However, i...