Populating the Cyc Knowledge Base (KB) has been a manual process until very recently. However, there is currently enough knowledge in Cyc for it to be feasible to attempt to acqui...
Purvesh Shah, David Schneider, Cynthia Matuszek, R...
Thepaper deals with investigations concerning potential structures of documentsthat will be subject to automated information extraction. The focus is on folding principles and the...
We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus. Most earlier approaches to this problem rely on combining cluste...
Bhavana Bharat Dalvi, William W. Cohen, Jamie Call...
Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen