In recent years, there has been considerable research on information extraction and constructing RDF knowledge bases. In general, the goal is to extract all relevant information f...
Through the recent NTCIR workshops, patent retrieval casts many challenging issues to information retrieval community. Unlike newspaper articles, patent documents are very long an...
In-Su Kang, Seung-Hoon Na, Jungi Kim, Jong-Hyeok L...
Question Answering has been the recent focus of information retrieval research; many systems just incorporate a search engine as a black box and most effort has been devoted to the...
In this paper, we propose a novel document clustering method based on the non-negative factorization of the termdocument matrix of the given document corpus. In the latent semanti...
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...