As the web expands exponentially, the need to put some order to its content becomes apparent. Hypertext categorization, that is the automatic classification of web documents into ...
In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper sh...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
Information overload is a growing threat to the productivity of today’s knowledge workers, who need to keep track of multiple streams of information from various sources. RSS fe...
Lichan Hong, Gregorio Convertino, Bongwon Suh, Ed ...
In this paper we describe an improved version of ANERsys, an Arabic Named Entity Recognition system for open-domain texts. The first version of ANERsys was totally based on the Ma...
Images on the Web present a major accessibility issue for the visually impaired, mainly because the majority of them do not have proper captions. This paper addresses the problem ...
Luis von Ahn, Shiry Ginosar, Mihir Kedia, Ruoran L...