We present initial results from an international and multi-disciplinary research collaboration that aims at the construction of a reference corpus of web genres. The primary appli...
Georg Rehm, Marina Santini, Alexander Mehler, Pave...
Abstract. Automatic image annotation has been becoming an attractive research subject. Most current image annotation methods are based on training techniques. The major weaknesses ...
In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems. We use...
Currently, in the field of technology monitoring, it is very important to be able to get relevant information from heterogeneous sources, especially on the World Wide Web. The com...
The research reported in this paper is the first phase of a larger project on the automatic classification of Web pages by their genres. The long term goal is the incorporation of...