Text mining concerns the discovery of knowledge from unstructured textual data. One important task is the discovery of rules that relate specific words and phrases. Although exist...
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
This paper is concerned with the problem of structured data extraction from Web pages. The objective of the research is to automatically segment data records in a page, extract da...
We present the problem of categorizing web services according to a shallow ontology for presentation on a specialist portal, using their WSDL and associated textual documents foun...