Data is routinely created, disseminated, and processed in distributed systems that span multiple administrative domains. To maintain accountability while the data is transformed b...
Semistructured data is characterized by the lack of any fixed and rigid schema, although typically the data hassomeimplicitstructure. While thelack offixedschemamakesextracting ...
Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simpl...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua...
This paper presents the first scalable context-sensitive, inclusionbased pointer alias analysis for Java programs. Our approach to context sensitivity is to create a clone of a m...
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...