Overload management has been an important problem for large-scale dynamic systems. In this paper, we study this problem in the context of our Borealis distributed stream processin...
Schema matching has been historically difficult to automate. Most previous studies have tried to find matches by exploiting information on schema and data instances. However, sche...
While the Web has been increasingly recognized as a culturally valuable social artifact, many nations endeavor to create national Web archives for long term preservation. However, ...
Semantic validation of the effectiveness of a schema matching system is traditionally performed by comparing system-generated mappings with those of human evaluators. The human ef...
Marko Smiljanic, Maurice van Keulen, Willem Jonker
Data-driven scientific applications utilize workflow frameworks to execute complex dataflows, resulting in derived data products of unknown quality. We discuss our on-going resear...
Scalable Distributed Data Structures (SDDS) store data in a file of key-based records distributed over many storage sites. The number of storage sites utilized grows and shrinks w...
We consider the problem of query optimization in distributed stream based systems where multiple continuous queries may be executing simultaneously. In such systems, distribution ...
Sangeetha Seshadri, Vibhore Kumar, Brian F. Cooper
The CVS (Concurrent Versions System) software is a popular method for recording modifications to data objects, in addition to concurrent access to data in a multi-user environmen...
Most traditional Information Retrieval (IR) systems, including web search engines, operationalize “relevant” as the word frequency in a document of a set of keywords. Because ...
Hyun Woong Shin, Eduard H. Hovy, Dennis McLeod, La...