Exact substring matching queries on large data collections can be answered using q-gram indices, that store for each occurring q-byte pattern an (ordered) posting list with the po...
Effective security management depends upon good risk management, which is itself based upon a reliable risk assessment, involving data collection of all the facets influencing sys...
Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
Data warehouses contain large amounts of information, often collected from a variety of independent sources. Decision-support functions in a warehouse, such as on-line analytical ...
We present a novel approach for classifying documents that combines different pieces of evidence (e.g., textual features of documents, links, and citations) transparently, through...
Adriano Veloso, Wagner Meira Jr., Marco Cristo, Ma...