Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
Automatic identification of software faults has enormous practical significance. This requires characterizing program execution behavior. Equally important is the aspect of diagno...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Large search engines process thousands of queries per second on billions of pages, making query processing a major factor in their operating costs. This has led to a lot of resear...
Database queries are often exploratory and users often find their queries return too many answers, many of them irrelevant. Existing work either categorizes or ranks the results t...