Sciweavers

48 search results - page 7 / 10
» Collection statistics for fast duplicate document detection
Sort
View
KDD
2007
ACM
136views Data Mining» more  KDD 2007»
14 years 8 months ago
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Benyah Shaparenko, Thorsten Joachims
DEXAW
2006
IEEE
111views Database» more  DEXAW 2006»
14 years 1 months ago
Finding Syntactic Similarities Between XML Documents
Detecting structural similarities between XML documents has been the subject of several recent work, and the proposed algorithms mostly use tree edit distance between the correspo...
Davood Rafiei, Daniel L. Moise, Dabo Sun
WETICE
2007
IEEE
14 years 1 months ago
Collaborative Intrusion Prevention
Intrusion Prevention Systems (IPSs) have long been proposed as a defense against attacks that propagate too fast for any manual response to be useful. In an important class of IPS...
Simon P. Chung, Aloysius K. Mok
TPDS
2008
130views more  TPDS 2008»
13 years 7 months ago
Detecting VoIP Floods Using the Hellinger Distance
Voice over IP (VoIP), also known as Internet telephony, is gaining market share rapidly and now competes favorably as one of the visible applications of the Internet. Nevertheless,...
Hemant Sengar, Haining Wang, Duminda Wijesekera, S...
IDEAS
2008
IEEE
109views Database» more  IDEAS 2008»
14 years 2 months ago
EXsum: an XML summarization framework
1 We propose a new framework for the summarization of XML document properties called EXsum (Element-wise XML summarization), which can capture statistical information of all import...
José de Aguiar Moraes Filho, Theo Härd...