Sciweavers

466 search results - page 72 / 94
» Scalable Feature Extraction from Noisy Documents
Sort
View
DATAMINE
2006
224views more  DATAMINE 2006»
13 years 6 months ago
Characteristic-Based Clustering for Time Series Data
With the growing importance of time series clustering research, particularly for similarity searches amongst long time series such as those arising in medicine or finance, it is cr...
Xiaozhe Wang, Kate A. Smith, Rob J. Hyndman
ICDAR
2005
IEEE
14 years 11 days ago
The Neural-based Segmentation of Cursive Words using Enhanced Heuristics
This paper presents an Enhanced Heuristic Segmenter (EHS) and an improved neural-based segmentation technique for segmenting cursive words and validating prospective segmentation ...
Chun Ki Cheng, Michael Blumenstein
LREC
2008
102views Education» more  LREC 2008»
13 years 8 months ago
Unsupervised Learning-based Anomalous Arabic Text Detection
The growing dependence of modern society on the Web as a vital source of information and communication has become inevitable. However, the Web has become an ideal channel for vari...
Nasser Abouzakhar, Ben Allison, Louise Guthrie
CIKM
2008
Springer
13 years 8 months ago
Dr. Searcher and Mr. Browser: a unified hyperlink-click graph
We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink ...
Barbara Poblete, Carlos Castillo, Aristides Gionis
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 1 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...