Sciweavers

708 search results - page 32 / 142
» Identifying Content Blocks from Web Documents
Sort
View
ICDAR
2011
IEEE
12 years 8 months ago
Identification of Indic Scripts on Torn-Documents
—Questioned Document Examination processes often encompass analysis of torn documents. To aid a forensic expert, automatic classification of content type in torn documents might ...
Sukalpa Chanda, Katrin Franke, Umapada Pal
HT
2007
ACM
14 years 15 days ago
Lesson learnt from a large-scale industrial semantic web application
The design and maintenance of an aero-engine generates a significant amount of documentation. When designing new engines, engineers must obtain knowledge gained from maintenance o...
Sylvia C. Wong, Richard M. Crowder, Gary B. Wills,...
CNIS
2006
13 years 10 months ago
Dynamically blocking access to web pages for spammers' harvesters
Almost all current anti spam measures are reactive, filtering being the most common. But to react means always to be one step behind. Reaction requires to predict the next action ...
Tobias Eggendorfer, Jörg Keller
SAC
2008
ACM
13 years 8 months ago
Exploring social annotations for web document classification
Social annotation via so-called collaborative tagging describes the process by which many users add metadata in the form of unstructured keywords to shared content. In this paper,...
Michael G. Noll, Christoph Meinel
KDD
2005
ACM
163views Data Mining» more  KDD 2005»
14 years 2 months ago
Web mining from competitors' websites
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...
Xin Chen, Yi-fang Brook Wu