Sciweavers

263 search results - page 4 / 53
» Re-engineering structures from Web documents
Sort
View
NAACL
2010
13 years 5 months ago
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Jason R. Smith, Chris Quirk, Kristina Toutanova
WWW
2006
ACM
14 years 8 months ago
Logical structure based semantic relationship extraction from semi-structured documents
Addressed in this paper is the issue of semantic relationship extraction from semi-structured documents. Many research efforts have been made so far on the semantic information ex...
Kuo Zhang, Gang Wu, Juan-Zi Li
CHI
1996
ACM
13 years 11 months ago
Silk from a Sow's Ear: Extracting Usable Structures from the Web
In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property has directly f...
Peter Pirolli, James E. Pitkow, Ramana Rao
NDSS
2009
IEEE
14 years 2 months ago
Document Structure Integrity: A Robust Basis for Cross-site Scripting Defense
Cross-site scripting (or XSS) has been the most dominant class of web vulnerabilities in 2007. The main underlying reason for XSS vulnerabilities is that web markup and client-sid...
Yacin Nadji, Prateek Saxena, Dawn Song
HT
2005
ACM
14 years 1 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov