Sciweavers

155 search results - page 18 / 31
» Matching web site structure and content
Sort
View
IAT
2007
IEEE
14 years 1 months ago
An Intelligent Web Agent to Mine Bilingual Parallel Pages via Automatic Discovery of URL Pairing Patterns
This paper describes an intelligent agent to facilitate bitext mining from the Web via automatic discovery of URL pairing patterns (or keys) for retrieving parallel web pages. The...
Chunyu Kit, Jessica Yee Ha Ng
WWW
2004
ACM
14 years 8 months ago
Automatic detection of fragments in dynamically generated web pages
Dividing web pages into fragments has been shown to provide significant benefits for both content generation and caching. In order for a web site to use fragment-based content gen...
Lakshmish Ramaswamy, Arun Iyengar, Ling Liu, Fred ...
WWW
2009
ACM
14 years 8 months ago
Less talk, more rock: automated organization of community-contributed collections of concert videos
We describe a system for synchronization and organization of user-contributed content from live music events. We start with a set of short video clips taken at a single event by m...
Lyndon S. Kennedy, Mor Naaman
WSE
2003
IEEE
14 years 19 days ago
Resolution of Static Clones in Dynamic Web Pages
Cloning is extremely likely to occur in web sites, much more so than in other software. While some clones exist for valid reasons, or are too small to eliminate, cloning percentag...
Nikita Synytskyy, James R. Cordy, Thomas R. Dean
DIS
2001
Springer
13 years 12 months ago
Eliminating Useless Parts in Semi-structured Documents Using Alternation Counts
We propose a preprocessing method for Web mining which, given semi-structured documents with the same structure and style, distinguishes useless parts and non-useless parts in each...
Daisuke Ikeda, Yasuhiro Yamada, Sachio Hirokawa