The World Wide Web (WWW) is rapidly becoming important for society as a medium for sharing data, information and services, and there is a growing interest in tools for understandi...
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...
Systems based on statistical and machine learning methods have been shown to be extremely effective and scalable for the analysis of large amount of textual data. However, in the r...
As WWW becomes more and more popular and powerful, how to search information on the web in database way becomes an important research topic. COMMIX, which is developed in the DB g...
Tengjiao Wang, Shiwei Tang, Dongqing Yang, Jun Gao...