While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
This paper shows how a corpus of instant messages can be employed to detect de facto communities of practice automatically. A novel algorithm based on the concept of Edge Stress F...
This contribution addresses the development of new web sites reusing already existing contents from external sources. Unlike common links to other resources, which retrieves the w...
We examine the difference and similarities between two online computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the C...
Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G....
We propose a new browsing system called "Web2Talkshow". It transforms declarative-based web content into humorous dialogbased TV-program-like content that is presented t...
With the fast increase in Web activities, Web data mining has recently become an important research topic. However, most previous studies of mining path traversal patterns are bas...
The first part of the paper provides a brief description of the Language Observatory Project (LOP) and highlights the major technical difficulties to be challenged. The latter par...
Yoshiki Mikami, Pavol Zavarsky, Mohd Zaidi Abd Roz...