Many interesting Web-based AI problems require the ability to collect, store and process large text datasets. To address this problem, we have developed Slashpack, an integrated toolkit for collecting and managing hypertext data. Currently, we are using Slashpack to study the effectiveness of tagging as a mechanism for organizing and searching blogs, and also to study community structure in the blogosphere.
Christopher H. Brooks, Monica Agarwal, Jason Endo,