Sciweavers

WWW
2010
ACM

Access: news and blog analysis for the social sciences

14 years 4 months ago
Access: news and blog analysis for the social sciences
The social sciences strive to understand the political, social, and cultural world around us, but have been impaired by limited access to the quantitative data sources enjoyed by the hard sciences. Careful analysis of Web document streams holds enormous potential to solve longstanding problems in a variety of social science disciplines through massive data analysis. This paper introduces the TextMap Access system, which provides ready access to a wealth of interesting statistics on millions of people, places, and things across a number of interesting web corpora. Powered by a flexible and scalable distributed statistics computation framework using Hadoop, continually updated corpora include newspapers, blogs, patent records, legal documents, and scientific abstracts; well over a terabyte of raw text and growing daily. The Lydia Textmap Access system, available through http://www.textmap.com/access, provides instant access for students and scholars through a convenient web user-inter...
Mikhail Bautin, Charles B. Ward, Akshay Patil, Ste
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 2010
Where WWW
Authors Mikhail Bautin, Charles B. Ward, Akshay Patil, Steven Skiena
Comments (0)