Towards The Web of Concepts: Extracting Concepts from Large Datasets

15 years 5 months ago

Download ilpubs.stanford.edu

Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a ﬁrst step towards building a web of concepts that will form the backbone of the next generation of search technology, we develop a novel technique to extract concepts from large datasets. We approach the problem of concept extraction from corpora as a market-baskets problem [2], adapting statistical measures of support and conﬁdence. We evaluate our concept extraction algorithm on datasets containing data from a large number of users (e.g., the AOL query log data set [11]), and we show that a high-precision concept set can be extracted.

Aditya G. Parameswaran, Hector Garcia-Molina, Anan

Real-time Traffic

AOL Query Log | Concept Extraction | Imaginary Entities | PVLDB 2010 |

claim paper

» WebSets extracting sets of entities from the web using unsupervised information extraction

» Relational concept discovery in structured datasets

» Towards Efficient Learning of Neural Network Ensembles from Arbitrarily Large Datasets

» CUM An Efficient Framework for Mining Concept Units

» Toward interactive learning by concept ordering

» Semantic Interoperability in Archaeological Datasets Data Mapping and Extraction Via the C...

» Identifying overrepresented concepts in gene lists from literature a statistical approach ...

» Efficient concept clustering for ontology learning using an event life cycle on the web

Post Info
More Details (n/a)

Added	30 Jan 2011
Updated	30 Jan 2011
Type	Journal
Year	2010
Where	PVLDB
Authors	Aditya G. Parameswaran, Hector Garcia-Molina, Anand Rajaraman

Comments (0)

Sciweavers

Towards The Web of Concepts: Extracting Concepts from Large Datasets

AOL Query Log | Concept Extraction | Imaginary Entities | PVLDB 2010 |

Explore & Download

Productivity Tools

Sciweavers