A substantial subset of the web data follows some kind of underlying structure. In order to let software programs gain full benefit from these “semistructured” web sources, wra...
Set expansion refers to expanding a partial set of “seed” objects into a more complete set. One system that does set expansion is SEAL (Set Expander for Any Language), which e...
Background: Primer and probe sequences are the main components of nucleic acid-based detection systems. Biologists use primers and probes for different tasks, some related to the ...
We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database...
A major difficulty of supervised approaches for text classification is that they require a great number of training instances in order to construct an accurate classifier. This pap...