The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
We present a framework to extract the most important features (tree fragments) from a Tree Kernel (TK) space according to their importance in the target kernelbased machine, e.g. ...
In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction...
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Emergence of the web and online computing applications gave rise to rich large scale social activity data. One of the principal challenges then is to build models and understandin...