With the growing importance of foreign commerce comes also greater opportunities for fraudulent behaviour. As such, governments must try to detect frauds as soon as they take place, if they are to avoid the profound damage to the society frauds may cause. Although current fraud detection systems can be used on this endeavour with reasonable accuracy, they still suffer with the inconsistencies and ambiguities of unstructured databases, especially in customs. To deal with this kind of problem, we propose a twofold approach: building a brand new structured database, keeping it as clean as possible; and mining the current database for the desired information. Then, as a first contribution, we present a methodology for mining product attribute-value pairs in unstructured text datasets, bringing more structure to the current customs database. Next, as our second contribution, we introduce a system for building a structured database for the Brazilian customs and keeping it with as few redund...
Norton Trevisan Roman, Cristiano D. Ferreira, Luis