World Wide Web (WWW) is a vast source of information, the problem of information overload is more acute than ever. Due to noise in WWW, it is becoming hard to find usable information. Real-estate listings are frequently available through different real estate agencies and published on different web sites. As a consequence, differences in price and description can also be observed. At the same time, a potential buyer or renter may prefer to get the entire description of a property of interest based on the data available on different portals and if possible track the changes in price. This problem can be considered as an illustration of a wider class of problems with integrating the data from numerous semistructured web data sources. The paper investigates the way clustering algorithms can be used to identify individual real estate properties described on different portals. Clustering algorithms have been used to group the records acquired from different web sources. Both standard cluste...