Nowadays, the web-based architecture is the most frequently used for a wide range of internet services, as it allows to easily access and manage information and software on remote machines. The input of web applications is made up of queries, i.e. sequences of pairs attributevalue. A wide range of attacks exploits web application vulnerabilities, typically derived from input validation flaws. In this work we propose a new formulation of query analysis through Hidden Markov Models (HMM) and show that HMM are effective in detecting a wide range of either known or unknown attacks on web applications. In addition, despite previous works, we explicitly address the problem related to the presence of noise (i.e., attacks) in the training set. Finally, we show that performance can be increased when a sequence of symbols is modelled by an ensemble of HMM. Experimental results on real world data, show the effectiveness of the proposed system in terms of very high detection rates and low false al...