Recently, Balcan and Blum [1] suggested a theory of learning based on general similarity functions, instead of positive semi-definite kernels. We study the gap between the learning guarantees based on kernel-based learning, and those that can be obtained by using the kernel as a similarity function, which was left open by Balcan and Blum. We provide a significantly improved bound on how good a kernel function is when used as a similarity function, and extend the result also to the more practically relevant hinge-loss rather then zero-one-error-rate. Furthermore, we show that this bound is tight, and hence establish that there is in-fact a real gap between the traditional kernel-based notion of margin and the newer similarity-based notion.