The use of visual information derived from accurate lip extraction, can provide features invariant to noise perturbation for speech recognition systems and can be also used in a wide variety of applications. Unlike many current automatic lip reading systems which impose several restrictions on users, our efforts are towards an unconstrained system. In this paper we introduce a method using k-means color clustering with automatically adapted number of clusters, for the extraction of the lip area. The method’s performance is improved by previously applying nearest neighbor color segmentation. The extracted lip area is morphologically processed and fitted by a best-fit ellipse. The points of interest (keypoints) of the mouth area are extracted, while a corner detector for fine tuning of mouth corners is applied. Experimental tests have shown that the algorithm works very well under natural conditions and accurate extraction of lip keypoints is feasible.
Evangelos Skodras, Nikolaos D. Fakotakis