Understanding intents from search queries can improve a user’s search experience and boost a site’s advertising profits. Query tagging via statistical sequential labeling models has been shown to perform well, but annotating the training set for supervised learning requires substantial human effort. Domain-specific knowledge, such as semantic class lexicons, reduces the amount of needed manual annotations, but much human effort is still required to maintain these as search topics evolve over time. This paper investigates semi-supervised learning algorithms that leverage structured data (HTML lists) from the Web to automatically generate semantic-class lexicons, which are used to improve query tagging performance – even with far less training data. We focus our study on understanding the correct objectives for the semi-supervised lexicon learning algorithms that are crucial for the success of query tagging. Prior work on lexicon acquisition has largely focused on the precision...