Information retrieval systems, based on keyword match, are evolving to question answering systems that return short passages or direct answers to questions, rather than URLs pointing to whole pages. Most open-domain question answering systems depend on elaborately designed hierarchies of question types. A question is first classified to a fixed type, and then hand-engineered rules associated with the type yield keywords and/or predictive annotations that are likely to match indexed answer passages. Here we seek a more data-driven approach, assisted by machine learning. We propose a simple conditional exponential model over a pair of feature vectors, one derived from the question and the other derived from the a candidate passage. Features are extracted using a lexical network and surface context as in named entity extraction, except that there is no direct supervision available in the form of fixed entity types and their examples. Using the exponential model, we filter candidate passa...