Users who are unfamiliar with database query languages can search XML data sets using keyword queries. Current approaches for supporting such queries are either for textcentric XML, where the structure is very simple and long text fields predominate; or data-centric, where the structure is very rich. However, long text fields are becoming more common in data-centric XML, and existing approaches deliver relatively poor precision, recall, and ranking for such data sets. In this paper, we introduce an XML keyword search method that provides high precision, recall, and ranking quality for data-centric XML, even when long text fields are present. Our approach is based on a new group of structural relationships called normalized term presence correlation (NTPC). In a one-time setup phase, we compute the NTPCs for a representative DB instance, then use this information to rank candidate answers for all subsequent queries, based on each answer’s structure. Our experiments with 65 user-su...