Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure