Multi-label classification is a popular learning task. However, some of the algorithms that learn from multi-label data, can only output a score for each label, so they cannot be readily used in applications that require bipartitions. In addition, several of the recent state-of-the-art multi-label classification algorithms, actually output a score vector primarily and employ one (sometimes simple) thresholding method in order to be able to output bipartitions. Furthermore, some approaches can naturally output both a score vector and a bipartition, but whether a better bipartition can be obtained through thresholding has not been investigated. This paper contributes a theoretical and empirical comparative study of existing thresholding methods, highlighting their importance for obtaining bipartitions of high quality.