Regularization-based Methods for Ordinal Quantification
Quantification, i.e., the task of predicting the class prevalence values in bags of unlabeled data items, has received increased attention in recent years. However, most quantification research has concentrated on developing algorithms for binary and multi-class problems in which the classes are not ordered. Here, we study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes. We give three main contributions to this field. First, we create and make available two datasets for ordinal quantification (OQ) research that overcome the inadequacies of the previously available ones. Second, we experimentally compare the most important OQ algorithms proposed in the literature so far. To this end, we bring together algorithms proposed by authors from very different research fields, such as data mining and astrophysics, who were unaware of each others’ developments. Third, we propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments. The key to this gain in performance is that our regularization prevents ordinally implausible estimates, assuming that ordinal distributions tend to be smooth in practice. We informally verify this assumption for several real-world applications.
- Published in:
Data Mining and Knowledge Discovery - Type:
Article - Authors:
Bunse, Mirko; Moreo, Alejandro; Sebastiani, Fabrizio; Senz, Martin - Year:
2024
Citation information
Bunse, Mirko; Moreo, Alejandro; Sebastiani, Fabrizio; Senz, Martin: Regularization-based Methods for Ordinal Quantification, Data Mining and Knowledge Discovery, 2024, August, Springer Science and Business Media LLC, Bunse.etal.2024a,
@Article{Bunse.etal.2024a,
author={Bunse, Mirko; Moreo, Alejandro; Sebastiani, Fabrizio; Senz, Martin},
title={Regularization-based Methods for Ordinal Quantification},
journal={Data Mining and Knowledge Discovery},
month={August},
publisher={Springer Science and Business Media LLC},
year={2024},
abstract={Quantification, i.e., the task of predicting the class prevalence values in bags of unlabeled data items, has received increased attention in recent years. However, most quantification research has concentrated on developing algorithms for binary and multi-class problems in which the classes are not ordered. Here, we study the ordinal case, i.e., the case in which a total order is defined on the...}}