The empirical duality gap of constrained statistical learning, Luiz Chamon
This paper is concerned with the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all of modern information processing. Accounting for constraints, however, is paramount to incorporate prior knowledge and impose desired structural and statistical properties on the solutions. Still, solving constrained statistical problems remains challenging and guarantees scarce, leaving them to be tackled using regularized formulations. Though practical and effective, selecting regularization parameters so as to satisfy requirements is challenging, if at all possible, due to the lack of a straightforward relation between parameters and constraints. In this work, we propose to directly tackle the constrained statistical problem overcoming its infinite dimensionality, unknown distributions, and constraints by leveraging finite dimensional parameterizations, sample averages, and duality theory. Aside from making the problem tractable, these tools allow us to bound the empirical duality gap, i.e., the difference between our approximate tractable solutions and the actual solutions of the original statistical problem. We demonstrate the effectiveness and usefulness of this constrained formulation in a fair learning application.