Lehrstuhl Mathematik & Informatik

General Bounds on the Number of Examples Needed for Learning Probabilistic Concepts

Abstract. Given a p-concept class C, we define two important functions d_C(gamma), d'_C(gamma) (related to the notion of gamma-shattering). We prove a lower bound of Omega((d_C(gamma)-1)/(epsilon(gamma)²)) on the number of examples required for learning C with an (epsilon, gamma)-good model of probability. We prove similar lower bounds for some other learning models like learning with gamma-bounded absolute (or quadratic) difference or learning with a gamma-good decision rule. For the class ND of nondecreasing p-concepts on the real domain, d_ND(gamma)=Omega(1/gamma). It can be shown that the resulting lower bounds for learning ND (within the models in consideration) are tight to within a logarithmic factor. In order to get the "almost-matching" upper bounds, we introduce a new method for designing learning algorithms: dynamic partitioning of the domain by use of splitting trees. The paper also contains a discussion of the gaps between the general lower bounds and the corresponding general upper bounds. It can be shown that, under very mild conditions, these gaps are quite narrow.

Fakultät für Mathematik

General Bounds on the Number of Examples Needed for Learning Probabilistic Concepts