Order Compression Schemes

Sample compression schemes are schemes for "encoding" a set of examples in a small subset of examples. The long-standing open sample compression conjecture states that, for any concept class C of VC-dimension d, there is a sample compression scheme in which samples for concepts in C are compressed to samples of size at most d. We show that every order over C induces a special type of sample compression scheme for C, which we call order compression scheme. It turns out that order compression schemes can compress to samples of size at most d if C is maximum, intersection-closed, a Dudley class, or of VC-dimension, and thus in most cases for which the sample compression conjecture is known to be true. Since order compression schemes are much simpler than sample compression schemes in general, their study seems to be a promising step towards resolving the sample compression conjecture. We reveal a number of fundamental properties of order compression schemes, which are helpful in such a study. In particular, order compression schemes exhibit interesting graph-theoretic properties as well as connections to the theory of learning from teachers. To obtain small compressed sets, order compression schemes for a concept class C must often use a proper superset H of C as a hypothesis space. We thus further compare order compression schemes for C to order compression schemes for such hypothesis spaces, leading to a study of a number of mutually related combinatorial parameters specifying compressibility.