We investigate the complexity of neural learning problems in the form of
three basic questionsrepresentation, modification, and constructionrelated
to algorithms that learn from examples in neural networks.
The first part is concerned with representations of example sets by
weight vectors. We establish bounds for the range of integer weights of
McCullochPitts neurons computing linearly separable Boolean functions
in case that these functions are incompletely specified. To this end, we
introduce the concept of weight complexity as a measure for the degree
of difficulty to represent linear dichotomies of sets of binary words by
McCullochPitts neurons. We give upper and lower bounds for the weight
complexity of linear dichotomies in terms of their cardinality. The upper
bound established results for completely specified functions in an improvement
by a factor of 2 compared to the previously known bound. Concerning lower
bounds we develop methods to construct linear dichotomies of linear cardinality
having exponential weight complexity. Furthermore, we investigate weight
complexity in the average case and generalize the previously known lower
bound for completely specified functions to linear dichotomies. Finally,
we consider linear dichotomies consisting of binary words with bounded
Hammingweight and show that their weight complexity can be exponential
even if their Hammingweight is at most 3. For Hammingweight at most 2
we prove that exponential weight complexity is equivalent to requiring
an exponential threshold.
The second part considers learning to be modification of representations.
We define identification criteria for simple learning rules in McCullochPitts
neurons that are weaker than exact identification. We show that the Perceptron
learning rule cannot satisfy any of these criteria in a polynomial number
of learning steps. For other known learning rules based on the error correction
principle we show similar results and survey further issues already treated
in the literature.
The last part is devoted to the general problem of constructing representations
efficiently. We focus on the task of mapping example sets to weights in
neural architectures. The consistency problem and the approximation problem
are variants of this task asking for the mere existence of appropriate
weights. In the thesis, we propose two parameters, coincidence and heaviness,
to characterize example sets. Using both to restrict the set of instances,
we create from each problem an infinite class of problems whose complexity
may vary with the choice of these parameters. For architectures with known
NPcomplete consistency and approximation problem we show NPcompleteness
already to occur when coincidence and heaviness are constant. We determine
some of these constants optimally, thereby proving interesting dichotomy
theorems. Finally, for twolayered architectures with more than two hidden
nodes we show NPcompleteness of the consistency problem to hold for a
larger class of network functions than previously known.
Some NPcompleteness results obtained in the thesis are based on two
new variants of the satisfiability problem. We give their NPcompleteness
proofs in the Appendix. They may also be interesting for the sake of their
own and helpful in finding further NPcomplete problems.
Table of Contents
1. Introduction
1.1 Notation
1.2 McCullochPitts Neurons and Linearly Separable Functions
1.3 Learning from Examples
1.4 Computational Complexity
2. Linearly Separable Sets and Their Weight Complexity
2.1 Properties of Linearly Separable Sets
2.2 Weight Complexity
2.3 Upper Bounds
2.4 Lower Bounds
2.5 Average Case Weight Complexity
2.6 Weight Complexity of Light Sets
2.7 Remarks and Open Questions
3 Simple Learning Rules for McCullochPitts Neurons
3.1 Learning from Errors: The Perceptron Learning Rule
3.2 Exact Identification
3.3 pacIdentification
3.4 Identification of Variable Order
3.5 Identification of Signs
3.6 Other Simple Learning Rules
3.7 Remarks and Open Questions
4 Consistency Problems
4.1 Neural Variants
4.2 Neurons with Bounded Weights
4.3 Twolayered Architectures
4.4 More than Two Hidden Nodes
4.5 Cascade Architectures
4.6 An Approximation Problem
4.7 Remarks and Open Questions
A Two NPcomplete Satisfiability Problems
A.1 The 1Disjoint Positive 1IN3SAT Problem
A.2 The 1Disjoint 3SETSPLITTING Problem
Summary
References
Author Index
Acknowledgements
