Concept class explained
In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory.
Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning.[1] In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map
, i.e. from examples to classifier labels (where
and where
c is a subset of
X),
c is then said to be a
concept. A
concept class
is then a collection of such concepts.
Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s. Not every subclass is reachable.[2]
Background
A sample
is a
partial function from
to
. Identifying a concept with its characteristic function mapping
to
, it is a special case of a sample.
Two samples are consistent if they agree on the intersection of their domains. A sample
extends another sample
if the two are consistent and the domain of
is contained in the domain of
.
Examples
Suppose that
. Then:
is reachable with the sample
;
for
are reachable with a sample that maps the elements of
to zero;
, which consists of the singleton sets, is
not reachable.
Applications
Let
be some concept class. For any concept
, we call this concept
-good for a positive integer
if, for all
, at least
of the concepts in
agree with
on the classification of
. The
fingerprint dimension
of the entire concept class
is the least positive integer
such that every reachable subclass
contains a concept that is
-good for it. This quantity can be used to bound the minimum number of equivalence queries needed to learn a class of concepts according to the following
inequality:
.
Notes and References
- https://arxiv.org/abs/1801.06566 Chase, H., & Freitag, J. (2018). Model Theory and Machine Learning. arXiv preprint arXiv:1801.06566
- Angluin. D.. 2004. Queries revisited. Theoretical Computer Science. 313. 2. 188–191. 10.1016/j.tcs.2003.11.004 .