Adaptive resonance theory explained

Adaptive resonance theory (ART) is a theory developed by Stephen Grossberg and Gail Carpenter on aspects of how the brain processes information. It describes a number of artificial neural network models which use supervised and unsupervised learning methods, and address problems such as pattern recognition and prediction.

The primary intuition behind the ART model is that object identification and recognition generally occur as a result of the interaction of 'top-down' observer expectations with 'bottom-up' sensory information. The model postulates that 'top-down' expectations take the form of a memory template or prototype that is then compared with the actual features of an object as detected by the senses. This comparison gives rise to a measure of category belongingness. As long as this difference between sensation and expectation does not exceed a set threshold called the 'vigilance parameter', the sensed object will be considered a member of the expected class. The system thus offers a solution to the 'plasticity/stability' problem, i.e. the problem of acquiring new knowledge without disrupting existing knowledge that is also called incremental learning.

Learning model

The basic ART system is an unsupervised learning model. It typically consists of a comparison field and a recognition field composed of neurons, a vigilance parameter (threshold of recognition), and a reset module.

Training

There are two basic methods of training ART-based neural networks: slow and fast. In the slow learning method, the degree of training of the recognition neuron's weights towards the input vector is calculated to continuous values with differential equations and is thus dependent on the length of time the input vector is presented. With fast learning, algebraic equations are used to calculate degree of weight adjustments to be made, and binary values are used. While fast learning is effective and efficient for a variety of tasks, the slow learning method is more biologically plausible and can be used with continuous-time networks (i.e. when the input vector can vary continuously).

Types

ART 1[1] [2] is the simplest variety of ART networks, accepting only binary inputs.ART 2[3] extends network capabilities to support continuous inputs.ART 2-A[4] is a streamlined form of ART-2 with a drastically accelerated runtime, and with qualitative results being only rarely inferior to the full ART-2 implementation.ART 3[5] builds on ART-2 by simulating rudimentary neurotransmitter regulation of synaptic activity by incorporating simulated sodium (Na+) and calcium (Ca2+) ion concentrations into the system's equations, which results in a more physiologically realistic means of partially inhibiting categories that trigger mismatch resets.

ARTMAP[6] also known as Predictive ART, combines two slightly modified ART-1 or ART-2 units into a supervised learning structure where the first unit takes the input data and the second unit takes the correct output data, then used to make the minimum possible adjustment of the vigilance parameter in the first unit in order to make the correct classification.

Fuzzy ART[7] implements fuzzy logic into ART's pattern recognition, thus enhancing generalizability. An optional (and very useful) feature of fuzzy ART is complement coding, a means of incorporating the absence of features into pattern classifications, which goes a long way towards preventing inefficient and unnecessary category proliferation. The applied similarity measures are based on the L1 norm. Fuzzy ART is known to be very sensitive to noise.

Fuzzy ARTMAP[8] is merely ARTMAP using fuzzy ART units, resulting in a corresponding increase in efficacy.

Simplified Fuzzy ARTMAP (SFAM)[9] constitutes a strongly simplified variant of fuzzy ARTMAP dedicated to classification tasks.

Gaussian ART[10] and Gaussian ARTMAP use Gaussian activation functions and computations based on probability theory. Therefore, they have some similarity with Gaussian mixture models. In comparison to fuzzy ART and fuzzy ARTMAP, they are less sensitive to noise. But the stability of learnt representations is reduced which may lead to category proliferation in open-ended learning tasks.

Fusion ART and related networks[11] [12] [13] extend ART and ARTMAP to multiple pattern channels. They support several learning paradigms, including unsupervised learning, supervised learning and reinforcement learning.

TopoART[14] combines fuzzy ART with topology learning networks such as the growing neural gas. Furthermore, it adds a noise reduction mechanism. There are several derived neural networks which extend TopoART to further learning paradigms.

Hypersphere ART[15] and Hypersphere ARTMAP are closely related to fuzzy ART and fuzzy ARTMAP, respectively. But as they use a different type of category representation (namely hyperspheres), they do not require their input to be normalised to the interval [0, 1]. They apply similarity measures based on the L2 norm.

LAPART[16] The Laterally Primed Adaptive Resonance Theory (LAPART) neural networks couple two Fuzzy ART algorithms to create a mechanism for making predictions based on learned associations. The coupling of the two Fuzzy ARTs has a unique stability that allows the system to converge rapidly towards a clear solution. Additionally, it can perform logical inference and supervised learning similar to fuzzy ARTMAP.

Criticism

It has been noted that results of Fuzzy ART and ART 1 (i.e., the learnt categories) depend critically upon the order in which the training data are processed. The effect can be reduced to some extent by using a slower learning rate, but is present regardless of the size of the input data set. Hence Fuzzy ART and ART 1 estimates do not possess the statistical property of consistency.[17] This problem can be considered as a side effect of the respective mechanisms ensuring stable learning in both networks.

More advanced ART networks such as TopoART and Hypersphere TopoART that summarise categories to clusters may solve this problem as the shapes of the clusters do not depend on the order of creation of the associated categories. (cf. Fig. 3(g, h) and Fig. 4 of [18])

References

Wasserman, Philip D. (1989), Neural computing: theory and practice, New York: Van Nostrand Reinhold,

External links

Notes and References

  1. Carpenter, G.A. & Grossberg, S. (2003), Adaptive Resonance Theory, In Michael A. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks, Second Edition (pp. 87-90). Cambridge, MA: MIT Press
  2. Grossberg, S. (1987), Competitive learning: From interactive activation to adaptive resonance, Cognitive Science (journal), 11, 23-63
  3. Carpenter, G.A. & Grossberg, S. (1987), ART 2: Self-organization of stable category recognition codes for analog input patterns, Applied Optics, 26(23), 4919-4930
  4. Carpenter, G.A., Grossberg, S., & Rosen, D.B. (1991a), ART 2-A: An adaptive resonance algorithm for rapid category learning and recognition, Neural Networks, 4, 493-504
  5. Carpenter, G.A. & Grossberg, S. (1990), ART 3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures, Neural Networks, 3, 129-152
  6. Carpenter, G.A., Grossberg, S., & Reynolds, J.H. (1991), ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network, Neural Networks, 4, 565-588
  7. Carpenter, G.A., Grossberg, S., & Rosen, D.B. (1991b), Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system, Neural Networks, 4, 759-771
  8. Carpenter, G.A., Grossberg, S., Markuzon, N., Reynolds, J.H., & Rosen, D.B. (1992), Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps, IEEE Transactions on Neural Networks, 3, 698-713
  9. Mohammad-Taghi Vakil-Baghmisheh and Nikola Pavešić. (2003) A Fast Simplified Fuzzy ARTMAP Network, Neural Processing Letters, 17(3):273–316
  10. James R. Williamson. (1996), Gaussian ARTMAP: A Neural Network for Fast Incremental Learning of Noisy Multidimensional Maps, Neural Networks, 9(5):881-897
  11. Y.R. Asfour, G.A. Carpenter, S. Grossberg, and G.W. Lesher. (1993) Fusion ARTMAP: an adaptive fuzzy network for multi-channel classification. In: Proceedings of the Third International Conference on Industrial Fuzzy Control and Intelligent Systems (IFIS).
  12. Book: Tan. A.-H.. Carpenter. G. A.. Grossberg. S.. Advances in Neural Networks – ISNN 2007 . Intelligence Through Interaction: Towards a Unified Theory for Learning . 2007. Liu. D.. Fei. S.. Hou. Z.-G.. Zhang. H.. Sun. C.. https://ink.library.smu.edu.sg/sis_research/6558. Lecture Notes in Computer Science. 4491 . en. Berlin, Heidelberg. Springer. 1094–1103. 10.1007/978-3-540-72383-7_128. 978-3-540-72383-7.
  13. Tan. A.-H.. Subagdja. B.. Wang. D.. Meng. L.. 2019. Self-organizing neural networks for universal learning and multimodal memory encoding. Neural Networks. en. 120. 58–73. 10.1016/j.neunet.2019.08.020. 31537437 . 202703163 .
  14. Marko Tscherepanow. (2010) TopoART: A Topology Learning Hierarchical ART Network, In: Proceedings of the International Conference on Artificial Neural Networks (ICANN), Part III, LNCS 6354, 157-167
  15. Georgios C. Anagnostopoulos and Michael Georgiopoulos. (2000), Hypersphere ART and ARTMAP for Unsupervised and Supervised Incremental Learning, In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), vol. 6, 59-64
  16. Sandia National Laboratories (2017) Lapart-python documentation
  17. Sarle, Warren S. (1995), Why Statisticians Should Not FART
  18. Marko Tscherepanow. (2012) Incremental On-line Clustering with a Topology-Learning Hierarchical ART Neural Network Using Hyperspherical Categories, In: Poster and Industry Proceedings of the Industrial Conference on Data Mining (ICDM), 22–34