Chainer | |
Author: | Seiya Tokui |
Developer: | Community, Preferred Networks, Inc. |
Released: | .[1] [2] |
Programming Language: | Python |
Platform: | cross-platform |
Language: | Python |
Genre: | Deep learning library |
License: | MIT |
Chainer is an open source deep learning framework written purely in Python on top of NumPy and CuPy Python libraries. The development is led by Japanese venture company Preferred Networks in partnership with IBM, Intel, Microsoft, and Nvidia.[3] [4] [5] [6]
Chainer is notable for its early adoption of "define-by-run" scheme, as well as its performance on large scale systems. The first version was released in June 2015 and has gained large popularity in Japan since then. Furthermore, in 2017, it was listed by KDnuggets in top 10 open source machine learning Python projects.[7]
In December 2019, Preferred Networks announced the transition of its development effort from Chainer to PyTorch and it will only provide maintenance patches after releasing v7.[8]
Chainer was the first deep learning framework to introduce the define-by-run approach.[9] [10] The traditional procedure to train a network was in two phases: define the fixed connections between mathematical operations (such as matrix multiplication and nonlinear activations) in the network, and then run the actual training calculation. This is called the define-and-run or static-graph approach. Theano and TensorFlow are among the notable frameworks that took this approach. In contrast, in the define-by-run or dynamic-graph approach, the connection in a network is not determined when the training is started. The network is determined during the training as the actual calculation is performed.
One of the advantages of this approach is that it is intuitive and flexible.[11] If the network has complicated control flows such as conditionals and loops, in the define-and-run approach, specially designed operations for such constructs are needed. On the other hand, in the define-by-run approach, programming language's native constructs such as if statements and for loops can be used to describe such flow. This flexibility is especially useful to implement recurrent neural networks.[12] [13]
Another advantage is ease of debugging. In the define-and-run approach, if an error (such as numeric error) has occurred in the training calculation, it is often difficult to inspect the fault, because the code written to define the network and the actual place of the error are separated. In the define-by-run approach, you can just suspend the calculation with the language's built-in debugger and inspect the data that flows on your code of the network.
Define-by-run has gained popularity since the introduction by Chainer and is now implemented in many other frameworks, including PyTorch[14] and TensorFlow.
Chainer has four extension libraries, ChainerMN, ChainerRL, ChainerCV and ChainerUI. ChainerMN enables Chainer to be used on multiple GPUs with performance significantly faster than other deep learning frameworks. A supercomputer running Chainer on 1024 GPUs processed 90 epochs of ImageNet dataset on ResNet-50 network in 15 minutes, which is four times faster than the previous record held by Facebook.[15] [16] ChainerRL adds state of art deep reinforcement learning algorithms, and ChainerUI is a management and visualization tool.
Chainer is used as the framework for PaintsChainer, a service which does automatic colorization of black and white, line only, draft drawings with minimal user input.[17] [18]