Region Based Convolutional Neural Networks Explained

Region-based Convolutional Neural Networks (R-CNN) are a family of machine learning models for computer vision and specifically object detection.

History

The original goal of R-CNN was to take an input image and produce a set of bounding boxes as output, where each bounding box contains an object and also the category (e.g. car or pedestrian) of the object. More recently, R-CNN has been extended to perform other computer vision tasks. The following covers some of the versions of R-CNN that have been developed.

Applications

Region-based convolutional neural networks have been used for tracking objects from a drone-mounted camera,[5] locating text in an image,[6] and enabling object detection in Google Lens.[7] Mask R-CNN serves as one of seven tasks in the MLPerf Training Benchmark, which is a competition to speed up the training of neural networks.[8]

References

  1. News: Bhatia. Richa. What is region of interest pooling?. September 10, 2018. Analytics India. March 12, 2020.
  2. News: Farooq. Umer. From R-CNN to Mask R-CNN. February 15, 2018. Medium. March 12, 2020.
  3. News: Weng. Lilian. Object Detection for Dummies Part 3: R-CNN Family. December 31, 2017. Lil'Log. March 12, 2020.
  4. News: Wiggers. Kyle. Facebook highlights AI that converts 2D objects into 3D shapes. October 29, 2019. VentureBeat. March 12, 2020.
  5. News: Nene. Vidi. Deep Learning-Based Real-Time Multiple-Object Detection and Tracking via Drone. Aug 2, 2019. Drone Below. Mar 28, 2020.
  6. News: Ray. Tiernan. Facebook pumps up character recognition to mine memes. Sep 11, 2018 . . Mar 28, 2020.
  7. News: Sagar. Ram. These machine learning methods make google lens a success. Sep 9, 2019. Analytics India. Mar 28, 2020.
  8. 1910.01500v3. math.LG. Peter. Mattson. MLPerf Training Benchmark. 2019. etal.