AWS Glue explained

AWS Glue
Developer:Amazon.com
Released: [1]
Operating System:Cross-platform
Language:English

AWS Glue is an event-driven, serverless computing platform provided by Amazon as a part of Amazon Web Services. It was introduced in August 2017.[2]

The primary purpose of Glue is to scan other services[3] in the same Virtual Private Cloud (or equivalent accessible network element even if not provided by AWS), particularly S3. The jobs are billed according to compute time, with a minimum count of 1 minute.[4] Glue discovers the source data to store associated meta-data (e.g. the table's schema of field names, types lengths) in the AWS Glue Data Catalog (which is then accessible via AWS console or APIs).[5]

Languages supported

Scala and Python are officially supported .[6]

Catalog interrogation via API

The catalog can be read in AWS console (via browser) and via API divided into topics including:[7]

See also

Notes and References

  1. Web site: Introducing AWS Glue: A Simple, Flexible, and Cost-Effective Extract, Transfer, and Load (ETL) Service.
  2. Web site: AWS Services List . ParkMyCloud . October 6, 2020.
  3. Web site: AWS Glue: crawlers and use cases . 5 January 2022 . July 13, 2022.
  4. Web site: AWS Glue version 2.0 featuring 10x faster job start times and 1-minute minimum billing duration . AWS . August 10, 2020 . October 6, 2020.
  5. Web site: AWS Glue API Documentation . AWS . October 6, 2020.
  6. Web site: AWS Glue Now Supports Scala in Addition to Python . AWS . January 12, 2018 . October 6, 2020.
  7. Web site: Catalog API . AWS . October 8, 2020.