Ceph (software) explained

Ceph Storage
Author:Inktank Storage (Sage Weil, Yehuda Sadeh Weinraub, Gregory Farnum, Josh Durgin, Samuel Just, Wido den Hollander)
Developer:Red Hat, Intel, CERN, Cisco, Fujitsu, SanDisk, Canonical and SUSE[1]
Latest Release Version:
Programming Language:C++, Python[2]
Operating System:Linux, FreeBSD,[3] Windows
Genre:Distributed object store
License:LGPLv2.1[4]

Ceph (pronounced) is a free and open-source software-defined storage platform that provides object storage,[5] block storage, and file storage built on a common distributed cluster foundation. Ceph provides distributed operation without a single point of failure and scalability to the exabyte level. Since version 12 (Luminous), Ceph does not rely on any other conventional filesystem and directly manages HDDs and SSDs with its own storage backend BlueStore and can expose a POSIX filesystem.

Ceph replicates data with fault tolerance,[6] using commodity hardware and Ethernet IP and requiring no specific hardware support. Ceph is highly available and ensures strong data durability through techniques including replication, erasure coding, snapshots and clones. By design, the system is both self-healing and self-managing, minimizing administration time and other costs.

Large-scale production Ceph deployments include CERN,[7] [8] OVH[9] [10] [11] [12] and DigitalOcean.[13] [14]

Design

Ceph employs five distinct kinds of daemons:[15]

All of these are fully distributed, and may be deployed on disjoint, dedicated servers or in a converged topology. Clients with different needs directly interact with appropriate cluster components.[19]

Ceph distributes data across multiple storage devices and nodes to achieve higher throughput, in a fashion similar to RAID. Adaptive load balancing is supported whereby frequently accessed services may be replicated over more nodes.[20]

, BlueStore is the default and recommended storage back end for production environments,[21] which provides better latency and configurability than the older Filestore back end, and avoiding the shortcomings of filesystem based storage involving additional processing and caching layers. The Filestore back end will be deprecated as of the Reef release in mid 2023. XFS was the recommended underlying filesystem for Filestore OSDs, and Btrfs could be used at one's own risk. ext4 filesystems were not recommended due to limited metadata capacity.[22] The BlueStore back end does still use XFS for a small metadata partition.[23]

Object storage S3

Ceph implements distributed object storage via the RADOS GateWay, which exposes the underlying storage layer via an interface compatible with Amazon S3 or OpenStack Swift.

Ceph RGW deployments scale readily and often utilize large and dense storage media for bulk use cases that include Big Data (datalake), backups & archives, IOT, media, video recording, and deployment images for virtual machines and containers.[24]

Ceph's software libraries provide client applications with direct access to the reliable autonomic distributed object store (RADOS) object-based storage system. More frequently used are libraries for Ceph's RADOS Block Device (RBD), RADOS Gateway, and Ceph File System services. In this way, administrators can maintain their storage devices within a unified system, which makes it easier to replicate and protect the data.

The "librados" software libraries provide access in C, C++, Java, PHP, and Python. The RADOS Gateway also exposes the object store as a RESTful interface which can present as both native Amazon S3 and OpenStack Swift APIs.

Block storage

Ceph can provide clients with thin-provisioned block devices. When an application writes data to Ceph using a block device, Ceph automatically stripes and replicates the data across the cluster. Ceph's RADOS Block Device (RBD) also integrates with Kernel-based Virtual Machines (KVMs).

Ceph block storage may be deployed on traditional HDDs and/or SSDs which are associated with Ceph's block storage for use cases, including databases, virtual machines, data analytics, artificial intelligence, and machine learning. Block storage clients often require high throughput and IOPS, thus Ceph RBD deployments increasingly utilize SSDs with NVMe interfaces.

"RBD" is built on with Ceph's foundational RADOS object storage system that provides the librados interface and the CephFS file system. Since RBD is built on librados, RBD inherits librados's abilities, including clones and snapshots. By striping volumes across the cluster, Ceph improves performance for large block device images.

"Ceph-iSCSI" is a gateway which enables access to distributed, highly available block storage from Microsoft Windows and VMware vSphere servers or clients capable of speaking the iSCSI protocol. By using ceph-iscsi on one or more iSCSI gateway hosts, Ceph RBD images become available as Logical Units (LUs) associated with iSCSI targets, which can be accessed in an optionally load-balanced, highly available fashion.

Since ceph-iscsi configuration is stored in the Ceph RADOS object store, ceph-iscsi gateway hosts are inherently without persistent state and thus can be replaced, augmented, or reduced at will. As a result, Ceph Storage enables customers to run a truly distributed, highly-available, resilient, and self-healing enterprise storage technology on commodity hardware and an entirely open source platform.

The block device can be virtualized, providing block storage to virtual machines, in virtualization platforms such as Openshift, OpenStack, Kubernetes, OpenNebula, Ganeti, Apache CloudStack and Proxmox Virtual Environment.

File storage

Ceph's file system (CephFS) runs on top of the same RADOS foundation as Ceph's object storage and block device services. The CephFS metadata server (MDS) provides a service that maps the directories and file names of the file system to objects stored within RADOS clusters. The metadata server cluster can expand or contract, and it can rebalance file system metadata ranks dynamically to distribute data evenly among cluster hosts. This ensures high performance and prevents heavy loads on specific hosts within the cluster.

Clients mount the POSIX-compatible file system using a Linux kernel client. An older FUSE-based client is also available. The servers run as regular Unix daemons.

Ceph's file storage is often associated with log collection, messaging, and file storage.

Dashboard

From 2018 there is also a Dashboard web UI project, which helps to manage the cluster. It's being developed by Ceph community on LGPL-3 and uses Ceph-mgr, Python, Angular framework and Grafana.[25] Landing page has been refreshed in the beginning of 2023.[26]

Previous dashboards were developed but are closed now: Calamari (2013–2018), OpenAttic (2013–2019), VSM (2014–2016), Inkscope (2015–2016) and Ceph-Dash (2015–2017).[27]

Crimson

Beginning in 2019 the Crimson project has been reimplementing the OSD data path. The goal of Crimson is to minimize latency and CPU overhead. Modern storage devices and interfaces including NVMe and 3D XPoint have become much faster than HDD and even SAS/SATA SSDs, but CPU performance has not kept pace. Moreover is meant to be a backward-compatible drop-in replacement for . While Crimson can work with the BlueStore back end (via AlienStore), a new native ObjectStore implementation called SeaStore is also being developed along with CyanStore for testing purposes. One reason for creating SeaStore is that transaction support in the BlueStore back end is provided by RocksDB, which needs to be re-implemented to achieve better parallelism.[28] [29] [30]

History

Ceph was created by Sage Weil for his doctoral dissertation,[31] which was advised by Professor Scott A. Brandt at the Jack Baskin School of Engineering, University of California, Santa Cruz (UCSC), and sponsored by the Advanced Simulation and Computing Program (ASC), including Los Alamos National Laboratory (LANL), Sandia National Laboratories (SNL), and Lawrence Livermore National Laboratory (LLNL).[32] The first line of code that ended up being part of Ceph was written by Sage Weil in 2004 while at a summer internship at LLNL, working on scalable filesystem metadata management (known today as Ceph's MDS).[33] In 2005, as part of a summer project initiated by Scott A. Brandt and led by Carlos Maltzahn, Sage Weil created a fully functional file system prototype which adopted the name Ceph. Ceph made its debut with Sage Weil giving two presentations in November 2006, one at USENIX OSDI 2006[34] and another at SC'06.[35]

After his graduation in autumn 2007, Weil continued to work on Ceph full-time, and the core development team expanded to include Yehuda Sadeh Weinraub and Gregory Farnum. On March 19, 2010, Linus Torvalds merged the Ceph client into Linux kernel version 2.6.34[36] [37] which was released on May 16, 2010. In 2012, Weil created Inktank Storage for professional services and support for Ceph.[38] [39]

In April 2014, Red Hat purchased Inktank, bringing the majority of Ceph development in-house to make it a production version for enterprises with support (hotline) and continuous maintenance (new versions).[40]

In October 2015, the Ceph Community Advisory Board was formed to assist the community in driving the direction of open source software-defined storage technology. The charter advisory board includes Ceph community members from global IT organizations that are committed to the Ceph project, including individuals from Red Hat, Intel, Canonical, CERN, Cisco, Fujitsu, SanDisk, and SUSE.[41]

In November 2018, the Linux Foundation launched the Ceph Foundation as a successor to the Ceph Community Advisory Board. Founding members of the Ceph Foundation included Amihan, Canonical, China Mobile, DigitalOcean, Intel, OVH, ProphetStor Data Services, Red Hat, SoftIron, SUSE, Western Digital, XSKY Data Technology, and ZTE.[42]

In March 2021, SUSE discontinued its Enterprise Storage product incorporating Ceph in favor of Rancher's Longhorn,[43] and the former Enterprise Storage website was updated stating "SUSE has refocused the storage efforts around serving our strategic SUSE Enterprise Storage Customers and are no longer actively selling SUSE Enterprise Storage."[44]

Release history

+ Release history
NameReleaseFirst releaseEnd of
life
Milestones
ArgonautJuly 3, 2012First major "stable" release
BobtailJanuary 1, 2013
CuttlefishMay 7, 2013ceph-deploy is stable
DumplingAugust 14, 2013May 2015namespace, region, monitoring REST API
EmperorNovember 9, 2013May 2014multi-datacenter replication for RGW
FireflyMay 7, 2014April 2016erasure coding, cache tiering, primary affinity, key/value OSD backend (experimental), standalone RGW (experimental)
GiantOctober 29, 2014April 2015
HammerApril 7, 2015August 2017
InfernalisNovember 6, 2015April 2016
JewelApril 21, 20162018-06-01Stable CephFS, experimental OSD back end named BlueStore, daemons no longer run as the root user
KrakenJanuary 20, 20172017-08-01BlueStore is stable, EC for RBD pools
LuminousAugust 29, 20172020-03-01pg-upmap balancer
MimicJune 1, 20182020-07-22snapshots are stable, Beast is stable, official GUI (Dashboard)
NautilusMarch 19, 20192021-06-01asynchronous replication, auto-retry of failed writes due to grown defect remapping
OctopusMarch 23, 20202022-06-01
PacificMarch 31, 2021[45] 2023-06-01
QuincyApril 19, 2022[46] 2024-06-01auto-setting of min_alloc_size for novel media
ReefAug 3, 2023[47]
SquidTBA

Available platforms

While basically built for Linux, Ceph has been also partially ported to Windows platform. It is production-ready for Windows Server 2016 (some commands might be unavailable due to lack of UNIX socket implementation), Windows Server 2019 and Windows Server 2022, but testing/development can be done also on Windows 10 and Windows 11. One can use Ceph RBD and CephFS on Windows, but OSD is not supported on this platform.[48] [49] [50]

There is also FreeBSD implementation of Ceph.[3]

Etymology

The name "Ceph" is a shortened form of "cephalopod", a class of molluscs that includes squids, cuttlefish, nautiloids, and octopuses. The name (emphasized by the logo) suggests the highly parallel behavior of an octopus and was chosen to associate the file system with "Sammy", the banana slug mascot of UCSC.[15] Both cephalopods and banana slugs are molluscs.

See also

Further reading

External links

Notes and References

  1. Web site: Ceph Community Forms Advisory Board . 2015-10-28 . 2016-01-20 . https://web.archive.org/web/20190129064135/https://www.storagereview.com/ceph_community_forms_advisory_board . 2019-01-29 . dead .
  2. Web site: GitHub Repository. GitHub.
  3. Web site: FreeBSD Quarterly Status Report.
  4. Web site: LGPL2.1 license file in the Ceph sources . . 2014-10-24 . 2014-10-24.
  5. News: Nicolas. Philippe. 2016-07-15. The History Boys: Object storage ... from the beginning. en. The Register.
  6. Web site: 2007-11-15 . Jeremy Andrews . Ceph Distributed Network File System . . 2007-11-15 . https://web.archive.org/web/20071117102035/http://kerneltrap.org/Linux/Ceph_Distributed_Network_File_System . 2007-11-17 . dead.
  7. Web site: Ceph Clusters . CERN . 12 November 2022.
  8. Web site: Ceph Operations at CERN: Where Do We Go From Here? - Dan van der Ster & Teo Mouratidis, CERN . YouTube . 24 May 2019 . 12 November 2022.
  9. Web site: Dorosz . Filip . Journey to next-gen Ceph storage at OVHcloud with LXD . OVHcloud . 15 June 2020 . 12 November 2022.
  10. Web site: CephFS distributed filesystem . OVHcloud . 12 November 2022.
  11. Web site: Ceph - Distributed Storage System in OVH [en] - Bartłomiej Święcki ]. YouTube . 7 April 2016 . 12 November 2022.
  12. Web site: 200 Clusters vs 1 Admin - Bartosz Rabiega, OVH . YouTube . 24 May 2019 . 15 November 2022.
  13. Web site: D'Atri . Anthony . Why We Chose Ceph to Build Block Storage . DigitalOcean . 31 May 2018 . 12 November 2022.
  14. Web site: Ceph Tech Talk: Ceph at DigitalOcean . YouTube . 7 October 2021 . 12 November 2022.
  15. Web site: Ceph: A Linux petabyte-scale distributed file system . 2010-06-04 . 2014-12-03 . M. Tim Jones . .
  16. Web site: BlueStore . 2017-09-29 . Ceph.
  17. Web site: BlueStore Migration. 2020-04-12. 2019-12-04. https://web.archive.org/web/20191204094405/https://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/. dead.
  18. Web site: Ceph Manager Daemon — Ceph Documentation. https://web.archive.org/web/20180606153846/http://docs.ceph.com/docs/mimic/mgr/. dead. June 6, 2018. docs.ceph.com. 2019-01-31. archive link
  19. Web site: 2007-11-14 . Jake Edge . The Ceph filesystem . LWN.net.
  20. Web site: 2017-10-01. Anthony D'Atri, Vaibhav Bhembre. Learning Ceph, Second Edition. Packt.
  21. Web site: 2017-08-29 . Sage Weil . v12.2.0 Luminous Released. Ceph Blog.
  22. Web site: Hard Disk and File System Recommendations. ceph.com. 2017-06-26. https://web.archive.org/web/20170714142019/http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/. 2017-07-14. dead.
  23. Web site: BlueStore Config Reference. April 12, 2020. July 20, 2019. https://web.archive.org/web/20190720185522/http://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/. dead.
  24. Web site: 2023-07-03 . 10th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2023) . 2023-08-09 . JINR (Indico).
  25. Web site: Ceph Dashboard . Ceph documentation . 11 April 2023.
  26. News: Gomez . Pedro Gonzalez . Introducing the new Dashboard Landing Page . 23 February 2023 . 11 April 2023.
  27. Web site: Operating Ceph from the Ceph Dashboard: past, present and future . YouTube . 22 November 2022 . 11 April 2023.
  28. Web site: Just . Sam . Crimson: evolving Ceph for high performance NVMe . Red Hat Emerging Technologies . 18 January 2021 . 12 November 2022.
  29. Web site: Just . Samuel . What's new with Crimson and Seastore? . YouTube . 10 November 2022 . 12 November 2022.
  30. Web site: Crimson: Next-generation Ceph OSD for Multi-core Scalability . Ceph blog . Ceph . 7 February 2023 . 11 April 2023.
  31. Web site: 2007-12-01 . Sage Weil . Ceph: Reliable, Scalable, and High-Performance Distributed Storage . . 2017-03-11 . 2017-07-06 . https://web.archive.org/web/20170706201040/https://ceph.com/wp-content/uploads/2016/08/weil-thesis.pdf . dead .
  32. Web site: The ASCI/DOD Scalable I/O History and Strategy . Gary Grider . 2004-05-01 . University of Minnesota . 2019-07-17 . en-US.
  33. Dynamic Metadata Management for Petabyte-Scale File Systems, SA Weil, KT Pollack, SA Brandt, EL Miller, Proc. SC'04, Pittsburgh, PA, November, 2004
  34. "Ceph: A scalable, high-performance distributed file system," SA Weil, SA Brandt, EL Miller, DDE Long, C Maltzahn, Proc. OSDI, Seattle, WA, November, 2006
  35. "CRUSH: Controlled, scalable, decentralized placement of replicated data," SA Weil, SA Brandt, EL Miller, DDE Long, C Maltzahn, SC'06, Tampa, FL, November, 2006
  36. Web site: 2010-02-19 . Sage Weil . Client merged for 2.6.34 . ceph.newdream.net . 2010-03-21 . 2010-03-23 . https://web.archive.org/web/20100323004234/http://ceph.newdream.net/2010/03/client-merged-for-2-6-34/ . dead .
  37. Web site: 2010-05-20 . Tim Stephens . New version of Linux OS includes Ceph file system developed at UCSC . news.ucsc.edu.
  38. Web site: 2012-05-03 . Bryan Bogensberger . And It All Comes Together . Inktank Blog . 2012-07-10 . https://web.archive.org/web/20120719100928/http://www.inktank.com/uncategorized/and-it-all-comes-together-2/ . 2012-07-19 . dead.
  39. News: The 10 Coolest Storage Startups Of 2012 (So Far) . Joseph F. Kovar . CRN . July 10, 2012 . July 19, 2013.
  40. Web site: 2014-04-30 . 2014-08-19 . Red Hat Inc . Red Hat to Acquire Inktank, Provider of Ceph . Red Hat.
  41. Web site: Ceph Community Forms Advisory Board. 2015-10-28. 2016-01-20. https://web.archive.org/web/20190129064135/https://www.storagereview.com/ceph_community_forms_advisory_board. 2019-01-29. dead.
  42. Web site: The Linux Foundation Launches Ceph Foundation To Advance Open Source Storage . 2018-11-12 .
  43. Web site: SUSE says tschüss to Ceph-based enterprise storage product – it's Rancher's Longhorn from here on out.
  44. Web site: SUSE Enterprise Software-Defined Storage.
  45. https://ceph.io/releases/v16-2-0-pacific-released/ Ceph.io — v16.2.0 Pacific released
  46. https://ceph.com/en/news/blog/2022/v17-2-0-quincy-released/ Ceph.io — v17.2.0 Quincy released
  47. Web site: Flores . Laura . v18.2.0 Reef released . Ceph Blog . 6 August 2023 . 26 August 2023.
  48. Web site: Ceph for Windows . Cloudbase Solutions . 2 July 2023.
  49. Web site: Installing Ceph on Windows . Ceph . 2 July 2023.
  50. Web site: Pilotti . Alessandro . Ceph on Windows . YouTube . 2 July 2023.