Large-file support (LFS) is the term frequently applied to the ability to create files larger than either 2 or 4 GiB on 32-bit filesystems.
Traditionally, many operating systems and their underlying file system implementations used 32-bit integers to represent file sizes and positions. Consequently, no file could be larger than 232 − 1 bytes (4 GiB − 1). In many implementations, the problem was exacerbated by treating the sizes as signed numbers, which further lowered the limit to 231 − 1 bytes (2 GiB − 1). Files that were too large for 32-bit operating systems to handle came to be known as large files.
While the limit was quite acceptable at a time when hard disks were smaller, the general increase in storage capacity combined with increased server and desktop file usage, especially for database and multimedia files, led to intense pressure for OS vendors to overcome the limitation.
In 1996, multiple vendors responded by forming an industry initiative known as the Large File Summit to support large files on POSIX (at the time Windows NT already supported large files on NTFS), an obvious backronym of "LFS". The summit was tasked to define a standardized way to switch to 64-bit numbers to represent file sizes.
This switch caused deployment issues and required design modifications, the consequences of which can still be seen:
[[fseek]]
and ftell
operate on file positions of type long int
, which is typically 32 bits wide on 32-bit platforms, and cannot be made larger without sacrificing backward compatibility. (This was resolved by introducing new functions fseeko
and ftello
in POSIX. On Windows machines, under Visual C++, functions _fseeki64
and _ftelli64
are used.)The usage of the large-file API in 32-bit programs had been incomplete for a long time. An analysis did show in 2002 that many base libraries of operating systems were still shipped without large-file support thereby limiting applications using them. The much-used zlib library started to support 64-bit large-files on 32-bit platform not before 2006.
The problem disappeared slowly with PCs and workstations moving completely to 64-bit computing. Microsoft Windows Server 2008 has been the last server version to be shipped in 32-bit. Redhat Enterprise Linux 7 was published in 2014 only as a 64-bit operating system. Ubuntu Linux stopped delivering a 32-bit variant in 2019. Nvidia stopped to develop 32-bit drivers in 2018 and deliver updates after January 2019. Apple stopped developing 32-bit Mac OS versions in 2018 delivering macOS Mojave only as a 64-bit operating system. The end-of-life for Windows 10 has been set to 2025 on the desktop which is related to the latest upgrades from old systems like Windows 7 & Windows 8 in January 2020 as some of those system ran on old computers built on the i386 architecture. Windows 11 however will ship only as a 64-bit operating system since its first version in 2021.
A similar development can be seen in the mobile area. Google required to support 64-bit versions of applications in their app store by August 2019, which allows to discontinue 32-bit support for Android later. The shift towards 64-bit started in 2014 when all new processors were designed to a 64-bit architecture and Android 5 ("Lollipop") was published in that year providing a fitting 64-bit variant of the operating system. Apple had made shift in the year before starting to produce the 64-Bit Apple A7 by 2013. Google started to deliver the development environment for Linux only in 64-bit by 2015. In May 2019 the share of Android versions below 5 had fallen to ten percent. As app developers concentrate on a single compilation variant, many manufacturers started to require Android 5 as the minimum version by mid 2019, for example Niantic. Subsequently the 32-bit versions were hard to get.
Except for embedded systems with their special programs, the consideration of varying large-file support becomes obsolete in program code after 2020.
The year 2038 problem is well known for another case where a 32-bit "long" on 32-bit platforms will lead into problems. Just like the large-file limitation it will get obsolete when systems move to 64-bit only. In the meantime a 64-bit timestamp was introduced. In the Win32 API it is visible in functions having a "64" suffix along the earlier "32" suffix. When large-file support was added to the Win32 API it has led to functions having an additional "i64" suffix which sometimes makes for four combinations.(findfirst32, findfirst64, findfirst32i64, findfirst64i32). By comparison the UNIX98 API introduces functions with a "64" suffix when "_LARGEFILE64_SOURCE" is used.
Related to the large-file API there is a limitation of block numbers for mass storage media. With a common size of 512 bytes per data block the barrier resulting from 32-bit numbers did occur later. When hard disk drives reached a size of 2 terabyte (around 2010) the master boot record had to be replaced by the GUID Partition Table which uses 64-bit for the LBA numbers (logical block address). On Unix-like operating systems it did also require to enlarge the inode numbers which are used in some functions (stat64, setrlimit64). The Linux kernel introduced that in 2001 leading to version 2.4 which was picked up by the glibc in that year. As the large-file support and large-disk support was introduced at the same time the GNU C Library exports 64-bit inode structures on 32-bit architectures at the same time when the Unix LFS API is activated in program code.
When the kernel moved to 64-bit inodes the file system ext3 used them internally in the driver by 2001. However the inode format on the storage media itself was stuck at 32-bit numbers. As mass storage devices moved to the Advanced Format of 4 kilobyte per block the actual limit of that file system format is at 8 or 16 terabyte. Handling larger disk partitions requires the usage of a different file system like XFS which was designed with 64-bit inodes from the start allowing for exabyte files and partitions. The first 16 terabyte magnetic disk drives were delivered by mid 2019. Solid-state drive with 32 TiB for data centers were available as early as 2016 with some manufacturers forecasting 100 TiB SSD by 2020.