In information technology, an information repository or simply a repository is "a central place in which an aggregation of data is kept and maintained in an organized way, usually in computer storage."[1] It "may be just the aggregation of data itself into some accessible place of storage or it may also imply some ability to selectively extract data."
The concept of a universal digital library was described as "within reach" by a 2012 European Union Copyright Directive[2] which told about Google's attempts to "mass-digitize" what are termed "orphan works" (i.e. out-of-print copyrighted works).
The U.S. Copyright Office and the European Union Copyright law have been working on this. Google has reached agreements in France which "lets the publisher choose which works can be scanned or sold." By contrast, Google has been trying in the USA for a "free to digitize and sell any works unless the copyright holders opted out" deal and is still unsuccessful.[3]
Attempts to develop what was called an information repository have been underway for decades:
A federated information repository is an easy way to deploy a secondary tier of data storage that can comprise multiple, networked data storage technologies running on diverse operating systems, where data that no longer needs to be in primary storage is protected, classified according to captured metadata, processed, de-duplicated, and then purged, automatically, based on data service level objectives and requirements. In federated information repositories, data storage resources are virtualized as composite storage sets and operate as a federated environment.[7]
Federated information repositories were developed to mitigate problems arising from data proliferation and eliminate the need for separately deployed data storage solutions because of the concurrent deployment of diverse storage technologies running diverse operating systems. They feature centralized management for all deployed data storage resources. They are self-contained, support heterogeneous storage resources, support resource management to add, maintain, recycle, and terminate media, track of off-line media, and operate autonomously.
Since one of the main reasons for the implementation of a federated information repository is to reduce the maintenance workload placed on IT staff by traditional data storage systems, federated information repositories are automated. Automation is accomplished via policies that can process data based on time, events, data age, and data content. Policies manage the following:
Data is processed according to media type, storage pool, and storage technology.
Because federated information repositories are intended to reduce IT staff workload, they are designed to be easy to deploy and offer configuration flexibility, virtually limitless extensibility, redundancy, and reliable failover.
Federated information repositories feature robust, client based data search and recovery capabilities that, based on permissions, enable end users to search the information repository, view information repository contents, including data on off-line media, and recover individual files or multiple files to either their original network computer or another network computer.