Information repository explained

In information technology, an information repository or simply a repository is "a central place in which an aggregation of data is kept and maintained in an organized way, usually in computer storage."^[1] It "may be just the aggregation of data itself into some accessible place of storage or it may also imply some ability to selectively extract data."

Universal digital library

The concept of a universal digital library was described as "within reach" by a 2012 European Union Copyright Directive^[2] which told about Google's attempts to "mass-digitize" what are termed "orphan works" (i.e. out-of-print copyrighted works).

The U.S. Copyright Office and the European Union Copyright law have been working on this. Google has reached agreements in France which "lets the publisher choose which works can be scanned or sold." By contrast, Google has been trying in the USA for a "free to digitize and sell any works unless the copyright holders opted out" deal and is still unsuccessful.^[3]

Information repository

Attempts to develop what was called an information repository have been underway for decades:

In 1989, IBM tried to have OfficeVision combine mainframes and PCs to enable "an information repository."^[4]
In 2003, Microsoft introduced OneNote as an extension to Microsoft Office 2003; it would support "a personal information repository."^[5]
In 1996, an 1898-founded library obtained additional funding to expand its mission, and become a major "local resource center and regional information repository."^[6] The New York Times described them as "the second largest in the New York City region, second only to the New York Public Library on Fifth Avenue." Their services include "a computer information center devoted to outside-item requests."

Federated information repository

A federated information repository is an easy way to deploy a secondary tier of data storage that can comprise multiple, networked data storage technologies running on diverse operating systems, where data that no longer needs to be in primary storage is protected, classified according to captured metadata, processed, de-duplicated, and then purged, automatically, based on data service level objectives and requirements. In federated information repositories, data storage resources are virtualized as composite storage sets and operate as a federated environment.^[7]

Federated information repositories were developed to mitigate problems arising from data proliferation and eliminate the need for separately deployed data storage solutions because of the concurrent deployment of diverse storage technologies running diverse operating systems. They feature centralized management for all deployed data storage resources. They are self-contained, support heterogeneous storage resources, support resource management to add, maintain, recycle, and terminate media, track of off-line media, and operate autonomously.

Automated data management

Since one of the main reasons for the implementation of a federated information repository is to reduce the maintenance workload placed on IT staff by traditional data storage systems, federated information repositories are automated. Automation is accomplished via policies that can process data based on time, events, data age, and data content. Policies manage the following:

File system space management
Irrelevant data elimination (mp3, games, etc.)
Secondary storage resource management

Data is processed according to media type, storage pool, and storage technology.

Because federated information repositories are intended to reduce IT staff workload, they are designed to be easy to deploy and offer configuration flexibility, virtually limitless extensibility, redundancy, and reliable failover.

Data recovery

Federated information repositories feature robust, client based data search and recovery capabilities that, based on permissions, enable end users to search the information repository, view information repository contents, including data on off-line media, and recover individual files or multiple files to either their original network computer or another network computer.

Notes and References

Web site: Rouse . Margaret . Definition: repository . whatis.com . TechTarget . 1 May 2019 . April 2005.
Web site: A universal digital library is within reach . Pamela Samuelson . . May 1, 2012.
News: The New York Times. In France, Publisher and Google Reach Deal . Eric Pfanner . August 25, 2011.
News: The New York Times. IBM software to integrate systems . May 17, 1989.
Web site: The New York Times. For Doodlers and Pack Rates, a Multi-Media Binder . December 11, 2003 . John Markoff.
News: The New York Times. Mt. Vernon Library Marks Its 100th Year . F. Romall . May 12, 1996.
Web site: Armstrong . Mark . Benefits of a Federated Information Repository as a Secondary Storage Tier . SNIA Enterprise Information World 2007 Conference . Storage Networking Industry Association (SNIA) . 1 May 2019 . 9 August 2007. https://web.archive.org/web/20081121202134/http://www.enterpriseinformationworld.com/abstracts/benefits_federated_info.htm . 2008-11-21 .