Data redundancy explained

In computer main memory, auxiliary storage and computer buses, data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can simply be a complete copy of the actual data (a type of repetition code), or only select pieces of data that allow detection of errors and reconstruction of lost or damaged data up to a certain level.

For example, by including computed check bits, ECC memory is capable of detecting and correcting single-bit errors within each memory word, while RAID 1 combines two hard disk drives (HDDs) into a logical storage unit that allows stored data to survive a complete failure of one drive.[1] [2] Data redundancy can also be used as a measure against silent data corruption; for example, file systems such as Btrfs and ZFS use data and metadata checksumming in combination with copies of stored data to detect silent data corruption and repair its effects.[3]

In database systems

While different in nature, data redundancy also occurs in database systems that have values repeated unnecessarily in one or more records or fields, within a table, or where the field is replicated/repeated in two or more tables. Often this is found in unnormalized database designs and results in the complication of database management, introducing the risk of corrupting the data, and increasing the required amount of storage. When done on purpose from a previously normalized database schema, it may be considered a form of database denormalization; used to improve performance of database queries (shorten the database response time).

For instance, when customer data are duplicated and attached with each product bought, then redundancy of data is a known source of inconsistency since a given customer might appear with different values for one or more of their attributes.[4] Data redundancy leads to data anomalies and corruption and generally should be avoided by design;[5] applying database normalization prevents redundancy and makes the best possible usage of storage.[6]

See also

Notes and References

  1. Web site: A Realistic Evaluation of Memory Hardware Errors and Software System Susceptibility . 9 May 2010 . 16 January 2015 . Xin Li . Michael C. Huang . Kai Shen . Lingkun Chu . cs.rochester.edu .
  2. Web site: Operating Systems – Three Easy Pieces: Redundant Arrays of Inexpensive Disks (RAIDs) . 3 January 2015 . 16 January 2015 . Remzi H. Arpaci-Dusseau . Andrea C. Arpaci-Dusseau . cs.wisc.edu .
  3. Web site: How I Use the Advanced Capabilities of Btrfs . August 2012 . 26 January 2015 . Margaret Bierman . Lenz Grimmer . Oracle Corporation.
  4. Book: Jorge H. Doorn. Laura C. Rivero. Database integrity: challenges and solutions. 23 January 2011. 2002. Idea Group Inc (IGI). 978-1-930708-38-9. 4–5.
  5. Book: Peter Rob. Carlos Coronel. Database systems: design, implementation, and management. 22 January 2011. 2009. Cengage Learning. 978-1-4239-0201-0. 88.
  6. Book: I. T. L. Education Solutions Limited. Itl. Introduction to Information Technology. 4 February 2011. 2009. Pearson Education India. 978-81-7758-118-8. 522.