Fifth normal form explained

Fifth normal form (5NF), also known as projection–join normal form (PJ/NF), is a level of database normalization designed to remove redundancy in relational databases recording multi-valued facts by isolating semantically related multiple relationships. A table is said to be in the 5NF if and only if every non-trivial join dependency in that table is implied by the candidate keys. It is the final normal form as far as removing redundancy is concerned.

A 6NF also exists, but its purpose is not to remove redundancy and it is therefore only adopted by a few data warehouses, where it can be useful to make tables irreducible.

A join dependency * on R is implied by the candidate key(s) of R if and only if each of A, B, …, Z is a superkey for R.[1]

The fifth normal form was first described by Ronald Fagin in his 1979 conference paper Normal forms and relational database operators.[2]

Example

Consider the following example:

Traveling-salesman product availability by brand
- ! Traveling salesman Brand Product type - Jack Schneider Acme Vacuum cleaner - Jack Schneider Acme Breadbox - Mary Jones Robusto Pruning shears - Mary Jones Robusto Vacuum cleaner - Mary Jones Robusto Breadbox - Mary Jones Robusto Umbrella stand - Louis Ferguson Robusto Vacuum cleaner - Louis Ferguson Robusto Telescope - Louis Ferguson Acme Vacuum cleaner - Louis Ferguson Acme Lava lamp - Louis Ferguson Nimbus Tie rack

The table's predicate is: products of the type designated by product type, made by the brand designated by brand, are available from the traveling salesman designated by traveling salesman.

The primary key is the composite of all three columns. Also note that the table is in 4NF, since there are no multivalued dependencies (2-part join dependencies) in the table: no column (which by itself is not a candidate key or a superkey) is a determinant for the other two columns.

In the absence of any rules restricting the valid possible combinations of traveling salesman, brand, and product type, the three-attribute table above is necessary in order to model the situation correctly.

Suppose, however, that the following rule applies: A traveling salesman has certain brands and certain product types in their repertoire. If brand B1 and brand B2 are in their repertoire, and product type P is in their repertoire, then (assuming brand B1 and brand B2 both make product type P), the traveling salesman must offer products of product type P those made by brand B1 and those made by brand B2.

In that case, it is possible to split the table into three:

Product types by traveling salesman
- ! Traveling salesman Product type - Jack Schneider Vacuum cleaner - Jack Schneider Breadbox - Mary Jones Pruning shears - Mary Jones Vacuum cleaner - Mary Jones Breadbox - Mary Jones Umbrella stand - Louis Ferguson Telescope - Louis Ferguson Vacuum cleaner - Louis Ferguson Lava lamp - Louis Ferguson Tie rack
Brands by traveling salesman
- ! Traveling salesman Brand - Jack Schneider Acme - Mary Jones Robusto - Louis Ferguson Robusto - Louis Ferguson Acme - Louis Ferguson Nimbus
Product types by brand
- ! Brand Product type - Acme Vacuum cleaner - Acme Breadbox - Acme Lava lamp - Robusto Pruning shears - Robusto Vacuum cleaner - Robusto Breadbox - Robusto Umbrella stand - Robusto Telescope - Nimbus Tie rack

In this case, it's impossible for Louis Ferguson to refuse to offer vacuum cleaners made by Acme (assuming Acme makes vacuum cleaners) if he sells anything else made by Acme (lava lamp) and he also sells vacuum cleaners made by any other brand (Robusto). Note how this setup helps to remove redundancy. Suppose that Jack Schneider starts selling Robusto's products breadboxes and vacuum cleaners. In the previous setup we would have to add two new entries one for each product type (, ). With the new setup we need to add only a single entry () in "brands by traveling salesman".

Usage

Only in rare situations does a 4NF table not conform to 5NF; for instance, when the decomposed tables are cyclic. These are situations in which a complex real-world constraint governing the valid combinations of attribute values in the 4NF table is not implicit in the structure of that table. If such a table is not normalized to 5NF, the burden of maintaining the logical consistency of the data within the table must be carried partly by the application responsible for insertions, deletions, and updates to it; and there is a heightened risk that the data within the table will become inconsistent. In contrast, the 5NF design excludes the possibility of such inconsistencies.

A table T is in fifth normal form (5NF) or projection-join normal form (PJ/NF) if it cannot have a lossless decomposition into any number of smaller tables. The case where all the smaller tables after the decomposition have the same candidate key as the table T is excluded.

See also

Further reading

Notes and References

  1. http://www.anchormodeling.com/wp-content/uploads/2010/08/6nf.pdf Analysis of normal forms for anchor-tables
  2. Book: Introduction to Data Base and Knowledge Base Systems . S. Krishna . 1991. World Scientific . 9810206208. The fifth normal form was introduced by Fagin.