Normalization helps eliminate redundancies in data design. Redundancies are systematically eliminated using a three or four step approach. The four important normal forms are 1NF, 2NF, 3NF, and BCNF. This tutorial will cover the first three normal shapes.

The three normal forms apply only to Dr. Codd’s relational model, also called the relational data model.

In first normal form (1NF), multi-valued columns are not allowed. If there are multiple value columns, such as multiple subjects recorded by students in a course registration system or multiple phone numbers used to call patients in a patient record system, they should be split into multiple records, one for each value in the multi-value column. In this form of normalization, each record should be uniquely identifiable and would therefore imply the use of the primary key.

In second normal form (2NF), in addition to the database being in first normal form, all non-key attributes must be functionally dependent only on the primary key. In the case of a composite key, the non-key attributes must be completely dependent on the full composite key. For example, consider the following customer registration system. The following tuple (Customer ID, Order ID, Order Date, Store Location) has the primary key as both Customer ID and Order ID. The two non-key attributes are Order Date and Store Location. Here, the store location is not dependent on the primary key, or in other words, this non-key attribute is functionally not dependent on the primary key. So this table needs to be split into two tables to make it 2NF.

Customer -Order(Customer ID, Order ID, Location ID) Location(Location ID, Location Name).

In third normal form, the database must be in second normal form, and furthermore, normal form ensures that redundancy is further reduced by ensuring that each non-key attribute does not have any dependencies on any other non-key attribute or does not have any transitive functional dependencies. In popular relational database terminology, this means that the data model has referential integrity constraints enabled by design.

As an example from a non-3NF database is the tuple with the attributes (Client ID, Name, Address Line 1, Address Line 2, Address Line 3, PIN Code). This is not in 3NF as the address line elements depend only on the PIN code). Address line 3). Here, the Pincode attribute on the client tuple is the foreign key in the table that matches the primary key in the Pincode tuple. This results in what is called a referential integrity constraint where the foreign key in one table corresponds to the primary key in another table.