Normalization in DBMS | Types of Normalization with Examples
Normalization in DBMS refers to the process of organizing data in a database into separate tables in order to reduce data redundancy and improve data integrity. The goal of normalization is to eliminate data anomalies that can lead to data inconsistencies, and to make sure that data is stored in a logical, consistent manner. There are several normalization forms (such as First Normal Form, Second Normal Form, Third Normal Form, etc.) that provide guidelines for organizing data in a normalized way. Normalization is an important aspect of database design and can help to improve the performance and scalability of a database.
Types of Normalization
There are several types of normalization in DBMS, each with its own set of rules and guidelines for organizing data in a normalized way. The most commonly used normalization forms include:
- First Normal Form (1NF): All data must be atomic, meaning that each cell in a table should contain only a single value and not a list of values.
- Second Normal Form (2NF): In addition to meeting the rules of 1NF, a table must not contain any partial dependencies. A partial dependency exists when a non-primary key column depends on only part of a composite primary key.
- Third Normal Form (3NF): In addition to meeting the rules of 2NF, a table must not contain any transitive dependencies. A transitive dependency exists when a non-primary key column depends on another non-primary key column.
- Boyce-Codd Normal Form (BCNF): A relation is in BCNF if and only if for every one of its non-trivial functional dependencies X → Y, X is a superkey.
- Fourth Normal Form (4NF): A table is in 4NF if it is in BCNF and it has no multi-valued dependencies.
- Fifth Normal Form (5NF): A relation is in 5NF if every non-trivial join dependency in R is implied by the candidate keys of R.
First Normal Form (1NF)
First Normal Form (1NF) is the most basic level of normalization in a DBMS. The rules for achieving 1NF are as follows:
- Each table should have a primary key, which uniquely identifies each record in the table.
- Each column in the table should contain only atomic values, which means that a single cell should contain a single value and not a list of values.
- There should be no repeating groups of data.
For example, let’s say we have a table named “Students” that stores information about students in a school. A table that is not in 1NF might look like this:
StudentID | Name | Subject | Grade |
---|---|---|---|
1 | John Smith | Math, Science | A |
2 | Jane Doe | English, History | B |
In this table, the Subject column contains multiple values separated by commas, which violates the rule of atomic values. This table is not in 1NF. To bring this table to 1NF, we would need to split the Subject column into multiple columns, one for each subject, and repeat the student information for each subject taken.
StudentID | Name | Subject | Grade |
---|---|---|---|
1 | John Smith | Math | A |
1 | John Smith | Science | A |
2 | Jane Doe | English | B |
2 | Jane Doe | History | B |
This table is now in 1NF, since all the data is atomic, each cell contains only one value, and there are no repeating groups of data.
1NF is the starting point for normalization, and to ensure the data integrity and consistency it’s necessary to move to next normalization forms.
Second Normal Form (2NF)
Second Normal Form (2NF) builds upon the rules of First Normal Form (1NF) by addressing the issue of partial dependencies. In 2NF, a table must not have any partial dependencies. A partial dependency exists when a non-primary key column depends on only part of a composite primary key.
For example, let’s say we have a table named “Orders” that stores information about customer orders. A table that is not in 2NF might look like this:
OrderID | CustomerID | Product | Quantity | Price |
---|---|---|---|---|
1 | 1 | A | 2 | 10 |
2 | 1 | B | 1 | 20 |
3 | 2 | A | 3 | 10 |
In this table, the Price column depends on the Product column and the primary key is composed of OrderID and CustomerID. The table is not in 2NF because the Price column is functionally dependent on only part of the primary key (Product) and not on the whole primary key (OrderID and CustomerID).
To bring this table to 2NF, we need to separate the table into two separate tables: one for the Orders and one for the Products.
Table: Orders
OrderID | CustomerID | Product | Quantity |
---|---|---|---|
1 | 1 | A | 2 |
2 | 1 | B | 1 |
3 | 2 | A | 3 |
Table: Products
Product | Price |
---|---|
A | 10 |
B | 20 |
Now, the Price column is dependent on the primary key of the Products table (Product) and the Orders table has no partial dependencies. This design is now in 2NF.
2NF eliminates partial dependencies and improves the data integrity by reducing the data anomalies. However, it’s not enough to ensure the data consistency and to avoid data anomalies, so it’s necessary to move to the next normalization forms.
Third Normal Form (3NF)
Third Normal Form (3NF) builds upon the rules of Second Normal Form (2NF) by addressing the issue of transitive dependencies. In 3NF, a table must not have any transitive dependencies. A transitive dependency exists when a non-primary key column depends on another non-primary key column, rather than on the primary key.
To achieve 3NF, a table must already be in 2NF and all non-primary key columns must be directly dependent on the primary key.
For example, let’s say we have a table named “Employees” that stores information about employees in a company. A table that is not in 3NF might look like this:
EmployeeID | Name | Department | Manager |
---|---|---|---|
1 | John | IT | Tom |
2 | Jane | HR | Tom |
3 | Tom | Management |
In this table, the Manager column depends on the Department column, and neither of them depend on the primary key (EmployeeID). This table is not in 3NF because of the transitive dependency between the Manager column and the Department column.
To bring this table to 3NF, we need to separate the table into two separate tables: one for the Employees and one for the Departments.
Table: Employees
EmployeeID | Name | Department |
---|---|---|
1 | John | IT |
2 | Jane | HR |
3 | Tom | Management |
Table: Departments
Department | Manager |
---|---|
IT | Tom |
HR | Tom |
Management |
Now, the Manager column is dependent on the primary key of the Departments table (Department) and the Employees table has no transitive dependencies. This design is now in 3NF.
3NF eliminates transitive dependencies and improves the data integrity and consistency by reducing the data anomalies. However, it’s still not enough to ensure the data consistency and to avoid data anomalies, so it’s necessary to move to the next normalization forms like Boyce-Codd Normal Form (BCNF) or Fourth Normal Form (4NF) in some cases.
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF) is a higher level of normalization than Third Normal Form (3NF). It addresses the issue of non-trivial functional dependencies. A functional dependency is said to be non-trivial if it doesn’t hold true for the case where the left-hand side is a subset of the primary key.
A table is in BCNF if and only if for every non-trivial functional dependency X → Y, X is a superkey. A superkey is a set of columns that can uniquely identify a row in a table.
For example, let’s say we have a table named “Invoices” that stores information about invoices of a company. A table that is not in BCNF might look like this:
InvoiceNumber | CustomerID | Product | Quantity | Price |
---|---|---|---|---|
1 | 1 | A | 2 | 10 |
2 | 1 | B | 1 | 20 |
3 | 2 | A | 3 | 10 |
In this table, the primary key is (InvoiceNumber) and it has a non-trivial functional dependency CustomerID → Product. This dependency doesn’t hold true for the case where the left-hand side is a subset of the primary key (InvoiceNumber) because the product can be different for the same customer. So this table is not in BCNF.
To bring this table to BCNF, we need to separate the table into two separate tables: one for the Invoices and one for the Customer_Product.
Table: Invoices
InvoiceNumber | Quantity | Price |
---|---|---|
1 | 2 | 10 |
2 | 1 | 20 |
3 | 3 | 10 |
Table: Customer_Product
CustomerID | Product |
---|---|
1 | A |
1 | B |
2 | A |
Now, the table Invoices has the primary key (InvoiceNumber) and the table Customer_Product has the primary key (CustomerID, Product) . This design is now in BCNF.
BCNF is considered to be the strongest normal form and it eliminates non-trivial functional dependencies and improves the data integrity and consistency by reducing the data anomalies. However, it’s not necessary to go through BCNF before going to higher normal forms like fourth normal form (4NF) or fifth normal form (5NF).
Fourth Normal Form (4NF)
Fourth Normal Form (4NF) is a higher level of normalization than Boyce-Codd Normal Form (BCNF). It addresses the issue of multi-valued dependencies. A multi-valued dependency is a type of dependency where a non-primary key column is functionally dependent on two or more independent non-primary key columns.
A table is in 4NF if and only if it does not contain any multi-valued dependencies.
For example, let’s say we have a table named “Orders” that stores information about orders of a company. A table that is not in 4NF might look like this:
OrderNumber | Customer | Product | Supplier |
---|---|---|---|
1 | John | A | S1 |
2 | Jane | B | S2 |
3 | Tom | A | S1 |
In this table, there is a multi-valued dependency between Product and Supplier. This dependency doesn’t hold true for the case where a product is supplied by multiple suppliers or a supplier supplies multiple products. So this table is not in 4NF.
To bring this table to 4NF, we need to separate the table into two separate tables: one for the Orders and one for the Product_Supplier.
Table: Orders
OrderNumber | Customer | Product |
---|---|---|
1 | John | A |
2 | Jane | B |
3 | Tom | A |
Table: Product_Supplier
Product | Supplier |
---|---|
A | S1 |
B | S2 |
Now, the table Orders has no multi-valued dependencies and the table Product_Supplier has primary key (Product, Supplier) . This design is now in 4NF.
4NF eliminates multi-valued dependencies and improves the data integrity and consistency by reducing the data anomalies. However, it’s not necessary to go through 4NF before going to higher normal forms like fifth normal form (5NF).
Fifth Normal Form (5NF)
Fifth Normal Form (5NF), also known as Projection-Join Normal Form (PJ/NF) is a higher level of normalization than Fourth Normal Form (4NF). It addresses the issue of join dependency. A join dependency is a type of dependency where a table can be split into two or more tables, each of which contains a subset of the original table’s columns, and can be recombined by joining the tables on their common columns.
A table is in 5NF if and only if it does not contain any join dependencies.
For example, let’s say we have a table named “Employee_Department” that stores information about employees and their departments. A table that is not in 5NF might look like this:
EmployeeID | EmployeeName | Department | Salary |
---|---|---|---|
1 | John | IT | 50000 |
2 | Jane | HR | 40000 |
3 | Tom | IT | 60000 |
In this table, there is a join dependency between Employee and Department. This dependency doesn’t hold true for the case where an employee can belong to multiple departments or a department can have multiple employees. So this table is not in 5NF.
To bring this table to 5NF, we need to separate the table into three separate tables: one for the Employee, one for the Department, and one for the Employee_Department_Salary.
Table: Employee
EmployeeID | EmployeeName |
---|---|
1 | John |
2 | Jane |
3 | Tom |
Table: Department
Department |
---|
IT |
HR |
Table: Employee_Department_Salary
EmployeeID | Department | Salary |
---|---|---|
1 | IT | 50000 |
2 | HR | 40000 |
3 | IT | 60000 |
Now, the tables Employee, Department, and Employee_Department_Salary has no join dependency and each table has its own primary key. This design is now in 5NF.
5NF eliminates join dependencies and improves the data integrity and consistency by reducing the data anomalies. However, it’s not often used in practice as it requires a large number of tables and can be difficult to maintain.

Further Reading
- What is DBMS? (Components, Features & Advantages)
- Types of DBMS
- DBMS Architecture
- Types of Data Models in DBMS
- Hashing in DBMS
- Deadlock in DBMS
- Functional Dependency in DBMS
- Referential Integrity in DBMS
- Difference between DBMS and RDBMS
- ACID Properties DBMS (Atomicity, Consistency, Isolation, Durability)