Normalization in DBMS

Normalization (notes by Sumit Verma)

I will discuss all the key points of Normalization in DBMS in this blog.

Normalization is the process of breaking up the table or relation into multiple tables or relations so that we can reduce redundancy and we can make our database more efficient as well as error free.

Consider StudentInfo table,

IdNameDepartment
1S1CSE
2S2EC
3S3CSE
4S5IT
5S4CSE

Now if you have to add one more column for name of HOD also, so now Student table will look like,

IdNameDepartmentHOD
1S1CSEProf. A
2S2ECProf. B
3S3CSEProf. A
4S5ITProf. C
5S4CSEProf. A

Here you can see that whenever name of department is repeating, HOD name is also repeating, this is causing redundancy in the database.

Now suppose we also want to add phone number column for every department. So our table will be like,

IdNameDepartmentHODPNO
1S1CSEProf. A32272
2S2ECProf. B67897
3S3CSEProf. A32272
4S5ITProf. C67327
5S4CSEProf. A32272

Now suppose if phone number of CSE department changed from 32272 to 32222, you have to make the changes at three places.

IdNameDepartmentHODPNO
1S1CSEProf. A32222
2S2ECProf. B67897
3S3CSEProf. A32222
4S5ITProf. C67327
5S4CSEProf. A32222

The same is true for deletion also. So you can see that for insertion, deletion as well as updation, this database is inefficient.

So we will use normalization here,

We will break StudentInfo table into two tables as Student and Department as,

Student

IdNameDepartment
1S1CSE
2S2EC
3S3CSE
4S5IT
5S4CSE

Department:

Dept_IDHODPNO
CSEProf. A32222
ECProf. B67897
ITProf. C67327

We can make Department column of Student table as a foreign key which will refer to Dept_ID of Department table.

Here whenever we will need details of department, we will refer to the Department table with the help of the foreign key.

Now, suppose you want to update phone number of any department so you just need to update that only one place. Hence, our database is more efficient now.

First Normal Form (1NF):

The following criteria must be satisfied for 1NF.

  1. Values of each attribute are atomic.
  2. No composite values
  3. All entries in any column must be of the same kind
  4. Each column must have a unique name
  5. No two rows are identical  

Consider the below Student table

RollNameCourses
101AsifDBMS, CN, CD, SE
102AmitCO, OS
103ArpitCD, OS, CN, DBMS

Here you can see that column Courses contain multiple values in a single row. This is not allowed in 1NF.

We can convert this table into 1NF as below,

RollNameCourses
101AsifDBMS
101AsifCN
101AsifCD
101AsifSE
102AmitCO
102AmitOS
103ArpitCD
103ArpitOS
103ArpitCN
103ArpitDBMS

Second Normal Form (2NF):

A relation is said to be in 2NF if it is in 1NF and all non-prime attributes are fully functional dependent on any key of R.

Example: Consider the relation R(A,B,C,D,E,F) and the following functional dependencies,

F = { A→ BCDEF

         BC →ADEF

         B→F

         D→E }

Candidate Keys = {A, BC}

Prime Attributes = A, B, C

Non prime Attributes = D, E, F

BC is a candidate key so any of it’s subset can not determine other attributes of the relation but B is determining attribute F here.

So B→F is a partial dependency and hence this relation is not in 2NF.

In order to make it 2NF, we will break the relation into two parts: R1 (ABCDE) and R2 (BF).

Now the relations R1 and R2 are in 2NF.

Third Normal Form (3NF):

Criteria for 3NF:

1. Relation should be in 2NF.

2. No non-prime attribute should be transitively dependent on candidate key. 

   OR There should not be the case that a non prime attribute is determined by another non prime attribute.

Example: Consider the relation R(A,B,C,D,E,F) and the following functional dependencies,

F = { A→ BCDEF

         BC →ADEF

         B→F

         D→E }

Check for 2NF:

Candidate Keys = {A, BC}

Prime Attributes = A, B, C

Non prime Attributes = D, E, F

BC is a candidate key so any of it’s subset can not determine other attributes of the relation but B is determining attribute F here.

So B→F is a partial dependency and hence this relation is not in 2NF.

In order to make it 2NF, we will break the relation into two parts: R1 (ABCDE) and R2 (BF).

Now the relations R1 and R2 are in 2NF.

Check for 3NF: 

We have two relations now R1 (ABCDE) and R2 (BF).

R2 (BF):  We have B→F dependency here.

Candidate key = B

Prime Attribute = B

Non Prime Attribute = F

Since there is only one non prime attribute so no chance of violation of 3NF. So R2 (BF) is in 3NF.

R1 (ABCDE): We have following dependencies here:

A→ BCDEF

BC →ADEF

D→E 

Candidate keys = A, BC

Prime Attributes = A,B,C

Non Prime Attributes = D, E

Here D→E dependency is violating 3NF because ‘D’ is a non prime attribute and it is determining another non prime attribute ‘E’. So in order to make that relation in 3NF we will break the relation into two parts as,

1. R11(ABCD)

2. R12(DE)

Now all relations R11(ABCD), R12(DE) and R2 (BF) are in 3NF.

Boyce Codd Normal Form  (BCNF):

Criteria for  BCNF:  

1. Relation should be in 3NF

2. For each dependency X→Y, X should be a super key.

Example: Consider the realtion R(ABCD) and the following FDs,

F = { A→BCD

         BC→AD

         D→B

      }

Candidate Keys = A, BC

Prime Attributes = A, B, C

Non prime attributes = D

Check for 2NF:

There is no partial dependency, hence relation is in 2NF.

Check for 3NF:

There is only one non prime attribute so no chance of 3NF violation. Hence relation is in 3NF.

Check for BCNF:

Here  D→B is violating BCNF because ‘D’ is not a superkey. So it is not in BCNF.

In order to make it BCNF, relation will be divided into two parts as,

1. R1(ADC)

2. R2(DB)

Now relations R1(ADC) and R2(DB) are in BCNF.

Thanks for reading.

Share This Post
Have your say!
00
Do NOT follow this link or you will be banned from the site!
Skip to toolbar