Normalization Principles
Learning Objectives
- Understand why normalization is crucial for database design
- Master First Normal Form (1NF) requirements
- Apply Second Normal Form (2NF) principles
- Implement Third Normal Form (3NF) structures
- Recognize when to denormalize for performance
What is Normalization?
Normalization is the process of organizing data to minimize redundancy and dependency. It's like organizing your closet - everything has its proper place! 🗂️
Real-World Analogy
Normal Forms Overview
graph TD
UN[Unnormalized Data]
UN -->|Remove repeating groups| NF1[1NF: First Normal Form]
NF1 -->|Remove partial dependencies| NF2[2NF: Second Normal Form]
NF2 -->|Remove transitive dependencies| NF3[3NF: Third Normal Form]
NF3 -->|Remove multi-valued dependencies| BCNF[BCNF/4NF/5NF]
style UN fill:#fee2e2,color:#991b1b
style NF1 fill:#fef3c7,color:#92400e
style NF2 fill:#dbeafe,color:#1e40af
style NF3 fill:#dcfce7,color:#166534
style BCNF fill:#e9d5ff,color:#6b21a8
Practical Goal
First Normal Form (1NF)
1NF Rules
- ✅ Each column contains atomic (indivisible) values
- ✅ Each column contains values of a single type
- ✅ Each column has a unique name
- ✅ The order of rows doesn't matter
- ✅ No repeating groups or arrays
Example: Converting to 1NF
Second Normal Form (2NF)
2NF Rules
- ✅ Must be in 1NF
- ✅ No partial dependencies on composite primary keys
- ✅ All non-key attributes depend on the ENTIRE primary key
Key Concept
-- Violates 2NF (partial dependency)
CREATE TABLE order_items_bad (
order_id INT,
product_id INT,
product_name VARCHAR(100), -- Depends only on product_id!
product_price DECIMAL(10,2), -- Depends only on product_id!
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- Satisfies 2NF (split into two tables)
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
product_price DECIMAL(10,2)
);
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
Third Normal Form (3NF)
3NF Rules
- ✅ Must be in 2NF
- ✅ No transitive dependencies
- ✅ All attributes depend directly on the primary key
Remember:"Every non-key attribute must depend on the key, the whole key, and nothing but the key!"
Normalization in Practice
E-Commerce Database Example
graph TB
subgraph "Unnormalized"
UN[Orders
----
OrderID
CustomerName
CustomerEmail
ProductName
ProductPrice
Quantity
OrderDate] end subgraph "Normalized 3NF" C[Customers
----
CustomerID
Name
Email] P[Products
----
ProductID
Name
Price] O[Orders
----
OrderID
CustomerID
OrderDate] OI[OrderItems
----
OrderID
ProductID
Quantity] C -->|1:N| O O -->|1:N| OI P -->|1:N| OI end style UN fill:#fee2e2 style C fill:#dbeafe style P fill:#dcfce7 style O fill:#fef3c7 style OI fill:#e9d5ff
----
OrderID
CustomerName
CustomerEmail
ProductName
ProductPrice
Quantity
OrderDate] end subgraph "Normalized 3NF" C[Customers
----
CustomerID
Name
Email] P[Products
----
ProductID
Name
Price] O[Orders
----
OrderID
CustomerID
OrderDate] OI[OrderItems
----
OrderID
ProductID
Quantity] C -->|1:N| O O -->|1:N| OI P -->|1:N| OI end style UN fill:#fee2e2 style C fill:#dbeafe style P fill:#dcfce7 style O fill:#fef3c7 style OI fill:#e9d5ff
When to Denormalize
Practice: Normalize This Table
Exercise
💡 Click for Solution
-- Books table
CREATE TABLE books (
isbn VARCHAR(13) PRIMARY KEY,
title VARCHAR(200) NOT NULL,
author_id INT,
publisher_id INT,
FOREIGN KEY (author_id) REFERENCES authors(author_id),
FOREIGN KEY (publisher_id) REFERENCES publishers(publisher_id)
);
-- Authors table
CREATE TABLE authors (
author_id INT PRIMARY KEY AUTO_INCREMENT,
author_name VARCHAR(100) NOT NULL,
country VARCHAR(50)
);
-- Publishers table
CREATE TABLE publishers (
publisher_id INT PRIMARY KEY AUTO_INCREMENT,
publisher_name VARCHAR(100) NOT NULL,
address VARCHAR(255)
);
-- Borrowers table
CREATE TABLE borrowers (
borrower_id INT PRIMARY KEY AUTO_INCREMENT,
borrower_name VARCHAR(100) NOT NULL
);
-- Loans table (junction)
CREATE TABLE loans (
loan_id INT PRIMARY KEY AUTO_INCREMENT,
isbn VARCHAR(13),
borrower_id INT,
borrow_date DATE,
FOREIGN KEY (isbn) REFERENCES books(isbn),
FOREIGN KEY (borrower_id) REFERENCES borrowers(borrower_id)
);