Subtopic Notes
8.1 Database Concepts
8. Databases
File Based System
Data is kept in separate files on a computer, where users can access, modify, or delete it as needed.
Disadvantages:
- Data Redundancy: Duplicate data across files wastes space.
- Lack of Security and limited control over data access.
- Challenging for multiple users to access or update files simultaneously.
- Harder to sort or filter data
- There is no enforced structure for organizing file
- Data can exist in various formats, making it hard to use
Database Terminologies
- Database: Organized collection of non-redundant interrelated data stored electronically
- DBMS (Database Management System): Software that facilitates the creation, maintenance, and manipulation of databases
- Entity: A distinct object or concept that can be identified and stored in a database, often representing a real-world object
- Table: A structured arrangement of data in rows and columns within a database, representing entities
- Record: A single entry in a table, consisting of fields that contain related data about an entity
- Field/Column: Contains specific property about a record
- Tuple: A single row/record
- Attribute: A field/ column
- Primary key: Attribute/s which is the unique identifier
- Candidate key: Attribute/s that has potential to be a primary key
- Secondary key: Candidate keys not chosen as the primary key
- Foreign key: Attribute/s relating 2 different tables
- Referential integrity: A constraint ensuring that relationships between tables remain consistent, preventing orphaned records
- Indexing: Creating a secondary key on an attribute enables quick access when searching by that attribute; however, the index must be updated whenever the table data is modified.
Example Database
Let’s Say there is a database with two tables, where one is for customers and one is for vehicles.
Customer Table
| License_No | Phone | Name | DOB |
|---|---|---|---|
| ABC111 | 01829999999 | Lewis Hamilton | 07/01/1985 |
| ABC112 | 01799999999 | Max Verstappen | 30/09/1997 |
| ABC113 | 01699999999 | Michael Schumacher | 03/01/1969 |
| ABC114 | 01981231231 | Charles Leclerc | 16/10/1997 |
Vehicle Table
| VIN | Make | Model | Year | Mileage | Body | Buyer | Price |
|---|---|---|---|---|---|---|---|
| TYT1 | Toyota | Prado | 2023 | 15000 | SUV | ABC111 | 55000 |
| HND1 | Honda | Vezel | 2022 | 20000 | Crossover | ABC112 | 25000 |
| BUG1 | Bugatti | Veyron | 2012 | 10000 | Supercar | ABC113 | 2200000 |
| HND2 | Honda | CR-V | 2021 | 30000 | SUV | ABC114 | 30000 |
| TYT2 | Toyota | C-HR | 2020 | 25000 | Crossover | ABC114 | 22000 |
| LAM1 | Lamborghini | Gallardo | 2008 | 40000 | Supercar | ABC113 | 150000 |
| ASM1 | Aston Martin | DB12 | 2019 | 12000 | Grand Tourer | ABC114 | 300000 |
Examples from the above database:
- Tables: Vehicle, Customer
- Entity: Vehicle, Customer
- Record: Row 2 to 8 in Vehicle table
- Field/Column/Attribute: Make, Model, Year in Vehicle Table
- Tuple: Row 2
- Primary Key: License_No in Customer table and VIN in Vehicle table
- Candidate Key: Phone, License_No in Customer Table
- Secondary Key: Phone in Customer table
- Foreign Key: Buyer in Vehicle table
Entity Relation Diagram Symbols
Normalization:
Not Normalized Example:
| License_No | Cust_Name | Cust_Phone | City | Area | Cars | Models |
|---|---|---|---|---|---|---|
| L124 | Lewis Hamilton | 01852111113 | Dhaka | Uttara | "Tesla" | "Model S" |
| L125 | Max Verstappen | 01852111114 | Chittagong | Agrabad | "Ford" | "Mustang" |
| L126 | Carlos Sainz | 01852111115 | Dhaka | Gulshan | "Toyota", "Nissan" | "Corolla", "Altima" |
| L127 | Charles Leclerc | 01852111116 | Syhlet | Ratargul | "Mazda", "Honda" | "CX-5", "Accord" |
1st Normal Form (1NF): Contains no repeating attribute or groups of attributes. The intersection of each tuple and attribute contains only 1 value.
Example: (Split Cars and Models into individual rows)
| License_No | Cust_Name | Cust_Phone | City | Area | Car | Model |
|---|---|---|---|---|---|---|
| L124 | Lewis Hamilton | 01852111113 | Dhaka | Uttara | Tesla | Model S |
| L125 | Max Verstappen | 01852111114 | Chittagong | Agrabad | Ford | Mustang |
| L126 | Carlos Sainz | 01852111115 | Dhaka | Gulshan | Toyota | Corolla |
| L126 | Carlos Sainz | 01852111115 | Dhaka | Gulshan | Nissan | Altima |
| L127 | Charles Leclerc | 01852111116 | Syhlet | Ratargul | Mazda | CX-5 |
| L127 | Charles Leclerc | 01852111116 | Syhlet | Ratargul | Honda | Accord |
2nd Normal Form (2NF): it is in 1NF and every non-primary key attribute is fully dependent on the primary; all the incomplete dependencies have been removed.
Example: Model depends on car, The others depend on License_No
Customer Table
| License_No | Cust_Name | Cust_Phone | City | Area |
|---|---|---|---|---|
| L124 | Lewis Hamilton | 01852111113 | Dhaka | Uttara |
| L125 | Max Verstappen | 01852111114 | Chittagong | Agrabad |
| L126 | Carlos Sainz | 01852111115 | Dhaka | Gulshan |
| L127 | Charles Leclerc | 01852111116 | Sylhet | Ratargul |
Car Table
| Car_ID | Car | Model |
|---|---|---|
| 1 | Tesla | Model S |
| 2 | Ford | Mustang |
| 3 | Toyota | Corolla |
| 4 | Nissan | Altima |
| 5 | Mazda | CX-5 |
| 6 | Honda | Accord |
Customer_Car Table
| License_No | Car_ID |
|---|---|
| L124 | 1 |
| L125 | 2 |
| L126 | 3 |
| L126 | 4 |
| L127 | 5 |
| L127 | 6 |
3rd Normal Form (3NF): It is in 1NF and 2NF, and all non-key attributes are fully dependent on the primary key. There should be no inter-dependencies between non-key attributes.
Example: The Car Table and Customer_Car Table stay the same. The Customer Table is broken down because Area depends on City.
Customer Table
| License_No | Cust_Name | Cust_Phone | Area_ID |
|---|---|---|---|
| L124 | Lewis Hamilton | 01852111113 | 1 |
| L125 | Max Verstappen | 01852111114 | 2 |
| L126 | Carlos Sainz | 01852111115 | 3 |
| L127 | Charles Leclerc | 01852111116 | 4 |
Area Table
| Area_ID | Area | City |
|---|---|---|
| 1 | Uttara | Dhaka |
| 2 | Agrabad | Chittagong |
| 3 | Gulshan | Dhaka |
| 4 | Ratargul | Sylhet |
Note:
MANY-TO-MANY functions cannot be directly normalized to 3NF, must use a 2 step process e.g. The 2nd diagram below represents 3rd normal form
