Subtopic Notes

8.1 Database Concepts

8. Databases

File Based System

Data is kept in separate files on a computer, where users can access, modify, or delete it as needed.

Disadvantages:

  • Data Redundancy: Duplicate data across files wastes space.
  • Lack of Security and limited control over data access.
  • Challenging for multiple users to access or update files simultaneously.
  • Harder to sort or filter data
  • There is no enforced structure for organizing file
  • Data can exist in various formats, making it hard to use

Database Terminologies

  • Database: Organized collection of non-redundant interrelated data stored electronically
  • DBMS (Database Management System): Software that facilitates the creation, maintenance, and manipulation of databases
  • Entity: A distinct object or concept that can be identified and stored in a database, often representing a real-world object
  • Table: A structured arrangement of data in rows and columns within a database, representing entities
  • Record: A single entry in a table, consisting of fields that contain related data about an entity
  • Field/Column: Contains specific property about a record
  • Tuple: A single row/record
  • Attribute: A field/ column
  • Primary key: Attribute/s which is the unique identifier
  • Candidate key: Attribute/s that has potential to be a primary key
  • Secondary key: Candidate keys not chosen as the primary key
  • Foreign key: Attribute/s relating 2 different tables
  • Referential integrity: A constraint ensuring that relationships between tables remain consistent, preventing orphaned records
  • Indexing: Creating a secondary key on an attribute enables quick access when searching by that attribute; however, the index must be updated whenever the table data is modified.

Example Database
Let’s Say there is a database with two tables, where one is for customers and one is for vehicles.

Customer Table

License_NoPhoneNameDOB
ABC11101829999999Lewis Hamilton07/01/1985
ABC11201799999999Max Verstappen30/09/1997
ABC11301699999999Michael Schumacher03/01/1969
ABC11401981231231Charles Leclerc16/10/1997

Vehicle Table

VINMakeModelYearMileageBodyBuyerPrice
TYT1ToyotaPrado202315000SUVABC11155000
HND1HondaVezel202220000CrossoverABC11225000
BUG1BugattiVeyron201210000SupercarABC1132200000
HND2HondaCR-V202130000SUVABC11430000
TYT2ToyotaC-HR202025000CrossoverABC11422000
LAM1LamborghiniGallardo200840000SupercarABC113150000
ASM1Aston MartinDB12201912000Grand TourerABC114300000

Examples from the above database:

  • Tables: Vehicle, Customer
  • Entity: Vehicle, Customer
  • Record: Row 2 to 8 in Vehicle table
  • Field/Column/Attribute: Make, Model, Year in Vehicle Table
  • Tuple: Row 2
  • Primary Key: License_No in Customer table and VIN in Vehicle table
  • Candidate Key: Phone, License_No in Customer Table
  • Secondary Key: Phone in Customer table
  • Foreign Key: Buyer in Vehicle table

Entity Relation Diagram Symbols

Normalization:

Not Normalized Example:

License_NoCust_NameCust_PhoneCityAreaCarsModels
L124Lewis Hamilton01852111113DhakaUttara"Tesla""Model S"
L125Max Verstappen01852111114ChittagongAgrabad"Ford""Mustang"
L126Carlos Sainz01852111115DhakaGulshan"Toyota", "Nissan""Corolla", "Altima"
L127Charles Leclerc01852111116SyhletRatargul"Mazda", "Honda""CX-5", "Accord"

1st Normal Form (1NF): Contains no repeating attribute or groups of attributes. The intersection of each tuple and attribute contains only 1 value.
Example: (Split Cars and Models into individual rows)

License_NoCust_NameCust_PhoneCityAreaCarModel
L124Lewis Hamilton01852111113DhakaUttaraTeslaModel S
L125Max Verstappen01852111114ChittagongAgrabadFordMustang
L126Carlos Sainz01852111115DhakaGulshanToyotaCorolla
L126Carlos Sainz01852111115DhakaGulshanNissanAltima
L127Charles Leclerc01852111116SyhletRatargulMazdaCX-5
L127Charles Leclerc01852111116SyhletRatargulHondaAccord

2nd Normal Form (2NF): it is in 1NF and every non-primary key attribute is fully dependent on the primary; all the incomplete dependencies have been removed.

Example: Model depends on car, The others depend on License_No

Customer Table

License_NoCust_NameCust_PhoneCityArea
L124Lewis Hamilton01852111113DhakaUttara
L125Max Verstappen01852111114ChittagongAgrabad
L126Carlos Sainz01852111115DhakaGulshan
L127Charles Leclerc01852111116SylhetRatargul

Car Table

Car_IDCarModel
1TeslaModel S
2FordMustang
3ToyotaCorolla
4NissanAltima
5MazdaCX-5
6HondaAccord

Customer_Car Table

License_NoCar_ID
L1241
L1252
L1263
L1264
L1275
L1276

3rd Normal Form (3NF): It is in 1NF and 2NF, and all non-key attributes are fully dependent on the primary key. There should be no inter-dependencies between non-key attributes.

Example: The Car Table and Customer_Car Table stay the same. The Customer Table is broken down because Area depends on City.

Customer Table

License_NoCust_NameCust_PhoneArea_ID
L124Lewis Hamilton018521111131
L125Max Verstappen018521111142
L126Carlos Sainz018521111153
L127Charles Leclerc018521111164

Area Table

Area_IDAreaCity
1UttaraDhaka
2AgrabadChittagong
3GulshanDhaka
4RatargulSylhet

Note:
MANY-TO-MANY functions cannot be directly normalized to 3NF, must use a 2 step process e.g. The 2nd diagram below represents 3rd normal form