Subtopic Notes
1.3 Data storage and compression
1. Data representation
Prefixes
| Denary Prefix | Factor Value | Binary Prefix | Factor Value |
|---|---|---|---|
| Kilobyte - (kB) | x103 | Kibibyte - (KiB) | x210 |
| Megabyte - (MB) | x106 | Mebibyte - (MiB) | x220 |
| Gigabyte - (GB) | x109 | Gibibyte - (GiB) | x230 |
| Terabyte - (TB) | x1012 | Tebibyte - (TiB) | x240 |
| Petabyte - (PB) | x1015 | Pebibyte - (PiB) | x250 |
| Exabyte - (EB) | x1018 | Exbibyte - (EiB) | x260 |
Compression
Compression is the process that reduces file size while maintaining acceptable quality. Two types: Lossless and Lossy
Reasons for compressing:
- Less Bandwidth Required
- Less Storage Space Required
- Shorter Transmission Time
Lossless Compression
- A method that allows data to be perfectly reconstructed from the compressed file
- E.g. bitmap (.bmp), vector graphic (.svg) and .png images, text file compression, database records
Run-length Encoding (RLE)
- A form of lossless compression
- Used for text files and bitmap images.
- Encodes consecutive identical elements using two values: run count and run value.
- E.g. RLE of bitmap image
- We can represent the first row as a sequence of pixels: “WBBWWWBBBBW” | W: white and B: black
- After applying RLE: “W 2B 3W 4B W”
- The process is repeated for other rows.
Lossy Compression
- A method that permanently removes redundant data to achieve higher compression rates
- While the file size is significantly reduced, some quality is lost
- E.g. Sound files (.mp3), .jpeg images
- Sound files compression (.mp3) utilizes Perceptual Coding to remove certain parts of sound that are less audible to human hearing.
