Subtopic Notes

13.3 Floating-point numbers, representation and manipulation

13. Data Representation

Mantissa

  • Stores the actual digits of the number
  • Increasing the number of bits in mantissa increases accuracy

Exponent

  • Tells you where the decimal point is placed by shifting
  • Increasing the number of bits in exponent increases range

Floating Point Number Representation

  • ± mantissa × 2exponent
  • Can represent fractions and negative numbers
  • In scientific notation a number is represented in the following way: 2.15 x 104
  • Here 2.15 acts like mantissa (the significant digit) while 104 acts like the exponent (shifts decimal point 4 places to the right)
  • The decimal point in binary numbers may be termed as binary point

Converting binary to floating-point real number (Method 1)

  1. You will be given the value of mantissa and exponent
    Mantissa: 01110100 Exponent: 0010
  2. Write the binary number with decimal point: by default consider that the decimal point is after the first digit of mantissa
    0.1110100
  3. Evaluate the value of exponent
    0010 = 2
  4. Shift the decimal point to the right the number of times equaling the value of the exponent
    011.10100
  5. Evaluate the value
-421.1/21/41/81/161/32
011.10100

Value: 2 + 1 + (½) + (⅛) = 3.625

Converting binary to floating-point real number (Method 2)

  1. You will be given the value of mantissa and exponent
    Mantissa: 01110100 Exponent: 0010
  2. Answer will be mantissa × 2exponent
sign.1/21/41/81/161/321/641/128
0.1110100

= 0.1110100 x 20010

= (½ + ¼ + ⅛ + 1/32 ) x 22 = 3.625

Converting denary to binary floating-point number (example)

  1. You will be given a denary number and the number of bits for mantissa and exponent

  1. Convert to binary putting the decimal point in the correct place
-1286432168421.1/21/4
10011001.01
  1. Move the decimal point until the number is normalized (The value represented is the mantissa)
1.001100101
  1. Calculate the exponent.
    Since the decimal point moved by 7 places, the value of exponent will be 7: 0111
    Mantissa: 1001100101 Exponent: 000111

Note: If a given number cannot be represented fully, represent the closest number possible (Precision may be lost)

Normalization

  • Normalization is done to ensure uniqueness of number representation
  • Makes calculation more straight forward
  • Stores maximum range of numbers in minimum number of bits
  • Enables very large/small number to be stored with accuracy
  • During normalization a positive number must start with 01 from the left and negative number starts with 10
  • Normalization Steps:
    • Example: Normalize 0010110110 0010
    • Step 1: Adjust the decimal point by shifting it left or right until the number begins with 01 or 10
      • 0.010110110 becomes 0.101101100
    • Step 2: Adjust the exponent as you shift the decimal point: moving it to the left increases the exponent, while moving it to the right decreases the exponent.
      • Decimal point moves 1 to the right, so 1 is deducted from the exponent. Answer becomes 0101101100 0001
  • Normalizing Negative Floating Point Values
    • Example: 11110010 0111
    • By default the number is in the format
      1**.1110010 x 20111 = 1.**1110010 x 27
    • Move the decimal point until normalized
      1111**.**0010000
    • Since the decimal point moved by 3 places, deduct 3 from exponent
      = 1**.**0010000 x 24
      = 10010000 0100

Overflow

  • Happens when a number is too large to be represented within the available number of bits.
  • The exponent exceeds its maximum value.
  • Precision may be lost
  • Excess values cut off
  • Example: Trying to store 220 with only a 4-bit exponent.

Underflow

  • Happens when a number is too small (close to 0) to be represented within the available bits.
  • The exponent is too negative (less than the minimum allowed).
  • Result: The number may be stored as zero, losing precision.
  • Example: 0.0000001 with limited mantissa and exponent bits.

Rounding Error

  • When a number cannot be represented exactly in binary, it is approximated to the closest possible value.
  • This small difference can cause inaccuracies, especially when such numbers are used in multiple calculations.
  • Over time, these tiny errors accumulate and become noticeable
  • Example: 0.1 + 0.7 might give 0.7999999999999 instead of 0.8.

Important values

Consider for the following 8 bit mantissa and 4 bit exponent

Largest positive number0111 1111 0111
Smallest positive number0000 0001 1000
Smallest normalized positive number0100 0000 1000
Largest magnitude negative number1000 0000 0111