Part 10. Attention: The Floating-Point! -

content by Xing Chen

Suppose a bank was using the IEEE 754 Single Precision (32-bit) Floating-Point number to do their banking. You opened an account, the bank asked you to deposit minimum 10€, you had exactly 12.56€ in your pocket and you deposited all of them.

We use decimal to count the currency, because we have ten fingers, 😉. But, you know, the computer uses binary numbers. The first step for the bank is to convert your 12.56€ into a binary number. You got it, you have 01000001010010001111010111000011€ in the bank, this is an IEEE 754 single Precision (32-bit) Floating-Point number.

The other day, you won a lottery of 12.56Trillion€, that is 1.256E+13€; you took the check to the bank and happily deposited your money; you deposited 01010101001101101100010110100101€, according to the bank.

The next day, you decided to withdraw all your money! The bank had to agree, he wrote you a check of 12559999565824€, which is 1.2559999565824E+13€.

You were lost, asked why you didn’t get a check of 12.56 trillion euros plus 12.56€, that makes 1.2560000000012.56€?

The bank, stupidly, asked you: “Why do you want this amount of money? You have only 12559999565824€ in your account!”

What happened?

If you like, you can play around with the bank by using the following link: IEEE-754 Floating Point Converter

We have infinite real numbers, but because the Single Precision has only 32 bits, you can represent max 2³² real numbers out of infinity (positive and negative all included)! The computer has to round an unrepresentable real number to a representable one (IEEE Floating-Point standard). Unfortunately, 12.56€, 12.56Trillion€, and the sum of the two are all unrepresentable numbers, they all have to be rounded to a representable one, the 12559999565824€.

All the CPUs from the marketplace use IEEE 754 standard, but not all the CPUs round numbers the same way, there is round to zero, round to the nearest, etc. The rounded off part of the number is called the rounding error!

As shown by the following lines, the Floating-Point precisions are relative to the absolute value of the represented number, big numbers are with big rounding errors, small numbers are with small ones. For Single Precision, the precision is always around 1/(10E7) of the absolute value of the numbers; for Double Precision, the precision is always around 1/(10E16) of the absolute value of the number.

I share a very short set of slides to explain all these!

https://www.linkedin.com/feed/update/urn:li:activity:7270845679986503680

content by Xing Chen

1 thought on “Part 10. Attention: The Floating-Point!”

Media says:

May 8, 2025 at 1:11 pm

The website design looks great—clean, user-friendly, and visually appealing! It definitely has the potential to attract more visitors. Maybe adding even more engaging content (like interactive posts, videos, or expert insights) could take it to the next level. Keep up the good work!