• UnknownFryingObject@feddit.de
    link
    fedilink
    arrow-up
    13
    ·
    1 year ago

    Well that’s how floating point units work.

    The following is more a explanation about the principle than a precise description of float values in programming, since working with binary values has its own quirks, especially with values lower than one, but anyways:

    Think about a number noted by a base and an exponent, like
    1.000.000
    can be represented as 1*10^6.

    1.000.001 now becomes 1,000001*10^6.

    If you want more precision or bigger numbers maintaining the same precision, you will have to add further and further decimal places and that hits a limit at a certain amount.

    So basically you can either get really high numbers in a floating point unit or you can store really precise small numbers. But you cannot achieve both at the same time.

    • Traister101@lemmy.today
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      1 year ago

      Alternatively as both floats (32 bit) and doubles (64 bit) are represented in binary we can directly compare them to the possible values an int (32 bit) and a long (64 bit) has. That is to say a float has the same amount of possible values as an int does (and double has the same amount of values as a long) . That’s quite a lot of values but still ultimately limited.

      Since we generally use decimal numbers that look like this 1.5 or 3.14. It’s setup so the values are clustered around 0 and then every power of 2 you have half as many meaning you have high precision around zero (what you use and care about in practice) and less precision as you move towards negative infinity and positive infinity.

      In essence it’s a fancy fraction that is most precise when it’s representing a small value and less precise as the value gets farther from zero