Discussion Forums

Re: Representing a double as a Fast scaled decimal
Walter Mascarenhas / GeoCAD <>
9 Oct 2008 11:45AM ET

Daniel and Dimitri,

 These floating point issues can get very tricky. I,
for instance, have been mislead by them many, many times.

One point I am trying to make is that you don´t have the
12.34500 value in the double world to start with.
There is no such a thing as a nice double with value x = 12.34500.
The closest you will ever get to x is xr equal to
123450000000000006394884621840901672840118408203125 x 10^(-49)
with an error of about 6.4 x 10^(-16).

As a result, in most cases the double produced by rounding
will have its least signficant decimal different from zero
(for instance 12.34500 would look like 13.3450000000001)
and you would have no compression: the full 64 bits of the
number would need to be sent down the wire, because you would
never know if that last one was there on purpose or it
was caused by rounding. Or, in the bit vector alternative
suggest a few messages earlier, the bit vector would have
its maximum length most of the time.

as for Daniel´s suggestion,

>> float x = 123.4500
>> int64_t mantissa = (int64_t)(x * 10000);
>> int32_t exponent = 4;

I would do something different, but
his suggestion is much simpler and he
is right about the details: it can get
very messy in general cases.

I would go roughly like this
(assuming x > 0, that we have a unsigned
int with 128 bits and disregarding some details,
so that you get the idea):

int e;
double xm = frexp(x,&e);
UINT64 m = (UINT64) _scalb(xm,53); // now x = 2^(e - 53) m exactly

UINT128 result = Mult(m,5^4);
// now x = result * 2^(e - 49) * 10^(-4), exactly

if( e >= 49 )
{
  result <<= (e - 49);
  // now x = result * 10^(-4) exactly
}
else
{
  UINT64 one = 1;
  UINT64 mask = ((one << (49 - e)) - 1);        
  UINT128 reminder = result & mask;
  if( 2 * reminder > mask )
    result = (result >> (49 - e)) + 1;
  else
    result >>= 49 - e;                
}
 Hope this helps (but don´t trust the details...)

        Walter.

> Dimity, There are two ways to look at this problem. If you know ahead of
> time that you are going to always use a fixed number of decimal
> precision (again, let's say your business rules require a precision of 4
> decimal points), and you have already either rounded your float to those
> 4 points of precision, or truncation is okay for your application, then
> the conversion is simple:
>
> float x = 123.4500 int64_t mantissa = (int64_t)(x * 10000); int32_t
> exponent = 4;
>
>
> The more complicated case is when you are trying to generically convert
> a floating point number to a FAST scaled decimal with
> a.) not losing any precision, and
> b.) optimizing the exponent as to keep the mantissa as small as
> possible.
>
> For this, take a look at modf() in the standard C library. It breaks a
> float into the whole and fractional parts. You can then cast or convert
> the float whole and fractional parts to integers. Next, you would remove
> any unnecessary precision from the fractional integer by using a mod and
> divide by 10 while there are trailing zeros left.
>
> Finally, to create the FAST scaled decimal mantissa, you must determine
> the FAST exponent by inspecting the size of the fractional integer, and
> then adjust the whole integer by that factor, and add back the
> fractional part.
>
> I am working on C/C++ an example for you...
>
> /Daniel


Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   8 Oct 2008 12:50PM ET
Re: Representing a double as a Fast scaled decimal
Anders Furuhed / Pantor Engineering   8 Oct 2008 2:13PM ET
Re: Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   8 Oct 2008 3:06PM ET
Re: Representing a double as a Fast scaled decimal
Rolf Andersson / Pantor Engineering   8 Oct 2008 3:22PM ET
Re: Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   8 Oct 2008 3:47PM ET
Re: Representing a double as a Fast scaled decimal
Rolf Andersson / Pantor Engineering   8 Oct 2008 3:56PM ET
Re: Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   8 Oct 2008 4:04PM ET
Re: Representing a double as a Fast scaled decimal
Daniel May / SpryWare, LLC   8 Oct 2008 5:27PM ET
Re: Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   8 Oct 2008 9:07PM ET
Re: Representing a double as a Fast scaled decimal
Walter Mascarenhas / GeoCAD   8 Oct 2008 9:58PM ET
Re: Representing a double as a Fast scaled decimal
Daniel May / SpryWare, LLC   9 Oct 2008 9:47AM ET
Re: Representing a double as a Fast scaled decimal
Daniel May / SpryWare, LLC   9 Oct 2008 10:48AM ET
Re: Representing a double as a Fast scaled decimal
Walter Mascarenhas / GeoCAD   9 Oct 2008 11:45AM ET
Re: Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   9 Oct 2008 11:16PM ET
Re: Representing a double as a Fast scaled decimal
Walter Mascarenhas / GeoCAD   10 Oct 2008 7:30AM ET
Re: Representing a double as a Fast scaled decimal
Dimitry London / Morgan Stanley   9 Oct 2008 11:04PM ET