From a .c file of another guy, I saw this:
const float c = 0.70710678118654752440084436210485f;
where he wants to avoid the computation of sqrt(1/2)
.
Can this be really stored somehow with plain C/C++
? I mean without loosing precision. It seems impossible to me.
I am using C++, but I do not believe that precision difference between this two languages are too big (if any), that' why I did not test it.
So, I wrote these few lines, to have a look at the behaviour of the code:
std::cout << "Number: 0.70710678118654752440084436210485
";
const float f = 0.70710678118654752440084436210485f;
std::cout << "float: " << std::setprecision(32) << f << std::endl;
const double d = 0.70710678118654752440084436210485; // no f extension
std::cout << "double: " << std::setprecision(32) << d << std::endl;
const double df = 0.70710678118654752440084436210485f;
std::cout << "doublef: " << std::setprecision(32) << df << std::endl;
const long double ld = 0.70710678118654752440084436210485;
std::cout << "l double: " << std::setprecision(32) << ld << std::endl;
const long double ldl = 0.70710678118654752440084436210485l; // l suffix!
std::cout << "l doublel: " << std::setprecision(32) << ldl << std::endl;
The output is this:
* ** ***
v v v
Number: 0.70710678118654752440084436210485 // 32 decimal digits
float: 0.707106769084930419921875 // 24 >> >>
double: 0.70710678118654757273731092936941
doublef: 0.707106769084930419921875 // same as float
l double: 0.70710678118654757273731092936941 // same as double
l doublel: 0.70710678118654752438189403651592 // suffix l
where *
is the last accurate digit of float
, **
the last accurate digit of double
and ***
the last accurate digit of long double
.
The output of double
has 32 decimal digits, since I have set the precision of std::cout
at that value.
float
output has 24, as expected, as said here:
float has 24 binary bits of precision, and double has 53.
I would expect the last output to be the same with the pre-last, i.e. that the f
suffix would not prevent the number from becoming a double
. I think that when I write this:
const double df = 0.70710678118654752440084436210485f;
what happens is that first the number becomes a float
one and then stored as a double
, so after the 24th decimal digits, it has zeroes and that's why the double
precision stops there.
Am I correct?
From this answer I found some relevant information:
float x = 0 has an implicit typecast from int to float.
float x = 0.0f does not have such a typecast.
float x = 0.0 has an implicit typecast from double to float.
[EDIT]
About __float128
, it is not standard, thus it's out of the competition. See more here.