IMO this conversion should fail if the number represented is not valid, or fall ...

fffggg · on April 12, 2012

How do you test that it isn't valid? I think you may be underestimating the difficulty in predicting whether a particular decimal number can be accurately represented as a floating point type. It may be non-obvious, but the representation of precise numbers changes depending on the number base. For example, in base 10, we can't precisely represent 1/3. In base 3, we can (0.1). In base 2, we can't precisely represent 0.1, or 1/10. A simple number such as 0.1 has no precise representation in base 2.

In this case in php, the truncation happens due to loss of precision in the mantissa of the double precision float. But there are so many other ways to lose precision, I don't think it's reasonable to ask a language to attempt to account for them.

This is why languages should have clear rules about when type conversion occurs, and allow the user to prevent it when it isn't desirable.

edit: in fact, amusingly, php seems to be doing some non-standard stuff with its floats. I was going to make a point about how you can't determine if a double is a "correct" representation of a string decimal, but in mocking an example I discovered something odd. Check this out:

This is what one should expect:

$ ruby -e'puts "%5.25f" % 0.1'

0.1000000000000000055511151

$ perl -wle'printf "%5.25f\n", 0.1'

0.1000000000000000055511151

But in php:

$ php -r'printf("%5.25f\n", 0.1);'

0.1000000000000000000000000

$ php -r'printf("%5.25f\n", "0.1");'

0.1000000000000000000000000

Is php changing the type conversion? Or not using double precision at all?

andreasvc · on April 12, 2012

The idea of casting everything to float is just wrong; a string of digits without a dot should be converted to a (big) integer, without any loss of precision. Anyway I just can't fathom how anyone could think weak typing is a good idea; it might make some superficial things "easier", but you'll soon shoot yourself in the foot with it.

muyuu · on April 12, 2012

Precision in floats is accepted as a fact of life. Converting exceedingly big INT string literals to bigger float types is a hack to win some naive benchmarks against languages doing native proper arbitrary precision. This shouldn't have happened in the first place, but since it's there and backwards compatibility is important, it could be shown in a warning error_level[1] that the conversion happened, so the user could at least check that and hack a solution together.

[1] This doesn't really happen in PHP, but you have $php_errormsg that can be set without stopping execution (as happens with some errors/warnings when error_level is not set to E_STRICT, and below that depending on the error). This errors could be triggered in a new level, let's say "E_PEDANTIC".

fffggg · on April 13, 2012

You just reiterated my point. This is precisely why your original suggestion of failing an "invalid" conversion is untenable. ALL conversions lack precision -- there is no such thing as an "invalid" conversion.

muyuu · on April 13, 2012

Nope, there are conventions.

We have two distinct problems here:

- strings converting to numbers without there being any number on any side. "Peculiar" of PHP but easy to circumvent using string comparison. IMO belongs in PHP4 but not at all in PHP5, which is an attempt at a "general-purpose" language. To be frank, I thought PHP4 made more sense because it was 1st of class at what it did, while PHP5 falls short to a number of languages in basically everything.

- automatic integer-to-float comparison to accomodate bigger integers. A horrible hack to squeeze a little extra performance in naive benchmarks in computers with no native 64 bit integer support. This really makes no sense whatsoever now and may have had some partial justification in the early 90s, prior to PHP4 even.

Both ideas are terrible and pretty much unique to PHP of all popular languages.

This is not a philosophical debate about typing styles or the existence of perfect type conversions. PHP's problems in this regard are relics from a dubious past.

fffggg · on April 13, 2012

- automatic integer-to-float comparison to accommodate bigger integers. A horrible hack to squeeze a little extra performance in naive benchmarks in computers with no native 64 bit integer support. This really makes no sense whatsoever now and may have had some partial justification in the early 90s, prior to PHP4 even.

No, this is not unique to php. Many popular, comparable languages perform an int -> float conversion. For example, Perl:

$ perl -wle'print "20938410923849012834092834" + 0 if "20938410923849012834092834" == "20938410923849012834092835"'

2.0938410923849e+25

- This is not a philosophical debate about typing styles or the existence of perfect type conversions. PHP's problems in this regard are relics from a dubious past.

Conversion from string -> number, and loose numeric types which auto-convert to float are near universal in loosely typed languages, out of necessity -- if such a scheme doesn't work consistently it can't be used at all. This brings me back to my point. You said "IMO this conversion should fail if the number represented is not valid, or fall back to arbitrary precision math". My response is that you cannot provide such a rule on the basis of "is it valid" because there is no such thing as a "valid" type conversion -- ALL have precision loss. It is inherent in the datatype. When I said "you may be underestimating the difficulty in predicting whether a particular decimal number can be accurately represented as a floating point type" you should perhaps read that as "you cannot do this, it is not possible".

Instead you might suggest that no loose conversion, no loose typing be permitted in a language design -- and I would agree wholeheartedly. But your suggestion that this be handled on a case-by-case basis depending on the numeric value is fundamentally unworkable. Big integers are not the only area this type of problem presents.

muyuu · on April 13, 2012

Perl5 is old enough so this behaviour has a niche. Possibly even PHP4 is old enough for that. But PHP5 was born when both 64 bit ints and good, open arbitrary precision libraries were available and very fast.

This, now, where it's being used, is absurd. There is no two ways to that. And this doesn't happen elsewhere to this extent.

I will leave you the last word though. Cheerio.

mikeash · on April 12, 2012

Testing validity should be pretty easy. Remove insignificant zeroes from both ends, then only accept the conversion when it is precisely correct. This can be done by simply converting to double and then back and seeing if you get the same thing. If there's any difference, it's not sufficiently accurate, bail out.

fffggg · on April 13, 2012

You're missing the point -- there is no such thing as a precisely correct conversion to a floating point number.

Your suggestion would make type conversion utterly unusable as it would fail seemingly randomly -- for example on simple numbers such as "0.1"

mikeash · on April 13, 2012

Sure there is such a thing. 0.25 can be precisely converted to float. You're right that such a thing would fail a lot, but that doesn't mean the goal is impossible, merely that achieving it is not very useful.

fffggg · on April 14, 2012

I did not say it was "impossible," I said it would be "utterly unusable."

I am happy to see you agree with me.