On 01/20/2016 02:02 AM, Markus Armbruster wrote: >> @@ -519,6 +519,8 @@ static QObject *parse_literal(JSONParserContext *ctxt) >> } >> case JSON_FLOAT: >> /* FIXME dependent on locale */ >> + /* FIXME our lexer matches RFC7159 in forbidding Inf or NaN, > > For what it's worth, the RFC spells this "RFC 7159". Looks like we use space more often than not, but that we're inconsistent. For example: slirp/tcp.h: * Per RFC 793, September, 1981. slirp/tcp.h: * Per RFC793, September, 1981. Will fix if I need to respin, otherwise I assume you can do it. >> + /* FIXME: snprintf() is locale dependent; but JSON requires >> + * numbers to be formatted as if in the C locale. */ > > The new FIXME matches the existing one in parse_literal(), visible > above. > > However, dependence on C locale is a pervasive issue in QEMU. These two > comments could give readers the idea that it's an issue just here. > Suggest to add something like "Dependence on C locale is a pervasive > issue in QEMU." Good idea. > >> + /* FIXME: This risks printing Inf or NaN, which are not valid >> + * JSON values. */ >> + /* FIXME: the default precision of %f may be insufficient to >> + * tell this number apart from others. */ > > Yup. > > The easy way to avoid loss of precision is %a, but of course that's way > too much sophistication for JSON. > > Avoiding loss of precision with a decimal format is non-trivial; see > Steele, Jr., Guy L., and White, Jon L. How to print floating-point > numbers accurately, SIGPLAN ’90, and later improvements collected here: > http://stackoverflow.com/questions/7153979/algorithm-to-convert-an-ieee-754-double-to-a-string Ah, memories. I read and implemented that paper when working on the jikes compiler for the Java language back in the late nineties, as it is the Java language specification which had the very neat property of requiring the shortest decimal string that will unambiguously round back to the same floating point pattern. One alternative is to always output a guaranteed unambiguous decimal string (although not necessarily the shortest), by using %.17f, using DBL_DECIMAL_DIG. (Note that DBL_DIG of 15 is NOT sufficient - it is the lower limit that says that a decimal->float->decimal will not change the decimal; but we want the converse where a float->decimal->float will not change the float. There are stretches of numbers where the pigeonhole principle applies; you can think of it this way: there is no way to map all possible 2^10 (1024) binary values inside 2^3 (1000) decimal digits without at least 24 of them needing one more decimal digit. But by the same arguments, DBL_DECIMAL_DIG is an upper limit and usually more than you need.) So, the question is whether we want to always output 17 digits, or whether we want to do the poor-man's truncation scheme (easy to understand, but not optimal use of the processor), or go all the way to the algorithm of that paper (faster but lots harder to understand). For reference, here's the poor-man's algorithm in pseudocode: if 0, inf, nan: special case else: Obtain the DBL_DECIMAL_DIG string via sprintf %.17f i = 17; do { truncate the original string to i-1 decimal characters parse that with strtod() if the bit pattern differs: break; } while (--i); assert(i) use i digits of the string As a separate patch, of course, but I have a pending patch that provides a single place where we could drop in such an improvement: https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03932.html -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org