From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42841) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bDZNQ-000702-1G for qemu-devel@nongnu.org; Thu, 16 Jun 2016 11:38:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bDZNJ-0007Az-Ru for qemu-devel@nongnu.org; Thu, 16 Jun 2016 11:38:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58324) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bDZNJ-0007AM-KP for qemu-devel@nongnu.org; Thu, 16 Jun 2016 11:38:33 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1904563E01 for ; Thu, 16 Jun 2016 15:38:33 +0000 (UTC) From: Markus Armbruster References: <1465526889-8339-1-git-send-email-eblake@redhat.com> <1465526889-8339-4-git-send-email-eblake@redhat.com> Date: Thu, 16 Jun 2016 17:38:30 +0200 In-Reply-To: <1465526889-8339-4-git-send-email-eblake@redhat.com> (Eric Blake's message of "Thu, 9 Jun 2016 20:48:08 -0600") Message-ID: <87porhw3d5.fsf@dusky.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 3/4] qobject: Parse non-finite numbers, as an extension List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: qemu-devel@nongnu.org, Luiz Capitulino Eric Blake writes: > There have been times in the past where we have been careless and > allowed non-finite values to escape as a 'number' type in QMP > output (such as before commit 27ff42e). This is in violation of > the JSON specification in RFC 7159, and usually caused by > floating-point division by 0 resulting in NaN, although infinity > is also a possible culprit. While a future patch will tighten the > output generator to avoid a lexical error from a strict JSON > parser, we might as well first fix our parser to accept non-finite > values as an extension, so that we can always read what we have > output in the past ("be liberal in what you accept, strict in what > you produce"). > > Rather than complicate the lexer to add LOTS of states for each > letter within 'nan' and 'inf[inity]', I chose to just abuse the > 'keyword' token, but had to make it accept upper case. Also, since > I want to parse '-inf', I had to tweak IN_NEG_NONZERO_NUMBER; and > renamed that in the process (we have always accepted '-0', so the > state name was a misnomer). Then the parser then does the real magic Then the parser does > of creating a QFloat object if a non-finite "keyword" was recognized. > > I intentionally did not support NAN(n-char-sequence) forms, even > though C99 requires it to work for strtod(), in part because > "implementation-specified" n-char-sequence is rather hard to test > in a platform-agnostic manner, and in part because we've never > actually output it. I'm afraid only because libc didn't. =C2=A77.19.6.1: A double argument representing an infinity is converted in one of the styles [-]inf or [-]infinity -- which style is implementation-defined. A double argument representing a NaN is converted in one of the styles [-]nan or [-]nan(n-char-sequence) -- which style, and the meaning of any n-char- sequence, is implementation-defined. > Improve the testsuite to cover the new extension, including working > around the fact that g_assert_cmpfloat() does NOT test equivalence, > but merely mathematical equality (0.0 and -0.0 are equal but not > equivalent, NaN and NaN are equivalent but not equal), and enhance > the existing tests for -0.0 in the process. > > Checkpatch has a false negative, complaining that there should be Isn't this a false positive? > spaces around binary '-' in what is really the float literal > -32.20e-10. It was over my head to figure out how to silence that > one. > > Signed-off-by: Eric Blake > --- > qobject/json-lexer.c | 15 ++++++++----- > qobject/json-parser.c | 13 +++++++++-- > tests/check-qjson.c | 62 ++++++++++++++++++++++++++++++++++++++++++++-= ------ > docs/qmp-spec.txt | 4 ++++ > 4 files changed, 79 insertions(+), 15 deletions(-) > > diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c > index ebd15d8..de16219 100644 > --- a/qobject/json-lexer.c > +++ b/qobject/json-lexer.c > @@ -18,13 +18,14 @@ > #define MAX_TOKEN_SIZE (64ULL << 20) > > /* > - * Required by JSON (RFC 7159), plus \' extension in "": > + * Required by JSON (RFC 7159), plus \' extension in "", and extension > + * of parsing case-insensitive non-finite numbers like "NaN" and "-Inf": > * > * \"([^\\\"]|(\\\"|\\'|\\\\|\\/|\\b|\\f|\\n|\\r|\\t| > * \\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*\" > * -?(0|[1-9][0-9]*)(.[0-9]+)?([eE][-+]?[0-9]+)? > * [{}\[\],:] > - * [a-z]+ # covers null, true, false > + * -?[a-zA-Z]+ # covers null, true, false, nan, inf[inity] > * > * Extension of '' strings: > * > @@ -58,7 +59,7 @@ enum json_lexer_state { > IN_MANTISSA, > IN_MANTISSA_DIGITS, > IN_NONZERO_NUMBER, > - IN_NEG_NONZERO_NUMBER, > + IN_NEG_NUMBER, > IN_KEYWORD, > IN_ESCAPE, > IN_ESCAPE_L, > @@ -206,15 +207,18 @@ static const uint8_t json_lexer[][256] =3D { > ['.'] =3D IN_MANTISSA, > }, > > - [IN_NEG_NONZERO_NUMBER] =3D { > + [IN_NEG_NUMBER] =3D { > ['0'] =3D IN_ZERO, > ['1' ... '9'] =3D IN_NONZERO_NUMBER, > + ['a' ... 'z'] =3D IN_KEYWORD, > + ['A' ... 'Z'] =3D IN_KEYWORD, > }, > > /* keywords */ > [IN_KEYWORD] =3D { > TERMINAL(JSON_KEYWORD), > ['a' ... 'z'] =3D IN_KEYWORD, > + ['A' ... 'Z'] =3D IN_KEYWORD, > }, > > /* whitespace */ > @@ -264,7 +268,7 @@ static const uint8_t json_lexer[][256] =3D { > ['\''] =3D IN_SQ_STRING, > ['0'] =3D IN_ZERO, > ['1' ... '9'] =3D IN_NONZERO_NUMBER, > - ['-'] =3D IN_NEG_NONZERO_NUMBER, > + ['-'] =3D IN_NEG_NUMBER, > ['{'] =3D JSON_LCURLY, > ['}'] =3D JSON_RCURLY, > ['['] =3D JSON_LSQUARE, > @@ -272,6 +276,7 @@ static const uint8_t json_lexer[][256] =3D { > [','] =3D JSON_COMMA, > [':'] =3D JSON_COLON, > ['a' ... 'z'] =3D IN_KEYWORD, > + ['A' ... 'Z'] =3D IN_KEYWORD, > ['%'] =3D IN_ESCAPE, > [' '] =3D IN_WHITESPACE, > ['\t'] =3D IN_WHITESPACE, > diff --git a/qobject/json-parser.c b/qobject/json-parser.c > index 67ed727..12519b6 100644 > --- a/qobject/json-parser.c > +++ b/qobject/json-parser.c > @@ -450,6 +450,16 @@ static QObject *parse_keyword(JSONParserContext *ctx= t) > return QOBJECT(qbool_from_bool(false)); > } else if (!strcmp(token->str, "null")) { > return qnull(); > + } else { > + double d; > + char *p; > + > + /* The lexer feeds us "NaN" and "-Inf" as keywords */ > + errno =3D 0; > + d =3D strtod(token->str, &p); > + if (!errno && p !=3D token->str && !*p) { > + return QOBJECT(qfloat_from_double(d)); > + } > } > parse_error(ctxt, token, "invalid keyword '%s'", token->str); > return NULL; > @@ -519,8 +529,7 @@ static QObject *parse_literal(JSONParserContext *ctxt) > } > case JSON_FLOAT: > /* FIXME dependent on locale; a pervasive issue in QEMU */ > - /* FIXME our lexer matches RFC 7159 in forbidding Inf or NaN, > - * but those might be useful extensions beyond JSON */ > + /* NaN and infinity are parsed as extensions under parse_keyword= () */ "Under"? What about "extensions, by parse_keyword()"? > return QOBJECT(qfloat_from_double(strtod(token->str, NULL))); > default: > abort(); > diff --git a/tests/check-qjson.c b/tests/check-qjson.c > index 0e158f6..95adf64 100644 > --- a/tests/check-qjson.c > +++ b/tests/check-qjson.c > @@ -11,6 +11,7 @@ > * > */ > #include "qemu/osdep.h" > +#include > > #include "qapi/qmp/qstring.h" > #include "qapi/qmp/qint.h" > @@ -932,34 +933,78 @@ static void float_number(void) > struct { > const char *encoded; > double decoded; > - int skip; > + const char *canonical; > } test_cases[] =3D { > { "32.43", 32.43 }, > { "0.222", 0.222 }, > { "-32.12313", -32.12313 }, > - { "-32.20e-10", -32.20e-10, .skip =3D 1 }, > + { "-32.20e-10", -32.20e-10, "-0" }, /* FIXME: Our rounding is lo= usy */ > + { "-0.0", -0.0, "-0" }, > { }, > }; > > for (i =3D 0; test_cases[i].encoded; i++) { > QObject *obj; > QFloat *qfloat; > + QString *str; > > obj =3D qobject_from_json(test_cases[i].encoded); > g_assert(obj !=3D NULL); > g_assert(qobject_type(obj) =3D=3D QTYPE_QFLOAT); > > qfloat =3D qobject_to_qfloat(obj); > - g_assert(qfloat_get_double(qfloat) =3D=3D test_cases[i].decoded); > + g_assert_cmpfloat(qfloat_get_double(qfloat), =3D=3D, > + test_cases[i].decoded); > + g_assert_cmpint(signbit(qfloat_get_double(qfloat)), =3D=3D, > + signbit(test_cases[i].decoded)); Could use a comment. > > - if (test_cases[i].skip =3D=3D 0) { > - QString *str; > + str =3D qobject_to_json(obj); > + g_assert_cmpstr(qstring_get_str(str), =3D=3D, > + test_cases[i].canonical ?: test_cases[i].encoded= ); > + QDECREF(str); > + QDECREF(qfloat); > + } > +} > > - str =3D qobject_to_json(obj); > - g_assert(strcmp(qstring_get_str(str), test_cases[i].encoded)= =3D=3D 0); > - QDECREF(str); > +static void non_finite_number(void) > +{ > + int i; > + struct { > + const char *encoded; > + double decoded; > + const char *canonical; > + } test_cases[] =3D { > + { "nan", NAN }, > + { "NaN", NAN, "nan" }, > + /* Not all libc implementations can round-trip '-nan' */ > + /* We do not support forms like 'NAN(0)' */ > + { "inf", INFINITY }, > + { "-INFINITY", -INFINITY, "-inf" }, > + { }, > + }; > + > + for (i =3D 0; test_cases[i].encoded; i++) { > + QObject *obj; > + QFloat *qfloat; > + QString *str; > + > + obj =3D qobject_from_json(test_cases[i].encoded); > + g_assert(obj !=3D NULL); > + g_assert(qobject_type(obj) =3D=3D QTYPE_QFLOAT); > + > + qfloat =3D qobject_to_qfloat(obj); > + /* g_assert_cmpfloat(NAN, =3D=3D, NAN) doesn't do what we want */ > + g_assert_cmpint(fpclassify(qfloat_get_double(qfloat)), =3D=3D, > + fpclassify(test_cases[i].decoded)); > + if (!isnan(test_cases[i].decoded)) { > + g_assert_cmpfloat(qfloat_get_double(qfloat), =3D=3D, > + test_cases[i].decoded); > } > > + str =3D qobject_to_json(obj); > + g_assert_cmpstr(qstring_get_str(str), =3D=3D, > + test_cases[i].canonical ?: test_cases[i].encoded= ); > + QDECREF(str); > QDECREF(qfloat); > } > } > @@ -1520,6 +1565,7 @@ int main(int argc, char **argv) > > g_test_add_func("/literals/number/simple", simple_number); > g_test_add_func("/literals/number/float", float_number); > + g_test_add_func("/literals/number/non-finite", non_finite_number); > g_test_add_func("/literals/number/vararg", vararg_number); > > g_test_add_func("/literals/keyword", keyword_literal); > diff --git a/docs/qmp-spec.txt b/docs/qmp-spec.txt > index f8b5356..bfba431 100644 > --- a/docs/qmp-spec.txt > +++ b/docs/qmp-spec.txt > @@ -51,6 +51,10 @@ json-string, and both input forms of strings understan= d an additional > escape sequence of "\'" for a single quote. The server will only use > double quoting on output. > > +As an extension, the server understands case-insensitive forms of > +non-finite numbers, such as "NaN", "Inf", and "-infinity" (but not > +"NaN(...)"). > + > 2.1 General Definitions > -----------------------