All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are multiple results
Date: Fri, 20 Jul 2018 10:49:21 +0200	[thread overview]
Message-ID: <87fu0es35a.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <20180719184111.5129-11-marcandre.lureau@redhat.com> (=?utf-8?Q?=22Marc-Andr=C3=A9?= Lureau"'s message of "Thu, 19 Jul 2018 20:41:03 +0200")

Marc-André Lureau <marcandre.lureau@redhat.com> writes:

> qobject_from_jsonv() returns a single object. Let's make sure that
> during parsing we don't leak an intermediary object. Instead of
> returning the last object, set a parsing error.
>
> Also, the lexer/parser keeps consuming all the data. There might be an
> error set earlier. Let's keep it and not call json_parser_parse() with
> the remaining data. Eventually, we may teach the message parser to
> stop consuming the data.

Took me a while to figure out what you mean :)

qobject_from_jsonv() feeds a complete string to the JSON parser, with
json_message_parser_feed().  This actually feeds one character after the
other to the lexer, via json_lexer_feed_char().

Whenever a character completes a token, the lexer feeds that token to
the streamer via a callback that is always json_message_process_token().

The attentive reader may wonder what it does with trailing characters
that aren't a complete token.  More on that below.

The streamer accumulates tokens, counting various parenthesis.  When all
counts are zero or any is negative, the group is complete, and is fed to
the parser via another callback.  That callback is parse_json() here.
There are more elsewhere.

The attentive reader may wonder what it does with trailing tokens that
aren't a complete group.  More on that below.

The callbacks all pass the tokens to json_parser_parse() to do the
actual parsing.  Returns the abstract syntax tree as QObject on success.

Note that the callback can be called any number of times.

In my opinion, this is over-engineered and under-thought.  There's more
flexibility than we're using, and the associated complexity breeds bugs.

In particular, parse_json() is wrong:

    s->result = json_parser_parse(tokens, s->ap, &s->err);

If the previous call succeeded, we leak its result.  This is the leak
you mentioned.

If any previous call set an error, we pass &s->err pointing to non-null,
which is forbidden.  If json_parser_parse() passed it to error_setg(),
this would crash.  Since it passes it only to error_propagate(), all
errors but the first one are ignored.  Latent crash bug.

If the last call succeeds and any previous call failed, the function
simultaneously succeeds and fails: we return both a result and set an
error.

Let's demonstrate these bugs before we fix them, by inserting the
appended patch before this one.

Are the other callbacks similarly buggy?  Turns out they're okay:

* handle_qmp_command() consumes result and error on each call.

* process_event() does, too.

* qmp_response() treats errors as fatal, and asserts "no previous
  response".

On trailing tokens that don't form a group: they get silently ignored,
as demonstrated by check-qjson test cases unterminated_array(),
unterminated_array_comma(), unterminated_dict(),
unterminated_dict_comma(), all marked BUG.

On trailing characters that don't form a token: they get silently
ignored, as demonstrated by check-qjson test cases
unterminated_string(), unterminated_sq_string(), unterminated_escape(),
all marked BUG, except when they aren't, as in test case
unterminated_literal().

The BUG marks are all gone at the end of this series.  I guess you're
fixing all that :)

Note that these "trailing characters / tokens are silently ignored" bugs
*do* affect the other callbacks.  Reproducer for handle_qmp_command():

    $ echo -e '{ "execute": "qmp_capabilities" }\n{ "execute": "query-name" }\n"unterminated' | socat UNIX:test-qmp STDIO
    {"QMP": {"version": {"qemu": {"micro": 90, "minor": 12, "major": 2}, "package": "v3.0.0-rc1-20-g6a024cd461"}, "capabilities": ["oob"]}}
    {"return": {}}
    {"return": {}}

Note there's no error reported for the last line.

> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  qobject/qjson.c     | 16 +++++++++++++++-
>  tests/check-qjson.c | 11 +++++++++++
>  2 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/qobject/qjson.c b/qobject/qjson.c
> index ee04e61096..8a9d116150 100644
> --- a/qobject/qjson.c
> +++ b/qobject/qjson.c
> @@ -22,6 +22,7 @@
>  #include "qapi/qmp/qlist.h"
>  #include "qapi/qmp/qnum.h"
>  #include "qapi/qmp/qstring.h"
> +#include "qapi/qmp/qerror.h"
>  #include "qemu/unicode.h"
>  
>  typedef struct JSONParsingState
> @@ -36,7 +37,20 @@ static void parse_json(JSONMessageParser *parser, GQueue *tokens)
>  {
>      JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>  
> -    s->result = json_parser_parse(tokens, s->ap, &s->err);
> +    if (s->result || s->err) {
> +        if (s->result) {
> +            qobject_unref(s->result);
> +            s->result = NULL;
> +            if (!s->err) {

Conditional is necessary because a previous call to json_parser_parse()
could have set s->err already.  Without it, error_setg() would fail the
assertion in error_setv() then.  Subtle.

> +                error_setg(&s->err, QERR_JSON_PARSING);

Hmm.  Is this really "Invalid JSON syntax"?  In a way, it is, because
RFC 7159 defines JSON-text as a single value.  Still, a friendlier error
message like "Expecting at most one JSON value" woun't hurt.

What if !tokens?  That shouldn't be an error.

> +            }
> +        }
> +        if (tokens) {
> +            g_queue_free_full(tokens, g_free);

g_queue_free_full(NULL, ...) doesn't work?!?  That would be lame.

> +        }
> +    } else {
> +        s->result = json_parser_parse(tokens, s->ap, &s->err);
> +    }
>  }

What about this:

       JSONParsingState *s = container_of(parser, JSONParsingState, parser);
       Error *err = NULL;

       if (!tokens) {
           return;
       }
       if (s->result || s->err) {
           if (s->result) {
               qobject_unref(s->result);
               s->result = NULL;
               error_setg(&err, "Expecting at most one JSON value");
           }
           g_queue_free_full(tokens, g_free);
       } else {
           s->result = json_parser_parse(tokens, s->ap, &err);
       }
       error_propagate(&s->err, err);
       assert(!s->result != !s->err);

>  
>  QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp)
> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
> index da582df3e9..7d3547e0cc 100644
> --- a/tests/check-qjson.c
> +++ b/tests/check-qjson.c
> @@ -1417,6 +1417,16 @@ static void limits_nesting(void)
>      g_assert(obj == NULL);
>  }
>  
> +static void multiple_objects(void)
> +{
> +    Error *err = NULL;
> +    QObject *obj;
> +
> +    obj = qobject_from_json("{} {}", &err);
> +    g_assert(obj == NULL);
> +    error_free_or_abort(&err);
> +}
> +
>  int main(int argc, char **argv)
>  {
>      g_test_init(&argc, &argv, NULL);
> @@ -1454,6 +1464,7 @@ int main(int argc, char **argv)
>      g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma);
>      g_test_add_func("/errors/unterminated/literal", unterminated_literal);
>      g_test_add_func("/errors/limits/nesting", limits_nesting);
> +    g_test_add_func("/errors/multiple_objects", multiple_objects);
>  
>      return g_test_run();
>  }

>From 18064b774ea7a79a9dbabe5cf75458f0326070d5 Mon Sep 17 00:00:00 2001
From: Markus Armbruster <armbru@redhat.com>
Date: Fri, 20 Jul 2018 10:22:41 +0200
Subject: [PATCH] check-qjson: Cover multiple JSON objects in same string

qobject_from_json() & friends misbehave when the JSON text has more
than one JSON value.  Add test coverage to demonstrate the bugs.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 tests/check-qjson.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index da582df3e9..c5fd439263 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -1417,6 +1417,25 @@ static void limits_nesting(void)
     g_assert(obj == NULL);
 }
 
+static void multiple_objects(void)
+{
+    Error *err = NULL;
+    QObject *obj;
+
+    /* BUG this leaks the syntax tree for "false" */
+    obj = qobject_from_json("false true", &err);
+    g_assert(qbool_get_bool(qobject_to(QBool, obj)));
+    g_assert(!err);
+    qobject_unref(obj);
+
+    /* BUG simultaneously succeeds and fails */
+    /* BUG calls json_parser_parse() with errp pointing to non-null */
+    obj = qobject_from_json("} true", &err);
+    g_assert(qbool_get_bool(qobject_to(QBool, obj)));
+    error_free_or_abort(&err);
+    qobject_unref(obj);
+}
+
 int main(int argc, char **argv)
 {
     g_test_init(&argc, &argv, NULL);
@@ -1454,6 +1473,7 @@ int main(int argc, char **argv)
     g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma);
     g_test_add_func("/errors/unterminated/literal", unterminated_literal);
     g_test_add_func("/errors/limits/nesting", limits_nesting);
+    g_test_add_func("/errors/multiple_objects", multiple_objects);
 
     return g_test_run();
 }
-- 
2.17.1

  reply	other threads:[~2018-07-20  8:49 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-19 18:40 [Qemu-devel] [PATCH v2 00/18] monitor: various code simplification and fixes Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 01/18] tests: change /0.15/* tests to /qmp/* Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 02/18] monitor: consitify qmp_send_response() QDict argument Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 03/18] qmp: constify qmp_is_oob() Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 04/18] Revert "qmp: isolate responses into io thread" Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 05/18] monitor: no need to save need_resume Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 06/18] qga: process_event() simplification and leak fix Marc-André Lureau
2018-07-24  0:03   ` Michael Roth
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 07/18] qmp: drop json_parser_parse() wrapper Marc-André Lureau
2018-07-20  6:26   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 08/18] json-parser: simplify and avoid JSONParserContext allocation Marc-André Lureau
2018-07-20  6:28   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 09/18] json-parser: further simplify freeing JSONParserContext Marc-André Lureau
2018-07-20  6:40   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are multiple results Marc-André Lureau
2018-07-20  8:49   ` Markus Armbruster [this message]
2018-07-20 10:41     ` Marc-André Lureau
2018-07-23  5:34       ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 11/18] qjson: report error on unterminated string Marc-André Lureau
2018-07-23  6:40   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 12/18] qjson: return parsing error if unterminated input Marc-André Lureau
2018-07-23  6:47   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 13/18] json-parser: set an error if parsing returned NULL Marc-André Lureau
2018-07-23  8:15   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 14/18] json-lexer: make it safe to call multiple times Marc-André Lureau
2018-08-09 11:58   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 15/18] tests: add a few qemu-qmp tests Marc-André Lureau
2018-08-09 12:36   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 16/18] tests: add a qmp success-response test Marc-André Lureau
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 17/18] qga: process_event() simplification Marc-André Lureau
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 18/18] RFC: qmp: common 'id' handling & make QGA conform to QMP spec Marc-André Lureau
2018-08-09 13:02   ` Markus Armbruster
2018-08-09 11:48 ` [Qemu-devel] [PATCH v2 00/18] monitor: various code simplification and fixes Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fu0es35a.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.