All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are multiple results
Date: Mon, 23 Jul 2018 07:34:32 +0200	[thread overview]
Message-ID: <878t62ldlj.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <CAMxuvazpf1oRMDLpPw25C99gV8YZu506LSPGR0N9mU2MeYEE-A@mail.gmail.com> (=?utf-8?Q?=22Marc-Andr=C3=A9?= Lureau"'s message of "Fri, 20 Jul 2018 12:41:31 +0200")

Marc-André Lureau <marcandre.lureau@redhat.com> writes:

> Hi
>
> On Fri, Jul 20, 2018 at 10:49 AM, Markus Armbruster <armbru@redhat.com> wrote:
>> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
>>
>>> qobject_from_jsonv() returns a single object. Let's make sure that
>>> during parsing we don't leak an intermediary object. Instead of
>>> returning the last object, set a parsing error.
>>>
>>> Also, the lexer/parser keeps consuming all the data. There might be an
>>> error set earlier. Let's keep it and not call json_parser_parse() with
>>> the remaining data. Eventually, we may teach the message parser to
>>> stop consuming the data.
>>
>> Took me a while to figure out what you mean :)
>>
>> qobject_from_jsonv() feeds a complete string to the JSON parser, with
>> json_message_parser_feed().  This actually feeds one character after the
>> other to the lexer, via json_lexer_feed_char().
>>
>> Whenever a character completes a token, the lexer feeds that token to
>> the streamer via a callback that is always json_message_process_token().
>>
>> The attentive reader may wonder what it does with trailing characters
>> that aren't a complete token.  More on that below.
>>
>> The streamer accumulates tokens, counting various parenthesis.  When all
>> counts are zero or any is negative, the group is complete, and is fed to
>> the parser via another callback.  That callback is parse_json() here.
>> There are more elsewhere.
>>
>> The attentive reader may wonder what it does with trailing tokens that
>> aren't a complete group.  More on that below.
>>
>> The callbacks all pass the tokens to json_parser_parse() to do the
>> actual parsing.  Returns the abstract syntax tree as QObject on success.
>>
>> Note that the callback can be called any number of times.
>>
>> In my opinion, this is over-engineered and under-thought.  There's more
>> flexibility than we're using, and the associated complexity breeds bugs.

Two obvious simplifications come to mind:

* Call json_message_process_token() directly.

* Move the call json_parser_parse() into the streamer, and have the
  streamer pass QObject * instead of GQueue * to its callback.

I'll post patches for your consideration, but first I'll continue to
review your series.

>> In particular, parse_json() is wrong:
>>
>>     s->result = json_parser_parse(tokens, s->ap, &s->err);
>>
>> If the previous call succeeded, we leak its result.  This is the leak
>> you mentioned.
>>
>> If any previous call set an error, we pass &s->err pointing to non-null,
>> which is forbidden.  If json_parser_parse() passed it to error_setg(),
>> this would crash.  Since it passes it only to error_propagate(), all
>> errors but the first one are ignored.  Latent crash bug.
>>
>> If the last call succeeds and any previous call failed, the function
>> simultaneously succeeds and fails: we return both a result and set an
>> error.
>>
>> Let's demonstrate these bugs before we fix them, by inserting the
>> appended patch before this one.
>>
>> Are the other callbacks similarly buggy?  Turns out they're okay:
>>
>> * handle_qmp_command() consumes result and error on each call.
>>
>> * process_event() does, too.
>>
>> * qmp_response() treats errors as fatal, and asserts "no previous
>>   response".
>>
>> On trailing tokens that don't form a group: they get silently ignored,
>> as demonstrated by check-qjson test cases unterminated_array(),
>> unterminated_array_comma(), unterminated_dict(),
>> unterminated_dict_comma(), all marked BUG.
>>
>> On trailing characters that don't form a token: they get silently
>> ignored, as demonstrated by check-qjson test cases
>> unterminated_string(), unterminated_sq_string(), unterminated_escape(),
>> all marked BUG, except when they aren't, as in test case
>> unterminated_literal().
>>
>> The BUG marks are all gone at the end of this series.  I guess you're
>> fixing all that :)
>>
>> Note that these "trailing characters / tokens are silently ignored" bugs
>> *do* affect the other callbacks.  Reproducer for handle_qmp_command():
>>
>>     $ echo -e '{ "execute": "qmp_capabilities" }\n{ "execute": "query-name" }\n"unterminated' | socat UNIX:test-qmp STDIO
>>     {"QMP": {"version": {"qemu": {"micro": 90, "minor": 12, "major": 2}, "package": "v3.0.0-rc1-20-g6a024cd461"}, "capabilities": ["oob"]}}
>>     {"return": {}}
>>     {"return": {}}
>>
>> Note there's no error reported for the last line.
>>
>>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>> ---
>>>  qobject/qjson.c     | 16 +++++++++++++++-
>>>  tests/check-qjson.c | 11 +++++++++++
>>>  2 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qobject/qjson.c b/qobject/qjson.c
>>> index ee04e61096..8a9d116150 100644
>>> --- a/qobject/qjson.c
>>> +++ b/qobject/qjson.c
>>> @@ -22,6 +22,7 @@
>>>  #include "qapi/qmp/qlist.h"
>>>  #include "qapi/qmp/qnum.h"
>>>  #include "qapi/qmp/qstring.h"
>>> +#include "qapi/qmp/qerror.h"
>>>  #include "qemu/unicode.h"
>>>
>>>  typedef struct JSONParsingState
>>> @@ -36,7 +37,20 @@ static void parse_json(JSONMessageParser *parser, GQueue *tokens)
>>>  {
>>>      JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>>>
>>> -    s->result = json_parser_parse(tokens, s->ap, &s->err);
>>> +    if (s->result || s->err) {
>>> +        if (s->result) {
>>> +            qobject_unref(s->result);
>>> +            s->result = NULL;
>>> +            if (!s->err) {
>>
>> Conditional is necessary because a previous call to json_parser_parse()
>> could have set s->err already.  Without it, error_setg() would fail the
>> assertion in error_setv() then.  Subtle.
>>
>>> +                error_setg(&s->err, QERR_JSON_PARSING);
>>
>> Hmm.  Is this really "Invalid JSON syntax"?  In a way, it is, because
>> RFC 7159 defines JSON-text as a single value.  Still, a friendlier error
>> message like "Expecting at most one JSON value" woun't hurt.
>>
>> What if !tokens?  That shouldn't be an error.
>>
>>> +            }
>>> +        }
>>> +        if (tokens) {
>>> +            g_queue_free_full(tokens, g_free);
>>
>> g_queue_free_full(NULL, ...) doesn't work?!?  That would be lame.
>
> It warns and return.

Lame.

> We could use g_clear_pointer(&tokens, g_queue_free_full)...

Slightly terser, but the reader better knows g_clear_pointer().  So far,
we don't use it.

>>> +        }
>>> +    } else {
>>> +        s->result = json_parser_parse(tokens, s->ap, &s->err);
>>> +    }
>>>  }
>>
>> What about this:
>>
>>        JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>>        Error *err = NULL;
>>
>>        if (!tokens) {
>>            return;
>>        }
>
> I would rather leave handling of NULL tokens to json_parser_parse()
> for consistency with other callers.

Fair point.

I expect the issue to evaporate when I simplify the parser interface.

>>        if (s->result || s->err) {
>>            if (s->result) {
>>                qobject_unref(s->result);
>>                s->result = NULL;
>>                error_setg(&err, "Expecting at most one JSON value");
>>            }
>>            g_queue_free_full(tokens, g_free);
>
> missing null check

@tokens can't be null here.

>>        } else {
>>            s->result = json_parser_parse(tokens, s->ap, &err);
>>        }
>>        error_propagate(&s->err, err);
>
> how do you ensure s->err is not already set?

error_propagate() is fine with that, unlike error_setg().

>>        assert(!s->result != !s->err);
>>
>>>
>>>  QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp)
>>> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
>>> index da582df3e9..7d3547e0cc 100644
>>> --- a/tests/check-qjson.c
>>> +++ b/tests/check-qjson.c
>>> @@ -1417,6 +1417,16 @@ static void limits_nesting(void)
>>>      g_assert(obj == NULL);
>>>  }
>>>
>>> +static void multiple_objects(void)
>>> +{
>>> +    Error *err = NULL;
>>> +    QObject *obj;
>>> +
>>> +    obj = qobject_from_json("{} {}", &err);
>>> +    g_assert(obj == NULL);
>>> +    error_free_or_abort(&err);
>>> +}
>>> +
>>>  int main(int argc, char **argv)
>>>  {
>>>      g_test_init(&argc, &argv, NULL);
>>> @@ -1454,6 +1464,7 @@ int main(int argc, char **argv)
>>>      g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma);
>>>      g_test_add_func("/errors/unterminated/literal", unterminated_literal);
>>>      g_test_add_func("/errors/limits/nesting", limits_nesting);
>>> +    g_test_add_func("/errors/multiple_objects", multiple_objects);
>>>
>>>      return g_test_run();
>>>  }
>>
>> From 18064b774ea7a79a9dbabe5cf75458f0326070d5 Mon Sep 17 00:00:00 2001
>> From: Markus Armbruster <armbru@redhat.com>
>> Date: Fri, 20 Jul 2018 10:22:41 +0200
>> Subject: [PATCH] check-qjson: Cover multiple JSON objects in same string
>>
>> qobject_from_json() & friends misbehave when the JSON text has more
>> than one JSON value.  Add test coverage to demonstrate the bugs.
>>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
> test looks better than mine, should I include it in the series before the fix?

Yes, please.

  reply	other threads:[~2018-07-23  5:34 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-19 18:40 [Qemu-devel] [PATCH v2 00/18] monitor: various code simplification and fixes Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 01/18] tests: change /0.15/* tests to /qmp/* Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 02/18] monitor: consitify qmp_send_response() QDict argument Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 03/18] qmp: constify qmp_is_oob() Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 04/18] Revert "qmp: isolate responses into io thread" Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 05/18] monitor: no need to save need_resume Marc-André Lureau
2018-07-19 18:40 ` [Qemu-devel] [PATCH v2 06/18] qga: process_event() simplification and leak fix Marc-André Lureau
2018-07-24  0:03   ` Michael Roth
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 07/18] qmp: drop json_parser_parse() wrapper Marc-André Lureau
2018-07-20  6:26   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 08/18] json-parser: simplify and avoid JSONParserContext allocation Marc-André Lureau
2018-07-20  6:28   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 09/18] json-parser: further simplify freeing JSONParserContext Marc-André Lureau
2018-07-20  6:40   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are multiple results Marc-André Lureau
2018-07-20  8:49   ` Markus Armbruster
2018-07-20 10:41     ` Marc-André Lureau
2018-07-23  5:34       ` Markus Armbruster [this message]
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 11/18] qjson: report error on unterminated string Marc-André Lureau
2018-07-23  6:40   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 12/18] qjson: return parsing error if unterminated input Marc-André Lureau
2018-07-23  6:47   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 13/18] json-parser: set an error if parsing returned NULL Marc-André Lureau
2018-07-23  8:15   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 14/18] json-lexer: make it safe to call multiple times Marc-André Lureau
2018-08-09 11:58   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 15/18] tests: add a few qemu-qmp tests Marc-André Lureau
2018-08-09 12:36   ` Markus Armbruster
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 16/18] tests: add a qmp success-response test Marc-André Lureau
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 17/18] qga: process_event() simplification Marc-André Lureau
2018-07-19 18:41 ` [Qemu-devel] [PATCH v2 18/18] RFC: qmp: common 'id' handling & make QGA conform to QMP spec Marc-André Lureau
2018-08-09 13:02   ` Markus Armbruster
2018-08-09 11:48 ` [Qemu-devel] [PATCH v2 00/18] monitor: various code simplification and fixes Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878t62ldlj.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.