All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] keyval: accept escaped commas in implied option
@ 2020-11-11 10:45 Paolo Bonzini
  2020-11-11 10:45 ` [PATCH 1/2] " Paolo Bonzini
  2020-11-11 10:45 ` [PATCH 2/2] keyval: simplify keyval_parse_one Paolo Bonzini
  0 siblings, 2 replies; 10+ messages in thread
From: Paolo Bonzini @ 2020-11-11 10:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

The patches that I posted so far tried to make QemuOpts accept only
(slightly more) well-formed command lines.  With this one, QemuOpts
and keyval's grammars roughly meet in the middle: keyval gains the
ability to parse values that have an implied key and contain commas.

Paolo Bonzini (2):
  keyval: accept escaped commas in implied option
  keyval: simplify keyval_parse_one

 include/qemu/help_option.h |  11 ---
 tests/test-keyval.c        |  21 +++--
 util/keyval.c              | 155 +++++++++++++++++++------------------
 3 files changed, 94 insertions(+), 93 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-11 10:45 [PATCH 0/2] keyval: accept escaped commas in implied option Paolo Bonzini
@ 2020-11-11 10:45 ` Paolo Bonzini
  2020-11-11 10:53   ` Daniel P. Berrangé
  2020-11-27  8:38   ` Markus Armbruster
  2020-11-11 10:45 ` [PATCH 2/2] keyval: simplify keyval_parse_one Paolo Bonzini
  1 sibling, 2 replies; 10+ messages in thread
From: Paolo Bonzini @ 2020-11-11 10:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

This is used with the weirdly-named device "SUNFD,two", so accepting it
is also a preparatory step towards keyval-ifying -device and the
device_add monitor command.  But in general it is an unexpected wart
of the keyval syntax and leads to suboptimal errors compared to QemuOpts:

  $ ./qemu-system-x86_64 -object foo,,bar,id=obj
  qemu-system-x86_64: -object foo,,bar,id=obj: invalid object type: foo,bar
  $ storage-daemon/qemu-storage-daemon --object foo,,bar,id=obj
  qemu-storage-daemon: Invalid parameter ''

To implement this, the flow of the parser is changed to first unescape
everything up to the next comma or equal sign.  This is done in a
new function keyval_fetch_string for both the key and value part.
Keys therefore are now parsed in unescaped form, but this makes no
difference in practice because a comma is an invalid character for a
QAPI name.  Thus keys with a comma in them are rejected anyway, as
demonstrated by the new testcase.

As a side effect of the new code, parse errors are slightly improved as
well: "Invalid parameter ''" becomes "Expected parameter before '='"
when keyval is fed a string starting with an equal sign.

The slightly baroque interface of keyval_fetch_string lets me keep the
key parsing loop mostly untouched.  It is simplified in the next patch,
however.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/help_option.h |  11 ---
 tests/test-keyval.c        |  21 +++---
 util/keyval.c              | 142 ++++++++++++++++++++-----------------
 3 files changed, 91 insertions(+), 83 deletions(-)

diff --git a/include/qemu/help_option.h b/include/qemu/help_option.h
index ca6389a154..328d2a89fd 100644
--- a/include/qemu/help_option.h
+++ b/include/qemu/help_option.h
@@ -19,15 +19,4 @@ static inline bool is_help_option(const char *s)
     return !strcmp(s, "?") || !strcmp(s, "help");
 }
 
-static inline int starts_with_help_option(const char *s)
-{
-    if (*s == '?') {
-        return 1;
-    }
-    if (g_str_has_prefix(s, "help")) {
-        return 4;
-    }
-    return 0;
-}
-
 #endif
diff --git a/tests/test-keyval.c b/tests/test-keyval.c
index ee927fe4e4..19f664f535 100644
--- a/tests/test-keyval.c
+++ b/tests/test-keyval.c
@@ -89,6 +89,11 @@ static void test_keyval_parse(void)
     error_free_or_abort(&err);
     g_assert(!qdict);
 
+    /* Keys must be QAPI identifiers */
+    qdict = keyval_parse("weird,,=key", NULL, NULL, &err);
+    error_free_or_abort(&err);
+    g_assert(!qdict);
+
     /* Multiple keys, last one wins */
     qdict = keyval_parse("a=1,b=2,,x,a=3", NULL, NULL, &error_abort);
     g_assert_cmpuint(qdict_size(qdict), ==, 2);
@@ -178,15 +183,15 @@ static void test_keyval_parse(void)
     error_free_or_abort(&err);
     g_assert(!qdict);
 
-    /* Likewise (qemu_opts_parse(): implied key with comma value) */
-    qdict = keyval_parse(",,,a=1", "implied", NULL, &err);
-    error_free_or_abort(&err);
-    g_assert(!qdict);
+    /* Implied key's value can have a comma */
+    qdict = keyval_parse(",,,a=1", "implied", NULL, &error_abort);
+    g_assert_cmpstr(qdict_get_try_str(qdict, "implied"), ==, ",");
+    g_assert_cmpstr(qdict_get_try_str(qdict, "a"), ==, "1");
+    qobject_unref(qdict);
 
-    /* Implied key's value can't have comma (qemu_opts_parse(): it can) */
-    qdict = keyval_parse("val,,ue", "implied", NULL, &err);
-    error_free_or_abort(&err);
-    g_assert(!qdict);
+    qdict = keyval_parse("val,,ue", "implied", NULL, &error_abort);
+    g_assert_cmpstr(qdict_get_try_str(qdict, "implied"), ==, "val,ue");
+    qobject_unref(qdict);
 
     /* Empty key is not an implied key */
     qdict = keyval_parse("=val", "implied", NULL, &err);
diff --git a/util/keyval.c b/util/keyval.c
index 7f625ad33c..2213a5fe78 100644
--- a/util/keyval.c
+++ b/util/keyval.c
@@ -16,8 +16,8 @@
  *   key-vals     = [ key-val { ',' key-val } [ ',' ] ]
  *   key-val      = key '=' val | help
  *   key          = key-fragment { '.' key-fragment }
- *   key-fragment = / [^=,.]+ /
- *   val          = { / [^,]+ / | ',,' }
+ *   key-fragment = { / [^=,.] / | ',,' }
+ *   val          = { / [^,] / | ',,' }
  *   help         = 'help' | '?'
  *
  * Semantics defined by reduction to JSON:
@@ -78,13 +78,13 @@
  * Alternative syntax for use with an implied key:
  *
  *   key-vals     = [ key-val-1st { ',' key-val } [ ',' ] ]
- *   key-val-1st  = val-no-key | key-val
- *   val-no-key   = / [^=,]+ / - help
+ *   key-val-1st  = (val-no-key - help) | key-val
+ *   val-no-key   = { / [^=,] / | ',,' }
  *
  * where val-no-key is syntactic sugar for implied-key=val-no-key.
  *
- * Note that you can't use the sugared form when the value contains
- * '=' or ','.
+ * Note that you can't use the sugared form when the value is empty
+ * or contains '='.
  */
 
 #include "qemu/osdep.h"
@@ -141,7 +141,7 @@ static int key_to_index(const char *key, const char **end)
  * On failure, store an error through @errp and return NULL.
  */
 static QObject *keyval_parse_put(QDict *cur,
-                                 const char *key_in_cur, QString *value,
+                                 const char *key_in_cur, const char *value,
                                  const char *key, const char *key_cursor,
                                  Error **errp)
 {
@@ -152,20 +152,56 @@ static QObject *keyval_parse_put(QDict *cur,
         if (qobject_type(old) != (value ? QTYPE_QSTRING : QTYPE_QDICT)) {
             error_setg(errp, "Parameters '%.*s.*' used inconsistently",
                        (int)(key_cursor - key), key);
-            qobject_unref(value);
             return NULL;
         }
         if (!value) {
             return old;         /* already QDict, do nothing */
         }
-        new = QOBJECT(value);   /* replacement */
-    } else {
-        new = value ? QOBJECT(value) : QOBJECT(qdict_new());
     }
+    new = value ? QOBJECT(qstring_from_str(value)) : QOBJECT(qdict_new());
     qdict_put_obj(cur, key_in_cur, new);
     return new;
 }
 
+/*
+ * Parse and unescape the string (key or value) pointed to by @start,
+ * stopping at a single comma or if @key is true an equal sign.
+ * The string is unescaped and NUL-terminated in place.
+ *
+ * On return:
+ * - either NUL or the separator (comma or equal sign) is returned.
+ * - the length of the string is stored in @len.
+ * - @start is advanced to either the NUL or the first character past the
+ *   separator.
+ */
+static char keyval_fetch_string(char **start, size_t *len, bool key)
+{
+    char sep;
+    char *p, *unescaped;
+    p = unescaped = *start;
+    for (;;) {
+        sep = *p;
+        if (!sep) {
+            break;
+        }
+        if (key && sep == '=') {
+            ++p;
+            break;
+        }
+        if (sep == ',') {
+            if (*++p != ',') {
+                break;
+            }
+        }
+        *unescaped++ = *p++;
+    }
+
+    *unescaped = 0;
+    *len = unescaped - *start;
+    *start = p;
+    return sep;
+}
+
 /*
  * Parse one parameter from @params.
  *
@@ -179,35 +215,42 @@ static QObject *keyval_parse_put(QDict *cur,
  * On success, return a pointer to the next parameter, or else to '\0'.
  * On failure, return NULL.
  */
-static const char *keyval_parse_one(QDict *qdict, const char *params,
-                                    const char *implied_key, bool *help,
-                                    Error **errp)
+static char *keyval_parse_one(QDict *qdict, char *params,
+                              const char *implied_key, bool *help,
+                              Error **errp)
 {
-    const char *key, *key_end, *val_end, *s, *end;
+    const char *key, *key_end, *s, *end;
+    const char *val = NULL;
+    char sep;
     size_t len;
     char key_in_cur[128];
     QDict *cur;
     int ret;
     QObject *next;
-    QString *val;
 
     key = params;
-    val_end = NULL;
-    len = strcspn(params, "=,");
-    if (len && key[len] != '=') {
-        if (starts_with_help_option(key) == len) {
+    sep = keyval_fetch_string(&params, &len, true);
+    if (!len) {
+        if (sep) {
+            error_setg(errp, "Expected parameter before '%c'", sep);
+        } else {
+            error_setg(errp, "Expected parameter at end of string");
+        }
+        return NULL;
+    }
+    if (sep != '=') {
+        if (is_help_option(key)) {
             *help = true;
-            s = key + len;
-            if (*s == ',') {
-                s++;
-            }
-            return s;
+            return params;
         }
         if (implied_key) {
             /* Desugar implied key */
+            val = key;
             key = implied_key;
-            val_end = params + len;
             len = strlen(implied_key);
+        } else {
+            error_setg(errp, "Expected '=' after parameter '%s'", key);
+            return NULL;
         }
     }
     key_end = key + len;
@@ -218,7 +261,7 @@ static const char *keyval_parse_one(QDict *qdict, const char *params,
      */
     cur = qdict;
     s = key;
-    for (;;) {
+    do {
         /* Want a key index (unless it's first) or a QAPI name */
         if (s != key && key_to_index(s, &end) >= 0) {
             len = end - s;
@@ -254,46 +297,16 @@ static const char *keyval_parse_one(QDict *qdict, const char *params,
         memcpy(key_in_cur, s, len);
         key_in_cur[len] = 0;
         s += len;
+    } while (*s++ == '.');
 
-        if (*s != '.') {
-            break;
-        }
-        s++;
-    }
-
-    if (key == implied_key) {
-        assert(!*s);
-        val = qstring_from_substr(params, 0, val_end - params);
-        s = val_end;
-        if (*s == ',') {
-            s++;
-        }
-    } else {
-        if (*s != '=') {
-            error_setg(errp, "Expected '=' after parameter '%.*s'",
-                       (int)(s - key), key);
-            return NULL;
-        }
-        s++;
-
-        val = qstring_new();
-        for (;;) {
-            if (!*s) {
-                break;
-            } else if (*s == ',') {
-                s++;
-                if (*s != ',') {
-                    break;
-                }
-            }
-            qstring_append_chr(val, *s++);
-        }
+    if (key != implied_key) {
+        val = params;
+        keyval_fetch_string(&params, &len, false);
     }
-
     if (!keyval_parse_put(cur, key_in_cur, val, key, key_end, errp)) {
         return NULL;
     }
-    return s;
+    return params;
 }
 
 static char *reassemble_key(GSList *key)
@@ -438,10 +451,11 @@ QDict *keyval_parse(const char *params, const char *implied_key,
 {
     QDict *qdict = qdict_new();
     QObject *listified;
-    const char *s;
+    g_autofree char *dup;
+    char *s;
     bool help = false;
 
-    s = params;
+    s = dup = g_strdup(params);
     while (*s) {
         s = keyval_parse_one(qdict, s, implied_key, &help, errp);
         if (!s) {
-- 
2.26.2




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] keyval: simplify keyval_parse_one
  2020-11-11 10:45 [PATCH 0/2] keyval: accept escaped commas in implied option Paolo Bonzini
  2020-11-11 10:45 ` [PATCH 1/2] " Paolo Bonzini
@ 2020-11-11 10:45 ` Paolo Bonzini
  1 sibling, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2020-11-11 10:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

Now that the key is NULL terminated, we can remove some of the contortions
that were done to operate on possibly '='-terminated strings in
keyval_parse_one.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/keyval.c | 27 ++++++++++-----------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/util/keyval.c b/util/keyval.c
index 2213a5fe78..76daab0885 100644
--- a/util/keyval.c
+++ b/util/keyval.c
@@ -170,11 +170,10 @@ static QObject *keyval_parse_put(QDict *cur,
  *
  * On return:
  * - either NUL or the separator (comma or equal sign) is returned.
- * - the length of the string is stored in @len.
  * - @start is advanced to either the NUL or the first character past the
  *   separator.
  */
-static char keyval_fetch_string(char **start, size_t *len, bool key)
+static char keyval_fetch_string(char **start, bool key)
 {
     char sep;
     char *p, *unescaped;
@@ -197,7 +196,6 @@ static char keyval_fetch_string(char **start, size_t *len, bool key)
     }
 
     *unescaped = 0;
-    *len = unescaped - *start;
     *start = p;
     return sep;
 }
@@ -219,7 +217,7 @@ static char *keyval_parse_one(QDict *qdict, char *params,
                               const char *implied_key, bool *help,
                               Error **errp)
 {
-    const char *key, *key_end, *s, *end;
+    const char *key, *s, *end;
     const char *val = NULL;
     char sep;
     size_t len;
@@ -229,8 +227,8 @@ static char *keyval_parse_one(QDict *qdict, char *params,
     QObject *next;
 
     key = params;
-    sep = keyval_fetch_string(&params, &len, true);
-    if (!len) {
+    sep = keyval_fetch_string(&params, true);
+    if (!*key) {
         if (sep) {
             error_setg(errp, "Expected parameter before '%c'", sep);
         } else {
@@ -247,13 +245,11 @@ static char *keyval_parse_one(QDict *qdict, char *params,
             /* Desugar implied key */
             val = key;
             key = implied_key;
-            len = strlen(implied_key);
         } else {
             error_setg(errp, "Expected '=' after parameter '%s'", key);
             return NULL;
         }
     }
-    key_end = key + len;
 
     /*
      * Loop over key fragments: @s points to current fragment, it
@@ -269,24 +265,21 @@ static char *keyval_parse_one(QDict *qdict, char *params,
             ret = parse_qapi_name(s, false);
             len = ret < 0 ? 0 : ret;
         }
-        assert(s + len <= key_end);
-        if (!len || (s + len < key_end && s[len] != '.')) {
+        if (!len || (s[len] != '\0' && s[len] != '.')) {
             assert(key != implied_key);
-            error_setg(errp, "Invalid parameter '%.*s'",
-                       (int)(key_end - key), key);
+            error_setg(errp, "Invalid parameter '%s'", key);
             return NULL;
         }
         if (len >= sizeof(key_in_cur)) {
             assert(key != implied_key);
             error_setg(errp, "Parameter%s '%.*s' is too long",
-                       s != key || s + len != key_end ? " fragment" : "",
+                       s != key || s[len] == '.' ? " fragment" : "",
                        (int)len, s);
             return NULL;
         }
 
         if (s != key) {
-            next = keyval_parse_put(cur, key_in_cur, NULL,
-                                    key, s - 1, errp);
+            next = keyval_parse_put(cur, key_in_cur, NULL, key, s - 1, errp);
             if (!next) {
                 return NULL;
             }
@@ -301,9 +294,9 @@ static char *keyval_parse_one(QDict *qdict, char *params,
 
     if (key != implied_key) {
         val = params;
-        keyval_fetch_string(&params, &len, false);
+        keyval_fetch_string(&params, false);
     }
-    if (!keyval_parse_put(cur, key_in_cur, val, key, key_end, errp)) {
+    if (!keyval_parse_put(cur, key_in_cur, val, key, s - 1, errp)) {
         return NULL;
     }
     return params;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-11 10:45 ` [PATCH 1/2] " Paolo Bonzini
@ 2020-11-11 10:53   ` Daniel P. Berrangé
  2020-11-11 11:05     ` Paolo Bonzini
  2020-11-11 11:14     ` Mark Cave-Ayland
  2020-11-27  8:38   ` Markus Armbruster
  1 sibling, 2 replies; 10+ messages in thread
From: Daniel P. Berrangé @ 2020-11-11 10:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, armbru

On Wed, Nov 11, 2020 at 05:45:20AM -0500, Paolo Bonzini wrote:
> This is used with the weirdly-named device "SUNFD,two", so accepting it
> is also a preparatory step towards keyval-ifying -device and the
> device_add monitor command.  But in general it is an unexpected wart
> of the keyval syntax and leads to suboptimal errors compared to QemuOpts:

If "SUNFD,two" is the only wierdly named device, can we just rename
it to get rid of the comma, and then put validation in QOM to forbid
commas entirely.  eg rename it to "SUNFD-two"

Just have a targetted hack in vl.c to replace any use of "SUNFD,two"
with the new name before parsing in keyval, if we care enough about
back compat for this niche hardware device.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-11 10:53   ` Daniel P. Berrangé
@ 2020-11-11 11:05     ` Paolo Bonzini
  2020-11-11 11:14     ` Mark Cave-Ayland
  1 sibling, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2020-11-11 11:05 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, armbru

On 11/11/20 11:53, Daniel P. Berrangé wrote:
> On Wed, Nov 11, 2020 at 05:45:20AM -0500, Paolo Bonzini wrote:
>> This is used with the weirdly-named device "SUNFD,two", so accepting it
>> is also a preparatory step towards keyval-ifying -device and the
>> device_add monitor command.  But in general it is an unexpected wart
>> of the keyval syntax and leads to suboptimal errors compared to QemuOpts:
> 
> If "SUNFD,two" is the only wierdly named device, can we just rename
> it to get rid of the comma, and then put validation in QOM to forbid
> commas entirely.  eg rename it to "SUNFD-two"
> 
> Just have a targetted hack in vl.c to replace any use of "SUNFD,two"
> with the new name before parsing in keyval, if we care enough about
> back compat for this niche hardware device.

See the rest of the commit message.  The patch improves error messages 
as a side effect, and in my opinion also the code.  So it can be 
considered independent of the original reason why it was developed.

Paolo



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-11 10:53   ` Daniel P. Berrangé
  2020-11-11 11:05     ` Paolo Bonzini
@ 2020-11-11 11:14     ` Mark Cave-Ayland
  1 sibling, 0 replies; 10+ messages in thread
From: Mark Cave-Ayland @ 2020-11-11 11:14 UTC (permalink / raw)
  To: Daniel P. Berrangé, Paolo Bonzini; +Cc: qemu-devel, armbru

On 11/11/2020 10:53, Daniel P. Berrangé wrote:

> On Wed, Nov 11, 2020 at 05:45:20AM -0500, Paolo Bonzini wrote:
>> This is used with the weirdly-named device "SUNFD,two", so accepting it
>> is also a preparatory step towards keyval-ifying -device and the
>> device_add monitor command.  But in general it is an unexpected wart
>> of the keyval syntax and leads to suboptimal errors compared to QemuOpts:
> 
> If "SUNFD,two" is the only wierdly named device, can we just rename
> it to get rid of the comma, and then put validation in QOM to forbid
> commas entirely.  eg rename it to "SUNFD-two"
> 
> Just have a targetted hack in vl.c to replace any use of "SUNFD,two"
> with the new name before parsing in keyval, if we care enough about
> back compat for this niche hardware device.

There's also SUNW,tcx I can think of immediately, plus there's a few others mentioned 
in hw/sparc/sun4m.c. The reason these names have commas is simply because that's the 
name of the corresponding node in the PROM device tree.

A quick grep suggests that all these devices are in-built: the selection between 
SUNW,tcx and cg3 framebuffers is decided with the -vga command line option, so 
potentially these could be renamed without any user-visible changes.


ATB,

Mark.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-11 10:45 ` [PATCH 1/2] " Paolo Bonzini
  2020-11-11 10:53   ` Daniel P. Berrangé
@ 2020-11-27  8:38   ` Markus Armbruster
  2020-11-27  9:15     ` Paolo Bonzini
  1 sibling, 1 reply; 10+ messages in thread
From: Markus Armbruster @ 2020-11-27  8:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> This is used with the weirdly-named device "SUNFD,two", so accepting it

"SUNW,fdtwo"

> is also a preparatory step towards keyval-ifying -device and the
> device_add monitor command.

Not quite.  Device "SUNW,fdtwo" has user_creatable = false (it's a
sysbus device), and therefore isn't available with -device or
device_add.

There are more device names containing comma, but only one is available
with -device or device_add: xlnx,zynqmp-pmu-soc.  I doubt it makes sense
there.

Parameter values with comma need special treatment before and after the
patch.  Before the patch, you have to use the unsugared form and double
the commas.  After the patch, that still works, but you can now *also*
use the sugared form and double the commas.  Not much of an improvement,
I'm afraid.

Of course, we aren't really after improvement, we're after switching to
keyval_parse() with fewer compatiblity issues.  But this issue only
exists where values with comma are valid.

None of the QemuOpts I looked at take such values, except for -device
xlnx,zynqmp-pmu-soc, which I believe to be useless.  If an actual use
exists, I missed it.

Permit me to digress: we should kill the anti-social device names
regardless of this patch.

I want device names to conform to QAPI naming rules for enum values.  If
we ever do QAPI-schema-based device configuration, "driver" changes from
str to enum.  Even if we're ready to reject QAPI-schema-based device
configuration as impractical forever, there's still a consistency
argument: names of things should conform to a common set of rules.

This applies to all implied parameters with an emumeration-like value.
I'm not aware of other offenders, though.

>                              But in general it is an unexpected wart
> of the keyval syntax and leads to suboptimal errors compared to QemuOpts:
>
>   $ ./qemu-system-x86_64 -object foo,,bar,id=obj
>   qemu-system-x86_64: -object foo,,bar,id=obj: invalid object type: foo,bar
>   $ storage-daemon/qemu-storage-daemon --object foo,,bar,id=obj
>   qemu-storage-daemon: Invalid parameter ''

The suboptimal error message is due to the way I coded the parser, not
due to the grammar.

> To implement this, the flow of the parser is changed to first unescape
> everything up to the next comma or equal sign.  This is done in a
> new function keyval_fetch_string for both the key and value part.
> Keys therefore are now parsed in unescaped form, but this makes no
> difference in practice because a comma is an invalid character for a
> QAPI name.  Thus keys with a comma in them are rejected anyway, as
> demonstrated by the new testcase.
>
> As a side effect of the new code, parse errors are slightly improved as
> well: "Invalid parameter ''" becomes "Expected parameter before '='"
> when keyval is fed a string starting with an equal sign.

Yes, my parse errors are less than friendly.  Let's review some of them.
I'm testing with

    $ qemu-storage-daemon --nbd $ARG

because that one doesn't have an implied key, which permits slightly
simpler $ARG.

* Empty key

  --nbd ,

    master:       Invalid parameter ''
    your patch:   Expected parameter before ','

    Improvement.

  --nbd key=val,=,fob=

    master:       Invalid parameter ''
    your patch:   Expected parameter before '='

    Improvement, but which '='?  Possibly better:

                  Expected parameter before '=,fob='

* Empty key fragment

  --nbd key..=

    master:       Invalid parameter 'key..'
    your patch:   same

    Slightly better, I think:

                  Name or number expected after 'key.'

  --nbd .key=

    master:       Invalid parameter '..key'
    your patch:   same

    Better, I think:

                  Name expected before '..key'

  Odd: if I omit the '=', your patch's message often changes to

                  Expected '=' after parameter ...

  This means the parser reports a non-first syntax error.  Parsing
  smell, I'm afraid :)

* Invalid key fragment

  --nbd _=

    master:       Invalid parameter '_'
    your patch:   same

  --nbd key.1a.b=

    master:       Invalid parameter 'key.1a.b'
    your patch:   same

    Slightly better, I think:

                  'key.1a' is not a valid parameter name

  --ndb anti,,social,,key=

    master:       Expected '=' after parameter 'anti'
    your patch:   Invalid parameter 'anti,social,key'

    The new error message shows the *unescaped* string.  Okay.

> The slightly baroque interface of keyval_fetch_string lets me keep the
> key parsing loop mostly untouched.  It is simplified in the next patch,
> however.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

I believe there are two, maybe three reasons for this series:

1. Support ',' in values with an implicit keys.

2. Improve error reporting.

3. Maybe nicer code.

1. is a feature without a known use.  2. can be had with much less churn
(I'm ready to back that up with a patch).  Since I haven't looked at
PATCH 2, I'm reserving judgement on 3.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-27  8:38   ` Markus Armbruster
@ 2020-11-27  9:15     ` Paolo Bonzini
  2020-11-27 14:39       ` Markus Armbruster
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2020-11-27  9:15 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel

[huge snip]

On 27/11/20 09:38, Markus Armbruster wrote:
> The suboptimal error message is due to the way I coded the parser, not
> due to the grammar.

Small digression: a different grammar influences how the parser is 
written.  You coded the parser like this because you thought of implied 
options as "key without ="; instead I thought of them as "value not 
preceded by key=".

> 
>   --nbd key=val,=,fob=
> 
>     master:       Invalid parameter ''
>     your patch:   Expected parameter before '='
> 
>     Improvement, but which '='?  Possibly better:
> 
>                   Expected parameter before '=,fob='

Yup, easy.

>    --nbd .key=
> 
>      master:       Invalid parameter '..key'
>      your patch:   same
> 
>      Better, I think:
> 
>                    Name expected before '..key'
> 
>    Odd: if I omit the '=', your patch's message often changes to
> 
>                    Expected '=' after parameter ...
> 
>    This means the parser reports a non-first syntax error.  Parsing
>    smell, I'm afraid :)

Nah, just lazy cut-and-paste of the existing error message.  I should 
rename that error to something "No implicit parameter name for '.key'" 
(again, different grammar -> different parser -> different error).  That 
error message actually makes sense: "--object .key" would create an 
object of type ".key" both without or with these changes.

> * Invalid key fragment
> 
>    --nbd key.1a.b=
> 
>      master:       Invalid parameter 'key.1a.b'
>      your patch:   same
> 
>      Slightly better, I think:
> 
>                    'key.1a' is not a valid parameter name

Or just "Invalid parameter '1a'".  I'm not going to do that in v2 
though, since parameter parsing is not being

> I believe there are two, maybe three reasons for this series:
> 
> 1. Support ',' in values with an implicit keys.
> 
> 2. Improve error reporting.
> 
> 3. Maybe nicer code.
> 
> 1. is a feature without a known use.

Breaking news: there is actually a use.  I should have pointed out in 
the commit message, but I didn't realize at the time, that this patch 
fixes device-introspect-test once device_add is switched to keyval-based 
parsing.  And why is that?  Because even though SUNW,fdtwo is not 
user-creatable, you can still do "device_add SUNW,,fdtwo,help".  It even 
works from the command line:

$ qemu-system-sparc -device SUNW,,fdtwo,help
SUNW,fdtwo options:
   drive=<str>            - Node name or ID of a block device to use as 
a backend
   fallback=<FdcDriveType> - FDC drive type, 144/288/120/none/auto 
(default: "144")
   ...

This invocation is useful (for some value of useful) to see which 
properties you can pass with -global.  So there *is* a valid (for some 
value of valid) use of escaped commas in implied options.  It can be 
fixed with deprecation etc. but that would be a more complicated 
endeavor than just adjusting keyval.

> 2. can be had with much less churn
> (I'm ready to back that up with a patch).  Since I haven't looked at
> PATCH 2, I'm reserving judgement on 3.

FWIW I think this patch is already an improvement in code niceness, 
though I accept that it's in the eye of the beholder.

Paolo



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-27  9:15     ` Paolo Bonzini
@ 2020-11-27 14:39       ` Markus Armbruster
  2020-11-27 15:39         ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Markus Armbruster @ 2020-11-27 14:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> [huge snip]
>
> On 27/11/20 09:38, Markus Armbruster wrote:
>> The suboptimal error message is due to the way I coded the parser, not
>> due to the grammar.
>
> Small digression: a different grammar influences how the parser is 
> written.  You coded the parser like this because you thought of implied 
> options as "key without ="; instead I thought of them as "value not 
> preceded by key=".

Which requires relaxing the key syntax, as you did.  Without that, we'd
need unbounded lookahead to decide how to reduce a comma.

>> 
>>   --nbd key=val,=,fob=
>> 
>>     master:       Invalid parameter ''
>>     your patch:   Expected parameter before '='
>> 
>>     Improvement, but which '='?  Possibly better:
>> 
>>                   Expected parameter before '=,fob='
>
> Yup, easy.
>
>>    --nbd .key=
>> 
>>      master:       Invalid parameter '..key'
>>      your patch:   same
>> 
>>      Better, I think:
>> 
>>                    Name expected before '..key'
>> 
>>    Odd: if I omit the '=', your patch's message often changes to
>> 
>>                    Expected '=' after parameter ...
>> 
>>    This means the parser reports a non-first syntax error.  Parsing
>>    smell, I'm afraid :)
>
> Nah, just lazy cut-and-paste of the existing error message.  I should 
> rename that error to something "No implicit parameter name for '.key'" 
> (again, different grammar -> different parser -> different error).  That 
> error message actually makes sense: "--object .key" would create an 
> object of type ".key" both without or with these changes.

However, --object a=b,.key would not, because the sugar is available
for the leftmost value only.

"No implicit parameter name" assumes the user intended .key as a value,
and forgot to write the key.  We could instead assume the user intended
.key as key, and messed it up (forgot a fragment, fat-fingered '.',
whatever).  The absence of '=' makes the value assumption more
plausible, but that's already lookahead.

Error messages based on guesses what the user has in mind can be quite
confusing when we guess wrong.  A strictly factual syntax error style
like "I expected FOO instead of BAR here" may not be great, but has a
relatively low risk of being confusing.

>> * Invalid key fragment
>> 
>>    --nbd key.1a.b=
>> 
>>      master:       Invalid parameter 'key.1a.b'
>>      your patch:   same
>> 
>>      Slightly better, I think:
>> 
>>                    'key.1a' is not a valid parameter name
>
> Or just "Invalid parameter '1a'".  I'm not going to do that in v2 
> though, since parameter parsing is not being

Sentence not being finished?

>> I believe there are two, maybe three reasons for this series:
>> 
>> 1. Support ',' in values with an implicit keys.
>> 
>> 2. Improve error reporting.
>> 
>> 3. Maybe nicer code.
>> 
>> 1. is a feature without a known use.
>
> Breaking news: there is actually a use.  I should have pointed out in 
> the commit message, but I didn't realize at the time, that this patch 
> fixes device-introspect-test once device_add is switched to keyval-based 
> parsing.  And why is that?  Because even though SUNW,fdtwo is not 
> user-creatable, you can still do "device_add SUNW,,fdtwo,help".  It even 
> works from the command line:
>
> $ qemu-system-sparc -device SUNW,,fdtwo,help
> SUNW,fdtwo options:
>    drive=<str>            - Node name or ID of a block device to use as 
> a backend
>    fallback=<FdcDriveType> - FDC drive type, 144/288/120/none/auto 
> (default: "144")
>    ...

Right.  I actually had that knowledge filed in my brain, but it failed
to bubble up.

It fixes device-introspect-test only because you also fixed the test to
escape comma (commit e27bd498769, in rc1).  Quoting myself: "Parameter
values with comma need special treatment before and after the patch."

> This invocation is useful (for some value of useful) to see which 
> properties you can pass with -global.  So there *is* a valid (for some 
> value of valid) use of escaped commas in implied options.  It can be 
> fixed with deprecation etc. but that would be a more complicated 
> endeavor than just adjusting keyval.

The question becomes whether CLI help syntax is subject to the
compatibility promise.

It's certainly not something we'd want programs to use.  We provide QMP
commands for the purpose.

For human users, the usability goodness of keeping

    -device SUNW,,fdtwo,help

working would in my opinion be dwarved several times over by renaming
the the offenders so that you don't need arcane knowledge "double the
comma" just to get help.  We should do that regardless of this patch.

>> 2. can be had with much less churn
>> (I'm ready to back that up with a patch).  Since I haven't looked at
>> PATCH 2, I'm reserving judgement on 3.
>
> FWIW I think this patch is already an improvement in code niceness, 
> though I accept that it's in the eye of the beholder.
>
> Paolo



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] keyval: accept escaped commas in implied option
  2020-11-27 14:39       ` Markus Armbruster
@ 2020-11-27 15:39         ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2020-11-27 15:39 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel

On 27/11/20 15:39, Markus Armbruster wrote:
>> Nah, just lazy cut-and-paste of the existing error message.  I should
>> rename that error to something "No implicit parameter name for '.key'"
>> (again, different grammar -> different parser -> different error).  That
>> error message actually makes sense: "--object .key" would create an
>> object of type ".key" both without or with these changes.
> However, --object a=b,.key would not, because the sugar is available
> for the leftmost value only.
> 
> "No implicit parameter name" assumes the user intended .key as a value,
> and forgot to write the key.  We could instead assume the user intended
> .key as key, and messed it up (forgot a fragment, fat-fingered '.',
> whatever).  The absence of '=' makes the value assumption more
> plausible, but that's already lookahead.

To be fair, lookahead is a common trick to get better error messages.

The typical example is C's "id1 id2".  After "id1 id2" you already know 
it's a syntax error, but you do some lookahead because "id1 id2;" can be 
recovered as "id1 was supposed to be a type, so treat this as declaring 
a variable id2".  "id1 id2)" is not handled the same way.

Of course that's done for a completely different reason (cascading error 
messages---QEMU only reports the first), but it goes to show that 
parsing ahead is not necessarily a bad idea

> Error messages based on guesses what the user has in mind can be quite
> confusing when we guess wrong.  A strictly factual syntax error style
> like "I expected FOO instead of BAR here" may not be great, but has a
> relatively low risk of being confusing.

This is true.  That's a point in favor of "Expected '=' after parameter".

>>>
>>>
>>>      master:       Invalid parameter 'key.1a.b'
>>>      your patch:   same
>>>
>>>      Slightly better, I think:
>>>
>>>                    'key.1a' is not a valid parameter name
>>
>> Or just "Invalid parameter '1a'".  I'm not going to do that in v2 
>> though, since parameter parsing is not being
> 
> Sentence not being finished?

not being modified.

>> This invocation is useful (for some value of useful) to see which 
>> properties you can pass with -global.  So there *is* a valid (for some 
>> value of valid) use of escaped commas in implied options.  It can be 
>> fixed with deprecation etc. but that would be a more complicated 
>> endeavor than just adjusting keyval.
> 
> The question becomes whether CLI help syntax is subject to the
> compatibility promise.

Indeed.  But I still don't see it as a good reason _not_ to do the 
change, as I find the modified definition (grammar, code, etc.) to be 
easier on the brain too.

Paolo



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-27 15:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11 10:45 [PATCH 0/2] keyval: accept escaped commas in implied option Paolo Bonzini
2020-11-11 10:45 ` [PATCH 1/2] " Paolo Bonzini
2020-11-11 10:53   ` Daniel P. Berrangé
2020-11-11 11:05     ` Paolo Bonzini
2020-11-11 11:14     ` Mark Cave-Ayland
2020-11-27  8:38   ` Markus Armbruster
2020-11-27  9:15     ` Paolo Bonzini
2020-11-27 14:39       ` Markus Armbruster
2020-11-27 15:39         ` Paolo Bonzini
2020-11-11 10:45 ` [PATCH 2/2] keyval: simplify keyval_parse_one Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.