All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3)
@ 2013-01-16 18:28 Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod, Laszlo Ersek

Changes v2 -> v3:
 - Add 'base' parameter to parse_uint*() (patch 1/8)
 - Keep existing base=10 behavior when parsing "nodeid" and "cpus"
   (patches 6/8, 8/8)
 - Trivial whitespace change on patch 1/8
 - Fix fprintf() format string on patch 5/8

This series contains only the most important fixes from the previous "-numa
option parsing fixes & improvements" series I have submitted.

I have introduced parse_uint*() helpers that can be reused by other code, later.
I plan to submit parse_int*() (for signed integers) and parse_double*()
functions too, later, and change string-input-visitor.c and opts-visitor.c to
use those common functions instead of duplicating the number parsing code.


Eduardo Habkost (8):
  cutils: unsigned int parsing functions
  vl.c: Fix off-by-one bug when handling "-numa node" argument
  vl.c: Abort on unknown -numa option type
  vl.c: Check for NUMA node limit inside numa_add()
  vl.c: numa_add(): Validate nodeid before using it
  vl.c: Use parse_uint_full() for NUMA nodeid
  vl.c: Extract -numa "cpus" parsing to separate function
  vl.c: validate -numa "cpus" parameter properly

 include/qemu-common.h |   4 +
 tests/Makefile        |   3 +
 tests/test-cutils.c   | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++
 util/cutils.c         |  79 ++++++++++++++++++
 vl.c                  |  93 ++++++++++++++++------
 5 files changed, 370 insertions(+), 25 deletions(-)
 create mode 100644 tests/test-cutils.c

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-17 18:29   ` Laszlo Ersek
  2013-01-18 10:01   ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 2/8] vl.c: Fix off-by-one bug when handling "-numa node" argument Eduardo Habkost
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Laszlo Ersek, Chegu Vinod

There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().

Parsing functions for signed ints and floats will be submitted later.

parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:

 - Check for NULL (returns -EINVAL)
 - Check for negative numbers (returns -ERANGE)
 - Check for empty string (returns -EINVAL)
 - Check for overflow or other errno values set by strtoll() (returns
   -errno)
 - Check for end of string (reject invalid characters after number)
   (returns -EINVAL)

parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.

Unit tests included.

[1] string-input-visitor.c:parse_int() could use the same parsing code
    used by opts-visitor.c:opts_type_int(), instead of duplicating that
    logic.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Blake <eblake@redhat.com>

Changes v2:
 - Trivial whitespace change
 - Add 'base' parameter to the functions
---
 include/qemu-common.h |   4 +
 tests/Makefile        |   3 +
 tests/test-cutils.c   | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++
 util/cutils.c         |  79 ++++++++++++++++++
 4 files changed, 302 insertions(+)
 create mode 100644 tests/test-cutils.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index ca464bb..f134629 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -170,6 +170,10 @@ int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
 
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base);
+int parse_uint_full(const char *s, unsigned long long *value, int base);
+
 /*
  * strtosz() suffixes used to specify the default treatment of an
  * argument passed to strtosz() without an explicit suffix.
diff --git a/tests/Makefile b/tests/Makefile
index d97a571..2dba3e5 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
 check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
+check-unit-y += tests/test-cutils$(EXESUF)
+gcov-files-test-cutils-y += util/cutils.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -84,6 +86,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
+tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-cutils.c b/tests/test-cutils.c
new file mode 100644
index 0000000..f4f6951
--- /dev/null
+++ b/tests/test-cutils.c
@@ -0,0 +1,216 @@
+/*
+ * cutils.c unit-tests
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Eduardo Habkost <ehabkost@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+
+
+static void test_parse_uint_null(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    int r;
+
+    r = parse_uint(NULL, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == NULL);
+}
+
+static void test_parse_uint_empty(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+static void test_parse_uint_trailing(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + 3);
+}
+
+static void test_parse_uint_correct(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_octal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 0123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_decimal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 10);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+
+static void test_parse_uint_llong_max(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    char *str = g_strdup_printf("%llu", (unsigned long long)LLONG_MAX + 1);
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, (unsigned long long)LLONG_MAX + 1);
+    g_assert(endptr == str + strlen(str));
+
+    g_free(str);
+}
+
+static void test_parse_uint_overflow(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "99999999999999999999999999999999999999";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, ULLONG_MAX);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_negative(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t -321";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str + 3);
+}
+
+
+static void test_parse_uint_full_trailing(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 123);
+}
+
+static void test_parse_uint_full_correct(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
+    g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
+    g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
+    g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
+    g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
+    g_test_add_func("/cutils/parse_uint/decimal", test_parse_uint_decimal);
+    g_test_add_func("/cutils/parse_uint/llong_max", test_parse_uint_llong_max);
+    g_test_add_func("/cutils/parse_uint/overflow", test_parse_uint_overflow);
+    g_test_add_func("/cutils/parse_uint/negative", test_parse_uint_negative);
+    g_test_add_func("/cutils/parse_uint_full/trailing",
+                    test_parse_uint_full_trailing);
+    g_test_add_func("/cutils/parse_uint_full/correct",
+                    test_parse_uint_full_correct);
+
+    return g_test_run();
+}
diff --git a/util/cutils.c b/util/cutils.c
index 80bb1dc..7f09251 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -270,6 +270,85 @@ int64_t strtosz(const char *nptr, char **end)
     return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
 }
 
+/* Try to parse an unsigned integer
+ *
+ * Error checks done by the function:
+ * - NULL pointer will return -EINVAL.
+ * - Empty strings will return -EINVAL.
+ * - Overflow errors or other errno values  set by strtoull() will
+ *   return -errno (-ERANGE in case of overflow).
+ * - Differently from strtoull(), values starting with a minus sign are
+ *   rejected (returning -ERANGE).
+ *
+ * Sets endptr to point to the first invalid character. Callers may rely
+ * on *value and *endptr to be always set by the function, even in case of
+ * errors.
+ *
+ * The 'base' parameter has the same meaning of 'base' on strtoull().
+ *
+ * Returns 0 on success, negative errno value on error.
+ */
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base)
+{
+    int r = 0;
+    char *endp = (char *)s;
+    unsigned long long val = 0;
+
+    if (!s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+    /* make sure we reject negative numbers: */
+    while (isspace((unsigned char)*s)) {
+        ++s; endp++;
+    }
+    if (*s == '-') {
+        r = -ERANGE;
+        goto out;
+    }
+
+    errno = 0;
+    val = strtoull(s, &endp, base);
+    if (errno) {
+        r = -errno;
+        goto out;
+    }
+
+    if (endp == s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+out:
+    *value = val;
+    *endptr = endp;
+    return r;
+}
+
+/* Try to parse an unsigned integer, making sure the whole string is parsed
+ *
+ * Have the same behavior of parse_uint(), but with an additional check
+ * for additional data after the parsed number (in that case, the function
+ * will return -EINVAL).
+ */
+int parse_uint_full(const char *s, unsigned long long *value, int base)
+{
+    char *endp;
+    int r;
+
+    r = parse_uint(s, value, &endp, base);
+    if (r < 0) {
+        return r;
+    }
+    if (*endp) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 int qemu_parse_fd(const char *param)
 {
     int fd;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 2/8] vl.c: Fix off-by-one bug when handling "-numa node" argument
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 3/8] vl.c: Abort on unknown -numa option type Eduardo Habkost
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

The numa_add() code was unconditionally adding 1 to the get_opt_name()
return value, making it point after the end of the string if no ','
separator is present.

Example of weird behavior caused by the bug:

  $ qemu-img create -f qcow2 this-file-image-has,cpus=5,mem=1000,in-its-name.qcow2 5G
  Formatting 'this-file-image-has,cpus=5,mem=1000,in-its-name.qcow2', fmt=qcow2 size=5368709120 encryption=off cluster_size=65536
  $ ./x86_64-softmmu/qemu-system-x86_64 -S -monitor stdio -numa node 'this-file-image-has,cpus=5,mem=1000,in-its-name.qcow2'
  QEMU 1.3.50 monitor - type 'help' for more information
  (qemu) info numa
  1 nodes
  node 0 cpus: 0
  node 0 size: 1000 MB
  (qemu)

This changes the code to nove the pointer only if ',' is found.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 vl.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index 15e0280..f29d926 100644
--- a/vl.c
+++ b/vl.c
@@ -1250,7 +1250,10 @@ static void numa_add(const char *optarg)
 
     value = endvalue = 0ULL;
 
-    optarg = get_opt_name(option, 128, optarg, ',') + 1;
+    optarg = get_opt_name(option, 128, optarg, ',');
+    if (*optarg == ',') {
+        optarg++;
+    }
     if (!strcmp(option, "node")) {
         if (get_param_value(option, 128, "nodeid", optarg) == 0) {
             nodenr = nb_numa_nodes;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 3/8] vl.c: Abort on unknown -numa option type
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 2/8] vl.c: Fix off-by-one bug when handling "-numa node" argument Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 4/8] vl.c: Check for NUMA node limit inside numa_add() Eduardo Habkost
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

Abort in case an invalid -numa option is provided, instead of silently
ignoring it.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 vl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/vl.c b/vl.c
index f29d926..1cf8ba6 100644
--- a/vl.c
+++ b/vl.c
@@ -1290,6 +1290,9 @@ static void numa_add(const char *optarg)
             bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
         }
         nb_numa_nodes++;
+    } else {
+        fprintf(stderr, "Invalid -numa option: %s\n", option);
+        exit(1);
     }
 }
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 4/8] vl.c: Check for NUMA node limit inside numa_add()
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
                   ` (2 preceding siblings ...)
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 3/8] vl.c: Abort on unknown -numa option type Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 5/8] vl.c: numa_add(): Validate nodeid before using it Eduardo Habkost
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

Instead of checking the limit before calling numa_add(), check the limit
only when we already know we're going to add a new node.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
Changes v2:
 - Implement the change without adding numa_node_add() function
---
 vl.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/vl.c b/vl.c
index 1cf8ba6..dabbba1 100644
--- a/vl.c
+++ b/vl.c
@@ -1255,6 +1255,12 @@ static void numa_add(const char *optarg)
         optarg++;
     }
     if (!strcmp(option, "node")) {
+
+        if (nb_numa_nodes >= MAX_NODES) {
+            fprintf(stderr, "qemu: too many NUMA nodes\n");
+            exit(1);
+        }
+
         if (get_param_value(option, 128, "nodeid", optarg) == 0) {
             nodenr = nb_numa_nodes;
         } else {
@@ -2960,10 +2966,6 @@ int main(int argc, char **argv, char **envp)
                 }
                 break;
             case QEMU_OPTION_numa:
-                if (nb_numa_nodes >= MAX_NODES) {
-                    fprintf(stderr, "qemu: too many NUMA nodes\n");
-                    exit(1);
-                }
                 numa_add(optarg);
                 break;
             case QEMU_OPTION_display:
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 5/8] vl.c: numa_add(): Validate nodeid before using it
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
                   ` (3 preceding siblings ...)
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 4/8] vl.c: Check for NUMA node limit inside numa_add() Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 6/8] vl.c: Use parse_uint_full() for NUMA nodeid Eduardo Habkost
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

Without this check, QEMU will corrupt memory if a too-large nodeid is
provided in the command-line. e.g.:

  -numa node,mem=...,cpus=...,nodeid=65

This changes nodenr to unsigned long long, to avoid integer conversion
issues when converting the strtoull() result to int.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Changes v2:
 - Implement change without creation of numa_node_add() function

Changes v3:
 - Fix fprintf() format to use "%llu" for unsigned long long nodenr
---
 vl.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index dabbba1..b39cd9a 100644
--- a/vl.c
+++ b/vl.c
@@ -1246,7 +1246,7 @@ static void numa_add(const char *optarg)
     char option[128];
     char *endptr;
     unsigned long long value, endvalue;
-    int nodenr;
+    unsigned long long nodenr;
 
     value = endvalue = 0ULL;
 
@@ -1267,6 +1267,11 @@ static void numa_add(const char *optarg)
             nodenr = strtoull(option, NULL, 10);
         }
 
+        if (nodenr >= MAX_NODES) {
+            fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr);
+            exit(1);
+        }
+
         if (get_param_value(option, 128, "mem", optarg) == 0) {
             node_mem[nodenr] = 0;
         } else {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 6/8] vl.c: Use parse_uint_full() for NUMA nodeid
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
                   ` (4 preceding siblings ...)
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 5/8] vl.c: numa_add(): Validate nodeid before using it Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 7/8] vl.c: Extract -numa "cpus" parsing to separate function Eduardo Habkost
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

This should catch many kinds of errors that the current code wasn't
checking for:

 - Values that can't be parsed as a number
 - Negative values
 - Overflow
 - Empty string

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Cc: Eric Blake <eblake@redhat.com>

Changes v2:
 - Use base=10 to keep the existing behavior
---
 vl.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index b39cd9a..abc2af5 100644
--- a/vl.c
+++ b/vl.c
@@ -1264,7 +1264,10 @@ static void numa_add(const char *optarg)
         if (get_param_value(option, 128, "nodeid", optarg) == 0) {
             nodenr = nb_numa_nodes;
         } else {
-            nodenr = strtoull(option, NULL, 10);
+            if (parse_uint_full(option, &nodenr, 10) < 0) {
+                fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option);
+                exit(1);
+            }
         }
 
         if (nodenr >= MAX_NODES) {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 7/8] vl.c: Extract -numa "cpus" parsing to separate function
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
                   ` (5 preceding siblings ...)
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 6/8] vl.c: Use parse_uint_full() for NUMA nodeid Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 8/8] vl.c: validate -numa "cpus" parameter properly Eduardo Habkost
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

This will make it easier to refactor that code later.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Changes v2:
 - Implemented change without creation of numa_node_add() function
---
 vl.c | 41 +++++++++++++++++++++++------------------
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/vl.c b/vl.c
index abc2af5..d037193 100644
--- a/vl.c
+++ b/vl.c
@@ -1241,15 +1241,34 @@ char *get_boot_devices_list(uint32_t *size)
     return list;
 }
 
+static void numa_node_parse_cpus(int nodenr, const char *cpus)
+{
+    char *endptr;
+    unsigned long long value, endvalue;
+
+    value = strtoull(cpus, &endptr, 10);
+    if (*endptr == '-') {
+        endvalue = strtoull(endptr+1, &endptr, 10);
+    } else {
+        endvalue = value;
+    }
+
+    if (!(endvalue < MAX_CPUMASK_BITS)) {
+        endvalue = MAX_CPUMASK_BITS - 1;
+        fprintf(stderr,
+            "A max of %d CPUs are supported in a guest\n",
+             MAX_CPUMASK_BITS);
+    }
+
+    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+}
+
 static void numa_add(const char *optarg)
 {
     char option[128];
     char *endptr;
-    unsigned long long value, endvalue;
     unsigned long long nodenr;
 
-    value = endvalue = 0ULL;
-
     optarg = get_opt_name(option, 128, optarg, ',');
     if (*optarg == ',') {
         optarg++;
@@ -1287,21 +1306,7 @@ static void numa_add(const char *optarg)
             node_mem[nodenr] = sval;
         }
         if (get_param_value(option, 128, "cpus", optarg) != 0) {
-            value = strtoull(option, &endptr, 10);
-            if (*endptr == '-') {
-                endvalue = strtoull(endptr+1, &endptr, 10);
-            } else {
-                endvalue = value;
-            }
-
-            if (!(endvalue < MAX_CPUMASK_BITS)) {
-                endvalue = MAX_CPUMASK_BITS - 1;
-                fprintf(stderr,
-                    "A max of %d CPUs are supported in a guest\n",
-                     MAX_CPUMASK_BITS);
-            }
-
-            bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+            numa_node_parse_cpus(nodenr, option);
         }
         nb_numa_nodes++;
     } else {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 8/8] vl.c: validate -numa "cpus" parameter properly
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
                   ` (6 preceding siblings ...)
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 7/8] vl.c: Extract -numa "cpus" parsing to separate function Eduardo Habkost
@ 2013-01-16 18:28 ` Eduardo Habkost
  2013-01-16 20:07 ` [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eric Blake
       [not found] ` <20130128165559.GA6849@otherpad.lan.raisama.net>
  9 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 18:28 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Chegu Vinod

- Accept empty strings without aborting
- Use parse_uint*() to parse numbers
- Abort if anything except '-' or end-of-string is found after the first
  number.
- Check for endvalue < value

Also change the MAX_CPUMASK_BITS warning message from "A max of %d CPUs
are supported in a guest" to "qemu: NUMA: A max of %d VCPUs are
supported".

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Cc: Eric Blake <eblake@redhat.com>

Changes v2:
 - Use base=10 to keep existing behavior
---
 vl.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/vl.c b/vl.c
index d037193..73fefe6 100644
--- a/vl.c
+++ b/vl.c
@@ -1246,21 +1246,43 @@ static void numa_node_parse_cpus(int nodenr, const char *cpus)
     char *endptr;
     unsigned long long value, endvalue;
 
-    value = strtoull(cpus, &endptr, 10);
+    /* Empty CPU range strings will be considered valid, they will simply
+     * not set any bit in the CPU bitmap.
+     */
+    if (!*cpus) {
+        return;
+    }
+
+    if (parse_uint(cpus, &value, &endptr, 10) < 0) {
+        goto error;
+    }
     if (*endptr == '-') {
-        endvalue = strtoull(endptr+1, &endptr, 10);
-    } else {
+        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
+            goto error;
+        }
+    } else if (*endptr == '\0') {
         endvalue = value;
+    } else {
+        goto error;
     }
 
-    if (!(endvalue < MAX_CPUMASK_BITS)) {
+    if (endvalue >= MAX_CPUMASK_BITS) {
         endvalue = MAX_CPUMASK_BITS - 1;
         fprintf(stderr,
-            "A max of %d CPUs are supported in a guest\n",
+            "qemu: NUMA: A max of %d VCPUs are supported\n",
              MAX_CPUMASK_BITS);
     }
 
+    if (endvalue < value) {
+        goto error;
+    }
+
     bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+    return;
+
+error:
+    fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
+    exit(1);
 }
 
 static void numa_add(const char *optarg)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3)
  2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
                   ` (7 preceding siblings ...)
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 8/8] vl.c: validate -numa "cpus" parameter properly Eduardo Habkost
@ 2013-01-16 20:07 ` Eric Blake
       [not found] ` <20130128165559.GA6849@otherpad.lan.raisama.net>
  9 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2013-01-16 20:07 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori

[-- Attachment #1: Type: text/plain, Size: 949 bytes --]

On 01/16/2013 11:28 AM, Eduardo Habkost wrote:
> Changes v2 -> v3:
>  - Add 'base' parameter to parse_uint*() (patch 1/8)
>  - Keep existing base=10 behavior when parsing "nodeid" and "cpus"
>    (patches 6/8, 8/8)
>  - Trivial whitespace change on patch 1/8
>  - Fix fprintf() format string on patch 5/8
> 
> This series contains only the most important fixes from the previous "-numa
> option parsing fixes & improvements" series I have submitted.
> 
> I have introduced parse_uint*() helpers that can be reused by other code, later.
> I plan to submit parse_int*() (for signed integers) and parse_double*()
> functions too, later, and change string-input-visitor.c and opts-visitor.c to
> use those common functions instead of duplicating the number parsing code.

Series:
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
@ 2013-01-17 18:29   ` Laszlo Ersek
  2013-01-17 18:50     ` Eduardo Habkost
  2013-01-18 10:01   ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
  1 sibling, 1 reply; 31+ messages in thread
From: Laszlo Ersek @ 2013-01-17 18:29 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Chegu Vinod, qemu-devel, Anthony Liguori

On 01/16/13 19:28, Eduardo Habkost wrote:

> +static void test_parse_uint_negative(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = " \t -321";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, -ERANGE);
> +    g_assert_cmpint(i, ==, 0);
> +    g_assert(endptr == str + 3);
> +}

I think it would be more true to the strtol() family if in this case

(a) we reported -EINVAL (invalid subject sequence) -- but I certainly
don't insist on that,

(b) and, independently,

(b1) we either consumed all of the whitespace sequence *and* the subject
sequence (which would be consistent with ERANGE; see
test_parse_uint_overflow()),

(b2) or we didn't consume anything (not even part of the whitespace
sequence). This would be easy to implement and also consistent with the
strtol() family's behavior when it sees an invalid subject sequence:

"If the subject sequence is empty or does not have the expected form, no
conversion is performed; the value of /str/ is stored in the object
pointed to by /endptr/, provided that /endptr/ is not a null pointer."

But I don't insist on (b) either :)

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-17 18:29   ` Laszlo Ersek
@ 2013-01-17 18:50     ` Eduardo Habkost
  2013-01-17 19:06       ` [Qemu-devel] [PATCH 1/8 v4] " Eduardo Habkost
  0 siblings, 1 reply; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-17 18:50 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Chegu Vinod, qemu-devel, Anthony Liguori

On Thu, Jan 17, 2013 at 07:29:17PM +0100, Laszlo Ersek wrote:
> On 01/16/13 19:28, Eduardo Habkost wrote:
> 
> > +static void test_parse_uint_negative(void)
> > +{
> > +    unsigned long long i = 999;
> > +    char f = 'X';
> > +    char *endptr = &f;
> > +    const char *str = " \t -321";
> > +    int r;
> > +
> > +    r = parse_uint(str, &i, &endptr, 0);
> > +
> > +    g_assert_cmpint(r, ==, -ERANGE);
> > +    g_assert_cmpint(i, ==, 0);
> > +    g_assert(endptr == str + 3);
> > +}
> 
> I think it would be more true to the strtol() family if in this case
> 
> (a) we reported -EINVAL (invalid subject sequence) -- but I certainly
> don't insist on that,

It makes sense, as we didn't consume any number and simply aborted
parsing as soon as '-' was found.

> 
> (b) and, independently,
> 
> (b1) we either consumed all of the whitespace sequence *and* the subject
> sequence (which would be consistent with ERANGE; see
> test_parse_uint_overflow()),
> 
> (b2) or we didn't consume anything (not even part of the whitespace
> sequence). This would be easy to implement and also consistent with the
> strtol() family's behavior when it sees an invalid subject sequence:

This makes sense as well, especially if we return -EINVAL.

I will submit a new version of just this patch. Thanks!

> 
> "If the subject sequence is empty or does not have the expected form, no
> conversion is performed; the value of /str/ is stored in the object
> pointed to by /endptr/, provided that /endptr/ is not a null pointer."
> 
> But I don't insist on (b) either :)
> 
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>


-- 
Eduardo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 1/8 v4] cutils: unsigned int parsing functions
  2013-01-17 18:50     ` Eduardo Habkost
@ 2013-01-17 19:06       ` Eduardo Habkost
  2013-01-17 19:25         ` Eduardo Habkost
                           ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-17 19:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Chegu Vinod, Laszlo Ersek, Anthony Liguori

There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().

Parsing functions for signed ints and floats will be submitted later.

parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:

 - Check for NULL (returns -EINVAL)
 - Check for negative numbers (returns -ERANGE)
 - Check for empty string (returns -EINVAL)
 - Check for overflow or other errno values set by strtoll() (returns
   -errno)
 - Check for end of string (reject invalid characters after number)
   (returns -EINVAL)

parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.

Unit tests included.

[1] string-input-visitor.c:parse_int() could use the same parsing code
    used by opts-visitor.c:opts_type_int(), instead of duplicating that
    logic.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Changes v2:
 - Trivial whitespace change
 - Add 'base' parameter to the functions

Changes v4:
 - Return -EINVAL in case a minus sign is found
 - Make endptr point to beginning of string in case -EINVAL
   is returned (like the strtoull() behavior)

v3->v4 interdiff:

    diff --git a/tests/test-cutils.c b/tests/test-cutils.c
    index f4f6951..f8395f4 100644
    --- a/tests/test-cutils.c
    +++ b/tests/test-cutils.c
    @@ -61,6 +61,22 @@ static void test_parse_uint_empty(void)
         g_assert(endptr == str);
     }
     
    +static void test_parse_uint_invalid(void)
    +{
    +    unsigned long long i = 999;
    +    char f = 'X';
    +    char *endptr = &f;
    +    const char *str = " \t xxx";
    +    int r;
    +
    +    r = parse_uint(str, &i, &endptr, 0);
    +
    +    g_assert_cmpint(r, ==, -EINVAL);
    +    g_assert_cmpint(i, ==, 0);
    +    g_assert(endptr == str);
    +}
    +
    +
     static void test_parse_uint_trailing(void)
     {
         unsigned long long i = 999;
    @@ -164,9 +180,9 @@ static void test_parse_uint_negative(void)
     
         r = parse_uint(str, &i, &endptr, 0);
     
    -    g_assert_cmpint(r, ==, -ERANGE);
    +    g_assert_cmpint(r, ==, -EINVAL);
         g_assert_cmpint(i, ==, 0);
    -    g_assert(endptr == str + 3);
    +    g_assert(endptr == str);
     }
     
     
    @@ -200,6 +216,7 @@ int main(int argc, char **argv)
     
         g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
         g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
    +    g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
         g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
         g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
         g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
    diff --git a/util/cutils.c b/util/cutils.c
    index 7f09251..ce80a2a 100644
    --- a/util/cutils.c
    +++ b/util/cutils.c
    @@ -278,7 +278,7 @@ int64_t strtosz(const char *nptr, char **end)
      * - Overflow errors or other errno values  set by strtoull() will
      *   return -errno (-ERANGE in case of overflow).
      * - Differently from strtoull(), values starting with a minus sign are
    - *   rejected (returning -ERANGE).
    + *   rejected (returning -EINVAL).
      *
      * Sets endptr to point to the first invalid character. Callers may rely
      * on *value and *endptr to be always set by the function, even in case of
    @@ -294,6 +294,7 @@ int parse_uint(const char *s, unsigned long long *value, char **endptr,
         int r = 0;
         char *endp = (char *)s;
         unsigned long long val = 0;
    +    const char *sp;
     
         if (!s) {
             r = -EINVAL;
    @@ -301,11 +302,12 @@ int parse_uint(const char *s, unsigned long long *value, char **endptr,
         }
     
         /* make sure we reject negative numbers: */
    -    while (isspace((unsigned char)*s)) {
    -        ++s; endp++;
    +    sp = s;
    +    while (isspace((unsigned char)*sp)) {
    +        ++sp;
         }
    -    if (*s == '-') {
    -        r = -ERANGE;
    +    if (*sp == '-') {
    +        r = -EINVAL;
             goto out;
         }

---
 include/qemu-common.h |   4 +
 tests/Makefile        |   3 +
 tests/test-cutils.c   | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++
 util/cutils.c         |  81 ++++++++++++++++++
 4 files changed, 321 insertions(+)
 create mode 100644 tests/test-cutils.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index ca464bb..f134629 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -170,6 +170,10 @@ int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
 
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base);
+int parse_uint_full(const char *s, unsigned long long *value, int base);
+
 /*
  * strtosz() suffixes used to specify the default treatment of an
  * argument passed to strtosz() without an explicit suffix.
diff --git a/tests/Makefile b/tests/Makefile
index d86e95a..e5929cd 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
 check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
+check-unit-y += tests/test-cutils$(EXESUF)
+gcov-files-test-cutils-y += util/cutils.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -86,6 +88,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
+tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-cutils.c b/tests/test-cutils.c
new file mode 100644
index 0000000..f8395f4
--- /dev/null
+++ b/tests/test-cutils.c
@@ -0,0 +1,233 @@
+/*
+ * cutils.c unit-tests
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Eduardo Habkost <ehabkost@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+
+
+static void test_parse_uint_null(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    int r;
+
+    r = parse_uint(NULL, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == NULL);
+}
+
+static void test_parse_uint_empty(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+static void test_parse_uint_invalid(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_trailing(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + 3);
+}
+
+static void test_parse_uint_correct(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_octal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 0123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_decimal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 10);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+
+static void test_parse_uint_llong_max(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    char *str = g_strdup_printf("%llu", (unsigned long long)LLONG_MAX + 1);
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, (unsigned long long)LLONG_MAX + 1);
+    g_assert(endptr == str + strlen(str));
+
+    g_free(str);
+}
+
+static void test_parse_uint_overflow(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "99999999999999999999999999999999999999";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, ULLONG_MAX);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_negative(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t -321";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_full_trailing(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 123);
+}
+
+static void test_parse_uint_full_correct(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
+    g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
+    g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
+    g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
+    g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
+    g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
+    g_test_add_func("/cutils/parse_uint/decimal", test_parse_uint_decimal);
+    g_test_add_func("/cutils/parse_uint/llong_max", test_parse_uint_llong_max);
+    g_test_add_func("/cutils/parse_uint/overflow", test_parse_uint_overflow);
+    g_test_add_func("/cutils/parse_uint/negative", test_parse_uint_negative);
+    g_test_add_func("/cutils/parse_uint_full/trailing",
+                    test_parse_uint_full_trailing);
+    g_test_add_func("/cutils/parse_uint_full/correct",
+                    test_parse_uint_full_correct);
+
+    return g_test_run();
+}
diff --git a/util/cutils.c b/util/cutils.c
index 80bb1dc..ce80a2a 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -270,6 +270,87 @@ int64_t strtosz(const char *nptr, char **end)
     return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
 }
 
+/* Try to parse an unsigned integer
+ *
+ * Error checks done by the function:
+ * - NULL pointer will return -EINVAL.
+ * - Empty strings will return -EINVAL.
+ * - Overflow errors or other errno values  set by strtoull() will
+ *   return -errno (-ERANGE in case of overflow).
+ * - Differently from strtoull(), values starting with a minus sign are
+ *   rejected (returning -EINVAL).
+ *
+ * Sets endptr to point to the first invalid character. Callers may rely
+ * on *value and *endptr to be always set by the function, even in case of
+ * errors.
+ *
+ * The 'base' parameter has the same meaning of 'base' on strtoull().
+ *
+ * Returns 0 on success, negative errno value on error.
+ */
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base)
+{
+    int r = 0;
+    char *endp = (char *)s;
+    unsigned long long val = 0;
+    const char *sp;
+
+    if (!s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+    /* make sure we reject negative numbers: */
+    sp = s;
+    while (isspace((unsigned char)*sp)) {
+        ++sp;
+    }
+    if (*sp == '-') {
+        r = -EINVAL;
+        goto out;
+    }
+
+    errno = 0;
+    val = strtoull(s, &endp, base);
+    if (errno) {
+        r = -errno;
+        goto out;
+    }
+
+    if (endp == s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+out:
+    *value = val;
+    *endptr = endp;
+    return r;
+}
+
+/* Try to parse an unsigned integer, making sure the whole string is parsed
+ *
+ * Have the same behavior of parse_uint(), but with an additional check
+ * for additional data after the parsed number (in that case, the function
+ * will return -EINVAL).
+ */
+int parse_uint_full(const char *s, unsigned long long *value, int base)
+{
+    char *endp;
+    int r;
+
+    r = parse_uint(s, value, &endp, base);
+    if (r < 0) {
+        return r;
+    }
+    if (*endp) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 int qemu_parse_fd(const char *param)
 {
     int fd;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8 v4] cutils: unsigned int parsing functions
  2013-01-17 19:06       ` [Qemu-devel] [PATCH 1/8 v4] " Eduardo Habkost
@ 2013-01-17 19:25         ` Eduardo Habkost
  2013-01-17 19:31         ` Laszlo Ersek
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-17 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laszlo Ersek, Chegu Vinod, Anthony Liguori

On Thu, Jan 17, 2013 at 05:06:50PM -0200, Eduardo Habkost wrote:
> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
> 
> Parsing functions for signed ints and floats will be submitted later.
> 
> parse_uint_full() has all the checks made by opts_type_uint64() at
> opts-visitor.c:
> 
>  - Check for NULL (returns -EINVAL)
>  - Check for negative numbers (returns -ERANGE)

Oops, I forgot to update the commit message. I hope the commiter can
change it before committing, to avoid having to resubmit it again.


>  - Check for empty string (returns -EINVAL)
>  - Check for overflow or other errno values set by strtoll() (returns
>    -errno)
>  - Check for end of string (reject invalid characters after number)
>    (returns -EINVAL)
> 
> parse_uint() does everything above except checking for the end of the
> string, so callers can continue parsing the remainder of string after
> the number.
> 
> Unit tests included.
> 
> [1] string-input-visitor.c:parse_int() could use the same parsing code
>     used by opts-visitor.c:opts_type_int(), instead of duplicating that
>     logic.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
> Changes v2:
>  - Trivial whitespace change
>  - Add 'base' parameter to the functions
> 
> Changes v4:
>  - Return -EINVAL in case a minus sign is found
>  - Make endptr point to beginning of string in case -EINVAL
>    is returned (like the strtoull() behavior)
[...]

-- 
Eduardo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8 v4] cutils: unsigned int parsing functions
  2013-01-17 19:06       ` [Qemu-devel] [PATCH 1/8 v4] " Eduardo Habkost
  2013-01-17 19:25         ` Eduardo Habkost
@ 2013-01-17 19:31         ` Laszlo Ersek
  2013-01-17 19:55         ` Eric Blake
  2013-01-17 21:04         ` Blue Swirl
  3 siblings, 0 replies; 31+ messages in thread
From: Laszlo Ersek @ 2013-01-17 19:31 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Chegu Vinod, qemu-devel, Anthony Liguori

On 01/17/13 20:06, Eduardo Habkost wrote:

> Changes v4:
>  - Return -EINVAL in case a minus sign is found
>  - Make endptr point to beginning of string in case -EINVAL
>    is returned (like the strtoull() behavior)

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8 v4] cutils: unsigned int parsing functions
  2013-01-17 19:06       ` [Qemu-devel] [PATCH 1/8 v4] " Eduardo Habkost
  2013-01-17 19:25         ` Eduardo Habkost
  2013-01-17 19:31         ` Laszlo Ersek
@ 2013-01-17 19:55         ` Eric Blake
  2013-01-17 21:04         ` Blue Swirl
  3 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2013-01-17 19:55 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Chegu Vinod, Laszlo Ersek, qemu-devel, Anthony Liguori

[-- Attachment #1: Type: text/plain, Size: 915 bytes --]

On 01/17/2013 12:06 PM, Eduardo Habkost wrote:
> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
> 
> Parsing functions for signed ints and floats will be submitted later.
> 

> 
> Changes v4:
>  - Return -EINVAL in case a minus sign is found
>  - Make endptr point to beginning of string in case -EINVAL
>    is returned (like the strtoull() behavior)
> 
> v3->v4 interdiff:

Thanks; that helped review.

The rest of the series needs no modifications to enjoy the improvements
of this v4 patch.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8 v4] cutils: unsigned int parsing functions
  2013-01-17 19:06       ` [Qemu-devel] [PATCH 1/8 v4] " Eduardo Habkost
                           ` (2 preceding siblings ...)
  2013-01-17 19:55         ` Eric Blake
@ 2013-01-17 21:04         ` Blue Swirl
  3 siblings, 0 replies; 31+ messages in thread
From: Blue Swirl @ 2013-01-17 21:04 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori

On Thu, Jan 17, 2013 at 7:06 PM, Eduardo Habkost <ehabkost@redhat.com> wrote:
> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
>
> Parsing functions for signed ints and floats will be submitted later.
>
> parse_uint_full() has all the checks made by opts_type_uint64() at
> opts-visitor.c:
>
>  - Check for NULL (returns -EINVAL)
>  - Check for negative numbers (returns -ERANGE)
>  - Check for empty string (returns -EINVAL)
>  - Check for overflow or other errno values set by strtoll() (returns
>    -errno)
>  - Check for end of string (reject invalid characters after number)
>    (returns -EINVAL)
>
> parse_uint() does everything above except checking for the end of the
> string, so callers can continue parsing the remainder of string after
> the number.
>
> Unit tests included.

I think it would be pretty cool if we also used a string fuzzer to
check the robustness of string parsers.

>
> [1] string-input-visitor.c:parse_int() could use the same parsing code
>     used by opts-visitor.c:opts_type_int(), instead of duplicating that
>     logic.
>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
> Changes v2:
>  - Trivial whitespace change
>  - Add 'base' parameter to the functions
>
> Changes v4:
>  - Return -EINVAL in case a minus sign is found
>  - Make endptr point to beginning of string in case -EINVAL
>    is returned (like the strtoull() behavior)
>
> v3->v4 interdiff:
>
>     diff --git a/tests/test-cutils.c b/tests/test-cutils.c
>     index f4f6951..f8395f4 100644
>     --- a/tests/test-cutils.c
>     +++ b/tests/test-cutils.c
>     @@ -61,6 +61,22 @@ static void test_parse_uint_empty(void)
>          g_assert(endptr == str);
>      }
>
>     +static void test_parse_uint_invalid(void)
>     +{
>     +    unsigned long long i = 999;
>     +    char f = 'X';
>     +    char *endptr = &f;
>     +    const char *str = " \t xxx";
>     +    int r;
>     +
>     +    r = parse_uint(str, &i, &endptr, 0);
>     +
>     +    g_assert_cmpint(r, ==, -EINVAL);
>     +    g_assert_cmpint(i, ==, 0);
>     +    g_assert(endptr == str);
>     +}
>     +
>     +
>      static void test_parse_uint_trailing(void)
>      {
>          unsigned long long i = 999;
>     @@ -164,9 +180,9 @@ static void test_parse_uint_negative(void)
>
>          r = parse_uint(str, &i, &endptr, 0);
>
>     -    g_assert_cmpint(r, ==, -ERANGE);
>     +    g_assert_cmpint(r, ==, -EINVAL);
>          g_assert_cmpint(i, ==, 0);
>     -    g_assert(endptr == str + 3);
>     +    g_assert(endptr == str);
>      }
>
>
>     @@ -200,6 +216,7 @@ int main(int argc, char **argv)
>
>          g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
>          g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
>     +    g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
>          g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
>          g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
>          g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
>     diff --git a/util/cutils.c b/util/cutils.c
>     index 7f09251..ce80a2a 100644
>     --- a/util/cutils.c
>     +++ b/util/cutils.c
>     @@ -278,7 +278,7 @@ int64_t strtosz(const char *nptr, char **end)
>       * - Overflow errors or other errno values  set by strtoull() will
>       *   return -errno (-ERANGE in case of overflow).
>       * - Differently from strtoull(), values starting with a minus sign are
>     - *   rejected (returning -ERANGE).
>     + *   rejected (returning -EINVAL).
>       *
>       * Sets endptr to point to the first invalid character. Callers may rely
>       * on *value and *endptr to be always set by the function, even in case of
>     @@ -294,6 +294,7 @@ int parse_uint(const char *s, unsigned long long *value, char **endptr,
>          int r = 0;
>          char *endp = (char *)s;
>          unsigned long long val = 0;
>     +    const char *sp;
>
>          if (!s) {
>              r = -EINVAL;
>     @@ -301,11 +302,12 @@ int parse_uint(const char *s, unsigned long long *value, char **endptr,
>          }
>
>          /* make sure we reject negative numbers: */
>     -    while (isspace((unsigned char)*s)) {
>     -        ++s; endp++;
>     +    sp = s;
>     +    while (isspace((unsigned char)*sp)) {
>     +        ++sp;
>          }
>     -    if (*s == '-') {
>     -        r = -ERANGE;
>     +    if (*sp == '-') {
>     +        r = -EINVAL;
>              goto out;
>          }
>
> ---
>  include/qemu-common.h |   4 +
>  tests/Makefile        |   3 +
>  tests/test-cutils.c   | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  util/cutils.c         |  81 ++++++++++++++++++
>  4 files changed, 321 insertions(+)
>  create mode 100644 tests/test-cutils.c
>
> diff --git a/include/qemu-common.h b/include/qemu-common.h
> index ca464bb..f134629 100644
> --- a/include/qemu-common.h
> +++ b/include/qemu-common.h
> @@ -170,6 +170,10 @@ int qemu_fdatasync(int fd);
>  int fcntl_setfl(int fd, int flag);
>  int qemu_parse_fd(const char *param);
>
> +int parse_uint(const char *s, unsigned long long *value, char **endptr,
> +               int base);
> +int parse_uint_full(const char *s, unsigned long long *value, int base);
> +
>  /*
>   * strtosz() suffixes used to specify the default treatment of an
>   * argument passed to strtosz() without an explicit suffix.
> diff --git a/tests/Makefile b/tests/Makefile
> index d86e95a..e5929cd 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -45,6 +45,8 @@ gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
>  gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
>  check-unit-y += tests/test-thread-pool$(EXESUF)
>  gcov-files-test-thread-pool-y = thread-pool.c
> +check-unit-y += tests/test-cutils$(EXESUF)
> +gcov-files-test-cutils-y += util/cutils.c
>
>  check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
>
> @@ -86,6 +88,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
>  tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
> +tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
>
>  tests/test-qapi-types.c tests/test-qapi-types.h :\
>  $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
> diff --git a/tests/test-cutils.c b/tests/test-cutils.c
> new file mode 100644
> index 0000000..f8395f4
> --- /dev/null
> +++ b/tests/test-cutils.c
> @@ -0,0 +1,233 @@
> +/*
> + * cutils.c unit-tests
> + *
> + * Copyright (C) 2013 Red Hat Inc.
> + *
> + * Authors:
> + *  Eduardo Habkost <ehabkost@redhat.com>
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include <glib.h>
> +#include <errno.h>
> +#include <string.h>
> +
> +#include "qemu-common.h"
> +
> +
> +static void test_parse_uint_null(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    int r;
> +
> +    r = parse_uint(NULL, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, -EINVAL);
> +    g_assert_cmpint(i, ==, 0);
> +    g_assert(endptr == NULL);
> +}
> +
> +static void test_parse_uint_empty(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = "";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, -EINVAL);
> +    g_assert_cmpint(i, ==, 0);
> +    g_assert(endptr == str);
> +}
> +
> +static void test_parse_uint_invalid(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = " \t xxx";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, -EINVAL);
> +    g_assert_cmpint(i, ==, 0);
> +    g_assert(endptr == str);
> +}
> +
> +
> +static void test_parse_uint_trailing(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = "123xxx";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, 0);
> +    g_assert_cmpint(i, ==, 123);
> +    g_assert(endptr == str + 3);
> +}
> +
> +static void test_parse_uint_correct(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = "123";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, 0);
> +    g_assert_cmpint(i, ==, 123);
> +    g_assert(endptr == str + strlen(str));
> +}
> +
> +static void test_parse_uint_octal(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = "0123";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, 0);
> +    g_assert_cmpint(i, ==, 0123);
> +    g_assert(endptr == str + strlen(str));
> +}
> +
> +static void test_parse_uint_decimal(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = "0123";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 10);
> +
> +    g_assert_cmpint(r, ==, 0);
> +    g_assert_cmpint(i, ==, 123);
> +    g_assert(endptr == str + strlen(str));
> +}
> +
> +
> +static void test_parse_uint_llong_max(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    char *str = g_strdup_printf("%llu", (unsigned long long)LLONG_MAX + 1);
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, 0);
> +    g_assert_cmpint(i, ==, (unsigned long long)LLONG_MAX + 1);
> +    g_assert(endptr == str + strlen(str));
> +
> +    g_free(str);
> +}
> +
> +static void test_parse_uint_overflow(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = "99999999999999999999999999999999999999";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, -ERANGE);
> +    g_assert_cmpint(i, ==, ULLONG_MAX);
> +    g_assert(endptr == str + strlen(str));
> +}
> +
> +static void test_parse_uint_negative(void)
> +{
> +    unsigned long long i = 999;
> +    char f = 'X';
> +    char *endptr = &f;
> +    const char *str = " \t -321";
> +    int r;
> +
> +    r = parse_uint(str, &i, &endptr, 0);
> +
> +    g_assert_cmpint(r, ==, -EINVAL);
> +    g_assert_cmpint(i, ==, 0);
> +    g_assert(endptr == str);
> +}
> +
> +
> +static void test_parse_uint_full_trailing(void)
> +{
> +    unsigned long long i = 999;
> +    const char *str = "123xxx";
> +    int r;
> +
> +    r = parse_uint_full(str, &i, 0);
> +
> +    g_assert_cmpint(r, ==, -EINVAL);
> +    g_assert_cmpint(i, ==, 123);
> +}
> +
> +static void test_parse_uint_full_correct(void)
> +{
> +    unsigned long long i = 999;
> +    const char *str = "123";
> +    int r;
> +
> +    r = parse_uint_full(str, &i, 0);
> +
> +    g_assert_cmpint(r, ==, 0);
> +    g_assert_cmpint(i, ==, 123);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    g_test_init(&argc, &argv, NULL);
> +
> +    g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
> +    g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
> +    g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
> +    g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
> +    g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
> +    g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
> +    g_test_add_func("/cutils/parse_uint/decimal", test_parse_uint_decimal);
> +    g_test_add_func("/cutils/parse_uint/llong_max", test_parse_uint_llong_max);
> +    g_test_add_func("/cutils/parse_uint/overflow", test_parse_uint_overflow);
> +    g_test_add_func("/cutils/parse_uint/negative", test_parse_uint_negative);
> +    g_test_add_func("/cutils/parse_uint_full/trailing",
> +                    test_parse_uint_full_trailing);
> +    g_test_add_func("/cutils/parse_uint_full/correct",
> +                    test_parse_uint_full_correct);
> +
> +    return g_test_run();
> +}
> diff --git a/util/cutils.c b/util/cutils.c
> index 80bb1dc..ce80a2a 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -270,6 +270,87 @@ int64_t strtosz(const char *nptr, char **end)
>      return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
>  }
>
> +/* Try to parse an unsigned integer
> + *
> + * Error checks done by the function:
> + * - NULL pointer will return -EINVAL.
> + * - Empty strings will return -EINVAL.
> + * - Overflow errors or other errno values  set by strtoull() will
> + *   return -errno (-ERANGE in case of overflow).
> + * - Differently from strtoull(), values starting with a minus sign are
> + *   rejected (returning -EINVAL).
> + *
> + * Sets endptr to point to the first invalid character. Callers may rely
> + * on *value and *endptr to be always set by the function, even in case of
> + * errors.
> + *
> + * The 'base' parameter has the same meaning of 'base' on strtoull().
> + *
> + * Returns 0 on success, negative errno value on error.
> + */
> +int parse_uint(const char *s, unsigned long long *value, char **endptr,
> +               int base)
> +{
> +    int r = 0;
> +    char *endp = (char *)s;
> +    unsigned long long val = 0;
> +    const char *sp;
> +
> +    if (!s) {
> +        r = -EINVAL;
> +        goto out;
> +    }
> +
> +    /* make sure we reject negative numbers: */
> +    sp = s;
> +    while (isspace((unsigned char)*sp)) {
> +        ++sp;
> +    }
> +    if (*sp == '-') {
> +        r = -EINVAL;
> +        goto out;
> +    }
> +
> +    errno = 0;
> +    val = strtoull(s, &endp, base);
> +    if (errno) {
> +        r = -errno;
> +        goto out;
> +    }
> +
> +    if (endp == s) {
> +        r = -EINVAL;
> +        goto out;
> +    }
> +
> +out:
> +    *value = val;
> +    *endptr = endp;
> +    return r;
> +}
> +
> +/* Try to parse an unsigned integer, making sure the whole string is parsed
> + *
> + * Have the same behavior of parse_uint(), but with an additional check
> + * for additional data after the parsed number (in that case, the function
> + * will return -EINVAL).
> + */
> +int parse_uint_full(const char *s, unsigned long long *value, int base)
> +{
> +    char *endp;
> +    int r;
> +
> +    r = parse_uint(s, value, &endp, base);
> +    if (r < 0) {
> +        return r;
> +    }
> +    if (*endp) {
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
>  int qemu_parse_fd(const char *param)
>  {
>      int fd;
> --
> 1.7.11.7
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 18:28 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
  2013-01-17 18:29   ` Laszlo Ersek
@ 2013-01-18 10:01   ` Markus Armbruster
  2013-01-18 13:26     ` Eduardo Habkost
  1 sibling, 1 reply; 31+ messages in thread
From: Markus Armbruster @ 2013-01-18 10:01 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Chegu Vinod, Laszlo Ersek, qemu-devel, Anthony Liguori

Eduardo Habkost <ehabkost@redhat.com> writes:

> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
>
> Parsing functions for signed ints and floats will be submitted later.
>
> parse_uint_full() has all the checks made by opts_type_uint64() at
> opts-visitor.c:
>
>  - Check for NULL (returns -EINVAL)
>  - Check for negative numbers (returns -ERANGE)
>  - Check for empty string (returns -EINVAL)
>  - Check for overflow or other errno values set by strtoll() (returns
>    -errno)
>  - Check for end of string (reject invalid characters after number)
>    (returns -EINVAL)
>
> parse_uint() does everything above except checking for the end of the
> string, so callers can continue parsing the remainder of string after
> the number.
>
> Unit tests included.
>
> [1] string-input-visitor.c:parse_int() could use the same parsing code
>     used by opts-visitor.c:opts_type_int(), instead of duplicating that
>     logic.
[...]
> diff --git a/util/cutils.c b/util/cutils.c
> index 80bb1dc..7f09251 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -270,6 +270,85 @@ int64_t strtosz(const char *nptr, char **end)
>      return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
>  }
>  
> +/* Try to parse an unsigned integer
> + *
> + * Error checks done by the function:
> + * - NULL pointer will return -EINVAL.
> + * - Empty strings will return -EINVAL.

Same for strings containing only whitespace.

> + * - Overflow errors or other errno values  set by strtoull() will

Extra space before 'set'.

> + *   return -errno (-ERANGE in case of overflow).
> + * - Differently from strtoull(), values starting with a minus sign are
> + *   rejected (returning -ERANGE).
> + *
> + * Sets endptr to point to the first invalid character.

Mention that unlike strtol() & friends, endptr must not be null?
Probably not necessary.

The two ERANGE cases treat endptr differently: it's either set to point
just behind the out-of-range number, or to point to the beginning of the
out-of-range number.  Ugh.

> +                                                        Callers may rely
> + * on *value and *endptr to be always set by the function, even in case of
> + * errors.

You neglect to specify what gets assigned to *value in error cases.

Your list of error checks isn't quite complete.  Here's my try:

/**
 * Parse unsigned integer
 *
 * @s: String to parse
 * @value: Destination for parsed integer value
 * @endptr: Destination for pointer to first character not consumed
 * @base: integer base, between 2 and 36 inclusive, or 0
 *
 * Parsed syntax is just like strol()'s: arbitrary whitespace, a single
 * optional '+' or '-', an optional "0x" if @base is 0 or 16, one or
 * more digits.
 *
 * If @s is null, or @base is invalid, or @s doesn't start with an
 * integer in the syntax above, set *@value to 0, *@endptr to @s, and
 * return -EINVAL.
 *
 * Set @endptr to point right beyond the parsed integer.
 *
 * If the integer is negative, set *@value to 0, and return -ERANGE.
 * If the integer overflows unsigned long long, set *@value to
 * ULLONG_MAX, and return -ERANGE.
 * Else, set *@value to the parsed integer, and return 0.
 */

> + *
> + * The 'base' parameter has the same meaning of 'base' on strtoull().
> + *
> + * Returns 0 on success, negative errno value on error.
> + */
> +int parse_uint(const char *s, unsigned long long *value, char **endptr,
> +               int base)
> +{
> +    int r = 0;
> +    char *endp = (char *)s;
> +    unsigned long long val = 0;
> +
> +    if (!s) {
> +        r = -EINVAL;
> +        goto out;
> +    }
> +
> +    /* make sure we reject negative numbers: */
> +    while (isspace((unsigned char)*s)) {
> +        ++s; endp++;

Mixing pre- and post-increment like that is ugly.  Recommend to stick to
post-increment.

I'd set endp after the loop.  Right before goto.

Or simply change the specification to set *endptr to point beyond the
integer instead of to the '-'.  I took the liberty to do exactly that in
my proposed function comment.

> +    }
> +    if (*s == '-') {
> +        r = -ERANGE;
> +        goto out;
> +    }

This creates the special case that made me go "ugh" above.  Suggest to
drop the if here, and check right after strtoull() instead, as shown
below.  That way, we get the same endptr behavior for all out-of-range
numbers.

> +
> +    errno = 0;
> +    val = strtoull(s, &endp, base);

       if (*s = '-') {
           r = -ERANGE;
           val = 0;
           goto out;
       }

> +    if (errno) {
> +        r = -errno;
> +        goto out;
> +    }
> +
> +    if (endp == s) {
> +        r = -EINVAL;
> +        goto out;
> +    }
> +
> +out:
> +    *value = val;
> +    *endptr = endp;
> +    return r;
> +}
> +
> +/* Try to parse an unsigned integer, making sure the whole string is parsed
> + *
> + * Have the same behavior of parse_uint(), but with an additional check
> + * for additional data after the parsed number (in that case, the function
> + * will return -EINVAL).

What's assigned to *value then?

> + */

Alternatively, you could make into parse_uint() do that check when
endptr is null, and drop this function.  Matter of taste.

> +int parse_uint_full(const char *s, unsigned long long *value, int base)
> +{
> +    char *endp;
> +    int r;
> +
> +    r = parse_uint(s, value, &endp, base);
> +    if (r < 0) {
> +        return r;
> +    }
> +    if (*endp) {

Answering my own question: the parsed integer is assigned to *value then.

> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
>  int qemu_parse_fd(const char *param)
>  {
>      int fd;

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-18 10:01   ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
@ 2013-01-18 13:26     ` Eduardo Habkost
  2013-01-18 13:32       ` Andreas Färber
                         ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-18 13:26 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Chegu Vinod, Laszlo Ersek, qemu-devel, Anthony Liguori


On Fri, Jan 18, 2013 at 11:01:14AM +0100, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > There are lots of duplicate parsing code using strto*() in QEMU, and
> > most of that code is broken in one way or another. Even the visitors
> > code have duplicate integer parsing code[1]. This introduces functions
> > to help parsing unsigned int values: parse_uint() and parse_uint_full().
> >
> > Parsing functions for signed ints and floats will be submitted later.
> >
> > parse_uint_full() has all the checks made by opts_type_uint64() at
> > opts-visitor.c:
> >
> >  - Check for NULL (returns -EINVAL)
> >  - Check for negative numbers (returns -ERANGE)
> >  - Check for empty string (returns -EINVAL)
> >  - Check for overflow or other errno values set by strtoll() (returns
> >    -errno)
> >  - Check for end of string (reject invalid characters after number)
> >    (returns -EINVAL)
> >
> > parse_uint() does everything above except checking for the end of the
> > string, so callers can continue parsing the remainder of string after
> > the number.
> >
> > Unit tests included.
> >
> > [1] string-input-visitor.c:parse_int() could use the same parsing code
> >     used by opts-visitor.c:opts_type_int(), instead of duplicating that
> >     logic.
> [...]
> > diff --git a/util/cutils.c b/util/cutils.c
> > index 80bb1dc..7f09251 100644
> > --- a/util/cutils.c
> > +++ b/util/cutils.c
> > @@ -270,6 +270,85 @@ int64_t strtosz(const char *nptr, char **end)
> >      return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
> >  }
> >  
> > +/* Try to parse an unsigned integer
> > + *
> > + * Error checks done by the function:
> > + * - NULL pointer will return -EINVAL.
> > + * - Empty strings will return -EINVAL.
> 
> Same for strings containing only whitespace.

Oh, you're right.

> 
> > + * - Overflow errors or other errno values  set by strtoull() will
> 
> Extra space before 'set'.

Oops.

> 
> > + *   return -errno (-ERANGE in case of overflow).
> > + * - Differently from strtoull(), values starting with a minus sign are
> > + *   rejected (returning -ERANGE).
> > + *
> > + * Sets endptr to point to the first invalid character.
> 
> Mention that unlike strtol() & friends, endptr must not be null?
> Probably not necessary.

I believe it is implicit if I document it as "Sets *endptr". But worth
documenting as it is different from strtoull() behavior.

> 
> The two ERANGE cases treat endptr differently: it's either set to point
> just behind the out-of-range number, or to point to the beginning of the
> out-of-range number.  Ugh.

This should be fixed in the newer version I sent.

> 
> > +                                                        Callers may rely
> > + * on *value and *endptr to be always set by the function, even in case of
> > + * errors.
> 
> You neglect to specify what gets assigned to *value in error cases.
> 

Undefined.  :-)

Anyway, I agree it not very useful to document that "*value and *endptr
will be always set" if it is undefined. I will try to reword it.


> Your list of error checks isn't quite complete.  Here's my try:
> 
> /**
>  * Parse unsigned integer
>  *
>  * @s: String to parse
>  * @value: Destination for parsed integer value
>  * @endptr: Destination for pointer to first character not consumed
>  * @base: integer base, between 2 and 36 inclusive, or 0
>  *
>  * Parsed syntax is just like strol()'s: arbitrary whitespace, a single
>  * optional '+' or '-', an optional "0x" if @base is 0 or 16, one or
>  * more digits.

The newer version has '-' as _not_ part of the syntax.

>  *
>  * If @s is null, or @base is invalid, or @s doesn't start with an
>  * integer in the syntax above, set *@value to 0, *@endptr to @s, and
>  * return -EINVAL.

This matches the behavior of the newer version. But I would like to keep
*value documented as undefined in case of errors.

>  *
>  * Set @endptr to point right beyond the parsed integer.
>  *
>  * If the integer is negative, set *@value to 0, and return -ERANGE.

This isn't the case in the newer version.

>  * If the integer overflows unsigned long long, set *@value to
>  * ULLONG_MAX, and return -ERANGE.
>  * Else, set *@value to the parsed integer, and return 0.
>  */

Thanks for taking the time to write this. I hate having to look up
what's the right syntax to use in docstrings, so I often just write
plain comments before functions.  :-)

I have a minor problem with the documentation above: as a developer, the
most important question I have is: what's the difference between using
parse_uint() and using strtoull() directly? But maybe it is a good
thing: instead of referencing the strtoull() specification (and an
implementation detail), now we have a well-defined and well-documented
behavior.

> 
> > + *
> > + * The 'base' parameter has the same meaning of 'base' on strtoull().
> > + *
> > + * Returns 0 on success, negative errno value on error.
> > + */
> > +int parse_uint(const char *s, unsigned long long *value, char **endptr,
> > +               int base)
> > +{
> > +    int r = 0;
> > +    char *endp = (char *)s;
> > +    unsigned long long val = 0;
> > +
> > +    if (!s) {
> > +        r = -EINVAL;
> > +        goto out;
> > +    }
> > +
> > +    /* make sure we reject negative numbers: */
> > +    while (isspace((unsigned char)*s)) {
> > +        ++s; endp++;
> 
> Mixing pre- and post-increment like that is ugly.  Recommend to stick to
> post-increment.

Agreed. I blame the copy & paste I made from opts-visitor. Later I added
the postfix increment without noticing how ugly it looked like

Anyway, this was changed in the newer patch version.

> 
> I'd set endp after the loop.  Right before goto.

Fixed in the newer version.

> 
> Or simply change the specification to set *endptr to point beyond the
> integer instead of to the '-'.  I took the liberty to do exactly that in
> my proposed function comment.

The newer version was changed to return -EINVAL and set endptr to the
beginning of the string.

> 
> > +    }
> > +    if (*s == '-') {
> > +        r = -ERANGE;
> > +        goto out;
> > +    }
> 
> This creates the special case that made me go "ugh" above.  Suggest to
> drop the if here, and check right after strtoull() instead, as shown
> below.  That way, we get the same endptr behavior for all out-of-range
> numbers.

Is returning -EINVAL acceptable for you, as well? In your proposal we
consider "-1234" part of the language but out-of-range. Returning
-EINVAL, we consider "-1234" not part of the language, and invalid
input. The newer version returns -EINVAL.

> 
> > +
> > +    errno = 0;
> > +    val = strtoull(s, &endp, base);
> 
>        if (*s = '-') {
>            r = -ERANGE;
>            val = 0;
>            goto out;
>        }

This has another bug: makes endptr point to the middle of the string in
case we are parsing "   xxx" or "   -1234".

> 
> > +    if (errno) {
> > +        r = -errno;
> > +        goto out;
> > +    }
> > +
> > +    if (endp == s) {
> > +        r = -EINVAL;
> > +        goto out;
> > +    }
> > +
> > +out:
> > +    *value = val;
> > +    *endptr = endp;
> > +    return r;
> > +}
> > +
> > +/* Try to parse an unsigned integer, making sure the whole string is parsed
> > + *
> > + * Have the same behavior of parse_uint(), but with an additional check
> > + * for additional data after the parsed number (in that case, the function
> > + * will return -EINVAL).
> 
> What's assigned to *value then?

Undefined.  :-)


> 
> > + */
> 
> Alternatively, you could make into parse_uint() do that check when
> endptr is null, and drop this function.  Matter of taste.

I considered doing that, but I believe it would be too subtle.

> 
> > +int parse_uint_full(const char *s, unsigned long long *value, int base)
> > +{
> > +    char *endp;
> > +    int r;
> > +
> > +    r = parse_uint(s, value, &endp, base);
> > +    if (r < 0) {
> > +        return r;
> > +    }
> > +    if (*endp) {
> 
> Answering my own question: the parsed integer is assigned to *value then.

I prefer to document it as undefined. If you want to use the parsed
integer even if there's additional data after it, you should use
parse_int() instead.

> 
> > +        return -EINVAL;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >  int qemu_parse_fd(const char *param)
> >  {
> >      int fd;

-- 
Eduardo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-18 13:26     ` Eduardo Habkost
@ 2013-01-18 13:32       ` Andreas Färber
  2013-01-18 17:57       ` [Qemu-devel] [PATCH 1/8 v5] " Eduardo Habkost
  2013-01-18 18:06       ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
  2 siblings, 0 replies; 31+ messages in thread
From: Andreas Färber @ 2013-01-18 13:32 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Laszlo Ersek, Chegu Vinod, Markus Armbruster, Anthony Liguori,
	qemu-devel

Am 18.01.2013 14:26, schrieb Eduardo Habkost:
> 
> On Fri, Jan 18, 2013 at 11:01:14AM +0100, Markus Armbruster wrote:
>> Your list of error checks isn't quite complete.  Here's my try:

Even better would be:

>> /**

 * parse_uint:

>>  * @s: String to parse
>>  * @value: Destination for parsed integer value
>>  * @endptr: Destination for pointer to first character not consumed
>>  * @base: integer base, between 2 and 36 inclusive, or 0
>>  *

 * Parse unsigned integer

>>  * Parsed syntax is just like strol()'s: arbitrary whitespace, a single
>>  * optional '+' or '-', an optional "0x" if @base is 0 or 16, one or
>>  * more digits.
[snip]

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 1/8 v5] cutils: unsigned int parsing functions
  2013-01-18 13:26     ` Eduardo Habkost
  2013-01-18 13:32       ` Andreas Färber
@ 2013-01-18 17:57       ` Eduardo Habkost
  2013-01-18 18:11         ` Eric Blake
  2013-01-18 18:06       ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
  2 siblings, 1 reply; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-18 17:57 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster; +Cc: Laszlo Ersek, Chegu Vinod, Anthony Liguori

There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().

Parsing functions for signed ints and floats will be submitted later.

parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:

 - Check for NULL (returns -EINVAL)
 - Check for negative numbers (returns -EINVAL)
 - Check for empty string (returns -EINVAL)
 - Check for overflow or other errno values set by strtoll() (returns
   -errno)
 - Check for end of string (reject invalid characters after number)
   (returns -EINVAL)

parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.

Unit tests included.

[1] string-input-visitor.c:parse_int() could use the same parsing code
    used by opts-visitor.c:opts_type_int(), instead of duplicating that
    logic.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>

Changes v2:
 - Trivial whitespace change
 - Add 'base' parameter to the functions

Changes v4:
 - Return -EINVAL in case a minus sign is found
 - Make endptr point to beginning of string in case -EINVAL
   is returned (like the strtoull() behavior)

Changes v5:
 - Updated function documentation to be very specific about
   the syntax and every error case
   Suggested-by: Markus Armbruster <armbru@redhat.com>
 - Added additional test case for whitespace-only string
 - parse_uint_full() will set *value to 0 if returning -EINVAL,
   so callers won't rely on the parsed value being returned.

v4 -> v5 interdiff:

    diff -u b/tests/test-cutils.c b/tests/test-cutils.c
    --- b/tests/test-cutils.c
    +++ b/tests/test-cutils.c
    @@ -61,6 +61,22 @@
         g_assert(endptr == str);
     }
     
    +static void test_parse_uint_whitespace(void)
    +{
    +    unsigned long long i = 999;
    +    char f = 'X';
    +    char *endptr = &f;
    +    const char *str = "   \t   ";
    +    int r;
    +
    +    r = parse_uint(str, &i, &endptr, 0);
    +
    +    g_assert_cmpint(r, ==, -EINVAL);
    +    g_assert_cmpint(i, ==, 0);
    +    g_assert(endptr == str);
    +}
    +
    +
     static void test_parse_uint_invalid(void)
     {
         unsigned long long i = 999;
    @@ -195,7 +211,7 @@
         r = parse_uint_full(str, &i, 0);
     
         g_assert_cmpint(r, ==, -EINVAL);
    -    g_assert_cmpint(i, ==, 123);
    +    g_assert_cmpint(i, ==, 0);
     }
     
     static void test_parse_uint_full_correct(void)
    @@ -216,6 +232,8 @@
     
         g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
         g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
    +    g_test_add_func("/cutils/parse_uint/whitespace",
    +                    test_parse_uint_whitespace);
         g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
         g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
         g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
    diff -u b/util/cutils.c b/util/cutils.c
    --- b/util/cutils.c
    +++ b/util/cutils.c
    @@ -270,23 +270,30 @@
         return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
     }
     
    -/* Try to parse an unsigned integer
    +/**
    + * parse_uint:
      *
    - * Error checks done by the function:
    - * - NULL pointer will return -EINVAL.
    - * - Empty strings will return -EINVAL.
    - * - Overflow errors or other errno values  set by strtoull() will
    - *   return -errno (-ERANGE in case of overflow).
    - * - Differently from strtoull(), values starting with a minus sign are
    - *   rejected (returning -EINVAL).
    - *
    - * Sets endptr to point to the first invalid character. Callers may rely
    - * on *value and *endptr to be always set by the function, even in case of
    - * errors.
    + * @s: String to parse
    + * @value: Destination for parsed integer value
    + * @endptr: Destination for pointer to first character not consumed
    + * @base: integer base, between 2 and 36 inclusive, or 0
      *
    - * The 'base' parameter has the same meaning of 'base' on strtoull().
    + * Parse unsigned integer
      *
    - * Returns 0 on success, negative errno value on error.
    + * Parsed syntax is: arbitrary whitespace, a single optional '+', an optional
    + * "0x"if @base is 0 or 16, one or more digits. It's similar to strtoull()'s
    + * syntax, except that the minus sign ('-') is rejected, so negative numbers
    + * won't be considered valid.
    + *
    + * If @s is null, or @base is invalid, or @s doesn't start with an
    + * integer in the syntax above, set *@value to 0, *@endptr to @s, and
    + * return -EINVAL.
    + *
    + * Set @endptr to point right beyond the parsed integer.
    + *
    + * If the integer overflows unsigned long long, set *@value to
    + * ULLONG_MAX, and return -ERANGE.
    + * Else, set *@value to the parsed integer, and return 0.
      */
     int parse_uint(const char *s, unsigned long long *value, char **endptr,
                    int base)
    @@ -329,11 +336,19 @@
         return r;
     }
     
    -/* Try to parse an unsigned integer, making sure the whole string is parsed
    +/**
    + * parse_uint_full:
    + *
    + * @s: String to parse
    + * @value: Destination for parsed integer value
    + * @base: integer base, between 2 and 36 inclusive, or 0
    + *
    + * Parse unsigned integer from entire string
      *
      * Have the same behavior of parse_uint(), but with an additional check
    - * for additional data after the parsed number (in that case, the function
    - * will return -EINVAL).
    + * for additional data after the parsed number. If extra characters are present
    + * after the parsed number, the function will return -EINVAL, and the caller
    + * should not rely on the value set on *@value.
      */
     int parse_uint_full(const char *s, unsigned long long *value, int base)
     {
    @@ -345,6 +360,7 @@
             return r;
         }
         if (*endp) {
    +        *value = 0;
             return -EINVAL;
         }
     

---
 include/qemu-common.h |   4 +
 tests/Makefile        |   3 +
 tests/test-cutils.c   | 251 ++++++++++++++++++++++++++++++++++++++++++++++++++
 util/cutils.c         |  97 +++++++++++++++++++
 4 files changed, 355 insertions(+)
 create mode 100644 tests/test-cutils.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index ca464bb..f134629 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -170,6 +170,10 @@ int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
 
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base);
+int parse_uint_full(const char *s, unsigned long long *value, int base);
+
 /*
  * strtosz() suffixes used to specify the default treatment of an
  * argument passed to strtosz() without an explicit suffix.
diff --git a/tests/Makefile b/tests/Makefile
index d86e95a..e5929cd 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
 check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
+check-unit-y += tests/test-cutils$(EXESUF)
+gcov-files-test-cutils-y += util/cutils.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -86,6 +88,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
+tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-cutils.c b/tests/test-cutils.c
new file mode 100644
index 0000000..7f94828
--- /dev/null
+++ b/tests/test-cutils.c
@@ -0,0 +1,251 @@
+/*
+ * cutils.c unit-tests
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Eduardo Habkost <ehabkost@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+
+
+static void test_parse_uint_null(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    int r;
+
+    r = parse_uint(NULL, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == NULL);
+}
+
+static void test_parse_uint_empty(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+static void test_parse_uint_whitespace(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "   \t   ";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_invalid(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_trailing(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + 3);
+}
+
+static void test_parse_uint_correct(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_octal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 0123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_decimal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 10);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+
+static void test_parse_uint_llong_max(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    char *str = g_strdup_printf("%llu", (unsigned long long)LLONG_MAX + 1);
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, (unsigned long long)LLONG_MAX + 1);
+    g_assert(endptr == str + strlen(str));
+
+    g_free(str);
+}
+
+static void test_parse_uint_overflow(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "99999999999999999999999999999999999999";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, ULLONG_MAX);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_negative(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t -321";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_full_trailing(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+}
+
+static void test_parse_uint_full_correct(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
+    g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
+    g_test_add_func("/cutils/parse_uint/whitespace",
+                    test_parse_uint_whitespace);
+    g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
+    g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
+    g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
+    g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
+    g_test_add_func("/cutils/parse_uint/decimal", test_parse_uint_decimal);
+    g_test_add_func("/cutils/parse_uint/llong_max", test_parse_uint_llong_max);
+    g_test_add_func("/cutils/parse_uint/overflow", test_parse_uint_overflow);
+    g_test_add_func("/cutils/parse_uint/negative", test_parse_uint_negative);
+    g_test_add_func("/cutils/parse_uint_full/trailing",
+                    test_parse_uint_full_trailing);
+    g_test_add_func("/cutils/parse_uint_full/correct",
+                    test_parse_uint_full_correct);
+
+    return g_test_run();
+}
diff --git a/util/cutils.c b/util/cutils.c
index 80bb1dc..80da9df 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -270,6 +270,103 @@ int64_t strtosz(const char *nptr, char **end)
     return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
 }
 
+/**
+ * parse_uint:
+ *
+ * @s: String to parse
+ * @value: Destination for parsed integer value
+ * @endptr: Destination for pointer to first character not consumed
+ * @base: integer base, between 2 and 36 inclusive, or 0
+ *
+ * Parse unsigned integer
+ *
+ * Parsed syntax is: arbitrary whitespace, a single optional '+', an optional
+ * "0x"if @base is 0 or 16, one or more digits. It's similar to strtoull()'s
+ * syntax, except that the minus sign ('-') is rejected, so negative numbers
+ * won't be considered valid.
+ *
+ * If @s is null, or @base is invalid, or @s doesn't start with an
+ * integer in the syntax above, set *@value to 0, *@endptr to @s, and
+ * return -EINVAL.
+ *
+ * Set @endptr to point right beyond the parsed integer.
+ *
+ * If the integer overflows unsigned long long, set *@value to
+ * ULLONG_MAX, and return -ERANGE.
+ * Else, set *@value to the parsed integer, and return 0.
+ */
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base)
+{
+    int r = 0;
+    char *endp = (char *)s;
+    unsigned long long val = 0;
+    const char *sp;
+
+    if (!s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+    /* make sure we reject negative numbers: */
+    sp = s;
+    while (isspace((unsigned char)*sp)) {
+        ++sp;
+    }
+    if (*sp == '-') {
+        r = -EINVAL;
+        goto out;
+    }
+
+    errno = 0;
+    val = strtoull(s, &endp, base);
+    if (errno) {
+        r = -errno;
+        goto out;
+    }
+
+    if (endp == s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+out:
+    *value = val;
+    *endptr = endp;
+    return r;
+}
+
+/**
+ * parse_uint_full:
+ *
+ * @s: String to parse
+ * @value: Destination for parsed integer value
+ * @base: integer base, between 2 and 36 inclusive, or 0
+ *
+ * Parse unsigned integer from entire string
+ *
+ * Have the same behavior of parse_uint(), but with an additional check
+ * for additional data after the parsed number. If extra characters are present
+ * after the parsed number, the function will return -EINVAL, and the caller
+ * should not rely on the value set on *@value.
+ */
+int parse_uint_full(const char *s, unsigned long long *value, int base)
+{
+    char *endp;
+    int r;
+
+    r = parse_uint(s, value, &endp, base);
+    if (r < 0) {
+        return r;
+    }
+    if (*endp) {
+        *value = 0;
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 int qemu_parse_fd(const char *param)
 {
     int fd;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-18 13:26     ` Eduardo Habkost
  2013-01-18 13:32       ` Andreas Färber
  2013-01-18 17:57       ` [Qemu-devel] [PATCH 1/8 v5] " Eduardo Habkost
@ 2013-01-18 18:06       ` Markus Armbruster
  2 siblings, 0 replies; 31+ messages in thread
From: Markus Armbruster @ 2013-01-18 18:06 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Chegu Vinod, Laszlo Ersek, qemu-devel, Anthony Liguori

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Fri, Jan 18, 2013 at 11:01:14AM +0100, Markus Armbruster wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>> 
>> > There are lots of duplicate parsing code using strto*() in QEMU, and
>> > most of that code is broken in one way or another. Even the visitors
>> > code have duplicate integer parsing code[1]. This introduces functions
>> > to help parsing unsigned int values: parse_uint() and parse_uint_full().
>> >
>> > Parsing functions for signed ints and floats will be submitted later.
>> >
>> > parse_uint_full() has all the checks made by opts_type_uint64() at
>> > opts-visitor.c:
>> >
>> >  - Check for NULL (returns -EINVAL)
>> >  - Check for negative numbers (returns -ERANGE)
>> >  - Check for empty string (returns -EINVAL)
>> >  - Check for overflow or other errno values set by strtoll() (returns
>> >    -errno)
>> >  - Check for end of string (reject invalid characters after number)
>> >    (returns -EINVAL)
>> >
>> > parse_uint() does everything above except checking for the end of the
>> > string, so callers can continue parsing the remainder of string after
>> > the number.
>> >
>> > Unit tests included.
>> >
>> > [1] string-input-visitor.c:parse_int() could use the same parsing code
>> >     used by opts-visitor.c:opts_type_int(), instead of duplicating that
>> >     logic.
>> [...]
>> > diff --git a/util/cutils.c b/util/cutils.c
>> > index 80bb1dc..7f09251 100644
>> > --- a/util/cutils.c
>> > +++ b/util/cutils.c
>> > @@ -270,6 +270,85 @@ int64_t strtosz(const char *nptr, char **end)
>> >      return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
>> >  }
>> >  
>> > +/* Try to parse an unsigned integer
>> > + *
>> > + * Error checks done by the function:
>> > + * - NULL pointer will return -EINVAL.
>> > + * - Empty strings will return -EINVAL.
>> 
>> Same for strings containing only whitespace.
>
> Oh, you're right.
>
>> 
>> > + * - Overflow errors or other errno values  set by strtoull() will
>> 
>> Extra space before 'set'.
>
> Oops.
>
>> 
>> > + *   return -errno (-ERANGE in case of overflow).
>> > + * - Differently from strtoull(), values starting with a minus sign are
>> > + *   rejected (returning -ERANGE).
>> > + *
>> > + * Sets endptr to point to the first invalid character.
>> 
>> Mention that unlike strtol() & friends, endptr must not be null?
>> Probably not necessary.
>
> I believe it is implicit if I document it as "Sets *endptr". But worth
> documenting as it is different from strtoull() behavior.
>
>> 
>> The two ERANGE cases treat endptr differently: it's either set to point
>> just behind the out-of-range number, or to point to the beginning of the
>> out-of-range number.  Ugh.
>
> This should be fixed in the newer version I sent.
>
>> 
>> > +                                                        Callers may rely
>> > + * on *value and *endptr to be always set by the function, even in case of
>> > + * errors.
>> 
>> You neglect to specify what gets assigned to *value in error cases.
>> 
>
> Undefined.  :-)
>
> Anyway, I agree it not very useful to document that "*value and *endptr
> will be always set" if it is undefined. I will try to reword it.

Yes, please.

I'm not fond of undefined behavior, unless there's a good reason.

If you don't want to assign a defined value, consider not assigning
anything.

>> Your list of error checks isn't quite complete.  Here's my try:
>> 
>> /**
>>  * Parse unsigned integer
>>  *
>>  * @s: String to parse
>>  * @value: Destination for parsed integer value
>>  * @endptr: Destination for pointer to first character not consumed
>>  * @base: integer base, between 2 and 36 inclusive, or 0
>>  *
>>  * Parsed syntax is just like strol()'s: arbitrary whitespace, a single
>>  * optional '+' or '-', an optional "0x" if @base is 0 or 16, one or
>>  * more digits.
>
> The newer version has '-' as _not_ part of the syntax.
>
>>  *
>>  * If @s is null, or @base is invalid, or @s doesn't start with an
>>  * integer in the syntax above, set *@value to 0, *@endptr to @s, and
>>  * return -EINVAL.
>
> This matches the behavior of the newer version. But I would like to keep
> *value documented as undefined in case of errors.
>
>>  *
>>  * Set @endptr to point right beyond the parsed integer.
>>  *
>>  * If the integer is negative, set *@value to 0, and return -ERANGE.
>
> This isn't the case in the newer version.
>
>>  * If the integer overflows unsigned long long, set *@value to
>>  * ULLONG_MAX, and return -ERANGE.
>>  * Else, set *@value to the parsed integer, and return 0.
>>  */
>
> Thanks for taking the time to write this. I hate having to look up
> what's the right syntax to use in docstrings, so I often just write
> plain comments before functions.  :-)

I don't particularly like GLib-style doc-strings, but it's what we use
when we use anything, which is rarely.

> I have a minor problem with the documentation above: as a developer, the
> most important question I have is: what's the difference between using
> parse_uint() and using strtoull() directly? But maybe it is a good
> thing: instead of referencing the strtoull() specification (and an
> implementation detail), now we have a well-defined and well-documented
> behavior.

I understand why you're asking for the difference to stroull().  Trouble
is strtoull() is complex, and often misunderstood.

>> > + *
>> > + * The 'base' parameter has the same meaning of 'base' on strtoull().
>> > + *
>> > + * Returns 0 on success, negative errno value on error.
>> > + */
>> > +int parse_uint(const char *s, unsigned long long *value, char **endptr,
>> > +               int base)
>> > +{
>> > +    int r = 0;
>> > +    char *endp = (char *)s;
>> > +    unsigned long long val = 0;
>> > +
>> > +    if (!s) {
>> > +        r = -EINVAL;
>> > +        goto out;
>> > +    }
>> > +
>> > +    /* make sure we reject negative numbers: */
>> > +    while (isspace((unsigned char)*s)) {
>> > +        ++s; endp++;
>> 
>> Mixing pre- and post-increment like that is ugly.  Recommend to stick to
>> post-increment.
>
> Agreed. I blame the copy & paste I made from opts-visitor. Later I added
> the postfix increment without noticing how ugly it looked like
>
> Anyway, this was changed in the newer patch version.
>
>> 
>> I'd set endp after the loop.  Right before goto.
>
> Fixed in the newer version.
>
>> 
>> Or simply change the specification to set *endptr to point beyond the
>> integer instead of to the '-'.  I took the liberty to do exactly that in
>> my proposed function comment.
>
> The newer version was changed to return -EINVAL and set endptr to the
> beginning of the string.
>
>> 
>> > +    }
>> > +    if (*s == '-') {
>> > +        r = -ERANGE;
>> > +        goto out;
>> > +    }
>> 
>> This creates the special case that made me go "ugh" above.  Suggest to
>> drop the if here, and check right after strtoull() instead, as shown
>> below.  That way, we get the same endptr behavior for all out-of-range
>> numbers.
>
> Is returning -EINVAL acceptable for you, as well? In your proposal we
> consider "-1234" part of the language but out-of-range. Returning
> -EINVAL, we consider "-1234" not part of the language, and invalid
> input. The newer version returns -EINVAL.

I prefer -ERANGE on underflow, for symmetry with parsing *signed*
integers.  A similar parsing function for signed integers would surely
assign LLONG_MIN and return -ERANGE then.

A secondary, weak argument: caller can more easily distinguish the
failure modes:

* -EINVAL: @s invalid, @base invalid (both programming errors), or @s
  doesn't start with an integer.  I use "integer" in the mathematical
  sense here.

* -ERANGE, *value == 0: integer underflows *value

* -ERANGE, *value == ULLONG_MAX: integer overflows *value.

If we lump underflow into the -EINVAL case, you have to check *endptr ==
'-' to find it.

The argument is weak, because I don't have a caller ready that actually
wants to find it.

>> > +
>> > +    errno = 0;
>> > +    val = strtoull(s, &endp, base);
>> 
>>        if (*s = '-') {
>>            r = -ERANGE;
>>            val = 0;
>>            goto out;
>>        }
>
> This has another bug: makes endptr point to the middle of the string in
> case we are parsing "   xxx" or "   -1234".

It makes endptr point right behind the integer, doesn't it?

When parsing "   xxx", it points to the first 'x' (we skipped the spaces
already).

When parsing "   -1234", it points behind '4'.

>> > +    if (errno) {
>> > +        r = -errno;
>> > +        goto out;
>> > +    }
>> > +
>> > +    if (endp == s) {
>> > +        r = -EINVAL;
>> > +        goto out;
>> > +    }
>> > +
>> > +out:
>> > +    *value = val;
>> > +    *endptr = endp;
>> > +    return r;
>> > +}
>> > +
>> > +/* Try to parse an unsigned integer, making sure the whole string is parsed
>> > + *
>> > + * Have the same behavior of parse_uint(), but with an additional check
>> > + * for additional data after the parsed number (in that case, the function
>> > + * will return -EINVAL).
>> 
>> What's assigned to *value then?
>
> Undefined.  :-)
>
>
>> 
>> > + */
>> 
>> Alternatively, you could make into parse_uint() do that check when
>> endptr is null, and drop this function.  Matter of taste.
>
> I considered doing that, but I believe it would be too subtle.
>
>> 
>> > +int parse_uint_full(const char *s, unsigned long long *value, int base)
>> > +{
>> > +    char *endp;
>> > +    int r;
>> > +
>> > +    r = parse_uint(s, value, &endp, base);
>> > +    if (r < 0) {
>> > +        return r;
>> > +    }
>> > +    if (*endp) {
>> 
>> Answering my own question: the parsed integer is assigned to *value then.
>
> I prefer to document it as undefined. If you want to use the parsed
> integer even if there's additional data after it, you should use
> parse_int() instead.

Maybe.

What I want is a proper function contract.  That includes how *value is
changed.  *Especially* when it's an unspecified change.

>> > +        return -EINVAL;
>> > +    }
>> > +
>> > +    return 0;
>> > +}
>> > +
>> >  int qemu_parse_fd(const char *param)
>> >  {
>> >      int fd;

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8 v5] cutils: unsigned int parsing functions
  2013-01-18 17:57       ` [Qemu-devel] [PATCH 1/8 v5] " Eduardo Habkost
@ 2013-01-18 18:11         ` Eric Blake
  2013-01-18 19:41           ` [Qemu-devel] [PATCH 1/8 v6] " Eduardo Habkost
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2013-01-18 18:11 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori,
	Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 3738 bytes --]

On 01/18/2013 10:57 AM, Eduardo Habkost wrote:
> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
> 
> Parsing functions for signed ints and floats will be submitted later.
> 
> parse_uint_full() has all the checks made by opts_type_uint64() at
> opts-visitor.c:
> 
>  - Check for NULL (returns -EINVAL)
>  - Check for negative numbers (returns -EINVAL)
>  - Check for empty string (returns -EINVAL)
>  - Check for overflow or other errno values set by strtoll() (returns
>    -errno)
>  - Check for end of string (reject invalid characters after number)
>    (returns -EINVAL)
> 
> parse_uint() does everything above except checking for the end of the
> string, so callers can continue parsing the remainder of string after
> the number.
> 
> Unit tests included.
> 
> [1] string-input-visitor.c:parse_int() could use the same parsing code
>     used by opts-visitor.c:opts_type_int(), instead of duplicating that
>     logic.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>     + *
>     + * If @s is null, or @base is invalid, or @s doesn't start with an
>     + * integer in the syntax above, set *@value to 0, *@endptr to @s, and
>     + * return -EINVAL.
>     + *
>     + * Set @endptr to point right beyond the parsed integer.
>     + *
>     + * If the integer overflows unsigned long long, set *@value to
>     + * ULLONG_MAX, and return -ERANGE.

Is it worth explicitly mentioning that *@endptr is set past the last
digit parsed in the -ERANGE case?  It's implied that it was set beyond
the parsed integer, but did you stop parsing the moment you detected
overflow (and thus *endptr might still be pointing to a digit), or is it
set beyond all possible digits to the first non-digit?

>     +/**
>     + * parse_uint_full:
>     + *
>     + * @s: String to parse
>     + * @value: Destination for parsed integer value
>     + * @base: integer base, between 2 and 36 inclusive, or 0
>     + *
>     + * Parse unsigned integer from entire string
>       *
>       * Have the same behavior of parse_uint(), but with an additional check
>     - * for additional data after the parsed number (in that case, the function
>     - * will return -EINVAL).
>     + * for additional data after the parsed number. If extra characters are present
>     + * after the parsed number, the function will return -EINVAL, and the caller
>     + * should not rely on the value set on *@value.

This says *value is unreliable;

>       */
>      int parse_uint_full(const char *s, unsigned long long *value, int base)
>      {
>     @@ -345,6 +360,7 @@
>              return r;
>          }
>          if (*endp) {
>     +        *value = 0;
>              return -EINVAL;

while this says it is explicitly 0.  Is this an intentional mismatch,
especially given that parse_uint explicitly documents that *value is
always set to a reliable value even on -EINVAL?


> +    /* make sure we reject negative numbers: */
> +    sp = s;
> +    while (isspace((unsigned char)*sp)) {
> +        ++sp;
> +    }
> +    if (*sp == '-') {
> +        r = -EINVAL;
> +        goto out;
> +    }
> +
> +    errno = 0;
> +    val = strtoull(s, &endp, base);

Is it worth a micro-optimization of calling strtoull(sp,...) instead os
strtoull(s,...), to avoid reparsing all the space that we just skipped?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 1/8 v6] cutils: unsigned int parsing functions
  2013-01-18 18:11         ` Eric Blake
@ 2013-01-18 19:41           ` Eduardo Habkost
  2013-01-18 20:20             ` Eric Blake
  0 siblings, 1 reply; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-18 19:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Chegu Vinod, Markus Armbruster, Anthony Liguori, Laszlo Ersek,
	Andreas Färber

There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().

Parsing functions for signed ints and floats will be submitted later.

parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:

 - Check for NULL (returns -EINVAL)
 - Check for negative numbers (returns -EINVAL)
 - Check for empty string (returns -EINVAL)
 - Check for overflow or other errno values set by strtoll() (returns
   -errno)
 - Check for end of string (reject invalid characters after number)
   (returns -EINVAL)

parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.

Unit tests included.

[1] string-input-visitor.c:parse_int() could use the same parsing code
    used by opts-visitor.c:opts_type_int(), instead of duplicating that
    logic.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Changes v2:
 - Trivial whitespace change
 - Add 'base' parameter to the functions

Changes v4:
 - Return -EINVAL in case a minus sign is found
 - Make endptr point to beginning of string in case -EINVAL
   is returned (like the strtoull() behavior)

Changes v5:
 - Updated function documentation to be very specific about
   the syntax and every error case
   Suggested-by: Markus Armbruster <armbru@redhat.com>
 - Added additional test case for whitespace-only string
 - parse_uint_full() will set *value to 0 if returning -EINVAL,
   so callers won't rely on the parsed value being returned.

Changes v6:
 - Return -ERANGE (and set *endptr beyond all digits) for negative
   numbers
   Suggested-by: Markus Armbruster <armbru@redhat.com>
 - Clarify that function will consume all digits even in case of
   overflow/underflow
 - Doesn't define *v as undefined in case parse_uint_full() finds extra
   characters. Explicitly document it as set to 0.

6 version of a simple integer parsing function. This is getting funny.
:-)

I decided to follow Markus' advice and return -ERANGE in case of
negative numbers. It sounds more intuitive, and will allow us to make
parse_uint() and a future parse_int() function (for signed integers) to
accept exactly the same syntax, with just different ranges of valid
number values. Most callers won't care, anyway.


Interdiff v5->v6:

    diff -u b/tests/test-cutils.c b/tests/test-cutils.c
    --- b/tests/test-cutils.c
    +++ b/tests/test-cutils.c
    @@ -196,9 +196,9 @@
     
         r = parse_uint(str, &i, &endptr, 0);
     
    -    g_assert_cmpint(r, ==, -EINVAL);
    +    g_assert_cmpint(r, ==, -ERANGE);
         g_assert_cmpint(i, ==, 0);
    -    g_assert(endptr == str);
    +    g_assert(endptr == str + strlen(str));
     }
     
     
    diff -u b/util/cutils.c b/util/cutils.c
    --- b/util/cutils.c
    +++ b/util/cutils.c
    @@ -280,19 +280,22 @@
      *
      * Parse unsigned integer
      *
    - * Parsed syntax is: arbitrary whitespace, a single optional '+', an optional
    - * "0x"if @base is 0 or 16, one or more digits. It's similar to strtoull()'s
    - * syntax, except that the minus sign ('-') is rejected, so negative numbers
    - * won't be considered valid.
    + * Parsed syntax is like strtoull()'s: arbitrary whitespace, a single optional
    + * '+' or '-', an optional "0x"if @base is 0 or 16, one or more digits.
      *
      * If @s is null, or @base is invalid, or @s doesn't start with an
      * integer in the syntax above, set *@value to 0, *@endptr to @s, and
      * return -EINVAL.
      *
    - * Set @endptr to point right beyond the parsed integer.
    + * Set *@endptr to point right beyond the parsed integer (even if the integer
    + * overflows or is negative, all digits will be parsed and *@endptr will
    + * point right beyond them).
    + *
    + * If the integer is negative, set *@value to 0, and return -ERANGE.
      *
      * If the integer overflows unsigned long long, set *@value to
      * ULLONG_MAX, and return -ERANGE.
    + *
      * Else, set *@value to the parsed integer, and return 0.
      */
     int parse_uint(const char *s, unsigned long long *value, char **endptr,
    @@ -301,23 +304,12 @@
         int r = 0;
         char *endp = (char *)s;
         unsigned long long val = 0;
    -    const char *sp;
     
         if (!s) {
             r = -EINVAL;
             goto out;
         }
     
    -    /* make sure we reject negative numbers: */
    -    sp = s;
    -    while (isspace((unsigned char)*sp)) {
    -        ++sp;
    -    }
    -    if (*sp == '-') {
    -        r = -EINVAL;
    -        goto out;
    -    }
    -
         errno = 0;
         val = strtoull(s, &endp, base);
         if (errno) {
    @@ -330,6 +322,16 @@
             goto out;
         }
     
    +    /* make sure we reject negative numbers: */
    +    while (isspace((unsigned char)*s)) {
    +        s++;
    +    }
    +    if (*s == '-') {
    +        val = 0;
    +        r = -ERANGE;
    +        goto out;
    +    }
    +
     out:
         *value = val;
         *endptr = endp;
    @@ -347,8 +349,8 @@
      *
      * Have the same behavior of parse_uint(), but with an additional check
      * for additional data after the parsed number. If extra characters are present
    - * after the parsed number, the function will return -EINVAL, and the caller
    - * should not rely on the value set on *@value.
    + * after the parsed number, the function will return -EINVAL, and *@v will
    + * be set to 0.
      */
     int parse_uint_full(const char *s, unsigned long long *value, int base)
     {
---
 include/qemu-common.h |   4 +
 tests/Makefile        |   3 +
 tests/test-cutils.c   | 251 ++++++++++++++++++++++++++++++++++++++++++++++++++
 util/cutils.c         |  99 ++++++++++++++++++++
 4 files changed, 357 insertions(+)
 create mode 100644 tests/test-cutils.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index ca464bb..f134629 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -170,6 +170,10 @@ int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
 
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base);
+int parse_uint_full(const char *s, unsigned long long *value, int base);
+
 /*
  * strtosz() suffixes used to specify the default treatment of an
  * argument passed to strtosz() without an explicit suffix.
diff --git a/tests/Makefile b/tests/Makefile
index d86e95a..e5929cd 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
 check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
+check-unit-y += tests/test-cutils$(EXESUF)
+gcov-files-test-cutils-y += util/cutils.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -86,6 +88,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
+tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-cutils.c b/tests/test-cutils.c
new file mode 100644
index 0000000..2a4556d
--- /dev/null
+++ b/tests/test-cutils.c
@@ -0,0 +1,251 @@
+/*
+ * cutils.c unit-tests
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Eduardo Habkost <ehabkost@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+
+
+static void test_parse_uint_null(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    int r;
+
+    r = parse_uint(NULL, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == NULL);
+}
+
+static void test_parse_uint_empty(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+static void test_parse_uint_whitespace(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "   \t   ";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_invalid(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+
+static void test_parse_uint_trailing(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + 3);
+}
+
+static void test_parse_uint_correct(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_octal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 0123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_decimal(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "0123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 10);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+
+static void test_parse_uint_llong_max(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    char *str = g_strdup_printf("%llu", (unsigned long long)LLONG_MAX + 1);
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, (unsigned long long)LLONG_MAX + 1);
+    g_assert(endptr == str + strlen(str));
+
+    g_free(str);
+}
+
+static void test_parse_uint_overflow(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "99999999999999999999999999999999999999";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, ULLONG_MAX);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_negative(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t -321";
+    int r;
+
+    r = parse_uint(str, &i, &endptr, 0);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str + strlen(str));
+}
+
+
+static void test_parse_uint_full_trailing(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+}
+
+static void test_parse_uint_full_correct(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint_full(str, &i, 0);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
+    g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
+    g_test_add_func("/cutils/parse_uint/whitespace",
+                    test_parse_uint_whitespace);
+    g_test_add_func("/cutils/parse_uint/invalid", test_parse_uint_invalid);
+    g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
+    g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
+    g_test_add_func("/cutils/parse_uint/octal", test_parse_uint_octal);
+    g_test_add_func("/cutils/parse_uint/decimal", test_parse_uint_decimal);
+    g_test_add_func("/cutils/parse_uint/llong_max", test_parse_uint_llong_max);
+    g_test_add_func("/cutils/parse_uint/overflow", test_parse_uint_overflow);
+    g_test_add_func("/cutils/parse_uint/negative", test_parse_uint_negative);
+    g_test_add_func("/cutils/parse_uint_full/trailing",
+                    test_parse_uint_full_trailing);
+    g_test_add_func("/cutils/parse_uint_full/correct",
+                    test_parse_uint_full_correct);
+
+    return g_test_run();
+}
diff --git a/util/cutils.c b/util/cutils.c
index 80bb1dc..8e43fae 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -270,6 +270,105 @@ int64_t strtosz(const char *nptr, char **end)
     return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
 }
 
+/**
+ * parse_uint:
+ *
+ * @s: String to parse
+ * @value: Destination for parsed integer value
+ * @endptr: Destination for pointer to first character not consumed
+ * @base: integer base, between 2 and 36 inclusive, or 0
+ *
+ * Parse unsigned integer
+ *
+ * Parsed syntax is like strtoull()'s: arbitrary whitespace, a single optional
+ * '+' or '-', an optional "0x"if @base is 0 or 16, one or more digits.
+ *
+ * If @s is null, or @base is invalid, or @s doesn't start with an
+ * integer in the syntax above, set *@value to 0, *@endptr to @s, and
+ * return -EINVAL.
+ *
+ * Set *@endptr to point right beyond the parsed integer (even if the integer
+ * overflows or is negative, all digits will be parsed and *@endptr will
+ * point right beyond them).
+ *
+ * If the integer is negative, set *@value to 0, and return -ERANGE.
+ *
+ * If the integer overflows unsigned long long, set *@value to
+ * ULLONG_MAX, and return -ERANGE.
+ *
+ * Else, set *@value to the parsed integer, and return 0.
+ */
+int parse_uint(const char *s, unsigned long long *value, char **endptr,
+               int base)
+{
+    int r = 0;
+    char *endp = (char *)s;
+    unsigned long long val = 0;
+
+    if (!s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+    errno = 0;
+    val = strtoull(s, &endp, base);
+    if (errno) {
+        r = -errno;
+        goto out;
+    }
+
+    if (endp == s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+    /* make sure we reject negative numbers: */
+    while (isspace((unsigned char)*s)) {
+        s++;
+    }
+    if (*s == '-') {
+        val = 0;
+        r = -ERANGE;
+        goto out;
+    }
+
+out:
+    *value = val;
+    *endptr = endp;
+    return r;
+}
+
+/**
+ * parse_uint_full:
+ *
+ * @s: String to parse
+ * @value: Destination for parsed integer value
+ * @base: integer base, between 2 and 36 inclusive, or 0
+ *
+ * Parse unsigned integer from entire string
+ *
+ * Have the same behavior of parse_uint(), but with an additional check
+ * for additional data after the parsed number. If extra characters are present
+ * after the parsed number, the function will return -EINVAL, and *@v will
+ * be set to 0.
+ */
+int parse_uint_full(const char *s, unsigned long long *value, int base)
+{
+    char *endp;
+    int r;
+
+    r = parse_uint(s, value, &endp, base);
+    if (r < 0) {
+        return r;
+    }
+    if (*endp) {
+        *value = 0;
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 int qemu_parse_fd(const char *param)
 {
     int fd;
-- 
1.7.11.7

On Fri, Jan 18, 2013 at 11:11:11AM -0700, Eric Blake wrote:
> On 01/18/2013 10:57 AM, Eduardo Habkost wrote:
> > There are lots of duplicate parsing code using strto*() in QEMU, and
> > most of that code is broken in one way or another. Even the visitors
> > code have duplicate integer parsing code[1]. This introduces functions
> > to help parsing unsigned int values: parse_uint() and parse_uint_full().
> > 
> > Parsing functions for signed ints and floats will be submitted later.
> > 
> > parse_uint_full() has all the checks made by opts_type_uint64() at
> > opts-visitor.c:
> > 
> >  - Check for NULL (returns -EINVAL)
> >  - Check for negative numbers (returns -EINVAL)
> >  - Check for empty string (returns -EINVAL)
> >  - Check for overflow or other errno values set by strtoll() (returns
> >    -errno)
> >  - Check for end of string (reject invalid characters after number)
> >    (returns -EINVAL)
> > 
> > parse_uint() does everything above except checking for the end of the
> > string, so callers can continue parsing the remainder of string after
> > the number.
> > 
> > Unit tests included.
> > 
> > [1] string-input-visitor.c:parse_int() could use the same parsing code
> >     used by opts-visitor.c:opts_type_int(), instead of duplicating that
> >     logic.
> > 
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> >     + *
> >     + * If @s is null, or @base is invalid, or @s doesn't start with an
> >     + * integer in the syntax above, set *@value to 0, *@endptr to @s, and
> >     + * return -EINVAL.
> >     + *
> >     + * Set @endptr to point right beyond the parsed integer.
> >     + *
> >     + * If the integer overflows unsigned long long, set *@value to
> >     + * ULLONG_MAX, and return -ERANGE.
> 
> Is it worth explicitly mentioning that *@endptr is set past the last
> digit parsed in the -ERANGE case?  It's implied that it was set beyond
> the parsed integer, but did you stop parsing the moment you detected
> overflow (and thus *endptr might still be pointing to a digit), or is it
> set beyond all possible digits to the first non-digit?
> 
> >     +/**
> >     + * parse_uint_full:
> >     + *
> >     + * @s: String to parse
> >     + * @value: Destination for parsed integer value
> >     + * @base: integer base, between 2 and 36 inclusive, or 0
> >     + *
> >     + * Parse unsigned integer from entire string
> >       *
> >       * Have the same behavior of parse_uint(), but with an additional check
> >     - * for additional data after the parsed number (in that case, the function
> >     - * will return -EINVAL).
> >     + * for additional data after the parsed number. If extra characters are present
> >     + * after the parsed number, the function will return -EINVAL, and the caller
> >     + * should not rely on the value set on *@value.
> 
> This says *value is unreliable;
> 
> >       */
> >      int parse_uint_full(const char *s, unsigned long long *value, int base)
> >      {
> >     @@ -345,6 +360,7 @@
> >              return r;
> >          }
> >          if (*endp) {
> >     +        *value = 0;
> >              return -EINVAL;
> 
> while this says it is explicitly 0.  Is this an intentional mismatch,
> especially given that parse_uint explicitly documents that *value is
> always set to a reliable value even on -EINVAL?
> 
> 
> > +    /* make sure we reject negative numbers: */
> > +    sp = s;
> > +    while (isspace((unsigned char)*sp)) {
> > +        ++sp;
> > +    }
> > +    if (*sp == '-') {
> > +        r = -EINVAL;
> > +        goto out;
> > +    }
> > +
> > +    errno = 0;
> > +    val = strtoull(s, &endp, base);
> 
> Is it worth a micro-optimization of calling strtoull(sp,...) instead os
> strtoull(s,...), to avoid reparsing all the space that we just skipped?
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 



-- 
Eduardo

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8 v6] cutils: unsigned int parsing functions
  2013-01-18 19:41           ` [Qemu-devel] [PATCH 1/8 v6] " Eduardo Habkost
@ 2013-01-18 20:20             ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2013-01-18 20:20 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Laszlo Ersek, qemu-devel, Markus Armbruster, Anthony Liguori,
	Chegu Vinod, Andreas Färber

[-- Attachment #1: Type: text/plain, Size: 2617 bytes --]

On 01/18/2013 12:41 PM, Eduardo Habkost wrote:
> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
> 

> Changes v6:
>  - Return -ERANGE (and set *endptr beyond all digits) for negative
>    numbers
>    Suggested-by: Markus Armbruster <armbru@redhat.com>
>  - Clarify that function will consume all digits even in case of
>    overflow/underflow
>  - Doesn't define *v as undefined in case parse_uint_full() finds extra
>    characters. Explicitly document it as set to 0.

Looks sane to me, except for the introduction of one typo (please, can
whoever queues up the PULL request just fix that, without needing to see
a v7)?

> 
> 6 version of a simple integer parsing function. This is getting funny.
> :-)

Yeah, but just think how much easier parse_int() is going to be :)

> ---
>  include/qemu-common.h |   4 +
>  tests/Makefile        |   3 +
>  tests/test-cutils.c   | 251 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  util/cutils.c         |  99 ++++++++++++++++++++
>  4 files changed, 357 insertions(+)
>  create mode 100644 tests/test-cutils.c

Reviewed-by: Eric Blake <eblake@redhat.com>

> + *
> + * Parsed syntax is like strtoull()'s: arbitrary whitespace, a single optional
> + * '+' or '-', an optional "0x"if @base is 0 or 16, one or more digits.

s/"0x"if/"0x" if/

> +
> +/**
> + * parse_uint_full:
> + *
> + * @s: String to parse
> + * @value: Destination for parsed integer value
> + * @base: integer base, between 2 and 36 inclusive, or 0
> + *
> + * Parse unsigned integer from entire string
> + *
> + * Have the same behavior of parse_uint(), but with an additional check
> + * for additional data after the parsed number. If extra characters are present
> + * after the parsed number, the function will return -EINVAL, and *@v will
> + * be set to 0.
> + */
> +int parse_uint_full(const char *s, unsigned long long *value, int base)

[Side note - in libvirt, we have just one function instead of two; you
get the parse_uint_full semantics by passing NULL for endptr to the
parse_uint counterpart.  But I'm fine with your approach of mandating
endptr to be a non-NULL pointer to parse_uint, and having a second
function for the case where no end characters are allowed.]

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH for-1.4 0/8] -numa option parsing fixes (v3)
       [not found]   ` <20130131154227.GH6849@otherpad.lan.raisama.net>
@ 2013-02-01 20:45     ` Anthony Liguori
  0 siblings, 0 replies; 31+ messages in thread
From: Anthony Liguori @ 2013-02-01 20:45 UTC (permalink / raw)
  To: Eduardo Habkost, qemu-devel
  Cc: Chegu Vinod, Laszlo Ersek, Andreas Färber

Eduardo Habkost <ehabkost@redhat.com> writes:

> Ping?
>
> If this is not 1.4 material, it's no problem to me. I just wanted to
> know if this is being ignored because it's too late for 1.4, or because
> it was lost in some black hole.

I would apply this but this thread is a PITA to sort through.  You
should do full resubmits of series if you can a patch.

Please resubmit a full v7 series.  It's bug fixes so it can go in after -rc0.

Regards,

Anthony Liguori

>
>
> On Mon, Jan 28, 2013 at 02:55:59PM -0200, Eduardo Habkost wrote:
>> 
>> I would like to get these bug fixes in 1.4. Who is the right maintainer
>> to pull this in? Anthony?
>> 
>> (Note that patch 1/8 has a newer version (v6))
>> 
>> 
>> On Wed, Jan 16, 2013 at 04:28:45PM -0200, Eduardo Habkost wrote:
>> > Changes v2 -> v3:
>> >  - Add 'base' parameter to parse_uint*() (patch 1/8)
>> >  - Keep existing base=10 behavior when parsing "nodeid" and "cpus"
>> >    (patches 6/8, 8/8)
>> >  - Trivial whitespace change on patch 1/8
>> >  - Fix fprintf() format string on patch 5/8
>> > 
>> > This series contains only the most important fixes from the previous "-numa
>> > option parsing fixes & improvements" series I have submitted.
>> > 
>> > I have introduced parse_uint*() helpers that can be reused by other code, later.
>> > I plan to submit parse_int*() (for signed integers) and parse_double*()
>> > functions too, later, and change string-input-visitor.c and opts-visitor.c to
>> > use those common functions instead of duplicating the number parsing code.
>> > 
>> > 
>> > Eduardo Habkost (8):
>> >   cutils: unsigned int parsing functions
>> >   vl.c: Fix off-by-one bug when handling "-numa node" argument
>> >   vl.c: Abort on unknown -numa option type
>> >   vl.c: Check for NUMA node limit inside numa_add()
>> >   vl.c: numa_add(): Validate nodeid before using it
>> >   vl.c: Use parse_uint_full() for NUMA nodeid
>> >   vl.c: Extract -numa "cpus" parsing to separate function
>> >   vl.c: validate -numa "cpus" parameter properly
>> > 
>> >  include/qemu-common.h |   4 +
>> >  tests/Makefile        |   3 +
>> >  tests/test-cutils.c   | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> >  util/cutils.c         |  79 ++++++++++++++++++
>> >  vl.c                  |  93 ++++++++++++++++------
>> >  5 files changed, 370 insertions(+), 25 deletions(-)
>> >  create mode 100644 tests/test-cutils.c
>> > 
>> > -- 
>> > 1.7.11.7
>> > 
>> > 
>> 
>> -- 
>> Eduardo
>> 
>
> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 17:50       ` Eric Blake
@ 2013-01-16 17:56         ` Eduardo Habkost
  0 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 17:56 UTC (permalink / raw)
  To: Eric Blake; +Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori

On Wed, Jan 16, 2013 at 10:50:18AM -0700, Eric Blake wrote:
> On 01/16/2013 10:33 AM, Eduardo Habkost wrote:
> > On Wed, Jan 16, 2013 at 10:10:42AM -0700, Eric Blake wrote:
> >> On 01/16/2013 08:24 AM, Eduardo Habkost wrote:
> >>> There are lots of duplicate parsing code using strto*() in QEMU, and
> >>> most of that code is broken in one way or another. Even the visitors
> >>> code have duplicate integer parsing code[1]. This introduces functions
> >>> to help parsing unsigned int values: parse_uint() and parse_uint_full().
> >>>
> >>
> >> Your test case lacks test of octal or hexadecimal input strings; is that
> >> worth adding?
> > 
> > I believe I trust strtoll() enough to not require tests for those cases.
> 
> But see my comments in 8/8 about whether it makes sense to add a 'base'
> parameter to let the caller choose whether they want to allow octal or
> require strict decimal parsing.

In that case, I will add at least a test cases to check if "010" gives
the expected result depending on the 'base' parameter.

> 
> >>> +        r =  -errno;
> >>
> >> Why two spaces?
> > 
> > Typo. I will send a fixed version (keeping your Reviewed-by, if you
> > don't mind).
> 
> No problem - I wouldn't have left a reviewed-by if I didn't think the
> changes couldn't be trivially fixed.  On the other hand, you may have a
> non-trivial fix in the form of adding a base parameter...

I was planning to keep the reviewed-by line only if fixing the trivial
whitespace issue above. I won't keep it when adding the 'base'
parameter, don't worry. :-)

-- 
Eduardo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 17:33     ` Eduardo Habkost
@ 2013-01-16 17:50       ` Eric Blake
  2013-01-16 17:56         ` Eduardo Habkost
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2013-01-16 17:50 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori

[-- Attachment #1: Type: text/plain, Size: 1288 bytes --]

On 01/16/2013 10:33 AM, Eduardo Habkost wrote:
> On Wed, Jan 16, 2013 at 10:10:42AM -0700, Eric Blake wrote:
>> On 01/16/2013 08:24 AM, Eduardo Habkost wrote:
>>> There are lots of duplicate parsing code using strto*() in QEMU, and
>>> most of that code is broken in one way or another. Even the visitors
>>> code have duplicate integer parsing code[1]. This introduces functions
>>> to help parsing unsigned int values: parse_uint() and parse_uint_full().
>>>
>>
>> Your test case lacks test of octal or hexadecimal input strings; is that
>> worth adding?
> 
> I believe I trust strtoll() enough to not require tests for those cases.

But see my comments in 8/8 about whether it makes sense to add a 'base'
parameter to let the caller choose whether they want to allow octal or
require strict decimal parsing.

>>> +        r =  -errno;
>>
>> Why two spaces?
> 
> Typo. I will send a fixed version (keeping your Reviewed-by, if you
> don't mind).

No problem - I wouldn't have left a reviewed-by if I didn't think the
changes couldn't be trivially fixed.  On the other hand, you may have a
non-trivial fix in the form of adding a base parameter...

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 17:10   ` Eric Blake
@ 2013-01-16 17:33     ` Eduardo Habkost
  2013-01-16 17:50       ` Eric Blake
  0 siblings, 1 reply; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 17:33 UTC (permalink / raw)
  To: Eric Blake; +Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori

On Wed, Jan 16, 2013 at 10:10:42AM -0700, Eric Blake wrote:
> On 01/16/2013 08:24 AM, Eduardo Habkost wrote:
> > There are lots of duplicate parsing code using strto*() in QEMU, and
> > most of that code is broken in one way or another. Even the visitors
> > code have duplicate integer parsing code[1]. This introduces functions
> > to help parsing unsigned int values: parse_uint() and parse_uint_full().
> > 
> > Parsing functions for signed ints and floats will be submitted later.
> > 
> > parse_uint_full() has all the checks made by opts_type_uint64() at
> > opts-visitor.c:
> > 
> >  - Check for NULL (returns -EINVAL)
> >  - Check for negative numbers (returns -ERANGE)
> >  - Check for empty string (returns -EINVAL)
> >  - Check for overflow or other errno values set by strtoll() (returns
> >    -errno)
> >  - Check for end of string (reject invalid characters after number)
> >    (returns -EINVAL)
> > 
> > parse_uint() does everything above except checking for the end of the
> > string, so callers can continue parsing the remainder of string after
> > the number.
> > 
> > Unit tests included.
> > 
> > [1] string-input-visitor.c:parse_int() could use the same parsing code
> >     used by opts-visitor.c:opts_type_int(), instead of duplicating that
> >     logic.
> > 
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Cc: Eric Blake <eblake@redhat.com>
> > ---
> 
> > +++ b/tests/test-cutils.c
> 
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a copy
> 
> Interesting that you chose a BSD license instead of GPL, but doesn't
> affect my review.

It's a personal preference, but by coincidence is also the license of
cutils.c.

> 
> Your test case lacks test of octal or hexadecimal input strings; is that
> worth adding?

I believe I trust strtoll() enough to not require tests for those cases.
Just like I didn't add tests involving trailing spaces or tabs, except
on the negative-number test case.


> > +++ b/util/cutils.c
> > @@ -270,6 +270,82 @@ int64_t strtosz(const char *nptr, char **end)
> >      return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
> >  }
> >  
> > +/* Try to parse an unsigned integer
> > + *
> > + * Error checks done by the function:
> > + * - NULL pointer will return -EINVAL.
> > + * - Empty strings will return -EINVAL.
> > + * - Overflow errors or other errno values  set by strtoull() will
> > + *   return -errno (-ERANGE in case of overflow).
> > + * - Differently from strtoull(), values starting with a minus sign are
> > + *   rejected (returning -ERANGE).
> 
> Interesting that you chose to reject negative numbers, even though
> strtoull() is required to accept them.  But you documented it and tested
> for it, so I can live with it.

I find this behavior of strtoull() confusing for users. If an
option/field requires positive numbers, I expect QEMU to reject "-1"
instead of silently converting it to ULLONG_MAX.

> 
> > +    errno = 0;
> > +    val = strtoull(s, &endp, 0);
> > +    if (errno) {
> > +        r =  -errno;
> 
> Why two spaces?

Typo. I will send a fixed version (keeping your Reviewed-by, if you
don't mind).

> 
> Reviewed-by: Eric Blake <eblake@redhat.com>

Thanks!

-- 
Eduardo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 15:24 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
@ 2013-01-16 17:10   ` Eric Blake
  2013-01-16 17:33     ` Eduardo Habkost
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2013-01-16 17:10 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Laszlo Ersek, Chegu Vinod, qemu-devel, Anthony Liguori

[-- Attachment #1: Type: text/plain, Size: 2696 bytes --]

On 01/16/2013 08:24 AM, Eduardo Habkost wrote:
> There are lots of duplicate parsing code using strto*() in QEMU, and
> most of that code is broken in one way or another. Even the visitors
> code have duplicate integer parsing code[1]. This introduces functions
> to help parsing unsigned int values: parse_uint() and parse_uint_full().
> 
> Parsing functions for signed ints and floats will be submitted later.
> 
> parse_uint_full() has all the checks made by opts_type_uint64() at
> opts-visitor.c:
> 
>  - Check for NULL (returns -EINVAL)
>  - Check for negative numbers (returns -ERANGE)
>  - Check for empty string (returns -EINVAL)
>  - Check for overflow or other errno values set by strtoll() (returns
>    -errno)
>  - Check for end of string (reject invalid characters after number)
>    (returns -EINVAL)
> 
> parse_uint() does everything above except checking for the end of the
> string, so callers can continue parsing the remainder of string after
> the number.
> 
> Unit tests included.
> 
> [1] string-input-visitor.c:parse_int() could use the same parsing code
>     used by opts-visitor.c:opts_type_int(), instead of duplicating that
>     logic.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Eric Blake <eblake@redhat.com>
> ---

> +++ b/tests/test-cutils.c

> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy

Interesting that you chose a BSD license instead of GPL, but doesn't
affect my review.

Your test case lacks test of octal or hexadecimal input strings; is that
worth adding?

> +++ b/util/cutils.c
> @@ -270,6 +270,82 @@ int64_t strtosz(const char *nptr, char **end)
>      return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
>  }
>  
> +/* Try to parse an unsigned integer
> + *
> + * Error checks done by the function:
> + * - NULL pointer will return -EINVAL.
> + * - Empty strings will return -EINVAL.
> + * - Overflow errors or other errno values  set by strtoull() will
> + *   return -errno (-ERANGE in case of overflow).
> + * - Differently from strtoull(), values starting with a minus sign are
> + *   rejected (returning -ERANGE).

Interesting that you chose to reject negative numbers, even though
strtoull() is required to accept them.  But you documented it and tested
for it, so I can live with it.

> +    errno = 0;
> +    val = strtoull(s, &endp, 0);
> +    if (errno) {
> +        r =  -errno;

Why two spaces?

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions
  2013-01-16 15:24 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v2) Eduardo Habkost
@ 2013-01-16 15:24 ` Eduardo Habkost
  2013-01-16 17:10   ` Eric Blake
  0 siblings, 1 reply; 31+ messages in thread
From: Eduardo Habkost @ 2013-01-16 15:24 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori; +Cc: Laszlo Ersek, Chegu Vinod

There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().

Parsing functions for signed ints and floats will be submitted later.

parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:

 - Check for NULL (returns -EINVAL)
 - Check for negative numbers (returns -ERANGE)
 - Check for empty string (returns -EINVAL)
 - Check for overflow or other errno values set by strtoll() (returns
   -errno)
 - Check for end of string (reject invalid characters after number)
   (returns -EINVAL)

parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.

Unit tests included.

[1] string-input-visitor.c:parse_int() could use the same parsing code
    used by opts-visitor.c:opts_type_int(), instead of duplicating that
    logic.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
---
 include/qemu-common.h |   3 +
 tests/Makefile        |   3 +
 tests/test-cutils.c   | 183 ++++++++++++++++++++++++++++++++++++++++++++++++++
 util/cutils.c         |  76 +++++++++++++++++++++
 4 files changed, 265 insertions(+)
 create mode 100644 tests/test-cutils.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index ca464bb..83f89fd 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -170,6 +170,9 @@ int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
 
+int parse_uint(const char *s, unsigned long long *value, char **endptr);
+int parse_uint_full(const char *s, unsigned long long *value);
+
 /*
  * strtosz() suffixes used to specify the default treatment of an
  * argument passed to strtosz() without an explicit suffix.
diff --git a/tests/Makefile b/tests/Makefile
index d97a571..2dba3e5 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
 check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
+check-unit-y += tests/test-cutils$(EXESUF)
+gcov-files-test-cutils-y += util/cutils.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -84,6 +86,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
+tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-cutils.c b/tests/test-cutils.c
new file mode 100644
index 0000000..d573eb1
--- /dev/null
+++ b/tests/test-cutils.c
@@ -0,0 +1,183 @@
+/*
+ * cutils.c unit-tests
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Eduardo Habkost <ehabkost@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+
+
+static void test_parse_uint_null(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    int r;
+
+    r = parse_uint(NULL, &i, &endptr);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == NULL);
+}
+
+static void test_parse_uint_empty(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "";
+    int r;
+
+    r = parse_uint(str, &i, &endptr);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str);
+}
+
+static void test_parse_uint_trailing(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint(str, &i, &endptr);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + 3);
+}
+
+static void test_parse_uint_correct(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint(str, &i, &endptr);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_llong_max(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    char *str = g_strdup_printf("%llu", (unsigned long long)LLONG_MAX + 1);
+    int r;
+
+    r = parse_uint(str, &i, &endptr);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, (unsigned long long)LLONG_MAX + 1);
+    g_assert(endptr == str + strlen(str));
+
+    g_free(str);
+}
+
+static void test_parse_uint_overflow(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = "99999999999999999999999999999999999999";
+    int r;
+
+    r = parse_uint(str, &i, &endptr);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, ULLONG_MAX);
+    g_assert(endptr == str + strlen(str));
+}
+
+static void test_parse_uint_negative(void)
+{
+    unsigned long long i = 999;
+    char f = 'X';
+    char *endptr = &f;
+    const char *str = " \t -321";
+    int r;
+
+    r = parse_uint(str, &i, &endptr);
+
+    g_assert_cmpint(r, ==, -ERANGE);
+    g_assert_cmpint(i, ==, 0);
+    g_assert(endptr == str + 3);
+}
+
+
+static void test_parse_uint_full_trailing(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123xxx";
+    int r;
+
+    r = parse_uint_full(str, &i);
+
+    g_assert_cmpint(r, ==, -EINVAL);
+    g_assert_cmpint(i, ==, 123);
+}
+
+static void test_parse_uint_full_correct(void)
+{
+    unsigned long long i = 999;
+    const char *str = "123";
+    int r;
+
+    r = parse_uint_full(str, &i);
+
+    g_assert_cmpint(r, ==, 0);
+    g_assert_cmpint(i, ==, 123);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/cutils/parse_uint/null", test_parse_uint_null);
+    g_test_add_func("/cutils/parse_uint/empty", test_parse_uint_empty);
+    g_test_add_func("/cutils/parse_uint/trailing", test_parse_uint_trailing);
+    g_test_add_func("/cutils/parse_uint/correct", test_parse_uint_correct);
+    g_test_add_func("/cutils/parse_uint/llong_max", test_parse_uint_llong_max);
+    g_test_add_func("/cutils/parse_uint/overflow", test_parse_uint_overflow);
+    g_test_add_func("/cutils/parse_uint/negative", test_parse_uint_negative);
+    g_test_add_func("/cutils/parse_uint_full/trailing",
+                    test_parse_uint_full_trailing);
+    g_test_add_func("/cutils/parse_uint_full/correct",
+                    test_parse_uint_full_correct);
+
+    return g_test_run();
+}
diff --git a/util/cutils.c b/util/cutils.c
index 80bb1dc..b5c4b3d 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -270,6 +270,82 @@ int64_t strtosz(const char *nptr, char **end)
     return strtosz_suffix(nptr, end, STRTOSZ_DEFSUFFIX_MB);
 }
 
+/* Try to parse an unsigned integer
+ *
+ * Error checks done by the function:
+ * - NULL pointer will return -EINVAL.
+ * - Empty strings will return -EINVAL.
+ * - Overflow errors or other errno values  set by strtoull() will
+ *   return -errno (-ERANGE in case of overflow).
+ * - Differently from strtoull(), values starting with a minus sign are
+ *   rejected (returning -ERANGE).
+ *
+ * Sets endptr to point to the first invalid character. Callers may rely
+ * on *value and *endptr to be always set by the function, even in case of
+ * errors.
+ *
+ * Returns 0 on success, negative errno value on error.
+ */
+int parse_uint(const char *s, unsigned long long *value, char **endptr)
+{
+    int r = 0;
+    char *endp = (char *)s;
+    unsigned long long val = 0;
+
+    if (!s) {
+        r = -EINVAL;
+        goto out;
+    }
+
+    /* make sure we reject negative numbers: */
+    while (isspace((unsigned char)*s)) {
+        ++s; endp++;
+    }
+    if (*s == '-') {
+        r = -ERANGE;
+        goto out;
+    }
+
+    errno = 0;
+    val = strtoull(s, &endp, 0);
+    if (errno) {
+        r =  -errno;
+        goto out;
+    }
+
+    if (endp == s) {
+        r =  -EINVAL;
+        goto out;
+    }
+
+out:
+    *value = val;
+    *endptr = endp;
+    return r;
+}
+
+/* Try to parse an unsigned integer, making sure the whole string is parsed
+ *
+ * Have the same behavior of parse_uint(), but with an additional check
+ * for additional data after the parsed number (in that case, the function
+ * will return -EINVAL).
+ */
+int parse_uint_full(const char *s, unsigned long long *value)
+{
+    char *endp;
+    int r;
+
+    r = parse_uint(s, value, &endp);
+    if (r < 0) {
+        return r;
+    }
+    if (*endp) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 int qemu_parse_fd(const char *param)
 {
     int fd;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-02-01 20:57 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-16 18:28 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
2013-01-17 18:29   ` Laszlo Ersek
2013-01-17 18:50     ` Eduardo Habkost
2013-01-17 19:06       ` [Qemu-devel] [PATCH 1/8 v4] " Eduardo Habkost
2013-01-17 19:25         ` Eduardo Habkost
2013-01-17 19:31         ` Laszlo Ersek
2013-01-17 19:55         ` Eric Blake
2013-01-17 21:04         ` Blue Swirl
2013-01-18 10:01   ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
2013-01-18 13:26     ` Eduardo Habkost
2013-01-18 13:32       ` Andreas Färber
2013-01-18 17:57       ` [Qemu-devel] [PATCH 1/8 v5] " Eduardo Habkost
2013-01-18 18:11         ` Eric Blake
2013-01-18 19:41           ` [Qemu-devel] [PATCH 1/8 v6] " Eduardo Habkost
2013-01-18 20:20             ` Eric Blake
2013-01-18 18:06       ` [Qemu-devel] [PATCH 1/8] " Markus Armbruster
2013-01-16 18:28 ` [Qemu-devel] [PATCH 2/8] vl.c: Fix off-by-one bug when handling "-numa node" argument Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 3/8] vl.c: Abort on unknown -numa option type Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 4/8] vl.c: Check for NUMA node limit inside numa_add() Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 5/8] vl.c: numa_add(): Validate nodeid before using it Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 6/8] vl.c: Use parse_uint_full() for NUMA nodeid Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 7/8] vl.c: Extract -numa "cpus" parsing to separate function Eduardo Habkost
2013-01-16 18:28 ` [Qemu-devel] [PATCH 8/8] vl.c: validate -numa "cpus" parameter properly Eduardo Habkost
2013-01-16 20:07 ` [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v3) Eric Blake
     [not found] ` <20130128165559.GA6849@otherpad.lan.raisama.net>
     [not found]   ` <20130131154227.GH6849@otherpad.lan.raisama.net>
2013-02-01 20:45     ` [Qemu-devel] [PATCH for-1.4 " Anthony Liguori
  -- strict thread matches above, loose matches on Subject: below --
2013-01-16 15:24 [Qemu-devel] [PATCH 0/8] -numa option parsing fixes (v2) Eduardo Habkost
2013-01-16 15:24 ` [Qemu-devel] [PATCH 1/8] cutils: unsigned int parsing functions Eduardo Habkost
2013-01-16 17:10   ` Eric Blake
2013-01-16 17:33     ` Eduardo Habkost
2013-01-16 17:50       ` Eric Blake
2013-01-16 17:56         ` Eduardo Habkost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.