All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Improve do_strtosz precision
@ 2021-02-04 19:07 Eric Blake
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: vsementsov, berrange, qemu-block, tao3.xu, rjones, armbru

Recently, commit 8b1170012b tweaked the maximum size the block layer
will allow, which in turn affects nbdkit's testsuite of edge-case
behaviors, where Rich noted [1] that our use of double meant rounding
errors that cause spurious failures in qemu-io (among other places).
So I decided to fix that.  In the process, I was reminded that we have
attempted this before, most recently in Dec 2019 [2], where Markus
(rightly, IMHO) said that any approach that tries to reimplement
strtoull() or which compares the amount of data consumed between
strtod() and strtoull() adds more complexity than it solves.

So first, our analysis of what we absolutely need:
- Existing clients expect decimal scaling to work (1M is a lot easier
  to type than 1048576)
- Existing clients expect hex to work (my initial attempt disabled it,
  and promptly hung in iotests when things like 0x10000 got parsed as
  zero rather than their intended byte value) (although our existing
  testsuite coverage was only via iotests, rather than unit tests)
- Existing clients expect sane decimal fractions to work (1.5k is
  easier to type than 1536), although our testsuite coverage of this
  was less apparant
- Existing clients are surprised when something that looks like a byte
  value is truncated or rounded due to an intermediate pass through
  double (hence Rich's bug report)

[1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00812.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html

My solution, instead of parsing twice and picking the longest, is to
always parse the integral portion with strtoull(), then special case
on what is left: if it is 'x' or 'X', the user wanted hex (and does
NOT want a fraction, and in fact we can optionally warn about the use
of scaling suffixes in this case); if it is '.', the user wanted the
convenience of a floating point adjustment which we can do via
strtod() on the suffix (and where we can optionally warn about
fractions that are not evenly divisible into bytes).  I also enhanced
the testsuite (our coverage for hex constants was implicit via
iotests, and now explicit in the unit test; and our testsuite did not
flag the fact that we were previously accepting nonsense like '0x1.8k'
for 1536, or '1.5e1M' for 53477376).

We may decide that one or both of patch 2 and 3 goes too far, which is
why I split them out from the main enhancement.

Eric Blake (3):
  utils: Improve qemu_strtosz() to have 64 bits of precision
  utils: Deprecate hex-with-suffix sizes
  utils: Deprecate inexact fractional suffix sizes

 docs/system/deprecated.rst |  8 ++++
 tests/test-cutils.c        | 83 ++++++++++++++++++++++++++++---------
 tests/test-keyval.c        | 28 +++++++++----
 tests/test-qemu-opts.c     | 24 +++++++----
 util/cutils.c              | 85 +++++++++++++++++++++++++++++---------
 tests/qemu-iotests/049.out |  7 +++-
 6 files changed, 178 insertions(+), 57 deletions(-)

-- 
2.30.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake
@ 2021-02-04 19:07 ` Eric Blake
  2021-02-04 20:12   ` Eric Blake
                     ` (4 more replies)
  2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake
  2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
  2 siblings, 5 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu, rjones,
	armbru, Max Reitz

We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
the keyval visitor), and it gets annoying that edge-case testing is
impacted by implicit rounding to 53 bits of precision due to parsing
with strtod().  As an example posted by Rich Jones:
 $ nbdkit memory $(( 2**63 - 2**30 )) --run \
   'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
 write failed: Input/output error

because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
out of bounds.

It is also worth noting that our existing parser, by virtue of using
strtod(), accepts decimal AND hex numbers, even though test-cutils
previously lacked any coverage of the latter.  We do have existing
clients that expect a hex parse to work (for example, iotest 33 using
qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
rather than as an invalid octal number, so we know there are no
clients that depend on octal.  Our use of strtod() also means that
"0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
than 1843 (if the fraction were 8/10); but as this was not covered in
the testsuite, I have no qualms forbidding hex fractions as invalid,
so this patch declares that the use of fractions is only supported
with decimal input, and enhances the testsuite to document that.

Our previous use of strtod() meant that -1 parsed as a negative; now
that we parse with strtoull(), negative values can wrap around module
2^64, so we have to explicitly check whether the user passed in a '-'.

We also had no testsuite coverage of "1.1e0k", which happened to parse
under strtod() but is unlikely to occur in practice; as long as we are
making things more robust, it is easy enough to reject the use of
exponents in a strtod parse.

The fix is done by breaking the parse into an integer prefix (no loss
in precision), rejecting negative values (since we can no longer rely
on strtod() to do that), determining if a decimal or hexadecimal parse
was intended (with the new restriction that a fractional hex parse is
not allowed), and where appropriate, using a floating point fractional
parse (where we also scan to reject use of exponents in the fraction).
The bulk of the patch is then updates to the testsuite to match our
new precision, as well as adding new cases we reject (whether they
were rejected or inadvertenly accepted before).

Signed-off-by: Eric Blake <eblake@redhat.com>

---

Note that this approach differs from what has been attempted in the
past; see the thread starting at
https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html
That approach tried to parse both as strtoull and strtod and take
whichever was longer, but that was harder to document.
---
 tests/test-cutils.c        | 79 ++++++++++++++++++++++++++++++--------
 tests/test-keyval.c        | 24 +++++++++---
 tests/test-qemu-opts.c     | 20 +++++++---
 util/cutils.c              | 77 +++++++++++++++++++++++++++----------
 tests/qemu-iotests/049.out |  7 +++-
 5 files changed, 156 insertions(+), 51 deletions(-)

diff --git a/tests/test-cutils.c b/tests/test-cutils.c
index 1aa8351520ae..0c2c89d6f113 100644
--- a/tests/test-cutils.c
+++ b/tests/test-cutils.c
@@ -1960,6 +1960,13 @@ static void test_qemu_strtosz_simple(void)
     g_assert_cmpint(res, ==, 0);
     g_assert(endptr == str + 1);

+    /* Leading 0 gives decimal results, not octal */
+    str = "08";
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, 0);
+    g_assert_cmpint(res, ==, 8);
+    g_assert(endptr == str + 2);
+
     str = "12345";
     err = qemu_strtosz(str, &endptr, &res);
     g_assert_cmpint(err, ==, 0);
@@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void)
     g_assert_cmpint(err, ==, 0);
     g_assert_cmpint(res, ==, 12345);

-    /* Note: precision is 53 bits since we're parsing with strtod() */
+    /* Note: If there is no '.', we get full 64 bits of precision */

     str = "9007199254740991"; /* 2^53-1 */
     err = qemu_strtosz(str, &endptr, &res);
@@ -1987,7 +1994,7 @@ static void test_qemu_strtosz_simple(void)
     str = "9007199254740993"; /* 2^53+1 */
     err = qemu_strtosz(str, &endptr, &res);
     g_assert_cmpint(err, ==, 0);
-    g_assert_cmpint(res, ==, 0x20000000000000); /* rounded to 53 bits */
+    g_assert_cmpint(res, ==, 0x20000000000001);
     g_assert(endptr == str + 16);

     str = "18446744073709549568"; /* 0xfffffffffffff800 (53 msbs set) */
@@ -1999,11 +2006,35 @@ static void test_qemu_strtosz_simple(void)
     str = "18446744073709550591"; /* 0xfffffffffffffbff */
     err = qemu_strtosz(str, &endptr, &res);
     g_assert_cmpint(err, ==, 0);
-    g_assert_cmpint(res, ==, 0xfffffffffffff800); /* rounded to 53 bits */
+    g_assert_cmpint(res, ==, 0xfffffffffffffbff);
     g_assert(endptr == str + 20);

-    /* 0x7ffffffffffffe00..0x7fffffffffffffff get rounded to
-     * 0x8000000000000000, thus -ERANGE; see test_qemu_strtosz_erange() */
+    str = "18446744073709551615"; /* 0xffffffffffffffff */
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, 0);
+    g_assert_cmpint(res, ==, 0xffffffffffffffff);
+    g_assert(endptr == str + 20);
+
+}
+
+static void test_qemu_strtosz_hex(void)
+{
+    const char *str;
+    const char *endptr;
+    int err;
+    uint64_t res = 0xbaadf00d;
+
+    str = "0x0";
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, 0);
+    g_assert_cmpint(res, ==, 0);
+    g_assert(endptr == str + 3);
+
+    str = "0xa";
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, 0);
+    g_assert_cmpint(res, ==, 10);
+    g_assert(endptr == str + 3);
 }

 static void test_qemu_strtosz_units(void)
@@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void)
     err = qemu_strtosz(str, &endptr, &res);
     g_assert_cmpint(err, ==, -EINVAL);
     g_assert(endptr == str);
+
+    str = "1.1e5";
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, -EINVAL);
+    g_assert(endptr == str);
+
+    str = "1.1B";
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, -EINVAL);
+    g_assert(endptr == str);
+
+    str = "0x1.8k";
+    err = qemu_strtosz(str, &endptr, &res);
+    g_assert_cmpint(err, ==, -EINVAL);
+    g_assert(endptr == str);
 }

 static void test_qemu_strtosz_trailing(void)
@@ -2145,17 +2191,7 @@ static void test_qemu_strtosz_erange(void)
     g_assert_cmpint(err, ==, -ERANGE);
     g_assert(endptr == str + 2);

-    str = "18446744073709550592"; /* 0xfffffffffffffc00 */
-    err = qemu_strtosz(str, &endptr, &res);
-    g_assert_cmpint(err, ==, -ERANGE);
-    g_assert(endptr == str + 20);
-
-    str = "18446744073709551615"; /* 2^64-1 */
-    err = qemu_strtosz(str, &endptr, &res);
-    g_assert_cmpint(err, ==, -ERANGE);
-    g_assert(endptr == str + 20);
-
-    str = "18446744073709551616"; /* 2^64 */
+    str = "18446744073709551616"; /* 2^64; see strtosz_simple for 2^64-1 */
     err = qemu_strtosz(str, &endptr, &res);
     g_assert_cmpint(err, ==, -ERANGE);
     g_assert(endptr == str + 20);
@@ -2168,15 +2204,22 @@ static void test_qemu_strtosz_erange(void)

 static void test_qemu_strtosz_metric(void)
 {
-    const char *str = "12345k";
+    const char *str;
     int err;
     const char *endptr;
     uint64_t res = 0xbaadf00d;

+    str = "12345k";
     err = qemu_strtosz_metric(str, &endptr, &res);
     g_assert_cmpint(err, ==, 0);
     g_assert_cmpint(res, ==, 12345000);
     g_assert(endptr == str + 6);
+
+    str = "12.345M";
+    err = qemu_strtosz_metric(str, &endptr, &res);
+    g_assert_cmpint(err, ==, 0);
+    g_assert_cmpint(res, ==, 12345000);
+    g_assert(endptr == str + 7);
 }

 int main(int argc, char **argv)
@@ -2443,6 +2486,8 @@ int main(int argc, char **argv)

     g_test_add_func("/cutils/strtosz/simple",
                     test_qemu_strtosz_simple);
+    g_test_add_func("/cutils/strtosz/hex",
+                    test_qemu_strtosz_hex);
     g_test_add_func("/cutils/strtosz/units",
                     test_qemu_strtosz_units);
     g_test_add_func("/cutils/strtosz/float",
diff --git a/tests/test-keyval.c b/tests/test-keyval.c
index ee927fe4e427..13be763650b2 100644
--- a/tests/test-keyval.c
+++ b/tests/test-keyval.c
@@ -445,9 +445,9 @@ static void test_keyval_visit_size(void)
     visit_end_struct(v, NULL);
     visit_free(v);

-    /* Note: precision is 53 bits since we're parsing with strtod() */
+    /* Note: full 64 bits of precision */

-    /* Around limit of precision: 2^53-1, 2^53, 2^53+1 */
+    /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */
     qdict = keyval_parse("sz1=9007199254740991,"
                          "sz2=9007199254740992,"
                          "sz3=9007199254740993",
@@ -460,7 +460,7 @@ static void test_keyval_visit_size(void)
     visit_type_size(v, "sz2", &sz, &error_abort);
     g_assert_cmphex(sz, ==, 0x20000000000000);
     visit_type_size(v, "sz3", &sz, &error_abort);
-    g_assert_cmphex(sz, ==, 0x20000000000000);
+    g_assert_cmphex(sz, ==, 0x20000000000001);
     visit_check_struct(v, &error_abort);
     visit_end_struct(v, NULL);
     visit_free(v);
@@ -475,7 +475,7 @@ static void test_keyval_visit_size(void)
     visit_type_size(v, "sz1", &sz, &error_abort);
     g_assert_cmphex(sz, ==, 0x7ffffffffffffc00);
     visit_type_size(v, "sz2", &sz, &error_abort);
-    g_assert_cmphex(sz, ==, 0x7ffffffffffffc00);
+    g_assert_cmphex(sz, ==, 0x7ffffffffffffdff);
     visit_check_struct(v, &error_abort);
     visit_end_struct(v, NULL);
     visit_free(v);
@@ -490,14 +490,26 @@ static void test_keyval_visit_size(void)
     visit_type_size(v, "sz1", &sz, &error_abort);
     g_assert_cmphex(sz, ==, 0xfffffffffffff800);
     visit_type_size(v, "sz2", &sz, &error_abort);
-    g_assert_cmphex(sz, ==, 0xfffffffffffff800);
+    g_assert_cmphex(sz, ==, 0xfffffffffffffbff);
+    visit_check_struct(v, &error_abort);
+    visit_end_struct(v, NULL);
+    visit_free(v);
+
+    /* Actual limit */
+    qdict = keyval_parse("sz1=18446744073709551615", /* ffffffffffffffff */
+                         NULL, NULL, &error_abort);
+    v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
+    qobject_unref(qdict);
+    visit_start_struct(v, NULL, NULL, 0, &error_abort);
+    visit_type_size(v, "sz1", &sz, &error_abort);
+    g_assert_cmphex(sz, ==, 0xffffffffffffffff);
     visit_check_struct(v, &error_abort);
     visit_end_struct(v, NULL);
     visit_free(v);

     /* Beyond limits */
     qdict = keyval_parse("sz1=-1,"
-                         "sz2=18446744073709550592", /* fffffffffffffc00 */
+                         "sz2=18446744073709551616", /* 2^64 */
                          NULL, NULL, &error_abort);
     v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
     qobject_unref(qdict);
diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c
index 8bbb17b1c726..f79b698e6715 100644
--- a/tests/test-qemu-opts.c
+++ b/tests/test-qemu-opts.c
@@ -654,9 +654,9 @@ static void test_opts_parse_size(void)
     g_assert_cmpuint(opts_count(opts), ==, 1);
     g_assert_cmpuint(qemu_opt_get_size(opts, "size1", 1), ==, 0);

-    /* Note: precision is 53 bits since we're parsing with strtod() */
+    /* Note: full 64 bits of precision */

-    /* Around limit of precision: 2^53-1, 2^53, 2^54 */
+    /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */
     opts = qemu_opts_parse(&opts_list_02,
                            "size1=9007199254740991,"
                            "size2=9007199254740992,"
@@ -668,7 +668,7 @@ static void test_opts_parse_size(void)
     g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
                      ==, 0x20000000000000);
     g_assert_cmphex(qemu_opt_get_size(opts, "size3", 1),
-                     ==, 0x20000000000000);
+                     ==, 0x20000000000001);

     /* Close to signed upper limit 0x7ffffffffffffc00 (53 msbs set) */
     opts = qemu_opts_parse(&opts_list_02,
@@ -679,7 +679,7 @@ static void test_opts_parse_size(void)
     g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1),
                      ==, 0x7ffffffffffffc00);
     g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
-                     ==, 0x7ffffffffffffc00);
+                     ==, 0x7ffffffffffffdff);

     /* Close to actual upper limit 0xfffffffffffff800 (53 msbs set) */
     opts = qemu_opts_parse(&opts_list_02,
@@ -690,14 +690,22 @@ static void test_opts_parse_size(void)
     g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1),
                      ==, 0xfffffffffffff800);
     g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
-                     ==, 0xfffffffffffff800);
+                     ==, 0xfffffffffffffbff);
+
+    /* Actual limit */
+    opts = qemu_opts_parse(&opts_list_02,
+                           "size1=18446744073709551615", /* ffffffffffffffff */
+                           false, &error_abort);
+    g_assert_cmpuint(opts_count(opts), ==, 1);
+    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1),
+                     ==, 0xffffffffffffffff);

     /* Beyond limits */
     opts = qemu_opts_parse(&opts_list_02, "size1=-1", false, &err);
     error_free_or_abort(&err);
     g_assert(!opts);
     opts = qemu_opts_parse(&opts_list_02,
-                           "size1=18446744073709550592", /* fffffffffffffc00 */
+                           "size1=18446744073709551616", /* 2^64 */
                            false, &err);
     error_free_or_abort(&err);
     g_assert(!opts);
diff --git a/util/cutils.c b/util/cutils.c
index 0b5073b33012..0234763bd70b 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -241,10 +241,21 @@ static int64_t suffix_mul(char suffix, int64_t unit)
 }

 /*
- * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
- * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
- * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
- * other error.
+ * Convert size string to bytes.
+ *
+ * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
+ * T/t for TB, with scaling based on @unit, and with @default_suffix
+ * implied if no explicit suffix was given.
+ *
+ * The end pointer will be returned in *end, if not NULL.  If there is
+ * no fraction, the input can be decimal or hexadecimal; if there is a
+ * fraction, then the input must be decimal and there must be a suffix
+ * (possibly by @default_suffix) larger than Byte, and the fractional
+ * portion may suffer from precision loss or rounding.  The input must
+ * be positive.
+ *
+ * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
+ * other error (with *@end left unchanged).
  */
 static int do_strtosz(const char *nptr, const char **end,
                       const char default_suffix, int64_t unit,
@@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end,
     int retval;
     const char *endptr;
     unsigned char c;
-    int mul_required = 0;
-    double val, mul, integral, fraction;
+    bool mul_required = false;
+    uint64_t val;
+    int64_t mul;
+    double fraction = 0.0;

-    retval = qemu_strtod_finite(nptr, &endptr, &val);
+    /* Parse integral portion as decimal. */
+    retval = qemu_strtou64(nptr, &endptr, 10, &val);
     if (retval) {
         goto out;
     }
-    fraction = modf(val, &integral);
-    if (fraction != 0) {
-        mul_required = 1;
+    if (strchr(nptr, '-') != NULL) {
+        retval = -ERANGE;
+        goto out;
+    }
+    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
+        /* Input looks like hex, reparse, and insist on no fraction. */
+        retval = qemu_strtou64(nptr, &endptr, 16, &val);
+        if (retval) {
+            goto out;
+        }
+        if (*endptr == '.') {
+            endptr = nptr;
+            retval = -EINVAL;
+            goto out;
+        }
+    } else if (*endptr == '.') {
+        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
+        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
+        if (retval) {
+            endptr = nptr;
+            goto out;
+        }
+        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
+            || memchr(nptr, 'E', endptr - nptr)) {
+            endptr = nptr;
+            retval = -EINVAL;
+            goto out;
+        }
+        if (fraction != 0) {
+            mul_required = true;
+        }
     }
     c = *endptr;
     mul = suffix_mul(c, unit);
-    if (mul >= 0) {
+    if (mul > 0) {
         endptr++;
     } else {
         mul = suffix_mul(default_suffix, unit);
-        assert(mul >= 0);
+        assert(mul > 0);
     }
     if (mul == 1 && mul_required) {
+        endptr = nptr;
         retval = -EINVAL;
         goto out;
     }
-    /*
-     * Values near UINT64_MAX overflow to 2**64 when converting to double
-     * precision.  Compare against the maximum representable double precision
-     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
-     * the direction of 0".
-     */
-    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
+    if (val > UINT64_MAX / mul) {
         retval = -ERANGE;
         goto out;
     }
-    *result = val * mul;
+    *result = val * mul + (uint64_t) (fraction * mul);
     retval = 0;

 out:
diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out
index b1d8fd9107ef..f38d3ccd5978 100644
--- a/tests/qemu-iotests/049.out
+++ b/tests/qemu-iotests/049.out
@@ -98,10 +98,13 @@ qemu-img create -f qcow2 -o size=-1024 TEST_DIR/t.qcow2
 qemu-img: TEST_DIR/t.qcow2: Value '-1024' is out of range for parameter 'size'

 qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- -1k
-qemu-img: Invalid image size specified. Must be between 0 and 9223372036854775807.
+qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for
+qemu-img: kilobytes, megabytes, gigabytes, terabytes, petabytes and exabytes.

 qemu-img create -f qcow2 -o size=-1k TEST_DIR/t.qcow2
-qemu-img: TEST_DIR/t.qcow2: Value '-1k' is out of range for parameter 'size'
+qemu-img: TEST_DIR/t.qcow2: Parameter 'size' expects a non-negative number below 2^64
+Optional suffix k, M, G, T, P or E means kilo-, mega-, giga-, tera-, peta-
+and exabytes, respectively.

 qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1kilobyte
 qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
@ 2021-02-04 19:07 ` Eric Blake
  2021-02-05 10:25   ` Vladimir Sementsov-Ogievskiy
  2021-02-05 11:13   ` Daniel P. Berrangé
  2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
  2 siblings, 2 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: vsementsov, berrange, qemu-block, reviewer:Incompatible changes,
	tao3.xu, rjones, armbru

Supporting '0x20M' looks odd, particularly since we have an 'E' suffix
that is ambiguous between a hex digit and the extremely large exibyte
suffix, as well as a 'B' suffix for bytes.  In practice, people using
hex inputs are specifying values in bytes (and would have written
0x2000000, or possibly relied on default_suffix in the case of
qemu_strtosz_MiB), and the use of scaling suffixes makes the most
sense for inputs in decimal (where the user would write 32M).  But
rather than outright dropping support for hex-with-suffix, let's
follow our deprecation policy.  Sadly, since qemu_strtosz() does not
have an Err** parameter, we pollute to stderr.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/system/deprecated.rst | 8 ++++++++
 util/cutils.c              | 6 +++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 6ac757ed9fa7..c404c3926e6f 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -146,6 +146,14 @@ library enabled as a cryptography provider.
 Neither the ``nettle`` library, or the built-in cryptography provider are
 supported on FIPS enabled hosts.

+hexadecimal sizes with scaling multipliers (since 6.0)
+''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+Input parameters that take a size value should only use a size suffix
+(such as 'k' or 'M') when the base is written in decimal, and not when
+the value is hexadecimal.  That is, '0x20M' is deprecated, and should
+be written either as '32M' or as '0x2000000'.
+
 QEMU Machine Protocol (QMP) commands
 ------------------------------------

diff --git a/util/cutils.c b/util/cutils.c
index 0234763bd70b..75190565cbb5 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char **end,
     int retval;
     const char *endptr;
     unsigned char c;
-    bool mul_required = false;
+    bool mul_required = false, hex = false;
     uint64_t val;
     int64_t mul;
     double fraction = 0.0;
@@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end,
     c = *endptr;
     mul = suffix_mul(c, unit);
     if (mul > 0) {
+        if (hex) {
+            fprintf(stderr, "Using a multiplier suffix on hex numbers "
+                    "is deprecated: %s\n", nptr);
+        }
         endptr++;
     } else {
         mul = suffix_mul(default_suffix, unit);
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
  2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake
@ 2021-02-04 19:07 ` Eric Blake
  2021-02-04 20:02   ` Eric Blake
                     ` (3 more replies)
  2 siblings, 4 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: vsementsov, berrange, qemu-block, tao3.xu, rjones, armbru

The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
happen to truncate it to 1126.  Our use of fractional sizes is
intended for convenience, but when a user specifies a fraction that is
not a clean translation to binary, truncating/rounding behind their
backs can cause confusion.  Better is to deprecate inexact values,
which still leaves '1.5k' as valid, but alerts the user to spell out
their values as a precise byte number in cases where they are
currently being rounded.

Note that values like '0.1G' in the testsuite need adjustment as a
result.

Sadly, since qemu_strtosz() does not have an Err** parameter, we
pollute to stderr.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 tests/test-cutils.c    | 4 ++--
 tests/test-keyval.c    | 4 ++--
 tests/test-qemu-opts.c | 4 ++--
 util/cutils.c          | 4 ++++
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/tests/test-cutils.c b/tests/test-cutils.c
index 0c2c89d6f113..ad51fb1baa51 100644
--- a/tests/test-cutils.c
+++ b/tests/test-cutils.c
@@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void)

 static void test_qemu_strtosz_float(void)
 {
-    const char *str = "12.345M";
+    const char *str = "12.125M";
     int err;
     const char *endptr;
     uint64_t res = 0xbaadf00d;

     err = qemu_strtosz(str, &endptr, &res);
     g_assert_cmpint(err, ==, 0);
-    g_assert_cmpint(res, ==, 12.345 * MiB);
+    g_assert_cmpint(res, ==, 12.125 * MiB);
     g_assert(endptr == str + 7);
 }

diff --git a/tests/test-keyval.c b/tests/test-keyval.c
index 13be763650b2..c951ac54cd23 100644
--- a/tests/test-keyval.c
+++ b/tests/test-keyval.c
@@ -522,7 +522,7 @@ static void test_keyval_visit_size(void)
     visit_free(v);

     /* Suffixes */
-    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T",
+    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T",
                          NULL, NULL, &error_abort);
     v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
     qobject_unref(qdict);
@@ -534,7 +534,7 @@ static void test_keyval_visit_size(void)
     visit_type_size(v, "sz3", &sz, &error_abort);
     g_assert_cmphex(sz, ==, 2 * MiB);
     visit_type_size(v, "sz4", &sz, &error_abort);
-    g_assert_cmphex(sz, ==, GiB / 10);
+    g_assert_cmphex(sz, ==, GiB / 8);
     visit_type_size(v, "sz5", &sz, &error_abort);
     g_assert_cmphex(sz, ==, 16777215ULL * TiB);
     visit_check_struct(v, &error_abort);
diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c
index f79b698e6715..6a1ea1d01c4f 100644
--- a/tests/test-qemu-opts.c
+++ b/tests/test-qemu-opts.c
@@ -717,10 +717,10 @@ static void test_opts_parse_size(void)
     g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8);
     g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536);
     g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB);
-    opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T",
+    opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T",
                            false, &error_abort);
     g_assert_cmpuint(opts_count(opts), ==, 2);
-    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10);
+    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8);
     g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB);

     /* Beyond limit with suffix */
diff --git a/util/cutils.c b/util/cutils.c
index 75190565cbb5..5ec6101ae778 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end,
         retval = -ERANGE;
         goto out;
     }
+    if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) {
+        fprintf(stderr, "Using a fractional size that is not an exact byte "
+                "multiple is deprecated: %s\n", nptr);
+    }
     *result = val * mul + (uint64_t) (fraction * mul);
     retval = 0;

-- 
2.30.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
@ 2021-02-04 20:02   ` Eric Blake
  2021-02-05 10:34   ` Richard W.M. Jones
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-04 20:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: vsementsov, berrange, qemu-block, tao3.xu, armbru, libvirt-list

On 2/4/21 1:07 PM, Eric Blake wrote:
> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
> happen to truncate it to 1126.  Our use of fractional sizes is
> intended for convenience, but when a user specifies a fraction that is
> not a clean translation to binary, truncating/rounding behind their
> backs can cause confusion.  Better is to deprecate inexact values,
> which still leaves '1.5k' as valid, but alerts the user to spell out
> their values as a precise byte number in cases where they are
> currently being rounded.

And I promptly forgot to save my changes in my editor.  Consider this
squashed in:

diff --git i/docs/system/deprecated.rst w/docs/system/deprecated.rst
index c404c3926e6f..8e5e05a10642 100644
--- i/docs/system/deprecated.rst
+++ w/docs/system/deprecated.rst
@@ -154,6 +154,15 @@ Input parameters that take a size value should only
use a size suffix
 the value is hexadecimal.  That is, '0x20M' is deprecated, and should
 be written either as '32M' or as '0x2000000'.

+inexact sizes via scaled fractions (since 6.0)
+''''''''''''''''''''''''''''''''''''''''''''''
+
+Input parameters that take a size value should only use a fractional
+size (such as '1.5M') that will result in an exact byte value.  The
+use of inexact values (such as '1.1M') that require truncation or
+rounding is deprecated, and you should instead consider writing your
+unusual size in bytes (here, '1153433' or '1153434' as desired).
+
 QEMU Machine Protocol (QMP) commands
 ------------------------------------



-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
@ 2021-02-04 20:12   ` Eric Blake
  2021-02-05 10:06     ` Vladimir Sementsov-Ogievskiy
  2021-02-05 10:07   ` Vladimir Sementsov-Ogievskiy
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2021-02-04 20:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu, armbru, Max Reitz

On 2/4/21 1:07 PM, Eric Blake wrote:
> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
> the keyval visitor), and it gets annoying that edge-case testing is
> impacted by implicit rounding to 53 bits of precision due to parsing
> with strtod().  As an example posted by Rich Jones:
>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>  write failed: Input/output error
> 
> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
> out of bounds.
> 
> It is also worth noting that our existing parser, by virtue of using
> strtod(), accepts decimal AND hex numbers, even though test-cutils
> previously lacked any coverage of the latter.  We do have existing
> clients that expect a hex parse to work (for example, iotest 33 using
> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
> rather than as an invalid octal number, so we know there are no
> clients that depend on octal.  Our use of strtod() also means that
> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
> than 1843 (if the fraction were 8/10); but as this was not covered in
> the testsuite, I have no qualms forbidding hex fractions as invalid,
> so this patch declares that the use of fractions is only supported
> with decimal input, and enhances the testsuite to document that.
> 
> Our previous use of strtod() meant that -1 parsed as a negative; now
> that we parse with strtoull(), negative values can wrap around module

modulo

> 2^64, so we have to explicitly check whether the user passed in a '-'.
> 
> We also had no testsuite coverage of "1.1e0k", which happened to parse
> under strtod() but is unlikely to occur in practice; as long as we are
> making things more robust, it is easy enough to reject the use of
> exponents in a strtod parse.
> 
> The fix is done by breaking the parse into an integer prefix (no loss
> in precision), rejecting negative values (since we can no longer rely
> on strtod() to do that), determining if a decimal or hexadecimal parse
> was intended (with the new restriction that a fractional hex parse is
> not allowed), and where appropriate, using a floating point fractional
> parse (where we also scan to reject use of exponents in the fraction).
> The bulk of the patch is then updates to the testsuite to match our
> new precision, as well as adding new cases we reject (whether they
> were rejected or inadvertenly accepted before).

inadvertently

> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> 

Is it a bad sign when I review my own code?

> 
>  /*
> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
> - * other error.
> + * Convert size string to bytes.
> + *
> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
> + * T/t for TB, with scaling based on @unit, and with @default_suffix
> + * implied if no explicit suffix was given.

Reformatted existing text, but incomplete; we also support 'P' and 'E'.

> + *
> + * The end pointer will be returned in *end, if not NULL.  If there is
> + * no fraction, the input can be decimal or hexadecimal; if there is a
> + * fraction, then the input must be decimal and there must be a suffix
> + * (possibly by @default_suffix) larger than Byte, and the fractional
> + * portion may suffer from precision loss or rounding.  The input must
> + * be positive.
> + *
> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
> + * other error (with *@end left unchanged).

If we take patch 2 and 3, this contract should also be updated to
mention what we consider to be deprecated.

>   */
>  static int do_strtosz(const char *nptr, const char **end,
>                        const char default_suffix, int64_t unit,
> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end,
>      int retval;
>      const char *endptr;
>      unsigned char c;
> -    int mul_required = 0;
> -    double val, mul, integral, fraction;
> +    bool mul_required = false;
> +    uint64_t val;
> +    int64_t mul;
> +    double fraction = 0.0;
> 
> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
> +    /* Parse integral portion as decimal. */
> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>      if (retval) {
>          goto out;
>      }
> -    fraction = modf(val, &integral);
> -    if (fraction != 0) {
> -        mul_required = 1;
> +    if (strchr(nptr, '-') != NULL) {
> +        retval = -ERANGE;
> +        goto out;
> +    }
> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
> +        /* Input looks like hex, reparse, and insist on no fraction. */
> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
> +        if (retval) {
> +            goto out;
> +        }
> +        if (*endptr == '.') {
> +            endptr = nptr;
> +            retval = -EINVAL;
> +            goto out;
> +        }
> +    } else if (*endptr == '.') {
> +        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
> +        if (retval) {
> +            endptr = nptr;
> +            goto out;
> +        }
> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
> +            || memchr(nptr, 'E', endptr - nptr)) {
> +            endptr = nptr;
> +            retval = -EINVAL;
> +            goto out;
> +        }
> +        if (fraction != 0) {
> +            mul_required = true;
> +        }
>      }
>      c = *endptr;
>      mul = suffix_mul(c, unit);
> -    if (mul >= 0) {
> +    if (mul > 0) {
>          endptr++;
>      } else {
>          mul = suffix_mul(default_suffix, unit);
> -        assert(mul >= 0);
> +        assert(mul > 0);
>      }
>      if (mul == 1 && mul_required) {
> +        endptr = nptr;
>          retval = -EINVAL;
>          goto out;
>      }
> -    /*
> -     * Values near UINT64_MAX overflow to 2**64 when converting to double
> -     * precision.  Compare against the maximum representable double precision
> -     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
> -     * the direction of 0".
> -     */
> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
> +    if (val > UINT64_MAX / mul) {

Hmm, do we care about:
15.9999999999999999999999999999E
where the fractional portion becomes large enough to actually bump our
sum below to 16E which indeed overflows?  Then again, we rejected a
fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0
due to rounding.
Maybe it's just worth a good comment why the overflow check here works
without consulting fraction.

>          retval = -ERANGE;
>          goto out;
>      }
> -    *result = val * mul;
> +    *result = val * mul + (uint64_t) (fraction * mul);
>      retval = 0;
> 
>  out:
-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 20:12   ` Eric Blake
@ 2021-02-05 10:06     ` Vladimir Sementsov-Ogievskiy
  2021-02-05 10:18       ` Vladimir Sementsov-Ogievskiy
  2021-02-05 14:06       ` Eric Blake
  0 siblings, 2 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:06 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, armbru, Max Reitz

04.02.2021 23:12, Eric Blake wrote:
> On 2/4/21 1:07 PM, Eric Blake wrote:
>> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
>> the keyval visitor), and it gets annoying that edge-case testing is
>> impacted by implicit rounding to 53 bits of precision due to parsing
>> with strtod().  As an example posted by Rich Jones:
>>   $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>>     'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>>   write failed: Input/output error
>>
>> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
>> out of bounds.
>>
>> It is also worth noting that our existing parser, by virtue of using
>> strtod(), accepts decimal AND hex numbers, even though test-cutils
>> previously lacked any coverage of the latter.  We do have existing
>> clients that expect a hex parse to work (for example, iotest 33 using
>> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
>> rather than as an invalid octal number, so we know there are no
>> clients that depend on octal.  Our use of strtod() also means that
>> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
>> than 1843 (if the fraction were 8/10); but as this was not covered in
>> the testsuite, I have no qualms forbidding hex fractions as invalid,
>> so this patch declares that the use of fractions is only supported
>> with decimal input, and enhances the testsuite to document that.
>>
>> Our previous use of strtod() meant that -1 parsed as a negative; now
>> that we parse with strtoull(), negative values can wrap around module
> 
> modulo
> 
>> 2^64, so we have to explicitly check whether the user passed in a '-'.
>>
>> We also had no testsuite coverage of "1.1e0k", which happened to parse
>> under strtod() but is unlikely to occur in practice; as long as we are
>> making things more robust, it is easy enough to reject the use of
>> exponents in a strtod parse.
>>
>> The fix is done by breaking the parse into an integer prefix (no loss
>> in precision), rejecting negative values (since we can no longer rely
>> on strtod() to do that), determining if a decimal or hexadecimal parse
>> was intended (with the new restriction that a fractional hex parse is
>> not allowed), and where appropriate, using a floating point fractional
>> parse (where we also scan to reject use of exponents in the fraction).
>> The bulk of the patch is then updates to the testsuite to match our
>> new precision, as well as adding new cases we reject (whether they
>> were rejected or inadvertenly accepted before).
> 
> inadvertently
> 
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>
>> ---
>>
> 
> Is it a bad sign when I review my own code?
> 
>>
>>   /*
>> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
>> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
>> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
>> - * other error.
>> + * Convert size string to bytes.
>> + *
>> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
>> + * T/t for TB, with scaling based on @unit, and with @default_suffix
>> + * implied if no explicit suffix was given.
> 
> Reformatted existing text, but incomplete; we also support 'P' and 'E'.
> 
>> + *
>> + * The end pointer will be returned in *end, if not NULL.  If there is
>> + * no fraction, the input can be decimal or hexadecimal; if there is a
>> + * fraction, then the input must be decimal and there must be a suffix
>> + * (possibly by @default_suffix) larger than Byte, and the fractional
>> + * portion may suffer from precision loss or rounding.  The input must
>> + * be positive.
>> + *
>> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
>> + * other error (with *@end left unchanged).
> 
> If we take patch 2 and 3, this contract should also be updated to
> mention what we consider to be deprecated.
> 
>>    */
>>   static int do_strtosz(const char *nptr, const char **end,
>>                         const char default_suffix, int64_t unit,
>> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end,
>>       int retval;
>>       const char *endptr;
>>       unsigned char c;
>> -    int mul_required = 0;
>> -    double val, mul, integral, fraction;
>> +    bool mul_required = false;
>> +    uint64_t val;
>> +    int64_t mul;
>> +    double fraction = 0.0;
>>
>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>> +    /* Parse integral portion as decimal. */
>> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>>       if (retval) {
>>           goto out;
>>       }
>> -    fraction = modf(val, &integral);
>> -    if (fraction != 0) {
>> -        mul_required = 1;
>> +    if (strchr(nptr, '-') != NULL) {
>> +        retval = -ERANGE;
>> +        goto out;
>> +    }
>> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
>> +        /* Input looks like hex, reparse, and insist on no fraction. */
>> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
>> +        if (retval) {
>> +            goto out;
>> +        }
>> +        if (*endptr == '.') {
>> +            endptr = nptr;
>> +            retval = -EINVAL;
>> +            goto out;
>> +        }
>> +    } else if (*endptr == '.') {
>> +        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
>> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
>> +        if (retval) {
>> +            endptr = nptr;
>> +            goto out;
>> +        }
>> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
>> +            || memchr(nptr, 'E', endptr - nptr)) {
>> +            endptr = nptr;
>> +            retval = -EINVAL;
>> +            goto out;
>> +        }
>> +        if (fraction != 0) {
>> +            mul_required = true;
>> +        }
>>       }
>>       c = *endptr;
>>       mul = suffix_mul(c, unit);
>> -    if (mul >= 0) {
>> +    if (mul > 0) {
>>           endptr++;
>>       } else {
>>           mul = suffix_mul(default_suffix, unit);
>> -        assert(mul >= 0);
>> +        assert(mul > 0);
>>       }
>>       if (mul == 1 && mul_required) {
>> +        endptr = nptr;
>>           retval = -EINVAL;
>>           goto out;
>>       }
>> -    /*
>> -     * Values near UINT64_MAX overflow to 2**64 when converting to double
>> -     * precision.  Compare against the maximum representable double precision
>> -     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
>> -     * the direction of 0".
>> -     */
>> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
>> +    if (val > UINT64_MAX / mul) {
> 
> Hmm, do we care about:
> 15.9999999999999999999999999999E
> where the fractional portion becomes large enough to actually bump our
> sum below to 16E which indeed overflows?  Then again, we rejected a
> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0
> due to rounding.
> Maybe it's just worth a good comment why the overflow check here works
> without consulting fraction.

worth a good comment, because I don't follow :)

If mul is big enough and fraction=0.5, why val*mul + fraction*mul will not overflow?

Also, if we find '.' in the number, why not just reparse the whole number with qemu_strtod_finite? And don't care about 1.0?

> 
>>           retval = -ERANGE;
>>           goto out;
>>       }
>> -    *result = val * mul;
>> +    *result = val * mul + (uint64_t) (fraction * mul);
>>       retval = 0;
>>
>>   out:


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
  2021-02-04 20:12   ` Eric Blake
@ 2021-02-05 10:07   ` Vladimir Sementsov-Ogievskiy
  2021-02-05 14:12     ` Eric Blake
  2021-02-05 10:28   ` Richard W.M. Jones
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:07 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: rjones, berrange, armbru, qemu-block, tao3.xu, Kevin Wolf, Max Reitz

04.02.2021 22:07, Eric Blake wrote:
> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
> the keyval visitor), and it gets annoying that edge-case testing is
> impacted by implicit rounding to 53 bits of precision due to parsing
> with strtod().  As an example posted by Rich Jones:
>   $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>     'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>   write failed: Input/output error
> 
> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
> out of bounds.
> 
> It is also worth noting that our existing parser, by virtue of using
> strtod(), accepts decimal AND hex numbers, even though test-cutils
> previously lacked any coverage of the latter.  We do have existing
> clients that expect a hex parse to work (for example, iotest 33 using
> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
> rather than as an invalid octal number, so we know there are no
> clients that depend on octal.  Our use of strtod() also means that
> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
> than 1843 (if the fraction were 8/10); but as this was not covered in
> the testsuite, I have no qualms forbidding hex fractions as invalid,
> so this patch declares that the use of fractions is only supported
> with decimal input, and enhances the testsuite to document that.
> 
> Our previous use of strtod() meant that -1 parsed as a negative; now
> that we parse with strtoull(), negative values can wrap around module
> 2^64, so we have to explicitly check whether the user passed in a '-'.
> 
> We also had no testsuite coverage of "1.1e0k", which happened to parse
> under strtod() but is unlikely to occur in practice; as long as we are
> making things more robust, it is easy enough to reject the use of
> exponents in a strtod parse.
> 
> The fix is done by breaking the parse into an integer prefix (no loss
> in precision), rejecting negative values (since we can no longer rely
> on strtod() to do that), determining if a decimal or hexadecimal parse
> was intended (with the new restriction that a fractional hex parse is
> not allowed), and where appropriate, using a floating point fractional
> parse (where we also scan to reject use of exponents in the fraction).
> The bulk of the patch is then updates to the testsuite to match our
> new precision, as well as adding new cases we reject (whether they
> were rejected or inadvertenly accepted before).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> 
> Note that this approach differs from what has been attempted in the
> past; see the thread starting at
> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html
> That approach tried to parse both as strtoull and strtod and take
> whichever was longer, but that was harder to document.
> ---
>   tests/test-cutils.c        | 79 ++++++++++++++++++++++++++++++--------
>   tests/test-keyval.c        | 24 +++++++++---
>   tests/test-qemu-opts.c     | 20 +++++++---
>   util/cutils.c              | 77 +++++++++++++++++++++++++++----------
>   tests/qemu-iotests/049.out |  7 +++-
>   5 files changed, 156 insertions(+), 51 deletions(-)
> 
> diff --git a/tests/test-cutils.c b/tests/test-cutils.c
> index 1aa8351520ae..0c2c89d6f113 100644
> --- a/tests/test-cutils.c
> +++ b/tests/test-cutils.c
> @@ -1960,6 +1960,13 @@ static void test_qemu_strtosz_simple(void)
>       g_assert_cmpint(res, ==, 0);
>       g_assert(endptr == str + 1);
> 
> +    /* Leading 0 gives decimal results, not octal */
> +    str = "08";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 8);
> +    g_assert(endptr == str + 2);
> +
>       str = "12345";
>       err = qemu_strtosz(str, &endptr, &res);
>       g_assert_cmpint(err, ==, 0);
> @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void)
>       g_assert_cmpint(err, ==, 0);
>       g_assert_cmpint(res, ==, 12345);
> 
> -    /* Note: precision is 53 bits since we're parsing with strtod() */
> +    /* Note: If there is no '.', we get full 64 bits of precision */
> 
>       str = "9007199254740991"; /* 2^53-1 */
>       err = qemu_strtosz(str, &endptr, &res);
> @@ -1987,7 +1994,7 @@ static void test_qemu_strtosz_simple(void)
>       str = "9007199254740993"; /* 2^53+1 */
>       err = qemu_strtosz(str, &endptr, &res);
>       g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 0x20000000000000); /* rounded to 53 bits */
> +    g_assert_cmpint(res, ==, 0x20000000000001);
>       g_assert(endptr == str + 16);
> 
>       str = "18446744073709549568"; /* 0xfffffffffffff800 (53 msbs set) */
> @@ -1999,11 +2006,35 @@ static void test_qemu_strtosz_simple(void)
>       str = "18446744073709550591"; /* 0xfffffffffffffbff */
>       err = qemu_strtosz(str, &endptr, &res);
>       g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 0xfffffffffffff800); /* rounded to 53 bits */
> +    g_assert_cmpint(res, ==, 0xfffffffffffffbff);
>       g_assert(endptr == str + 20);
> 
> -    /* 0x7ffffffffffffe00..0x7fffffffffffffff get rounded to
> -     * 0x8000000000000000, thus -ERANGE; see test_qemu_strtosz_erange() */
> +    str = "18446744073709551615"; /* 0xffffffffffffffff */
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 0xffffffffffffffff);
> +    g_assert(endptr == str + 20);
> +
> +}
> +
> +static void test_qemu_strtosz_hex(void)
> +{
> +    const char *str;
> +    const char *endptr;
> +    int err;
> +    uint64_t res = 0xbaadf00d;
> +
> +    str = "0x0";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 0);
> +    g_assert(endptr == str + 3);
> +
> +    str = "0xa";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 10);
> +    g_assert(endptr == str + 3);
>   }
> 
>   static void test_qemu_strtosz_units(void)
> @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void)
>       err = qemu_strtosz(str, &endptr, &res);
>       g_assert_cmpint(err, ==, -EINVAL);
>       g_assert(endptr == str);
> +
> +    str = "1.1e5";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, -EINVAL);
> +    g_assert(endptr == str);

I'd add also test with 'E'

> +
> +    str = "1.1B";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, -EINVAL);
> +    g_assert(endptr == str);
> +
> +    str = "0x1.8k";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, -EINVAL);
> +    g_assert(endptr == str);

ha this function looks like it should have

const cher *str[] = ["", " \t ", ... "0x1.8k"]

and all cases in a for()... and all other test cases may be ... Oh, I should say myself "don't refactor everything you see".

>   }
> 
>   static void test_qemu_strtosz_trailing(void)
> @@ -2145,17 +2191,7 @@ static void test_qemu_strtosz_erange(void)
>       g_assert_cmpint(err, ==, -ERANGE);
>       g_assert(endptr == str + 2);
> 
> -    str = "18446744073709550592"; /* 0xfffffffffffffc00 */
> -    err = qemu_strtosz(str, &endptr, &res);
> -    g_assert_cmpint(err, ==, -ERANGE);
> -    g_assert(endptr == str + 20);
> -
> -    str = "18446744073709551615"; /* 2^64-1 */
> -    err = qemu_strtosz(str, &endptr, &res);
> -    g_assert_cmpint(err, ==, -ERANGE);
> -    g_assert(endptr == str + 20);
> -
> -    str = "18446744073709551616"; /* 2^64 */
> +    str = "18446744073709551616"; /* 2^64; see strtosz_simple for 2^64-1 */
>       err = qemu_strtosz(str, &endptr, &res);
>       g_assert_cmpint(err, ==, -ERANGE);
>       g_assert(endptr == str + 20);
> @@ -2168,15 +2204,22 @@ static void test_qemu_strtosz_erange(void)
> 
>   static void test_qemu_strtosz_metric(void)
>   {
> -    const char *str = "12345k";
> +    const char *str;
>       int err;
>       const char *endptr;
>       uint64_t res = 0xbaadf00d;
> 
> +    str = "12345k";
>       err = qemu_strtosz_metric(str, &endptr, &res);
>       g_assert_cmpint(err, ==, 0);
>       g_assert_cmpint(res, ==, 12345000);
>       g_assert(endptr == str + 6);
> +
> +    str = "12.345M";
> +    err = qemu_strtosz_metric(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 12345000);
> +    g_assert(endptr == str + 7);
>   }
> 
>   int main(int argc, char **argv)
> @@ -2443,6 +2486,8 @@ int main(int argc, char **argv)
> 
>       g_test_add_func("/cutils/strtosz/simple",
>                       test_qemu_strtosz_simple);
> +    g_test_add_func("/cutils/strtosz/hex",
> +                    test_qemu_strtosz_hex);
>       g_test_add_func("/cutils/strtosz/units",
>                       test_qemu_strtosz_units);
>       g_test_add_func("/cutils/strtosz/float",
> diff --git a/tests/test-keyval.c b/tests/test-keyval.c
> index ee927fe4e427..13be763650b2 100644
> --- a/tests/test-keyval.c
> +++ b/tests/test-keyval.c
> @@ -445,9 +445,9 @@ static void test_keyval_visit_size(void)
>       visit_end_struct(v, NULL);
>       visit_free(v);
> 
> -    /* Note: precision is 53 bits since we're parsing with strtod() */
> +    /* Note: full 64 bits of precision */
> 
> -    /* Around limit of precision: 2^53-1, 2^53, 2^53+1 */
> +    /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */
>       qdict = keyval_parse("sz1=9007199254740991,"
>                            "sz2=9007199254740992,"
>                            "sz3=9007199254740993",
> @@ -460,7 +460,7 @@ static void test_keyval_visit_size(void)
>       visit_type_size(v, "sz2", &sz, &error_abort);
>       g_assert_cmphex(sz, ==, 0x20000000000000);
>       visit_type_size(v, "sz3", &sz, &error_abort);
> -    g_assert_cmphex(sz, ==, 0x20000000000000);
> +    g_assert_cmphex(sz, ==, 0x20000000000001);
>       visit_check_struct(v, &error_abort);
>       visit_end_struct(v, NULL);
>       visit_free(v);
> @@ -475,7 +475,7 @@ static void test_keyval_visit_size(void)
>       visit_type_size(v, "sz1", &sz, &error_abort);
>       g_assert_cmphex(sz, ==, 0x7ffffffffffffc00);
>       visit_type_size(v, "sz2", &sz, &error_abort);
> -    g_assert_cmphex(sz, ==, 0x7ffffffffffffc00);
> +    g_assert_cmphex(sz, ==, 0x7ffffffffffffdff);
>       visit_check_struct(v, &error_abort);
>       visit_end_struct(v, NULL);
>       visit_free(v);
> @@ -490,14 +490,26 @@ static void test_keyval_visit_size(void)
>       visit_type_size(v, "sz1", &sz, &error_abort);
>       g_assert_cmphex(sz, ==, 0xfffffffffffff800);
>       visit_type_size(v, "sz2", &sz, &error_abort);
> -    g_assert_cmphex(sz, ==, 0xfffffffffffff800);
> +    g_assert_cmphex(sz, ==, 0xfffffffffffffbff);
> +    visit_check_struct(v, &error_abort);
> +    visit_end_struct(v, NULL);
> +    visit_free(v);
> +
> +    /* Actual limit */
> +    qdict = keyval_parse("sz1=18446744073709551615", /* ffffffffffffffff */
> +                         NULL, NULL, &error_abort);
> +    v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
> +    qobject_unref(qdict);
> +    visit_start_struct(v, NULL, NULL, 0, &error_abort);
> +    visit_type_size(v, "sz1", &sz, &error_abort);
> +    g_assert_cmphex(sz, ==, 0xffffffffffffffff);
>       visit_check_struct(v, &error_abort);
>       visit_end_struct(v, NULL);
>       visit_free(v);
> 
>       /* Beyond limits */
>       qdict = keyval_parse("sz1=-1,"
> -                         "sz2=18446744073709550592", /* fffffffffffffc00 */
> +                         "sz2=18446744073709551616", /* 2^64 */
>                            NULL, NULL, &error_abort);
>       v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
>       qobject_unref(qdict);
> diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c
> index 8bbb17b1c726..f79b698e6715 100644
> --- a/tests/test-qemu-opts.c
> +++ b/tests/test-qemu-opts.c
> @@ -654,9 +654,9 @@ static void test_opts_parse_size(void)
>       g_assert_cmpuint(opts_count(opts), ==, 1);
>       g_assert_cmpuint(qemu_opt_get_size(opts, "size1", 1), ==, 0);
> 
> -    /* Note: precision is 53 bits since we're parsing with strtod() */
> +    /* Note: full 64 bits of precision */
> 
> -    /* Around limit of precision: 2^53-1, 2^53, 2^54 */
> +    /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */
>       opts = qemu_opts_parse(&opts_list_02,
>                              "size1=9007199254740991,"
>                              "size2=9007199254740992,"
> @@ -668,7 +668,7 @@ static void test_opts_parse_size(void)
>       g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
>                        ==, 0x20000000000000);
>       g_assert_cmphex(qemu_opt_get_size(opts, "size3", 1),
> -                     ==, 0x20000000000000);
> +                     ==, 0x20000000000001);
> 
>       /* Close to signed upper limit 0x7ffffffffffffc00 (53 msbs set) */

should fix comment?

>       opts = qemu_opts_parse(&opts_list_02,
> @@ -679,7 +679,7 @@ static void test_opts_parse_size(void)
>       g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1),
>                        ==, 0x7ffffffffffffc00);
>       g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
> -                     ==, 0x7ffffffffffffc00);
> +                     ==, 0x7ffffffffffffdff);
> 
>       /* Close to actual upper limit 0xfffffffffffff800 (53 msbs set) */

and here?

and probably in a previous file

>       opts = qemu_opts_parse(&opts_list_02,
> @@ -690,14 +690,22 @@ static void test_opts_parse_size(void)
>       g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1),
>                        ==, 0xfffffffffffff800);
>       g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
> -                     ==, 0xfffffffffffff800);
> +                     ==, 0xfffffffffffffbff);
> +
> +    /* Actual limit */
> +    opts = qemu_opts_parse(&opts_list_02,
> +                           "size1=18446744073709551615", /* ffffffffffffffff */
> +                           false, &error_abort);
> +    g_assert_cmpuint(opts_count(opts), ==, 1);
> +    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1),
> +                     ==, 0xffffffffffffffff);
> 
>       /* Beyond limits */
>       opts = qemu_opts_parse(&opts_list_02, "size1=-1", false, &err);
>       error_free_or_abort(&err);
>       g_assert(!opts);
>       opts = qemu_opts_parse(&opts_list_02,
> -                           "size1=18446744073709550592", /* fffffffffffffc00 */
> +                           "size1=18446744073709551616", /* 2^64 */
>                              false, &err);
>       error_free_or_abort(&err);
>       g_assert(!opts);

Test modifications above all looks OK.

> diff --git a/util/cutils.c b/util/cutils.c
> index 0b5073b33012..0234763bd70b 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -241,10 +241,21 @@ static int64_t suffix_mul(char suffix, int64_t unit)
>   }
> 
>   /*
> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
> - * other error.
> + * Convert size string to bytes.
> + *
> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
> + * T/t for TB, with scaling based on @unit, and with @default_suffix
> + * implied if no explicit suffix was given.
> + *
> + * The end pointer will be returned in *end, if not NULL.  If there is
> + * no fraction, the input can be decimal or hexadecimal; if there is a
> + * fraction, then the input must be decimal and there must be a suffix
> + * (possibly by @default_suffix) larger than Byte, and the fractional
> + * portion may suffer from precision loss or rounding.  The input must
> + * be positive.
> + *
> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
> + * other error (with *@end left unchanged).
>    */
>   static int do_strtosz(const char *nptr, const char **end,
>                         const char default_suffix, int64_t unit,
> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end,
>       int retval;
>       const char *endptr;
>       unsigned char c;
> -    int mul_required = 0;
> -    double val, mul, integral, fraction;
> +    bool mul_required = false;
> +    uint64_t val;
> +    int64_t mul;
> +    double fraction = 0.0;
> 
> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
> +    /* Parse integral portion as decimal. */
> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>       if (retval) {
>           goto out;
>       }
> -    fraction = modf(val, &integral);
> -    if (fraction != 0) {
> -        mul_required = 1;
> +    if (strchr(nptr, '-') != NULL) {
> +        retval = -ERANGE;
> +        goto out;
> +    }

Hmm, I have a question: does do_strtosz assumes that the whole nptr string is a number?

If yes, then I don't understand what the matter of **end OUT-argument.

If not, this if() is wrong, as you'll redject parsing first number of "123425 -  2323" string..

> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
> +        /* Input looks like hex, reparse, and insist on no fraction. */
> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
> +        if (retval) {
> +            goto out;
> +        }
> +        if (*endptr == '.') {
> +            endptr = nptr;
> +            retval = -EINVAL;
> +            goto out;
> +        }
> +    } else if (*endptr == '.') {
> +        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
> +        if (retval) {
> +            endptr = nptr;
> +            goto out;
> +        }
> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
> +            || memchr(nptr, 'E', endptr - nptr)) {
> +            endptr = nptr;
> +            retval = -EINVAL;
> +            goto out;
> +        }
> +        if (fraction != 0) {
> +            mul_required = true;
> +        }
>       }
>       c = *endptr;
>       mul = suffix_mul(c, unit);
> -    if (mul >= 0) {
> +    if (mul > 0) {
>           endptr++;
>       } else {
>           mul = suffix_mul(default_suffix, unit);
> -        assert(mul >= 0);
> +        assert(mul > 0);
>       }
>       if (mul == 1 && mul_required) {
> +        endptr = nptr;
>           retval = -EINVAL;
>           goto out;
>       }
> -    /*
> -     * Values near UINT64_MAX overflow to 2**64 when converting to double
> -     * precision.  Compare against the maximum representable double precision
> -     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
> -     * the direction of 0".
> -     */
> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
> +    if (val > UINT64_MAX / mul) {
>           retval = -ERANGE;
>           goto out;
>       }
> -    *result = val * mul;
> +    *result = val * mul + (uint64_t) (fraction * mul);
>       retval = 0;
> 
>   out:
> diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out
> index b1d8fd9107ef..f38d3ccd5978 100644
> --- a/tests/qemu-iotests/049.out
> +++ b/tests/qemu-iotests/049.out
> @@ -98,10 +98,13 @@ qemu-img create -f qcow2 -o size=-1024 TEST_DIR/t.qcow2
>   qemu-img: TEST_DIR/t.qcow2: Value '-1024' is out of range for parameter 'size'
> 
>   qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- -1k
> -qemu-img: Invalid image size specified. Must be between 0 and 9223372036854775807.
> +qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for
> +qemu-img: kilobytes, megabytes, gigabytes, terabytes, petabytes and exabytes.
> 
>   qemu-img create -f qcow2 -o size=-1k TEST_DIR/t.qcow2
> -qemu-img: TEST_DIR/t.qcow2: Value '-1k' is out of range for parameter 'size'
> +qemu-img: TEST_DIR/t.qcow2: Parameter 'size' expects a non-negative number below 2^64
> +Optional suffix k, M, G, T, P or E means kilo-, mega-, giga-, tera-, peta-
> +and exabytes, respectively.
> 
>   qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1kilobyte
>   qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 10:06     ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 10:18       ` Vladimir Sementsov-Ogievskiy
  2021-02-05 14:06       ` Eric Blake
  1 sibling, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:18 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, armbru, Max Reitz

05.02.2021 13:06, Vladimir Sementsov-Ogievskiy wrote:
> 04.02.2021 23:12, Eric Blake wrote:
>> On 2/4/21 1:07 PM, Eric Blake wrote:
>>> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
>>> the keyval visitor), and it gets annoying that edge-case testing is
>>> impacted by implicit rounding to 53 bits of precision due to parsing
>>> with strtod().  As an example posted by Rich Jones:
>>>   $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>>>     'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>>>   write failed: Input/output error
>>>
>>> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
>>> out of bounds.
>>>
>>> It is also worth noting that our existing parser, by virtue of using
>>> strtod(), accepts decimal AND hex numbers, even though test-cutils
>>> previously lacked any coverage of the latter.  We do have existing
>>> clients that expect a hex parse to work (for example, iotest 33 using
>>> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
>>> rather than as an invalid octal number, so we know there are no
>>> clients that depend on octal.  Our use of strtod() also means that
>>> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
>>> than 1843 (if the fraction were 8/10); but as this was not covered in
>>> the testsuite, I have no qualms forbidding hex fractions as invalid,
>>> so this patch declares that the use of fractions is only supported
>>> with decimal input, and enhances the testsuite to document that.
>>>
>>> Our previous use of strtod() meant that -1 parsed as a negative; now
>>> that we parse with strtoull(), negative values can wrap around module
>>
>> modulo
>>
>>> 2^64, so we have to explicitly check whether the user passed in a '-'.
>>>
>>> We also had no testsuite coverage of "1.1e0k", which happened to parse
>>> under strtod() but is unlikely to occur in practice; as long as we are
>>> making things more robust, it is easy enough to reject the use of
>>> exponents in a strtod parse.
>>>
>>> The fix is done by breaking the parse into an integer prefix (no loss
>>> in precision), rejecting negative values (since we can no longer rely
>>> on strtod() to do that), determining if a decimal or hexadecimal parse
>>> was intended (with the new restriction that a fractional hex parse is
>>> not allowed), and where appropriate, using a floating point fractional
>>> parse (where we also scan to reject use of exponents in the fraction).
>>> The bulk of the patch is then updates to the testsuite to match our
>>> new precision, as well as adding new cases we reject (whether they
>>> were rejected or inadvertenly accepted before).
>>
>> inadvertently
>>
>>>
>>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>>
>>> ---
>>>
>>
>> Is it a bad sign when I review my own code?
>>
>>>
>>>   /*
>>> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
>>> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
>>> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
>>> - * other error.
>>> + * Convert size string to bytes.
>>> + *
>>> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
>>> + * T/t for TB, with scaling based on @unit, and with @default_suffix
>>> + * implied if no explicit suffix was given.
>>
>> Reformatted existing text, but incomplete; we also support 'P' and 'E'.
>>
>>> + *
>>> + * The end pointer will be returned in *end, if not NULL.  If there is
>>> + * no fraction, the input can be decimal or hexadecimal; if there is a
>>> + * fraction, then the input must be decimal and there must be a suffix
>>> + * (possibly by @default_suffix) larger than Byte, and the fractional
>>> + * portion may suffer from precision loss or rounding.  The input must
>>> + * be positive.
>>> + *
>>> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
>>> + * other error (with *@end left unchanged).
>>
>> If we take patch 2 and 3, this contract should also be updated to
>> mention what we consider to be deprecated.
>>
>>>    */
>>>   static int do_strtosz(const char *nptr, const char **end,
>>>                         const char default_suffix, int64_t unit,
>>> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end,
>>>       int retval;
>>>       const char *endptr;
>>>       unsigned char c;
>>> -    int mul_required = 0;
>>> -    double val, mul, integral, fraction;
>>> +    bool mul_required = false;
>>> +    uint64_t val;
>>> +    int64_t mul;
>>> +    double fraction = 0.0;
>>>
>>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>>> +    /* Parse integral portion as decimal. */
>>> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>>>       if (retval) {
>>>           goto out;
>>>       }
>>> -    fraction = modf(val, &integral);
>>> -    if (fraction != 0) {
>>> -        mul_required = 1;
>>> +    if (strchr(nptr, '-') != NULL) {
>>> +        retval = -ERANGE;
>>> +        goto out;
>>> +    }
>>> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
>>> +        /* Input looks like hex, reparse, and insist on no fraction. */
>>> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
>>> +        if (retval) {
>>> +            goto out;
>>> +        }
>>> +        if (*endptr == '.') {
>>> +            endptr = nptr;
>>> +            retval = -EINVAL;
>>> +            goto out;
>>> +        }
>>> +    } else if (*endptr == '.') {
>>> +        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
>>> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
>>> +        if (retval) {
>>> +            endptr = nptr;
>>> +            goto out;
>>> +        }
>>> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
>>> +            || memchr(nptr, 'E', endptr - nptr)) {
>>> +            endptr = nptr;
>>> +            retval = -EINVAL;
>>> +            goto out;
>>> +        }
>>> +        if (fraction != 0) {
>>> +            mul_required = true;
>>> +        }
>>>       }
>>>       c = *endptr;
>>>       mul = suffix_mul(c, unit);
>>> -    if (mul >= 0) {
>>> +    if (mul > 0) {
>>>           endptr++;
>>>       } else {
>>>           mul = suffix_mul(default_suffix, unit);
>>> -        assert(mul >= 0);
>>> +        assert(mul > 0);
>>>       }
>>>       if (mul == 1 && mul_required) {
>>> +        endptr = nptr;
>>>           retval = -EINVAL;
>>>           goto out;
>>>       }
>>> -    /*
>>> -     * Values near UINT64_MAX overflow to 2**64 when converting to double
>>> -     * precision.  Compare against the maximum representable double precision
>>> -     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
>>> -     * the direction of 0".
>>> -     */
>>> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
>>> +    if (val > UINT64_MAX / mul) {
>>
>> Hmm, do we care about:
>> 15.9999999999999999999999999999E
>> where the fractional portion becomes large enough to actually bump our
>> sum below to 16E which indeed overflows?  Then again, we rejected a
>> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0
>> due to rounding.
>> Maybe it's just worth a good comment why the overflow check here works
>> without consulting fraction.
> 
> worth a good comment, because I don't follow :)
> 
> If mul is big enough and fraction=0.5, why val*mul + fraction*mul will not overflow?
> 
> Also, if we find '.' in the number, why not just reparse the whole number with qemu_strtod_finite? And don't care about 1.0?

Ah understand now, because mul is a power of two, so 2^64 - (val * mul) >= mul. Still, is it possible that (due to float calculations limited precision) some double close to 1 (but still < 1.0) being multiplied to mul will result in something >= mul?

Also I failed to understand what you write because I forget about E=exbibyte and thought only about exponent

> 
>>
>>>           retval = -ERANGE;
>>>           goto out;
>>>       }
>>> -    *result = val * mul;
>>> +    *result = val * mul + (uint64_t) (fraction * mul);
>>>       retval = 0;
>>>
>>>   out:
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake
@ 2021-02-05 10:25   ` Vladimir Sementsov-Ogievskiy
  2021-02-05 10:31     ` Richard W.M. Jones
  2021-02-05 13:38     ` Eric Blake
  2021-02-05 11:13   ` Daniel P. Berrangé
  1 sibling, 2 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:25 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: rjones, berrange, armbru, qemu-block, tao3.xu,
	reviewer:Incompatible changes

04.02.2021 22:07, Eric Blake wrote:
> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix

What about also deprecating 'E' suffix? (just my problem of reviewing previous patch)

> that is ambiguous between a hex digit and the extremely large exibyte
> suffix, as well as a 'B' suffix for bytes.  In practice, people using
> hex inputs are specifying values in bytes (and would have written
> 0x2000000, or possibly relied on default_suffix in the case of
> qemu_strtosz_MiB), and the use of scaling suffixes makes the most
> sense for inputs in decimal (where the user would write 32M).  But
> rather than outright dropping support for hex-with-suffix, let's
> follow our deprecation policy.  Sadly, since qemu_strtosz() does not
> have an Err** parameter, we pollute to stderr.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   docs/system/deprecated.rst | 8 ++++++++
>   util/cutils.c              | 6 +++++-
>   2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> index 6ac757ed9fa7..c404c3926e6f 100644
> --- a/docs/system/deprecated.rst
> +++ b/docs/system/deprecated.rst
> @@ -146,6 +146,14 @@ library enabled as a cryptography provider.
>   Neither the ``nettle`` library, or the built-in cryptography provider are
>   supported on FIPS enabled hosts.
> 
> +hexadecimal sizes with scaling multipliers (since 6.0)
> +''''''''''''''''''''''''''''''''''''''''''''''''''''''
> +
> +Input parameters that take a size value should only use a size suffix
> +(such as 'k' or 'M') when the base is written in decimal, and not when
> +the value is hexadecimal.  That is, '0x20M' is deprecated, and should
> +be written either as '32M' or as '0x2000000'.
> +
>   QEMU Machine Protocol (QMP) commands
>   ------------------------------------
> 
> diff --git a/util/cutils.c b/util/cutils.c
> index 0234763bd70b..75190565cbb5 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char **end,
>       int retval;
>       const char *endptr;
>       unsigned char c;
> -    bool mul_required = false;
> +    bool mul_required = false, hex = false;
>       uint64_t val;
>       int64_t mul;
>       double fraction = 0.0;
> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end,

you forget to set hex to true in corresponding if(){...}

>       c = *endptr;
>       mul = suffix_mul(c, unit);
>       if (mul > 0) {
> +        if (hex) {
> +            fprintf(stderr, "Using a multiplier suffix on hex numbers "
> +                    "is deprecated: %s\n", nptr);
> +        }
>           endptr++;
>       } else {
>           mul = suffix_mul(default_suffix, unit);

should we also deprecate hex where default_suffix is not 'B' ?


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
  2021-02-04 20:12   ` Eric Blake
  2021-02-05 10:07   ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 10:28   ` Richard W.M. Jones
  2021-02-05 14:15     ` Eric Blake
  2021-02-05 11:02   ` Daniel P. Berrangé
  2021-02-05 11:34   ` Daniel P. Berrangé
  4 siblings, 1 reply; 31+ messages in thread
From: Richard W.M. Jones @ 2021-02-05 10:28 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu,
	qemu-devel, armbru, Max Reitz

On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote:
> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
> the keyval visitor), and it gets annoying that edge-case testing is
> impacted by implicit rounding to 53 bits of precision due to parsing
> with strtod().  As an example posted by Rich Jones:
>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>  write failed: Input/output error
> 
> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
> out of bounds.
> 
> It is also worth noting that our existing parser, by virtue of using
> strtod(), accepts decimal AND hex numbers, even though test-cutils
> previously lacked any coverage of the latter.  We do have existing
> clients that expect a hex parse to work (for example, iotest 33 using
> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
> rather than as an invalid octal number, so we know there are no
> clients that depend on octal.  Our use of strtod() also means that
> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
> than 1843 (if the fraction were 8/10); but as this was not covered in
> the testsuite, I have no qualms forbidding hex fractions as invalid,
> so this patch declares that the use of fractions is only supported
> with decimal input, and enhances the testsuite to document that.
> 
> Our previous use of strtod() meant that -1 parsed as a negative; now
> that we parse with strtoull(), negative values can wrap around module

^^ modulo

The patch looked fine to me although Vladimir found some problems
which I didn't spot.  I have a question: What happens with leading or
trailing whitespace?  Is that ignored, rejected or impossible?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-05 10:25   ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 10:31     ` Richard W.M. Jones
  2021-02-05 13:38     ` Eric Blake
  1 sibling, 0 replies; 31+ messages in thread
From: Richard W.M. Jones @ 2021-02-05 10:31 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: berrange, qemu-block, reviewer:Incompatible changes, tao3.xu,
	qemu-devel, armbru

On Fri, Feb 05, 2021 at 01:25:18PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 04.02.2021 22:07, Eric Blake wrote:
> >Supporting '0x20M' looks odd, particularly since we have an 'E' suffix
> 
> What about also deprecating 'E' suffix? (just my problem of reviewing previous patch)

Ha! What if people want to specify exabytes?!  That actually works at
the moment:

$ nbdkit memory 7E --run 'qemu-io -f raw -c "r -v 1E 512" "$uri"'

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
  2021-02-04 20:02   ` Eric Blake
@ 2021-02-05 10:34   ` Richard W.M. Jones
  2021-02-05 14:19     ` Eric Blake
  2021-02-05 10:38   ` Vladimir Sementsov-Ogievskiy
  2021-02-05 11:10   ` Daniel P. Berrangé
  3 siblings, 1 reply; 31+ messages in thread
From: Richard W.M. Jones @ 2021-02-05 10:34 UTC (permalink / raw)
  To: Eric Blake; +Cc: vsementsov, berrange, qemu-block, tao3.xu, qemu-devel, armbru

On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote:
> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
> happen to truncate it to 1126.  Our use of fractional sizes is
> intended for convenience, but when a user specifies a fraction that is
> not a clean translation to binary, truncating/rounding behind their
> backs can cause confusion.  Better is to deprecate inexact values,
> which still leaves '1.5k' as valid, but alerts the user to spell out
> their values as a precise byte number in cases where they are
> currently being rounded.
> 
> Note that values like '0.1G' in the testsuite need adjustment as a
> result.
> 
> Sadly, since qemu_strtosz() does not have an Err** parameter, we
> pollute to stderr.

Does "deprecate" mean there's a plan to eventually remove this?  If so
I think it should say that (in docs/system/deprecated.rst I think).

I think it would be better to remove this now or in the future since
it's a trap for users.

Rich.

> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  tests/test-cutils.c    | 4 ++--
>  tests/test-keyval.c    | 4 ++--
>  tests/test-qemu-opts.c | 4 ++--
>  util/cutils.c          | 4 ++++
>  4 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/test-cutils.c b/tests/test-cutils.c
> index 0c2c89d6f113..ad51fb1baa51 100644
> --- a/tests/test-cutils.c
> +++ b/tests/test-cutils.c
> @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void)
> 
>  static void test_qemu_strtosz_float(void)
>  {
> -    const char *str = "12.345M";
> +    const char *str = "12.125M";
>      int err;
>      const char *endptr;
>      uint64_t res = 0xbaadf00d;
> 
>      err = qemu_strtosz(str, &endptr, &res);
>      g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 12.345 * MiB);
> +    g_assert_cmpint(res, ==, 12.125 * MiB);
>      g_assert(endptr == str + 7);
>  }
> 
> diff --git a/tests/test-keyval.c b/tests/test-keyval.c
> index 13be763650b2..c951ac54cd23 100644
> --- a/tests/test-keyval.c
> +++ b/tests/test-keyval.c
> @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void)
>      visit_free(v);
> 
>      /* Suffixes */
> -    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T",
> +    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T",
>                           NULL, NULL, &error_abort);
>      v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
>      qobject_unref(qdict);
> @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void)
>      visit_type_size(v, "sz3", &sz, &error_abort);
>      g_assert_cmphex(sz, ==, 2 * MiB);
>      visit_type_size(v, "sz4", &sz, &error_abort);
> -    g_assert_cmphex(sz, ==, GiB / 10);
> +    g_assert_cmphex(sz, ==, GiB / 8);
>      visit_type_size(v, "sz5", &sz, &error_abort);
>      g_assert_cmphex(sz, ==, 16777215ULL * TiB);
>      visit_check_struct(v, &error_abort);
> diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c
> index f79b698e6715..6a1ea1d01c4f 100644
> --- a/tests/test-qemu-opts.c
> +++ b/tests/test-qemu-opts.c
> @@ -717,10 +717,10 @@ static void test_opts_parse_size(void)
>      g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8);
>      g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536);
>      g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB);
> -    opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T",
> +    opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T",
>                             false, &error_abort);
>      g_assert_cmpuint(opts_count(opts), ==, 2);
> -    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10);
> +    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8);
>      g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB);
> 
>      /* Beyond limit with suffix */
> diff --git a/util/cutils.c b/util/cutils.c
> index 75190565cbb5..5ec6101ae778 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end,
>          retval = -ERANGE;
>          goto out;
>      }
> +    if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) {
> +        fprintf(stderr, "Using a fractional size that is not an exact byte "
> +                "multiple is deprecated: %s\n", nptr);
> +    }
>      *result = val * mul + (uint64_t) (fraction * mul);
>      retval = 0;
> 
> -- 
> 2.30.0

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
  2021-02-04 20:02   ` Eric Blake
  2021-02-05 10:34   ` Richard W.M. Jones
@ 2021-02-05 10:38   ` Vladimir Sementsov-Ogievskiy
  2021-02-05 11:10   ` Daniel P. Berrangé
  3 siblings, 0 replies; 31+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:38 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: rjones, berrange, armbru, qemu-block, tao3.xu

04.02.2021 22:07, Eric Blake wrote:
> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
> happen to truncate it to 1126.  Our use of fractional sizes is
> intended for convenience, but when a user specifies a fraction that is
> not a clean translation to binary, truncating/rounding behind their
> backs can cause confusion.  Better is to deprecate inexact values,
> which still leaves '1.5k' as valid, but alerts the user to spell out
> their values as a precise byte number in cases where they are
> currently being rounded.
> 
> Note that values like '0.1G' in the testsuite need adjustment as a
> result.
> 

Honestly, I don't agree with the reasoning. User shouldn't expect something extremely precise when he (is it correct use "he" here, or what is the right word?) specify fractional size. I doubt that someone use fractional sizes for production.

Also, this will break almost everything except for 0.5.. I don't think such deprecation will force people to change their habits to use 0.125 instead of 0.1. Simpler is to move to integrals and don't care.

So, in this way we can just deprecate fractional sizes at all and don't care with them anymore. Or allow only "0.5" fraction as the only exclusion :)

> Sadly, since qemu_strtosz() does not have an Err** parameter, we
> pollute to stderr.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   tests/test-cutils.c    | 4 ++--
>   tests/test-keyval.c    | 4 ++--
>   tests/test-qemu-opts.c | 4 ++--
>   util/cutils.c          | 4 ++++
>   4 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/test-cutils.c b/tests/test-cutils.c
> index 0c2c89d6f113..ad51fb1baa51 100644
> --- a/tests/test-cutils.c
> +++ b/tests/test-cutils.c
> @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void)
> 
>   static void test_qemu_strtosz_float(void)
>   {
> -    const char *str = "12.345M";
> +    const char *str = "12.125M";
>       int err;
>       const char *endptr;
>       uint64_t res = 0xbaadf00d;
> 
>       err = qemu_strtosz(str, &endptr, &res);
>       g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 12.345 * MiB);
> +    g_assert_cmpint(res, ==, 12.125 * MiB);
>       g_assert(endptr == str + 7);
>   }
> 
> diff --git a/tests/test-keyval.c b/tests/test-keyval.c
> index 13be763650b2..c951ac54cd23 100644
> --- a/tests/test-keyval.c
> +++ b/tests/test-keyval.c
> @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void)
>       visit_free(v);
> 
>       /* Suffixes */
> -    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T",
> +    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T",
>                            NULL, NULL, &error_abort);
>       v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
>       qobject_unref(qdict);
> @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void)
>       visit_type_size(v, "sz3", &sz, &error_abort);
>       g_assert_cmphex(sz, ==, 2 * MiB);
>       visit_type_size(v, "sz4", &sz, &error_abort);
> -    g_assert_cmphex(sz, ==, GiB / 10);
> +    g_assert_cmphex(sz, ==, GiB / 8);
>       visit_type_size(v, "sz5", &sz, &error_abort);
>       g_assert_cmphex(sz, ==, 16777215ULL * TiB);
>       visit_check_struct(v, &error_abort);
> diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c
> index f79b698e6715..6a1ea1d01c4f 100644
> --- a/tests/test-qemu-opts.c
> +++ b/tests/test-qemu-opts.c
> @@ -717,10 +717,10 @@ static void test_opts_parse_size(void)
>       g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8);
>       g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536);
>       g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB);
> -    opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T",
> +    opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T",
>                              false, &error_abort);
>       g_assert_cmpuint(opts_count(opts), ==, 2);
> -    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10);
> +    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8);
>       g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB);
> 
>       /* Beyond limit with suffix */
> diff --git a/util/cutils.c b/util/cutils.c
> index 75190565cbb5..5ec6101ae778 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end,
>           retval = -ERANGE;
>           goto out;
>       }
> +    if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) {
> +        fprintf(stderr, "Using a fractional size that is not an exact byte "
> +                "multiple is deprecated: %s\n", nptr);
> +    }
>       *result = val * mul + (uint64_t) (fraction * mul);
>       retval = 0;
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
                     ` (2 preceding siblings ...)
  2021-02-05 10:28   ` Richard W.M. Jones
@ 2021-02-05 11:02   ` Daniel P. Berrangé
  2021-02-05 14:27     ` Eric Blake
  2021-02-05 11:34   ` Daniel P. Berrangé
  4 siblings, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 11:02 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru,
	qemu-devel, Max Reitz

On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote:
> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
> the keyval visitor), and it gets annoying that edge-case testing is
> impacted by implicit rounding to 53 bits of precision due to parsing
> with strtod().  As an example posted by Rich Jones:
>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>  write failed: Input/output error
> 
> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
> out of bounds.
> 
> It is also worth noting that our existing parser, by virtue of using
> strtod(), accepts decimal AND hex numbers, even though test-cutils
> previously lacked any coverage of the latter.  We do have existing
> clients that expect a hex parse to work (for example, iotest 33 using
> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
> rather than as an invalid octal number, so we know there are no
> clients that depend on octal.  Our use of strtod() also means that
> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
> than 1843 (if the fraction were 8/10); but as this was not covered in
> the testsuite, I have no qualms forbidding hex fractions as invalid,
> so this patch declares that the use of fractions is only supported
> with decimal input, and enhances the testsuite to document that.
> 
> Our previous use of strtod() meant that -1 parsed as a negative; now
> that we parse with strtoull(), negative values can wrap around module
> 2^64, so we have to explicitly check whether the user passed in a '-'.
> 
> We also had no testsuite coverage of "1.1e0k", which happened to parse
> under strtod() but is unlikely to occur in practice; as long as we are
> making things more robust, it is easy enough to reject the use of
> exponents in a strtod parse.
> 
> The fix is done by breaking the parse into an integer prefix (no loss
> in precision), rejecting negative values (since we can no longer rely
> on strtod() to do that), determining if a decimal or hexadecimal parse
> was intended (with the new restriction that a fractional hex parse is
> not allowed), and where appropriate, using a floating point fractional
> parse (where we also scan to reject use of exponents in the fraction).
> The bulk of the patch is then updates to the testsuite to match our
> new precision, as well as adding new cases we reject (whether they
> were rejected or inadvertenly accepted before).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 


> diff --git a/util/cutils.c b/util/cutils.c
> index 0b5073b33012..0234763bd70b 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -241,10 +241,21 @@ static int64_t suffix_mul(char suffix, int64_t unit)
>  }
> 
>  /*
> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
> - * other error.
> + * Convert size string to bytes.
> + *
> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
> + * T/t for TB, with scaling based on @unit, and with @default_suffix
> + * implied if no explicit suffix was given.
> + *
> + * The end pointer will be returned in *end, if not NULL.  If there is
> + * no fraction, the input can be decimal or hexadecimal; if there is a
> + * fraction, then the input must be decimal and there must be a suffix
> + * (possibly by @default_suffix) larger than Byte, and the fractional
> + * portion may suffer from precision loss or rounding.  The input must
> + * be positive.

Even though the test suite gives some illustrations, I think we should
document here the patterns we're intending to support. IIUC, we aim for

[quote]
The size parsing supports the following syntaxes

 - 12345   - decimal, bytes
 - 12345{bBkKmMgGtT} - decimal, scaled bytes
 - 12345.678 - fractional decimal, bytes
 - 12345.678{bBkKmMgGtT} - fractional decimal, scaled bytes
 - 0x7FEE  - hex, bytes

The following are intentionally not supported

 - octal
 - fractional hex
 - floating point exponents
[/quote]

> + *
> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
> + * other error (with *@end left unchanged).
>   */

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
                     ` (2 preceding siblings ...)
  2021-02-05 10:38   ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 11:10   ` Daniel P. Berrangé
  2021-02-05 14:28     ` Eric Blake
  3 siblings, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 11:10 UTC (permalink / raw)
  To: Eric Blake; +Cc: vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel

On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote:
> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
> happen to truncate it to 1126.  Our use of fractional sizes is
> intended for convenience, but when a user specifies a fraction that is
> not a clean translation to binary, truncating/rounding behind their
> backs can cause confusion.  Better is to deprecate inexact values,
> which still leaves '1.5k' as valid, but alerts the user to spell out
> their values as a precise byte number in cases where they are
> currently being rounded.

I don't think we should be deprecating this, as I think it makes
it very user hostile.  Users who require exact answers, won't be
using fractional syntax in the first place. IOW, by using fractional
syntax you've decided that approximation is acceptable. Given that,
I should not have to worry about whether or not the fraction I'm
using is exact or truncated. It is horrible usability to say that
"1.1k" is invalid, while "1.5k" is valid - both are valid from my
POV as a user of this.



> Note that values like '0.1G' in the testsuite need adjustment as a
> result.
> 
> Sadly, since qemu_strtosz() does not have an Err** parameter, we
> pollute to stderr.

This is only an warning, so setting an Err ** would not be appropriate
right now.

None the less we should add an Err **, because many of the callers
want an Err ** object populated, or use error_report().

> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  tests/test-cutils.c    | 4 ++--
>  tests/test-keyval.c    | 4 ++--
>  tests/test-qemu-opts.c | 4 ++--
>  util/cutils.c          | 4 ++++
>  4 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/test-cutils.c b/tests/test-cutils.c
> index 0c2c89d6f113..ad51fb1baa51 100644
> --- a/tests/test-cutils.c
> +++ b/tests/test-cutils.c
> @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void)
> 
>  static void test_qemu_strtosz_float(void)
>  {
> -    const char *str = "12.345M";
> +    const char *str = "12.125M";
>      int err;
>      const char *endptr;
>      uint64_t res = 0xbaadf00d;
> 
>      err = qemu_strtosz(str, &endptr, &res);
>      g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 12.345 * MiB);
> +    g_assert_cmpint(res, ==, 12.125 * MiB);
>      g_assert(endptr == str + 7);
>  }
> 
> diff --git a/tests/test-keyval.c b/tests/test-keyval.c
> index 13be763650b2..c951ac54cd23 100644
> --- a/tests/test-keyval.c
> +++ b/tests/test-keyval.c
> @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void)
>      visit_free(v);
> 
>      /* Suffixes */
> -    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T",
> +    qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T",
>                           NULL, NULL, &error_abort);
>      v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
>      qobject_unref(qdict);
> @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void)
>      visit_type_size(v, "sz3", &sz, &error_abort);
>      g_assert_cmphex(sz, ==, 2 * MiB);
>      visit_type_size(v, "sz4", &sz, &error_abort);
> -    g_assert_cmphex(sz, ==, GiB / 10);
> +    g_assert_cmphex(sz, ==, GiB / 8);
>      visit_type_size(v, "sz5", &sz, &error_abort);
>      g_assert_cmphex(sz, ==, 16777215ULL * TiB);
>      visit_check_struct(v, &error_abort);
> diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c
> index f79b698e6715..6a1ea1d01c4f 100644
> --- a/tests/test-qemu-opts.c
> +++ b/tests/test-qemu-opts.c
> @@ -717,10 +717,10 @@ static void test_opts_parse_size(void)
>      g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8);
>      g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536);
>      g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB);
> -    opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T",
> +    opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T",
>                             false, &error_abort);
>      g_assert_cmpuint(opts_count(opts), ==, 2);
> -    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10);
> +    g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8);
>      g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB);
> 
>      /* Beyond limit with suffix */
> diff --git a/util/cutils.c b/util/cutils.c
> index 75190565cbb5..5ec6101ae778 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end,
>          retval = -ERANGE;
>          goto out;
>      }
> +    if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) {
> +        fprintf(stderr, "Using a fractional size that is not an exact byte "
> +                "multiple is deprecated: %s\n", nptr);
> +    }
>      *result = val * mul + (uint64_t) (fraction * mul);
>      retval = 0;
> 
> -- 
> 2.30.0
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake
  2021-02-05 10:25   ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 11:13   ` Daniel P. Berrangé
  2021-02-05 13:40     ` Eric Blake
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 11:13 UTC (permalink / raw)
  To: Eric Blake
  Cc: vsementsov, qemu-block, rjones, reviewer:Incompatible changes,
	tao3.xu, armbru, qemu-devel

On Thu, Feb 04, 2021 at 01:07:07PM -0600, Eric Blake wrote:
> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix
> that is ambiguous between a hex digit and the extremely large exibyte
> suffix, as well as a 'B' suffix for bytes.  In practice, people using
> hex inputs are specifying values in bytes (and would have written
> 0x2000000, or possibly relied on default_suffix in the case of
> qemu_strtosz_MiB), and the use of scaling suffixes makes the most
> sense for inputs in decimal (where the user would write 32M).  But
> rather than outright dropping support for hex-with-suffix, let's
> follow our deprecation policy.  Sadly, since qemu_strtosz() does not
> have an Err** parameter, we pollute to stderr.

Err** is only appropriate for errors, not warnings, so this statement
can be removed.

> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  docs/system/deprecated.rst | 8 ++++++++
>  util/cutils.c              | 6 +++++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> index 6ac757ed9fa7..c404c3926e6f 100644
> --- a/docs/system/deprecated.rst
> +++ b/docs/system/deprecated.rst
> @@ -146,6 +146,14 @@ library enabled as a cryptography provider.
>  Neither the ``nettle`` library, or the built-in cryptography provider are
>  supported on FIPS enabled hosts.
> 
> +hexadecimal sizes with scaling multipliers (since 6.0)
> +''''''''''''''''''''''''''''''''''''''''''''''''''''''
> +
> +Input parameters that take a size value should only use a size suffix
> +(such as 'k' or 'M') when the base is written in decimal, and not when
> +the value is hexadecimal.  That is, '0x20M' is deprecated, and should
> +be written either as '32M' or as '0x2000000'.
> +
>  QEMU Machine Protocol (QMP) commands
>  ------------------------------------
> 
> diff --git a/util/cutils.c b/util/cutils.c
> index 0234763bd70b..75190565cbb5 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char **end,
>      int retval;
>      const char *endptr;
>      unsigned char c;
> -    bool mul_required = false;
> +    bool mul_required = false, hex = false;
>      uint64_t val;
>      int64_t mul;
>      double fraction = 0.0;
> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end,
>      c = *endptr;
>      mul = suffix_mul(c, unit);
>      if (mul > 0) {
> +        if (hex) {
> +            fprintf(stderr, "Using a multiplier suffix on hex numbers "
> +                    "is deprecated: %s\n", nptr);

warn_report(), since IIUC, that'll get into HMP response correctly.

> +        }
>          endptr++;
>      } else {
>          mul = suffix_mul(default_suffix, unit);

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
                     ` (3 preceding siblings ...)
  2021-02-05 11:02   ` Daniel P. Berrangé
@ 2021-02-05 11:34   ` Daniel P. Berrangé
  2021-02-05 14:36     ` Eric Blake
  4 siblings, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 11:34 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru,
	qemu-devel, Max Reitz

On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote:
> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
> the keyval visitor), and it gets annoying that edge-case testing is
> impacted by implicit rounding to 53 bits of precision due to parsing
> with strtod().  As an example posted by Rich Jones:
>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>  write failed: Input/output error
> 
> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
> out of bounds.
> 
> It is also worth noting that our existing parser, by virtue of using
> strtod(), accepts decimal AND hex numbers, even though test-cutils
> previously lacked any coverage of the latter.  We do have existing
> clients that expect a hex parse to work (for example, iotest 33 using
> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
> rather than as an invalid octal number, so we know there are no
> clients that depend on octal.  Our use of strtod() also means that
> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
> than 1843 (if the fraction were 8/10); but as this was not covered in
> the testsuite, I have no qualms forbidding hex fractions as invalid,
> so this patch declares that the use of fractions is only supported
> with decimal input, and enhances the testsuite to document that.
> 
> Our previous use of strtod() meant that -1 parsed as a negative; now
> that we parse with strtoull(), negative values can wrap around module
> 2^64, so we have to explicitly check whether the user passed in a '-'.
> 
> We also had no testsuite coverage of "1.1e0k", which happened to parse
> under strtod() but is unlikely to occur in practice; as long as we are
> making things more robust, it is easy enough to reject the use of
> exponents in a strtod parse.
> 
> The fix is done by breaking the parse into an integer prefix (no loss
> in precision), rejecting negative values (since we can no longer rely
> on strtod() to do that), determining if a decimal or hexadecimal parse
> was intended (with the new restriction that a fractional hex parse is
> not allowed), and where appropriate, using a floating point fractional
> parse (where we also scan to reject use of exponents in the fraction).
> The bulk of the patch is then updates to the testsuite to match our
> new precision, as well as adding new cases we reject (whether they
> were rejected or inadvertenly accepted before).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> 
> Note that this approach differs from what has been attempted in the
> past; see the thread starting at
> https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html
> That approach tried to parse both as strtoull and strtod and take
> whichever was longer, but that was harder to document.
> ---
>  tests/test-cutils.c        | 79 ++++++++++++++++++++++++++++++--------
>  tests/test-keyval.c        | 24 +++++++++---
>  tests/test-qemu-opts.c     | 20 +++++++---
>  util/cutils.c              | 77 +++++++++++++++++++++++++++----------
>  tests/qemu-iotests/049.out |  7 +++-
>  5 files changed, 156 insertions(+), 51 deletions(-)
> 
> diff --git a/tests/test-cutils.c b/tests/test-cutils.c
> index 1aa8351520ae..0c2c89d6f113 100644
> --- a/tests/test-cutils.c
> +++ b/tests/test-cutils.c
> @@ -1960,6 +1960,13 @@ static void test_qemu_strtosz_simple(void)
>      g_assert_cmpint(res, ==, 0);
>      g_assert(endptr == str + 1);
> 
> +    /* Leading 0 gives decimal results, not octal */
> +    str = "08";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 8);
> +    g_assert(endptr == str + 2);
> +
>      str = "12345";
>      err = qemu_strtosz(str, &endptr, &res);
>      g_assert_cmpint(err, ==, 0);
> @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void)
>      g_assert_cmpint(err, ==, 0);
>      g_assert_cmpint(res, ==, 12345);
> 
> -    /* Note: precision is 53 bits since we're parsing with strtod() */
> +    /* Note: If there is no '.', we get full 64 bits of precision */

IIUC, our goal is that we should never loose precision for the
non-fractional part. 

> 
>      str = "9007199254740991"; /* 2^53-1 */
>      err = qemu_strtosz(str, &endptr, &res);
> @@ -1987,7 +1994,7 @@ static void test_qemu_strtosz_simple(void)
>      str = "9007199254740993"; /* 2^53+1 */
>      err = qemu_strtosz(str, &endptr, &res);
>      g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 0x20000000000000); /* rounded to 53 bits */
> +    g_assert_cmpint(res, ==, 0x20000000000001);
>      g_assert(endptr == str + 16);
> 
>      str = "18446744073709549568"; /* 0xfffffffffffff800 (53 msbs set) */
> @@ -1999,11 +2006,35 @@ static void test_qemu_strtosz_simple(void)
>      str = "18446744073709550591"; /* 0xfffffffffffffbff */
>      err = qemu_strtosz(str, &endptr, &res);
>      g_assert_cmpint(err, ==, 0);
> -    g_assert_cmpint(res, ==, 0xfffffffffffff800); /* rounded to 53 bits */
> +    g_assert_cmpint(res, ==, 0xfffffffffffffbff);
>      g_assert(endptr == str + 20);
> 
> -    /* 0x7ffffffffffffe00..0x7fffffffffffffff get rounded to
> -     * 0x8000000000000000, thus -ERANGE; see test_qemu_strtosz_erange() */
> +    str = "18446744073709551615"; /* 0xffffffffffffffff */
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 0xffffffffffffffff);
> +    g_assert(endptr == str + 20);
> +
> +}
> +
> +static void test_qemu_strtosz_hex(void)
> +{
> +    const char *str;
> +    const char *endptr;
> +    int err;
> +    uint64_t res = 0xbaadf00d;
> +
> +    str = "0x0";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 0);
> +    g_assert(endptr == str + 3);
> +
> +    str = "0xa";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, 0);
> +    g_assert_cmpint(res, ==, 10);
> +    g_assert(endptr == str + 3);

I'd stick a  '0xab' or "0xae" in there, to demonstrate that we're
not accidentally applyin the scaling units for "b" and "e" in hex.

>  }
> 
>  static void test_qemu_strtosz_units(void)
> @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void)
>      err = qemu_strtosz(str, &endptr, &res);
>      g_assert_cmpint(err, ==, -EINVAL);
>      g_assert(endptr == str);
> +
> +    str = "1.1e5";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, -EINVAL);
> +    g_assert(endptr == str);
> +
> +    str = "1.1B";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, -EINVAL);
> +    g_assert(endptr == str);
> +
> +    str = "0x1.8k";
> +    err = qemu_strtosz(str, &endptr, &res);
> +    g_assert_cmpint(err, ==, -EINVAL);
> +    g_assert(endptr == str);
>  }

A test case to reject negative numers ?

> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end,
>      int retval;
>      const char *endptr;
>      unsigned char c;
> -    int mul_required = 0;
> -    double val, mul, integral, fraction;
> +    bool mul_required = false;
> +    uint64_t val;
> +    int64_t mul;
> +    double fraction = 0.0;
> 
> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
> +    /* Parse integral portion as decimal. */
> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>      if (retval) {
>          goto out;
>      }
> -    fraction = modf(val, &integral);
> -    if (fraction != 0) {
> -        mul_required = 1;
> +    if (strchr(nptr, '-') != NULL) {
> +        retval = -ERANGE;
> +        goto out;
> +    }
> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {

This works as an approach but it might be more clear to
just check for hex upfront.

  if (g_str_has_prefix("0x", val) ||
      g_str_has_prefix("0X", val)) {
      ... do hex parse..
  } else {
      ... do dec parse..
  }
      

> +        /* Input looks like hex, reparse, and insist on no fraction. */
> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
> +        if (retval) {
> +            goto out;
> +        }
> +        if (*endptr == '.') {
> +            endptr = nptr;
> +            retval = -EINVAL;
> +            goto out;
> +        }
> +    } else if (*endptr == '.') {
> +        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
> +        if (retval) {
> +            endptr = nptr;
> +            goto out;
> +        }
> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
> +            || memchr(nptr, 'E', endptr - nptr)) {

Can this be simplified ?  Fraction can be > 1, if-and only-if exponent
syntax is used IIUC.

Shouldn't we be passing 'endptr+1' as the first arg to memchr ?

nptr points to the leading decimal part, and we only need to
scan the fractional part.

Also what happens with   "1.1e" - that needs to be treated
as a exabyte suffix, and not rejected as an exponent. We
ought to test that if we don't already.

> +            endptr = nptr;
> +            retval = -EINVAL;
> +            goto out;
> +        }
> +        if (fraction != 0) {
> +            mul_required = true;
> +        }
>      }
>      c = *endptr;
>      mul = suffix_mul(c, unit);
> -    if (mul >= 0) {
> +    if (mul > 0) {
>          endptr++;
>      } else {
>          mul = suffix_mul(default_suffix, unit);
> -        assert(mul >= 0);
> +        assert(mul > 0);
>      }
>      if (mul == 1 && mul_required) {
> +        endptr = nptr;
>          retval = -EINVAL;
>          goto out;
>      }
> -    /*
> -     * Values near UINT64_MAX overflow to 2**64 when converting to double
> -     * precision.  Compare against the maximum representable double precision
> -     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
> -     * the direction of 0".
> -     */
> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
> +    if (val > UINT64_MAX / mul) {
>          retval = -ERANGE;
>          goto out;
>      }
> -    *result = val * mul;
> +    *result = val * mul + (uint64_t) (fraction * mul);
>      retval = 0;
> 
>  out:

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-05 10:25   ` Vladimir Sementsov-Ogievskiy
  2021-02-05 10:31     ` Richard W.M. Jones
@ 2021-02-05 13:38     ` Eric Blake
  1 sibling, 0 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-05 13:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: berrange, qemu-block, reviewer:Incompatible changes, tao3.xu,
	rjones, armbru

On 2/5/21 4:25 AM, Vladimir Sementsov-Ogievskiy wrote:
> 04.02.2021 22:07, Eric Blake wrote:
>> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix
> 
> What about also deprecating 'E' suffix? (just my problem of reviewing
> previous patch)

No, we want to keep '1E' as a valid way to spell 1 exabyte.  That has
uses in the wild (admittedly corner-case, as that is a LOT of storage).
It is only '0x1E' where the use of 'E' as a hex digit has priority over
'E' as a suffix meaning exabyte.

> 
>> that is ambiguous between a hex digit and the extremely large exibyte
>> suffix, as well as a 'B' suffix for bytes.  In practice, people using
>> hex inputs are specifying values in bytes (and would have written
>> 0x2000000, or possibly relied on default_suffix in the case of
>> qemu_strtosz_MiB), and the use of scaling suffixes makes the most
>> sense for inputs in decimal (where the user would write 32M).  But
>> rather than outright dropping support for hex-with-suffix, let's
>> follow our deprecation policy.  Sadly, since qemu_strtosz() does not
>> have an Err** parameter, we pollute to stderr.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---

>> +++ b/util/cutils.c
>> @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char
>> **end,
>>       int retval;
>>       const char *endptr;
>>       unsigned char c;
>> -    bool mul_required = false;
>> +    bool mul_required = false, hex = false;
>>       uint64_t val;
>>       int64_t mul;
>>       double fraction = 0.0;
>> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const
>> char **end,
> 
> you forget to set hex to true in corresponding if(){...}
> 
>>       c = *endptr;
>>       mul = suffix_mul(c, unit);
>>       if (mul > 0) {
>> +        if (hex) {
>> +            fprintf(stderr, "Using a multiplier suffix on hex numbers "
>> +                    "is deprecated: %s\n", nptr);
>> +        }

D'oh.  Now I get to rerun my tests to see when the warning triggers.

>>           endptr++;
>>       } else {
>>           mul = suffix_mul(default_suffix, unit);
> 
> should we also deprecate hex where default_suffix is not 'B' ?

That's exactly what this patch is (supposed to be) doing.  If we parsed
a hex number, and there was an explicit suffix at all (which is
necessarily neither 'B' nor 'E', since those were already consumed while
parsing the hex number), issue a warning.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-05 11:13   ` Daniel P. Berrangé
@ 2021-02-05 13:40     ` Eric Blake
  2021-02-05 14:02       ` Daniel P. Berrangé
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2021-02-05 13:40 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: vsementsov, qemu-block, rjones, reviewer:Incompatible changes,
	tao3.xu, armbru, qemu-devel

On 2/5/21 5:13 AM, Daniel P. Berrangé wrote:
> On Thu, Feb 04, 2021 at 01:07:07PM -0600, Eric Blake wrote:
>> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix
>> that is ambiguous between a hex digit and the extremely large exibyte
>> suffix, as well as a 'B' suffix for bytes.  In practice, people using
>> hex inputs are specifying values in bytes (and would have written
>> 0x2000000, or possibly relied on default_suffix in the case of
>> qemu_strtosz_MiB), and the use of scaling suffixes makes the most
>> sense for inputs in decimal (where the user would write 32M).  But
>> rather than outright dropping support for hex-with-suffix, let's
>> follow our deprecation policy.  Sadly, since qemu_strtosz() does not
>> have an Err** parameter, we pollute to stderr.
> 
> Err** is only appropriate for errors, not warnings, so this statement
> can be removed.

The point of the comment was that we have no other mechanism for
reporting the deprecation other than polluting to stderr.  And if we
ever remove the support, we'll either have to silently reject input that
we used to accept, or plumb through Err** handling to allow better
reporting of WHY we are rejecting input.  But I didn't feel like adding
Err** support now.

>> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end,
>>      c = *endptr;
>>      mul = suffix_mul(c, unit);
>>      if (mul > 0) {
>> +        if (hex) {
>> +            fprintf(stderr, "Using a multiplier suffix on hex numbers "
>> +                    "is deprecated: %s\n", nptr);
> 
> warn_report(), since IIUC, that'll get into HMP response correctly.

Yes, that sounds better.  I'll use that in v2, as well as tweaking the
commit message.

> 
>> +        }
>>          endptr++;
>>      } else {
>>          mul = suffix_mul(default_suffix, unit);
> 
> Regards,
> Daniel
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes
  2021-02-05 13:40     ` Eric Blake
@ 2021-02-05 14:02       ` Daniel P. Berrangé
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 14:02 UTC (permalink / raw)
  To: Eric Blake
  Cc: vsementsov, qemu-block, rjones, reviewer:Incompatible changes,
	tao3.xu, armbru, qemu-devel

On Fri, Feb 05, 2021 at 07:40:36AM -0600, Eric Blake wrote:
> On 2/5/21 5:13 AM, Daniel P. Berrangé wrote:
> > On Thu, Feb 04, 2021 at 01:07:07PM -0600, Eric Blake wrote:
> >> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix
> >> that is ambiguous between a hex digit and the extremely large exibyte
> >> suffix, as well as a 'B' suffix for bytes.  In practice, people using
> >> hex inputs are specifying values in bytes (and would have written
> >> 0x2000000, or possibly relied on default_suffix in the case of
> >> qemu_strtosz_MiB), and the use of scaling suffixes makes the most
> >> sense for inputs in decimal (where the user would write 32M).  But
> >> rather than outright dropping support for hex-with-suffix, let's
> >> follow our deprecation policy.  Sadly, since qemu_strtosz() does not
> >> have an Err** parameter, we pollute to stderr.
> > 
> > Err** is only appropriate for errors, not warnings, so this statement
> > can be removed.
> 
> The point of the comment was that we have no other mechanism for
> reporting the deprecation other than polluting to stderr.  And if we
> ever remove the support, we'll either have to silently reject input that
> we used to accept, or plumb through Err** handling to allow better
> reporting of WHY we are rejecting input.  But I didn't feel like adding
> Err** support now.

Yeah, I guess what I meant was that warning on stderr is the expected
standard practice for deprecations. We only need to worry about other
thing once the deprecation turns into a hard error later.

> 
> >> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end,
> >>      c = *endptr;
> >>      mul = suffix_mul(c, unit);
> >>      if (mul > 0) {
> >> +        if (hex) {
> >> +            fprintf(stderr, "Using a multiplier suffix on hex numbers "
> >> +                    "is deprecated: %s\n", nptr);
> > 
> > warn_report(), since IIUC, that'll get into HMP response correctly.
> 
> Yes, that sounds better.  I'll use that in v2, as well as tweaking the
> commit message.
> 
> > 
> >> +        }
> >>          endptr++;
> >>      } else {
> >>          mul = suffix_mul(default_suffix, unit);
> > 
> > Regards,
> > Daniel
> > 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 10:06     ` Vladimir Sementsov-Ogievskiy
  2021-02-05 10:18       ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 14:06       ` Eric Blake
  2021-02-05 14:10         ` Daniel P. Berrangé
  1 sibling, 1 reply; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, armbru, Max Reitz

On 2/5/21 4:06 AM, Vladimir Sementsov-Ogievskiy wrote:

>>> -    /*
>>> -     * Values near UINT64_MAX overflow to 2**64 when converting to
>>> double
>>> -     * precision.  Compare against the maximum representable double
>>> precision
>>> -     * value below 2**64, computed as "the next value after 2**64
>>> (0x1p64) in
>>> -     * the direction of 0".
>>> -     */
>>> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
>>> +    if (val > UINT64_MAX / mul) {
>>
>> Hmm, do we care about:
>> 15.9999999999999999999999999999E
>> where the fractional portion becomes large enough to actually bump our
>> sum below to 16E which indeed overflows?  Then again, we rejected a
>> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0
>> due to rounding.
>> Maybe it's just worth a good comment why the overflow check here works
>> without consulting fraction.
> 
> worth a good comment, because I don't follow :)
> 
> If mul is big enough and fraction=0.5, why val*mul + fraction*mul will
> not overflow?

When mul is a power of 2, we know that fraction*mul does not change the
number of significant bits, but merely moves the exponent, so starting
with fraction < 1.0, we know fraction*mul < mul.  But when @unit is
1000, there is indeed a rare possibility that the multiplication will
cause an inexact answer that will trigger rounding, so we could end up
with fraction*mul == mul.  So I'm not yet 100% confident that there is
no possible combination where we can't cause an overflow to result in
val*mul + (uint64_t)(fraction*mul) resulting in 0 instead of UINT64_MAX,
and I think I will have to tighten this code up for v2.


> 
> Also, if we find '.' in the number, why not just reparse the whole
> number with qemu_strtod_finite? And don't care about 1.0?

Reparsing the whole number loses precision. Since we already have a
64-bit precise integer, why throw it away?

> 
>>
>>>           retval = -ERANGE;
>>>           goto out;
>>>       }
>>> -    *result = val * mul;
>>> +    *result = val * mul + (uint64_t) (fraction * mul);
>>>       retval = 0;
>>>
>>>   out:
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 14:06       ` Eric Blake
@ 2021-02-05 14:10         ` Daniel P. Berrangé
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 14:10 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, tao3.xu,
	qemu-devel, armbru, Max Reitz

On Fri, Feb 05, 2021 at 08:06:53AM -0600, Eric Blake wrote:
> On 2/5/21 4:06 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
> >>> -    /*
> >>> -     * Values near UINT64_MAX overflow to 2**64 when converting to
> >>> double
> >>> -     * precision.  Compare against the maximum representable double
> >>> precision
> >>> -     * value below 2**64, computed as "the next value after 2**64
> >>> (0x1p64) in
> >>> -     * the direction of 0".
> >>> -     */
> >>> -    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
> >>> +    if (val > UINT64_MAX / mul) {
> >>
> >> Hmm, do we care about:
> >> 15.9999999999999999999999999999E
> >> where the fractional portion becomes large enough to actually bump our
> >> sum below to 16E which indeed overflows?  Then again, we rejected a
> >> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0
> >> due to rounding.
> >> Maybe it's just worth a good comment why the overflow check here works
> >> without consulting fraction.
> > 
> > worth a good comment, because I don't follow :)
> > 
> > If mul is big enough and fraction=0.5, why val*mul + fraction*mul will
> > not overflow?
> 
> When mul is a power of 2, we know that fraction*mul does not change the
> number of significant bits, but merely moves the exponent, so starting
> with fraction < 1.0, we know fraction*mul < mul.  But when @unit is
> 1000, there is indeed a rare possibility that the multiplication will
> cause an inexact answer that will trigger rounding, so we could end up
> with fraction*mul == mul.  So I'm not yet 100% confident that there is
> no possible combination where we can't cause an overflow to result in
> val*mul + (uint64_t)(fraction*mul) resulting in 0 instead of UINT64_MAX,
> and I think I will have to tighten this code up for v2.
> 
> 
> > 
> > Also, if we find '.' in the number, why not just reparse the whole
> > number with qemu_strtod_finite? And don't care about 1.0?
> 
> Reparsing the whole number loses precision. Since we already have a
> 64-bit precise integer, why throw it away?

Yep, it isn't acceptable to throw away precision of the non-fractional
part of the input IMHO.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 10:07   ` Vladimir Sementsov-Ogievskiy
@ 2021-02-05 14:12     ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, rjones, armbru, Max Reitz

On 2/5/21 4:07 AM, Vladimir Sementsov-Ogievskiy wrote:

>> @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void)
>>       err = qemu_strtosz(str, &endptr, &res);
>>       g_assert_cmpint(err, ==, -EINVAL);
>>       g_assert(endptr == str);
>> +
>> +    str = "1.1e5";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, -EINVAL);
>> +    g_assert(endptr == str);
> 
> I'd add also test with 'E'

Will do.  For v2, I'll probably split this patch, first into adding new
test cases (with demonstrations of what we deem to be buggy parses), and
the second showing how those cases improve as we change the code.

> 
>> +
>> +    str = "1.1B";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, -EINVAL);
>> +    g_assert(endptr == str);
>> +
>> +    str = "0x1.8k";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, -EINVAL);
>> +    g_assert(endptr == str);
> 
> ha this function looks like it should have
> 
> const cher *str[] = ["", " \t ", ... "0x1.8k"]
> 
> and all cases in a for()... and all other test cases may be ... Oh, I
> should say myself "don't refactor everything you see".

I'll think about it.  It's already odd that the tests are split between
so many functions.


>> @@ -668,7 +668,7 @@ static void test_opts_parse_size(void)
>>       g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1),
>>                        ==, 0x20000000000000);
>>       g_assert_cmphex(qemu_opt_get_size(opts, "size3", 1),
>> -                     ==, 0x20000000000000);
>> +                     ==, 0x20000000000001);
>>
>>       /* Close to signed upper limit 0x7ffffffffffffc00 (53 msbs set) */
> 
> should fix comment?

Yes, I did it in one file but not the other, so I'll make it consistent.
 The point of the comment post-patch is that we are demonstrating that
our implementation is NOT bound by the limits of a double's precision.


>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>> +    /* Parse integral portion as decimal. */
>> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>>       if (retval) {
>>           goto out;
>>       }
>> -    fraction = modf(val, &integral);
>> -    if (fraction != 0) {
>> -        mul_required = 1;
>> +    if (strchr(nptr, '-') != NULL) {
>> +        retval = -ERANGE;
>> +        goto out;
>> +    }
> 
> Hmm, I have a question: does do_strtosz assumes that the whole nptr
> string is a number?

No.  In fact, our testsuite demonstrates the difference between endptr
as a pointer (we parse what we can and advance *endptr to the tail) and
as NULL (trailing garbage is an error).

> 
> If yes, then I don't understand what the matter of **end OUT-argument.
> 
> If not, this if() is wrong, as you'll redject parsing first number of
> "123425 -  2323" string..

Yep, good catch.  This needs to use memchr, similar to the check for 'e'
in the fraction portion below.


>> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
>> +        if (retval) {
>> +            endptr = nptr;
>> +            goto out;
>> +        }
>> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
>> +            || memchr(nptr, 'E', endptr - nptr)) {
>> +            endptr = nptr;


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 10:28   ` Richard W.M. Jones
@ 2021-02-05 14:15     ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:15 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu,
	qemu-devel, armbru, Max Reitz

On 2/5/21 4:28 AM, Richard W.M. Jones wrote:
> On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote:
>> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
>> the keyval visitor), and it gets annoying that edge-case testing is
>> impacted by implicit rounding to 53 bits of precision due to parsing
>> with strtod().  As an example posted by Rich Jones:
>>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>>  write failed: Input/output error
>>
>> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
>> out of bounds.
>>
>> It is also worth noting that our existing parser, by virtue of using
>> strtod(), accepts decimal AND hex numbers, even though test-cutils
>> previously lacked any coverage of the latter.  We do have existing
>> clients that expect a hex parse to work (for example, iotest 33 using
>> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8
>> rather than as an invalid octal number, so we know there are no
>> clients that depend on octal.  Our use of strtod() also means that
>> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather
>> than 1843 (if the fraction were 8/10); but as this was not covered in
>> the testsuite, I have no qualms forbidding hex fractions as invalid,
>> so this patch declares that the use of fractions is only supported
>> with decimal input, and enhances the testsuite to document that.
>>
>> Our previous use of strtod() meant that -1 parsed as a negative; now
>> that we parse with strtoull(), negative values can wrap around module
> 
> ^^ modulo
> 
> The patch looked fine to me although Vladimir found some problems
> which I didn't spot.  I have a question: What happens with leading or
> trailing whitespace?  Is that ignored, rejected or impossible?

leading whitespace: ignored (because both strtod() pre-patch, and now
strtoull() post-patch, do so for free).  And that is why we have to
memchr() (and not strchr(), as pointed out by Vladimir) for a '-' sign,
because merely checking *nptr=='-' would be wrong in the presence of
leading space.

trailing whitespace: treated the same as any other trailing garbage
(again, what strtod() and strtoull() give you for free).  If endptr was
non-NULL, then *endptr now points to that trailing space; if it was
NULL, the parse is rejected because of the trailing garbage.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-05 10:34   ` Richard W.M. Jones
@ 2021-02-05 14:19     ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:19 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: vsementsov, berrange, qemu-block, tao3.xu, qemu-devel, armbru

On 2/5/21 4:34 AM, Richard W.M. Jones wrote:
> On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote:
>> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
>> happen to truncate it to 1126.  Our use of fractional sizes is
>> intended for convenience, but when a user specifies a fraction that is
>> not a clean translation to binary, truncating/rounding behind their
>> backs can cause confusion.  Better is to deprecate inexact values,
>> which still leaves '1.5k' as valid, but alerts the user to spell out
>> their values as a precise byte number in cases where they are
>> currently being rounded.
>>
>> Note that values like '0.1G' in the testsuite need adjustment as a
>> result.
>>
>> Sadly, since qemu_strtosz() does not have an Err** parameter, we
>> pollute to stderr.
> 
> Does "deprecate" mean there's a plan to eventually remove this?  If so
> I think it should say that (in docs/system/deprecated.rst I think).

Yes (well, if we agree to the patch; Dan has already voiced concern that
we may not want 3/3 after all).  And that's why I posted a followup
message mentioning that I failed to commit that docs/ change before
sending my v1.  It will be in v2.

> 
> I think it would be better to remove this now or in the future since
> it's a trap for users.

It's borderline - Markus has expressed a desire to remove inexact
fractions, while Dan has expressed the desire that user convenience is
important (as long as we are clear that non-fractional values are ALWAYS
exact, and that the use of a fraction is for convenience at which point
you KNOW you are risking rounding, then allowing both 1.1M and 1.5M
instead of only one of the two is nicer).

I submitted this patch because of IRC discussion, but if it is up to
just me, I'd keep 2/3 but drop 3/3 (that is, I'm happy to deprecate
0x4000M, but not happy to deprecate 0.1G, but rather merely posted the
patch to see what the fallout is because the question had been asked on
IRC).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 11:02   ` Daniel P. Berrangé
@ 2021-02-05 14:27     ` Eric Blake
  2021-02-05 14:36       ` Daniel P. Berrangé
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:27 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru,
	qemu-devel, Max Reitz

On 2/5/21 5:02 AM, Daniel P. Berrangé wrote:

>>  /*
>> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
>> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
>> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
>> - * other error.
>> + * Convert size string to bytes.
>> + *
>> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
>> + * T/t for TB, with scaling based on @unit, and with @default_suffix
>> + * implied if no explicit suffix was given.
>> + *
>> + * The end pointer will be returned in *end, if not NULL.  If there is
>> + * no fraction, the input can be decimal or hexadecimal; if there is a
>> + * fraction, then the input must be decimal and there must be a suffix
>> + * (possibly by @default_suffix) larger than Byte, and the fractional
>> + * portion may suffer from precision loss or rounding.  The input must
>> + * be positive.
> 
> Even though the test suite gives some illustrations, I think we should
> document here the patterns we're intending to support. IIUC, we aim for
> 
> [quote]
> The size parsing supports the following syntaxes
> 
>  - 12345   - decimal, bytes
>  - 12345{bBkKmMgGtT} - decimal, scaled bytes

Yes.  Actually 12345{bBkKmMgGtTpPeE}, as long as it doesn't overflow 16E.

>  - 12345.678 - fractional decimal, bytes

No - this was already rejected prior to my patch, and my patch keeps it
as rejected (that's the whole mul_required bool, which checks whether
mul > 1).

>  - 12345.678{bBkKmMgGtT} - fractional decimal, scaled bytes

Close; we reject bB in this situation for the same reason that we reject
fractional decimal without suffix.  Also, patch 3/3 was questioning
whether the fractional portion must be exact or is permitted to round

>  - 0x7FEE  - hex, bytes

Yes, and with the additional note that 'E' and 'B' are treated as hex
digits, not scale suffixes, in this form.

> 
> The following are intentionally not supported
> 
>  - octal

Never has worked.

>  - fractional hex

worked by accident, dropping it in this patch is not deemed worth a
deprecation period.

>  - floating point exponents

worked by accident, dropping it in this patch is not deemed worth a
deprecation period.

> [/quote]

and with one more form:

patch 2/3
 - 0x123abc{kKmMgGtTpP} - hex, scaled bytes, with limited set of working
suffixes, and slated for future removal (this one is barely plausible
enough that we need the deprecation period to see who uses it)

At any rate, I like the idea of being more explicit in the function
description, so I will enhance v2 along these lines.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-05 11:10   ` Daniel P. Berrangé
@ 2021-02-05 14:28     ` Eric Blake
  2021-02-05 14:40       ` Daniel P. Berrangé
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:28 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel

On 2/5/21 5:10 AM, Daniel P. Berrangé wrote:
> On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote:
>> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
>> happen to truncate it to 1126.  Our use of fractional sizes is
>> intended for convenience, but when a user specifies a fraction that is
>> not a clean translation to binary, truncating/rounding behind their
>> backs can cause confusion.  Better is to deprecate inexact values,
>> which still leaves '1.5k' as valid, but alerts the user to spell out
>> their values as a precise byte number in cases where they are
>> currently being rounded.
> 
> I don't think we should be deprecating this, as I think it makes
> it very user hostile.  Users who require exact answers, won't be
> using fractional syntax in the first place. IOW, by using fractional
> syntax you've decided that approximation is acceptable. Given that,
> I should not have to worry about whether or not the fraction I'm
> using is exact or truncated. It is horrible usability to say that
> "1.1k" is invalid, while "1.5k" is valid - both are valid from my
> POV as a user of this.
> 
> 
> 
>> Note that values like '0.1G' in the testsuite need adjustment as a
>> result.
>>
>> Sadly, since qemu_strtosz() does not have an Err** parameter, we
>> pollute to stderr.
> 
> This is only an warning, so setting an Err ** would not be appropriate
> right now.
> 
> None the less we should add an Err **, because many of the callers
> want an Err ** object populated, or use error_report().

That is more effort.  What's the consensus - is it important enough that
I should spend that effort getting rid of technical debt by adding
versions of qemu_strto* that take Err** at this point in time?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 11:34   ` Daniel P. Berrangé
@ 2021-02-05 14:36     ` Eric Blake
  0 siblings, 0 replies; 31+ messages in thread
From: Eric Blake @ 2021-02-05 14:36 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru,
	qemu-devel, Max Reitz

On 2/5/21 5:34 AM, Daniel P. Berrangé wrote:
> On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote:
>> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
>> the keyval visitor), and it gets annoying that edge-case testing is
>> impacted by implicit rounding to 53 bits of precision due to parsing
>> with strtod().  As an example posted by Rich Jones:
>>  $ nbdkit memory $(( 2**63 - 2**30 )) --run \
>>    'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
>>  write failed: Input/output error
>>

>> @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void)
>>      g_assert_cmpint(err, ==, 0);
>>      g_assert_cmpint(res, ==, 12345);
>>
>> -    /* Note: precision is 53 bits since we're parsing with strtod() */
>> +    /* Note: If there is no '.', we get full 64 bits of precision */
> 
> IIUC, our goal is that we should never loose precision for the
> non-fractional part. 

Correct. Maybe I can tweak that comment more as part of splitting this
patch into testsuite improvements (highlighting existing shortfalls)
then the actual implementation change (with its corresponding testsuite
tweaks to demonstrate how it is fixed).


>> +
>> +    str = "0xa";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, 0);
>> +    g_assert_cmpint(res, ==, 10);
>> +    g_assert(endptr == str + 3);
> 
> I'd stick a  '0xab' or "0xae" in there, to demonstrate that we're
> not accidentally applyin the scaling units for "b" and "e" in hex.

Will do.


>> +    str = "0x1.8k";
>> +    err = qemu_strtosz(str, &endptr, &res);
>> +    g_assert_cmpint(err, ==, -EINVAL);
>> +    g_assert(endptr == str);
>>  }
> 
> A test case to reject negative numers ?

We already have one, under the _erange section.  Then again, there's an
oddity: with ERANGE, we tend to leave endptr pointing to the end of what
we did parse, while with EINVAL, we tend to leave endptr pointing to
nptr saying we parsed nothing at all.  It's hard to decide whether -1 is
an ERANGE (negative is not permitted, but we successfully parsed two
characters and so advanced endptr), or an EINVAL (we don't allow
negative at all, so endptr is nptr).  If anyone has preferences, now
would be the time to document the semantics we want; although since we
already have an existing test under _range for -1, I tried to keep that
behavior unchanged.


>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>> +    /* Parse integral portion as decimal. */
>> +    retval = qemu_strtou64(nptr, &endptr, 10, &val);
>>      if (retval) {
>>          goto out;
>>      }
>> -    fraction = modf(val, &integral);
>> -    if (fraction != 0) {
>> -        mul_required = 1;
>> +    if (strchr(nptr, '-') != NULL) {
>> +        retval = -ERANGE;
>> +        goto out;
>> +    }
>> +    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
> 
> This works as an approach but it might be more clear to
> just check for hex upfront.
> 
>   if (g_str_has_prefix("0x", val) ||
>       g_str_has_prefix("0X", val)) {

Won't work nicely.  strtoull() allows for leading whitespace, at which
point we are duplicating a lot of functionality to likewise skip such
leading whitespace before checking the prefix.

>       ... do hex parse..
>   } else {
>       ... do dec parse..
>   }
>       
> 
>> +        /* Input looks like hex, reparse, and insist on no fraction. */
>> +        retval = qemu_strtou64(nptr, &endptr, 16, &val);
>> +        if (retval) {
>> +            goto out;
>> +        }
>> +        if (*endptr == '.') {
>> +            endptr = nptr;
>> +            retval = -EINVAL;
>> +            goto out;
>> +        }
>> +    } else if (*endptr == '.') {
>> +        /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */
>> +        retval = qemu_strtod_finite(endptr, &endptr, &fraction);
>> +        if (retval) {
>> +            endptr = nptr;
>> +            goto out;
>> +        }
>> +        if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr)
>> +            || memchr(nptr, 'E', endptr - nptr)) {
> 
> Can this be simplified ?  Fraction can be > 1, if-and only-if exponent
> syntax is used IIUC.

True for > 1.0, but false for == 1.0, because of the possibility of
overflow while parsing a fraction like .99999999999999999999999999999999

> 
> Shouldn't we be passing 'endptr+1' as the first arg to memchr ?

Not quite.  I used endptr as both input and output in the strtod() call,
so at this point, I'd have to add another variable to track where the
'.' since endptr is no longer pointing there.  But being lazy, I know
that the integral portion contains no 'e', so even though the memchr is
wasting effort searching bytes before the '.' for 'e', it is not wrong.

> 
> nptr points to the leading decimal part, and we only need to
> scan the fractional part.
> 
> Also what happens with   "1.1e" - that needs to be treated
> as a exabyte suffix, and not rejected as an exponent. We
> ought to test that if we don't already.

Agree that we need to support it (well, depending on patch 3, it may be
cleaner to write the test to look for support for 1.5e).  Will make sure
the test covers it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision
  2021-02-05 14:27     ` Eric Blake
@ 2021-02-05 14:36       ` Daniel P. Berrangé
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 14:36 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru,
	qemu-devel, Max Reitz

On Fri, Feb 05, 2021 at 08:27:14AM -0600, Eric Blake wrote:
> On 2/5/21 5:02 AM, Daniel P. Berrangé wrote:
> 
> >>  /*
> >> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
> >> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
> >> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
> >> - * other error.
> >> + * Convert size string to bytes.
> >> + *
> >> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or
> >> + * T/t for TB, with scaling based on @unit, and with @default_suffix
> >> + * implied if no explicit suffix was given.
> >> + *
> >> + * The end pointer will be returned in *end, if not NULL.  If there is
> >> + * no fraction, the input can be decimal or hexadecimal; if there is a
> >> + * fraction, then the input must be decimal and there must be a suffix
> >> + * (possibly by @default_suffix) larger than Byte, and the fractional
> >> + * portion may suffer from precision loss or rounding.  The input must
> >> + * be positive.
> > 
> > Even though the test suite gives some illustrations, I think we should
> > document here the patterns we're intending to support. IIUC, we aim for
> > 
> > [quote]
> > The size parsing supports the following syntaxes
> > 
> >  - 12345   - decimal, bytes
> >  - 12345{bBkKmMgGtT} - decimal, scaled bytes
> 
> Yes.  Actually 12345{bBkKmMgGtTpPeE}, as long as it doesn't overflow 16E.
> 
> >  - 12345.678 - fractional decimal, bytes
> 
> No - this was already rejected prior to my patch, and my patch keeps it
> as rejected (that's the whole mul_required bool, which checks whether
> mul > 1).

Oh yes, of course.

> >  - 12345.678{bBkKmMgGtT} - fractional decimal, scaled bytes
> 
> Close; we reject bB in this situation for the same reason that we reject
> fractional decimal without suffix.  Also, patch 3/3 was questioning
> whether the fractional portion must be exact or is permitted to round

Yep, rejecting bB makes sense.

> 
> >  - 0x7FEE  - hex, bytes
> 
> Yes, and with the additional note that 'E' and 'B' are treated as hex
> digits, not scale suffixes, in this form.
> 
> > 
> > The following are intentionally not supported
> > 
> >  - octal
> 
> Never has worked.
> 
> >  - fractional hex
> 
> worked by accident, dropping it in this patch is not deemed worth a
> deprecation period.
> 
> >  - floating point exponents
> 
> worked by accident, dropping it in this patch is not deemed worth a
> deprecation period.
> 
> > [/quote]
> 
> and with one more form:
> 
> patch 2/3
>  - 0x123abc{kKmMgGtTpP} - hex, scaled bytes, with limited set of working
> suffixes, and slated for future removal (this one is barely plausible
> enough that we need the deprecation period to see who uses it)

Ok


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes
  2021-02-05 14:28     ` Eric Blake
@ 2021-02-05 14:40       ` Daniel P. Berrangé
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05 14:40 UTC (permalink / raw)
  To: Eric Blake; +Cc: vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel

On Fri, Feb 05, 2021 at 08:28:31AM -0600, Eric Blake wrote:
> On 2/5/21 5:10 AM, Daniel P. Berrangé wrote:
> > On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote:
> >> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we
> >> happen to truncate it to 1126.  Our use of fractional sizes is
> >> intended for convenience, but when a user specifies a fraction that is
> >> not a clean translation to binary, truncating/rounding behind their
> >> backs can cause confusion.  Better is to deprecate inexact values,
> >> which still leaves '1.5k' as valid, but alerts the user to spell out
> >> their values as a precise byte number in cases where they are
> >> currently being rounded.
> > 
> > I don't think we should be deprecating this, as I think it makes
> > it very user hostile.  Users who require exact answers, won't be
> > using fractional syntax in the first place. IOW, by using fractional
> > syntax you've decided that approximation is acceptable. Given that,
> > I should not have to worry about whether or not the fraction I'm
> > using is exact or truncated. It is horrible usability to say that
> > "1.1k" is invalid, while "1.5k" is valid - both are valid from my
> > POV as a user of this.
> > 
> > 
> > 
> >> Note that values like '0.1G' in the testsuite need adjustment as a
> >> result.
> >>
> >> Sadly, since qemu_strtosz() does not have an Err** parameter, we
> >> pollute to stderr.
> > 
> > This is only an warning, so setting an Err ** would not be appropriate
> > right now.
> > 
> > None the less we should add an Err **, because many of the callers
> > want an Err ** object populated, or use error_report().
> 
> That is more effort.  What's the consensus - is it important enough that
> I should spend that effort getting rid of technical debt by adding
> versions of qemu_strto* that take Err** at this point in time?

To be clear, I'm not requesting that you make it use Err** right now.

I just raise it as an idea for future technical debt removal.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-02-05 14:41 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake
2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake
2021-02-04 20:12   ` Eric Blake
2021-02-05 10:06     ` Vladimir Sementsov-Ogievskiy
2021-02-05 10:18       ` Vladimir Sementsov-Ogievskiy
2021-02-05 14:06       ` Eric Blake
2021-02-05 14:10         ` Daniel P. Berrangé
2021-02-05 10:07   ` Vladimir Sementsov-Ogievskiy
2021-02-05 14:12     ` Eric Blake
2021-02-05 10:28   ` Richard W.M. Jones
2021-02-05 14:15     ` Eric Blake
2021-02-05 11:02   ` Daniel P. Berrangé
2021-02-05 14:27     ` Eric Blake
2021-02-05 14:36       ` Daniel P. Berrangé
2021-02-05 11:34   ` Daniel P. Berrangé
2021-02-05 14:36     ` Eric Blake
2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake
2021-02-05 10:25   ` Vladimir Sementsov-Ogievskiy
2021-02-05 10:31     ` Richard W.M. Jones
2021-02-05 13:38     ` Eric Blake
2021-02-05 11:13   ` Daniel P. Berrangé
2021-02-05 13:40     ` Eric Blake
2021-02-05 14:02       ` Daniel P. Berrangé
2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake
2021-02-04 20:02   ` Eric Blake
2021-02-05 10:34   ` Richard W.M. Jones
2021-02-05 14:19     ` Eric Blake
2021-02-05 10:38   ` Vladimir Sementsov-Ogievskiy
2021-02-05 11:10   ` Daniel P. Berrangé
2021-02-05 14:28     ` Eric Blake
2021-02-05 14:40       ` Daniel P. Berrangé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.