From: Tao Xu <tao3.xu@intel.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"mdroth@linux.vnet.ibm.com" <mdroth@linux.vnet.ibm.com>,
"ehabkost@redhat.com" <ehabkost@redhat.com>
Subject: Re: [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits
Date: Mon, 9 Dec 2019 13:38:39 +0800 [thread overview]
Message-ID: <b7c442e3-cc7e-155e-5370-db9a371928a6@intel.com> (raw)
In-Reply-To: <87a786sse9.fsf@dusky.pond.sub.org>
On 12/5/19 11:29 PM, Markus Armbruster wrote:
> Tao Xu <tao3.xu@intel.com> writes:
>
>> Parse input string both as a double and as a uint64_t, then use the
>> method which consumes more characters. Update the related test cases.
>>
>> Signed-off-by: Tao Xu <tao3.xu@intel.com>
>> ---
> [...]
>> diff --git a/util/cutils.c b/util/cutils.c
>> index 77acadc70a..b08058c57c 100644
>> --- a/util/cutils.c
>> +++ b/util/cutils.c
>> @@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const char **end,
>> const char default_suffix, int64_t unit,
>> uint64_t *result)
>> {
>> - int retval;
>> - const char *endptr;
>> + int retval, retd, retu;
>> + const char *suffix, *suffixd, *suffixu;
>> unsigned char c;
>> int mul_required = 0;
>> - double val, mul, integral, fraction;
>> + bool use_strtod;
>> + uint64_t valu;
>> + double vald, mul, integral, fraction;
>
> Note for later: @mul is double.
>
>> +
>> + retd = qemu_strtod_finite(nptr, &suffixd, &vald);
>> + retu = qemu_strtou64(nptr, &suffixu, 0, &valu);
>> + use_strtod = strlen(suffixd) < strlen(suffixu);
>> +
>> + /*
>> + * Parse @nptr both as a double and as a uint64_t, then use the method
>> + * which consumes more characters.
>> + */
>
> The comment is in a funny place. I'd put it right before the
> qemu_strtod_finite() line.
>
>> + if (use_strtod) {
>> + suffix = suffixd;
>> + retval = retd;
>> + } else {
>> + suffix = suffixu;
>> + retval = retu;
>> + }
>>
>> - retval = qemu_strtod_finite(nptr, &endptr, &val);
>> if (retval) {
>> goto out;
>> }
>
> This is even more subtle than it looks.
>
> A close reading of the function contracts leads to three cases for each
> conversion:
>
> * parse error (including infinity and NaN)
>
> @retu / @retd is -EINVAL
> @valu / @vald is uninitialized
> @suffixu / @suffixd is @nptr
>
> * range error
>
> @retu / @retd is -ERANGE
> @valu / @vald is our best approximation of the conversion result
> @suffixu / @suffixd points to the first character not consumed by the
> conversion.
>
> Sub-cases:
>
> - uint64_t overflow
>
> We know the conversion result exceeds UINT64_MAX.
>
> - double overflow
>
> we know the conversion result's magnitude exceeds the largest
> representable finite double DBL_MAX.
>
> - double underflow
>
> we know the conversion result is close to zero (closer than DBL_MIN,
> the smallest normalized positive double).
>
> * success
>
> @retu / @retd is 0
> @valu / @vald is the conversion result
> @suffixu / @suffixd points to the first character not consumed by the
> conversion.
>
> This leads to a matrix (parse error, uint64_t overflow, success) x
> (parse error, double overflow, double underflow, success). We need to
> check the code does what we want for each element of this matrix, and
> document any behavior that's not perfectly obvious.
>
> (success, success): we pick uint64_t if qemu_strtou64() consumed more
> characters than qemu_strtod_finite(), else double. "More" is important
> here; when they consume the same characters, we *need* to use the
> uint64_t result. Example: for "18446744073709551615", we need to use
> uint64_t 18446744073709551615, not double 18446744073709551616.0. But
> for "18446744073709551616.", we need to use the double. Good.
>
> (success, parse error) and (parse error, success): we pick the one that
> succeeds, because success consumes characters, and failure to parse does
> not. Good.
>
> (parse error, parse error): neither consumes characters, so we pick
> uint64_t. Good.
>
> (parse error, double overflow), (parse error, double underflow) and
> (uint64_t overflow, parse error): we pick the range error, because it
> consumes characters. Good.
>
> These are the simple combinations. The remainder are hairier: (success,
> double overflow), (success, double underflow), (uint64_t overflow,
> success). I lack the time to analyze them today. Must be done before
> we take this patch. Any takers?
(success, double overflow), (success, double underflow), pick double
overflow error, return -ERANGE. Because it consumes characters. Example:
for "1.79769e+309", qemu_strtou64 consumes "1", and prases as uint64_t;
but qemu_strtod_finite return -ERANGE and consumes all characters. It is OK.
(uint64_t overflow, success), consume the same characters, use the
uint64_t return -ERANGE. Note that even if qemu_strtod_finite can parse
these cases such as "18446744073709551617", but the result is uint64_t
so we also need to return -ERANGE. It is OK.
Thank you for your analysis and suggestion. I will add more test cases
to cover some of these analysis.
>
>> - fraction = modf(val, &integral);
>> - if (fraction != 0) {
>> - mul_required = 1;
>> + if (use_strtod) {
>> + fraction = modf(vald, &integral);
>> + if (fraction != 0) {
>> + mul_required = 1;
>> + }
>> }
>
> Here, @suffix points to the suffix character, if any.
>
>> - c = *endptr;
>> + c = *suffix;
>> mul = suffix_mul(c, unit);
>> if (mul >= 0) {
>> - endptr++;
>> + suffix++;
>
> Now @suffix points to the first character not consumed, *not* the
> suffix.
>
> Your patch effectively renames @endptr to @suffix. I think @endptr is
> the better name. Keeping the name also makes the diff smaller and
> slightly easier to review.
>
>> } else {
>> mul = suffix_mul(default_suffix, unit);
>
> suffix_mul() returns int64_t. The assignment converts it to double.
> Fine before the patch, because @mul is the multiplier for a double
> value. No longer true after the patch, see below.
>
>> assert(mul >= 0);
>> @@ -238,23 +257,36 @@ static int do_strtosz(const char *nptr, const char **end,
>> retval = -EINVAL;
>> goto out;
>> }
>> - /*
>> - * Values near UINT64_MAX overflow to 2**64 when converting to double
>> - * precision. Compare against the maximum representable double precision
>> - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
>> - * the direction of 0".
>> - */
>> - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
>> - retval = -ERANGE;
>> - goto out;
>> +
>> + if (use_strtod) {
>> + /*
>> + * Values near UINT64_MAX overflow to 2**64 when converting to double
>> + * precision. Compare against the maximum representable double precision
>> + * value below 2**64, computed as "the next value after 2**64 (0x1p64)
>> + * in the direction of 0".
>> + */
>> + if ((vald * mul > nextafter(0x1p64, 0)) || vald < 0) {
>> + retval = -ERANGE;
>> + goto out;
>> + }
>> + *result = vald * mul;
>
> Here, @mul is a multiplier for double vald.
>
>> + } else {
>> + /* Reject negative input and overflow output */
>> + while (qemu_isspace(*nptr)) {
>> + nptr++;
>> + }
>> + if (*nptr == '-' || UINT64_MAX / (uint64_t) mul < valu) {
>> + retval = -ERANGE;
>> + goto out;
>> + }
>> + *result = valu * (uint64_t) mul;
>
> Here, @mul is a multiplier for uint64_t valu.
>
> Please change @mul to int64_t to reduce conversions.
>
>> }
>> - *result = val * mul;
>> retval = 0;
>>
>> out:
>> if (end) {
>> - *end = endptr;
>> - } else if (*endptr) {
>> + *end = suffix;
>> + } else if (*suffix) {
>> retval = -EINVAL;
>> }
>
next prev parent reply other threads:[~2019-12-09 5:39 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-05 2:14 [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits Tao Xu
2019-12-05 15:29 ` Markus Armbruster
2019-12-09 5:38 ` Tao Xu [this message]
2019-12-17 10:25 ` Markus Armbruster
2019-12-18 1:33 ` Tao Xu
2019-12-18 5:26 ` Tao Xu
2019-12-18 18:26 ` Markus Armbruster
2019-12-19 7:43 ` Tao Xu
2019-12-19 10:15 ` Markus Armbruster
2019-12-18 21:49 ` Eric Blake
2019-12-17 12:04 ` Christophe de Dinechin
2019-12-17 14:08 ` Markus Armbruster
2019-12-17 14:12 ` Christophe de Dinechin
2019-12-17 15:01 ` Markus Armbruster
2019-12-18 2:29 ` Tao Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7c442e3-cc7e-155e-5370-db9a371928a6@intel.com \
--to=tao3.xu@intel.com \
--cc=armbru@redhat.com \
--cc=ehabkost@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).