qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Tao Xu <tao3.xu@intel.com>
To: Markus Armbruster <armbru@redhat.com>,
	Christophe de Dinechin <dinechin@redhat.com>
Cc: "mdroth@linux.vnet.ibm.com" <mdroth@linux.vnet.ibm.com>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits
Date: Wed, 18 Dec 2019 10:29:28 +0800	[thread overview]
Message-ID: <0b0767cc-2f97-e1a1-f479-295e36877f89@intel.com> (raw)
In-Reply-To: <87o8w7ate6.fsf@dusky.pond.sub.org>

On 12/17/2019 11:01 PM, Markus Armbruster wrote:
> Christophe de Dinechin <dinechin@redhat.com> writes:
> 
>>> On 17 Dec 2019, at 15:08, Markus Armbruster <armbru@redhat.com> wrote:
>>>
>>> Christophe de Dinechin <dinechin@redhat.com> writes:
>>>
>>>>> On 5 Dec 2019, at 16:29, Markus Armbruster <armbru@redhat.com> wrote:
>>>>>
>>>>> Tao Xu <tao3.xu@intel.com> writes:
>>>>>
>>>>>> Parse input string both as a double and as a uint64_t, then use the
>>>>>> method which consumes more characters. Update the related test cases.
>>>>>>
>>>>>> Signed-off-by: Tao Xu <tao3.xu@intel.com>
>>>>>> ---
>>>>> [...]
>>>>>> diff --git a/util/cutils.c b/util/cutils.c
>>>>>> index 77acadc70a..b08058c57c 100644
>>>>>> --- a/util/cutils.c
>>>>>> +++ b/util/cutils.c
>>>>>> @@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const char **end,
>>>>>>                       const char default_suffix, int64_t unit,
>>>>>>                       uint64_t *result)
>>>>>> {
>>>>>> -    int retval;
>>>>>> -    const char *endptr;
>>>>>> +    int retval, retd, retu;
>>>>>> +    const char *suffix, *suffixd, *suffixu;
>>>>>>     unsigned char c;
>>>>>>     int mul_required = 0;
>>>>>> -    double val, mul, integral, fraction;
>>>>>> +    bool use_strtod;
>>>>>> +    uint64_t valu;
>>>>>> +    double vald, mul, integral, fraction;
>>>>>
>>>>> Note for later: @mul is double.
>>>>>
>>>>>> +
>>>>>> +    retd = qemu_strtod_finite(nptr, &suffixd, &vald);
>>>>>> +    retu = qemu_strtou64(nptr, &suffixu, 0, &valu);
>>>>>> +    use_strtod = strlen(suffixd) < strlen(suffixu);
>>>>>> +
>>>>>> +    /*
>>>>>> +     * Parse @nptr both as a double and as a uint64_t, then use the method
>>>>>> +     * which consumes more characters.
>>>>>> +     */
>>>>>
>>>>> The comment is in a funny place.  I'd put it right before the
>>>>> qemu_strtod_finite() line.
>>>>>
>>>>>> +    if (use_strtod) {
>>>>>> +        suffix = suffixd;
>>>>>> +        retval = retd;
>>>>>> +    } else {
>>>>>> +        suffix = suffixu;
>>>>>> +        retval = retu;
>>>>>> +    }
>>>>>>
>>>>>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>>>>>>     if (retval) {
>>>>>>         goto out;
>>>>>>     }
>>>>>
>>>>> This is even more subtle than it looks.
>>>>
>>>> But why it is even necessary?
>>>>
>>>> The “contract” for the function used to be that it returned rounded values
>>>> beyond 2^53, which in itself is curious.
>>>>
>>>> But now it’s a 6-dimensional matrix of hell with NaNs and barfnots, when the
>>>> name implies it’s simply doing a text to u64 conversion…
>>>>
>>>> There is certainly a reason, but I’m really curious what it is :-)
>>>
>>> It all goes back to commit 9f9b17a4f0 "Introduce strtosz() library
>>> function to convert a string to a byte count.".  To support "convenient"
>>> usage like "1.5G", it parses the number part with strtod().  This limits
>>> us to 53 bits of precision.  Larger sizes get rounded.
>>>
>>> I guess the excuse for this was that when you're dealing with sizes that
>>> large (petabytes!), your least significant bits are zero anyway.
>>>
>>> Regardless, the interface is *awful*.  We should've forced the author to
>>> spell it out in all its glory in a proper function contract.  That tends
>>> to cool the enthusiasm for "convenient" syntax amazingly fast.
>>>
>>> The awful interface has been confusing people for close to a decade now.
>>>
>>> What to do?
>>
>> I see. Thanks for the rationale. I knew it had to make sense :-)
> 
> For a value of "sense"...
> 
>> I’d probably avoid strtod even with the convenient syntax above.
>> Do you want 1.33e-6M to be allowed? Do we want to ever
>> accept or generate NaN or Inf values?
> 
> NaN or Inf definitely not.  That's why we use qemu_strtod_finite()
> before and after the patch.
> 
> No sane person should ever use 1.33e-6M.  Or even 1.1k (which yields
> 1126, rounded silently from machine number 1126.4000000000001, which
> approximates the true value 1126.4).
> 
> Certain fractions are actually sane.  1.5k denotes a perfectly fine
> integer, which the code manages not to screw up.  I'd recommend against
> using fractions regardless.
> 
> What usage are we prepared to break?  What kind of confusion are we
> willing to bear?  Those are the questions.
> 
>>> Tao Xu's patch tries to make the function do what its users expect,
>>> namely parse a bleepin' 64 bit integer, without breaking any of the
>>> "convenience" syntax.  Turns out that's amazingly subtle.  Are we making
>>> things less confusing or more?
> 
Thanks for your explanation. I think another reason is build-in 'size' 
is really commonly used. May be someone use '-m 1.5G' to boot QEMU or 
write it to a config file.


      reply	other threads:[~2019-12-18  2:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-05  2:14 [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits Tao Xu
2019-12-05 15:29 ` Markus Armbruster
2019-12-09  5:38   ` Tao Xu
2019-12-17 10:25     ` Markus Armbruster
2019-12-18  1:33       ` Tao Xu
2019-12-18  5:26         ` Tao Xu
2019-12-18 18:26           ` Markus Armbruster
2019-12-19  7:43             ` Tao Xu
2019-12-19 10:15               ` Markus Armbruster
2019-12-18 21:49         ` Eric Blake
2019-12-17 12:04   ` Christophe de Dinechin
2019-12-17 14:08     ` Markus Armbruster
2019-12-17 14:12       ` Christophe de Dinechin
2019-12-17 15:01         ` Markus Armbruster
2019-12-18  2:29           ` Tao Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b0767cc-2f97-e1a1-f479-295e36877f89@intel.com \
    --to=tao3.xu@intel.com \
    --cc=armbru@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).