qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits


From: Tao Xu
Subject: Re: [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits
Date: Wed, 18 Dec 2019 10:29:28 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1

On 12/17/2019 11:01 PM, Markus Armbruster wrote:
Christophe de Dinechin <address@hidden> writes:

On 17 Dec 2019, at 15:08, Markus Armbruster <address@hidden> wrote:

Christophe de Dinechin <address@hidden> writes:

On 5 Dec 2019, at 16:29, Markus Armbruster <address@hidden> wrote:

Tao Xu <address@hidden> writes:

Parse input string both as a double and as a uint64_t, then use the
method which consumes more characters. Update the related test cases.

Signed-off-by: Tao Xu <address@hidden>
---
[...]
diff --git a/util/cutils.c b/util/cutils.c
index 77acadc70a..b08058c57c 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const char **end,
                      const char default_suffix, int64_t unit,
                      uint64_t *result)
{
-    int retval;
-    const char *endptr;
+    int retval, retd, retu;
+    const char *suffix, *suffixd, *suffixu;
    unsigned char c;
    int mul_required = 0;
-    double val, mul, integral, fraction;
+    bool use_strtod;
+    uint64_t valu;
+    double vald, mul, integral, fraction;

Note for later: @mul is double.

+
+    retd = qemu_strtod_finite(nptr, &suffixd, &vald);
+    retu = qemu_strtou64(nptr, &suffixu, 0, &valu);
+    use_strtod = strlen(suffixd) < strlen(suffixu);
+
+    /*
+     * Parse @nptr both as a double and as a uint64_t, then use the method
+     * which consumes more characters.
+     */

The comment is in a funny place.  I'd put it right before the
qemu_strtod_finite() line.

+    if (use_strtod) {
+        suffix = suffixd;
+        retval = retd;
+    } else {
+        suffix = suffixu;
+        retval = retu;
+    }

-    retval = qemu_strtod_finite(nptr, &endptr, &val);
    if (retval) {
        goto out;
    }

This is even more subtle than it looks.

But why it is even necessary?

The “contract” for the function used to be that it returned rounded values
beyond 2^53, which in itself is curious.

But now it’s a 6-dimensional matrix of hell with NaNs and barfnots, when the
name implies it’s simply doing a text to u64 conversion…

There is certainly a reason, but I’m really curious what it is :-)

It all goes back to commit 9f9b17a4f0 "Introduce strtosz() library
function to convert a string to a byte count.".  To support "convenient"
usage like "1.5G", it parses the number part with strtod().  This limits
us to 53 bits of precision.  Larger sizes get rounded.

I guess the excuse for this was that when you're dealing with sizes that
large (petabytes!), your least significant bits are zero anyway.

Regardless, the interface is *awful*.  We should've forced the author to
spell it out in all its glory in a proper function contract.  That tends
to cool the enthusiasm for "convenient" syntax amazingly fast.

The awful interface has been confusing people for close to a decade now.

What to do?

I see. Thanks for the rationale. I knew it had to make sense :-)

For a value of "sense"...

I’d probably avoid strtod even with the convenient syntax above.
Do you want 1.33e-6M to be allowed? Do we want to ever
accept or generate NaN or Inf values?

NaN or Inf definitely not.  That's why we use qemu_strtod_finite()
before and after the patch.

No sane person should ever use 1.33e-6M.  Or even 1.1k (which yields
1126, rounded silently from machine number 1126.4000000000001, which
approximates the true value 1126.4).

Certain fractions are actually sane.  1.5k denotes a perfectly fine
integer, which the code manages not to screw up.  I'd recommend against
using fractions regardless.

What usage are we prepared to break?  What kind of confusion are we
willing to bear?  Those are the questions.

Tao Xu's patch tries to make the function do what its users expect,
namely parse a bleepin' 64 bit integer, without breaking any of the
"convenience" syntax.  Turns out that's amazingly subtle.  Are we making
things less confusing or more?

Thanks for your explanation. I think another reason is build-in 'size' is really commonly used. May be someone use '-m 1.5G' to boot QEMU or write it to a config file.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]