linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests
@ 2015-10-27 14:06 Vitaly Kuznetsov
  2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-27 14:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Linux always lies about your storage size when it has 4k sectors and its
size is big enough. E.g. a device with 8192 4k sectors will be reported as
"32.7 MB/32 MiB" while "33.5 MB/32 MiB" is expected. This series is
supposed to fix the issue by fixing calculation precision in
string_get_size() for all possible inputs.

PATCH 1/3 is a preparatory change, PATCH 2/3 re-factors string_get_size()
fixing the issue, PATCH 3/3 introduces tests for string_get_size().

PATCH 3/3 was previously sent as part of "lib/string_helpers.c: fix
infinite loop in string_get_size()" series but it is still not merged
upstream. In this submission I improve it and add additional tests to it.

Changes since v1:
- Patch2: do not rename 'i' to 'order' [Andy Shevchenko]
- Patch2: check against blk_size == 0 [Rasmus Villemoes]
- Patch3: make test_string_get_size_one() check both STRING_UNITS_2 and
  STRING_UNITS_10 in one shot [Rasmus Villemoes]
- Patch3: test U64_MAX instead of some-very-big-number.

Vitaly Kuznetsov (3):
  lib/string_helpers: change blk_size to u32 for string_get_size()
    interface
  lib/string_helpers.c: don't lose precision in string_get_size()
  lib/test-string_helpers.c: add string_get_size() tests

 include/linux/string_helpers.h |  2 +-
 lib/string_helpers.c           | 39 ++++++++++++++++++----------
 lib/test-string_helpers.c      | 58 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 84 insertions(+), 15 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
@ 2015-10-27 14:06 ` Vitaly Kuznetsov
  2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
  2015-10-27 14:06 ` [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
  2 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-27 14:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

string_get_size() can't really handle huge block sizes, especially
blk_size > U32_MAX but string_get_size() interface states the opposite.
Change blk_size from u64 to u32 to reflect the reality.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 include/linux/string_helpers.h | 2 +-
 lib/string_helpers.c           | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
index dabe643..1223e80 100644
--- a/include/linux/string_helpers.h
+++ b/include/linux/string_helpers.h
@@ -10,7 +10,7 @@ enum string_size_units {
 	STRING_UNITS_2,		/* use binary powers of 2^10 */
 };
 
-void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
+void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
 		     char *buf, int len);
 
 #define UNESCAPE_SPACE		0x01
diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..f6c27dc 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -26,7 +26,7 @@
  * at least 9 bytes and will always be zero terminated.
  *
  */
-void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
+void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 		     char *buf, int len)
 {
 	static const char *const units_10[] = {
@@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
+	exp = divisor[units] / blk_size;
 	/*
 	 * size must be strictly greater than exp here to ensure that remainder
 	 * is greater than divisor[units] coming out of the if below.
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
  2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
@ 2015-10-27 14:06 ` Vitaly Kuznetsov
  2015-10-27 15:25   ` Andy Shevchenko
  2015-10-27 21:02   ` Rasmus Villemoes
  2015-10-27 14:06 ` [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
  2 siblings, 2 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-27 14:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

string_get_size() loses precision when there is a remainder for
blk_size / divisor[units] and size is big enough. E.g
string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
while it is supposed to return "33.5 MB". For some artificial inputs
the result can be ridiculously wrong, e.g.
string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
when "5.70 MB" is expected.

The issues comes from the fact than we through away
blk_size / divisor[units] remainder when size is > exp. This can be fixed
by saving it and doing some non-trivial calculations later to fix the error
but that would make this function even more cumbersome. Slightly re-factor
the function to not lose the precision for all inputs.

The overall complexity of this function comes from the fact that size can
be huge and we don't want to do size * blk_size as it can overflow. Do the
math in two steps:
1) Reduce size to something < blk_size * divisor[units]
2) Multiply the result (and the remainder) by blk_size and do final
   calculations.

Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
Changes since v1:
- Check against blk_size == 0 [Rasmus Villemoes]
- Do not rename 'i' to 'order' [Andy Shevchenko]
---
 lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index f6c27dc..eba8e82 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 		[STRING_UNITS_2] = 1024,
 	};
 	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	u64 remainder = 0;
+	u32 sf_cap;
 	char tmp[8];
 	const char *unit;
 
@@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 	if (!size)
 		goto out;
 
-	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
-		i++;
+	if (!blk_size) {
+		WARN_ON(1);
+		size = 0;
+		goto out;
 	}
 
-	exp = divisor[units] / blk_size;
 	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
+	 * size can be huge and doing size * blk_size right away can overflow.
+	 * As a first step reduce huge size to something less than
+	 * blk_size * divisor[units].
 	 */
-	if (size > exp) {
+	while (size > (u64)blk_size * divisor[units]) {
 		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
 		i++;
-	} else {
-		remainder *= size;
 	}
 
+	/* Now we're OK with doing size * blk_size, it won't overflow. */
 	size *= blk_size;
+	remainder *= blk_size;
+	/*
+	 * We were doing partial multiplication by blk_size.
+	 * remainder >= divisor[units] here means size should be increased.
+	 */
 	size += remainder / divisor[units];
-	remainder %= divisor[units];
+	remainder -= (remainder / divisor[units]) * divisor[units];
 
+	/*
+	 * Normalize. size >= divisor[units] means we still have enough
+	 * precision and dropping remainder is fine.
+	 */
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
 		i++;
@@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 	if (j) {
 		remainder *= 1000;
 		remainder /= divisor[units];
-		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
+		/* remainder is < divisor[units] here, (u32) is legit */
+		snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
 		tmp[j+1] = '\0';
 	}
 
@@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 	else
 		unit = units_str[units][i];
 
+	/* size is < divisor[units] here, (u32) is legit */
 	snprintf(buf, len, "%u%s %s", (u32)size,
 		 tmp, unit);
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests
  2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
  2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
  2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
@ 2015-10-27 14:06 ` Vitaly Kuznetsov
  2 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-27 14:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Add a couple of simple tests for string_get_size().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
Changes since v1:
- Make test_string_get_size_one() check both STRING_UNITS_2 and
  STRING_UNITS_10 in one shot [Rasmus Villemoes]
- Check U64_MAX instead of some-very-big-number.
---
 lib/test-string_helpers.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/lib/test-string_helpers.c b/lib/test-string_helpers.c
index 8e376ef..4c77b54 100644
--- a/lib/test-string_helpers.c
+++ b/lib/test-string_helpers.c
@@ -326,6 +326,61 @@ out:
 	kfree(out_test);
 }
 
+#define string_get_size_maxbuf 16
+#define test_string_get_size_one(size, blk_size, exp_result10, exp_result2)    \
+	do {                                                                   \
+		BUILD_BUG_ON(sizeof(exp_result10) >= string_get_size_maxbuf);  \
+		BUILD_BUG_ON(sizeof(exp_result2) >= string_get_size_maxbuf);   \
+		__test_string_get_size((size), (blk_size), (exp_result10),     \
+				       (exp_result2));                         \
+	} while (0)
+
+
+static __init void __test_string_get_size(const u64 size, const u32 blk_size,
+					  const char *exp_result10,
+					  const char *exp_result2)
+{
+	char buf10[string_get_size_maxbuf];
+	char buf2[string_get_size_maxbuf];
+
+	string_get_size(size, blk_size, STRING_UNITS_10, buf10, sizeof(buf10));
+	string_get_size(size, blk_size, STRING_UNITS_2, buf2, sizeof(buf2));
+
+	if (!memcmp(buf10, exp_result10, strlen(exp_result10) + 1))
+		goto check_stringunits_2;
+
+	buf10[sizeof(buf10) - 1] = '\0';
+
+	pr_warn("Test 'test_string_get_size' failed!\n");
+	pr_warn("string_get_size(size = %llu, blk_size = %u, units = %s)\n",
+		size, blk_size, "STRING_UNITS_10");
+	pr_warn("expected: '%s', got '%s'\n", exp_result10, buf10);
+
+check_stringunits_2:
+	if (!memcmp(buf2, exp_result2, strlen(exp_result2) + 1))
+		return;
+
+	buf2[sizeof(buf2) - 1] = '\0';
+
+	pr_warn("Test 'test_string_get_size' failed!\n");
+	pr_warn("string_get_size(size = %llu, blk_size = %u, units = %s)\n",
+		size, blk_size, "STRING_UNITS_2");
+	pr_warn("expected: '%s', got '%s'\n", exp_result2, buf2);
+}
+
+static __init void test_string_get_size(void)
+{
+	test_string_get_size_one(16384, 512, "8.38 MB", "8.00 MiB");
+	test_string_get_size_one(500118192, 512, "256 GB", "238 GiB");
+	test_string_get_size_one(8192, 4096, "33.5 MB", "32.0 MiB");
+	test_string_get_size_one(1100, 1, "1.10 kB", "1.07 KiB");
+	test_string_get_size_one(3000, 1900, "5.70 MB", "5.43 MiB");
+	test_string_get_size_one(U64_MAX, 4096, "75.5 ZB", "63.9 ZiB");
+	test_string_get_size_one(1999, U32_MAX, "8.58 TB", "7.80 TiB");
+	test_string_get_size_one(1, 512, "512 B", "512 B");
+	test_string_get_size_one(0, 512, "0 B", "0 B");
+}
+
 static int __init test_string_helpers_init(void)
 {
 	unsigned int i;
@@ -344,6 +399,9 @@ static int __init test_string_helpers_init(void)
 	for (i = 0; i < (ESCAPE_ANY_NP | ESCAPE_HEX) + 1; i++)
 		test_string_escape("escape 1", escape1, i, TEST_STRING_2_DICT_1);
 
+	/* Test string_get_size() */
+	test_string_get_size();
+
 	return -EINVAL;
 }
 module_init(test_string_helpers_init);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
@ 2015-10-27 15:25   ` Andy Shevchenko
  2015-10-27 16:16     ` Vitaly Kuznetsov
  2015-10-27 21:02   ` Rasmus Villemoes
  1 sibling, 1 reply; 9+ messages in thread
From: Andy Shevchenko @ 2015-10-27 15:25 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Andrew Morton
  Cc: Rasmus Villemoes, Ulf Hansson, James Bottomley, Kees Cook, linux-kernel

On Tue, 2015-10-27 at 15:06 +0100, Vitaly Kuznetsov wrote:
> string_get_size() loses precision when there is a remainder for
> blk_size / divisor[units] and size is big enough. E.g
> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
> while it is supposed to return "33.5 MB". For some artificial inputs
> the result can be ridiculously wrong, e.g.
> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
> when "5.70 MB" is expected.
> 
> The issues comes from the fact than we through away
> blk_size / divisor[units] remainder when size is > exp. This can be
> fixed
> by saving it and doing some non-trivial calculations later to fix the
> error
> but that would make this function even more cumbersome. Slightly re-
> factor
> the function to not lose the precision for all inputs.
> 
> The overall complexity of this function comes from the fact that size
> can
> be huge and we don't want to do size * blk_size as it can overflow.
> Do the
> math in two steps:
> 1) Reduce size to something < blk_size * divisor[units]
> 2) Multiply the result (and the remainder) by blk_size and do final
>    calculations.
> 
> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> Changes since v1:
> - Check against blk_size == 0 [Rasmus Villemoes]
> - Do not rename 'i' to 'order' [Andy Shevchenko]
> ---
>  lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>  1 file changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index f6c27dc..eba8e82 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const
> enum string_size_units units,
>  		[STRING_UNITS_2] = 1024,
>  	};
>  	int i, j;
> -	u32 remainder = 0, sf_cap, exp;
> +	u64 remainder = 0;
> +	u32 sf_cap;
>  	char tmp[8];
>  	const char *unit;
>  
> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size,
> const enum string_size_units units,
>  	if (!size)
>  		goto out;
>  
> -	while (blk_size >= divisor[units]) {
> -		remainder = do_div(blk_size, divisor[units]);
> -		i++;
> +	if (!blk_size) {
> +		WARN_ON(1);

Hmm... Isn't it too strong? WARN_ONCE() might reduce a noise. Or even
pr_warn_once/ratelimited().

> +		size = 0;
> +		goto out;
>  	}

What about doing it before if (!size) ?

Like 

if (!blk_size) {
 pr_warn_once(); /* or WARN_ONCE() ? */
 /* Override size to follow error path */
 size = 0;
}
 
if (!size)

Also, would it be separate patch?

>  
> -	exp = divisor[units] / blk_size;
>  	/*
> -	 * size must be strictly greater than exp here to ensure
> that remainder
> -	 * is greater than divisor[units] coming out of the if
> below.
> +	 * size can be huge and doing size * blk_size right away can
> overflow.
> +	 * As a first step reduce huge size to something less than
> +	 * blk_size * divisor[units].
>  	 */
> -	if (size > exp) {
> +	while (size > (u64)blk_size * divisor[units]) {
>  		remainder = do_div(size, divisor[units]);
> -		remainder *= blk_size;
>  		i++;
> -	} else {
> -		remainder *= size;
>  	}
>  
> +	/* Now we're OK with doing size * blk_size, it won't
> overflow. */
>  	size *= blk_size;
> +	remainder *= blk_size;
> +	/*
> +	 * We were doing partial multiplication by blk_size.
> +	 * remainder >= divisor[units] here means size should be
> increased.
> +	 */
>  	size += remainder / divisor[units];

> -	remainder %= divisor[units];
> +	remainder -= (remainder / divisor[units]) * divisor[units];

I'm sorry I didn't get what the purpose of change here.

(Yes, I was thinking about u64 on 32-bit architecture, but % and / are
working in the similar way aren't they?)

>  
> +	/*
> +	 * Normalize. size >= divisor[units] means we still have
> enough
> +	 * precision and dropping remainder is fine.
> +	 */
>  	while (size >= divisor[units]) {
>  		remainder = do_div(size, divisor[units]);
>  		i++;
> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const
> enum string_size_units units,
>  	if (j) {
>  		remainder *= 1000;
>  		remainder /= divisor[units];
> -		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
> +		/* remainder is < divisor[units] here, (u32) is
> legit */
> +		snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>  		tmp[j+1] = '\0';
>  	}
>  
> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const
> enum string_size_units units,
>  	else
>  		unit = units_str[units][i];
>  
> +	/* size is < divisor[units] here, (u32) is legit */
>  	snprintf(buf, len, "%u%s %s", (u32)size,
>  		 tmp, unit);
>  }

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-27 15:25   ` Andy Shevchenko
@ 2015-10-27 16:16     ` Vitaly Kuznetsov
  2015-10-27 16:24       ` Andy Shevchenko
  0 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-27 16:16 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Andrew Morton, Rasmus Villemoes, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Andy Shevchenko <andriy.shevchenko@linux.intel.com> writes:

> On Tue, 2015-10-27 at 15:06 +0100, Vitaly Kuznetsov wrote:
>> string_get_size() loses precision when there is a remainder for
>> blk_size / divisor[units] and size is big enough. E.g
>> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
>> while it is supposed to return "33.5 MB". For some artificial inputs
>> the result can be ridiculously wrong, e.g.
>> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
>> when "5.70 MB" is expected.
>> 
>> The issues comes from the fact than we through away
>> blk_size / divisor[units] remainder when size is > exp. This can be
>> fixed
>> by saving it and doing some non-trivial calculations later to fix the
>> error
>> but that would make this function even more cumbersome. Slightly re-
>> factor
>> the function to not lose the precision for all inputs.
>> 
>> The overall complexity of this function comes from the fact that size
>> can
>> be huge and we don't want to do size * blk_size as it can overflow.
>> Do the
>> math in two steps:
>> 1) Reduce size to something < blk_size * divisor[units]
>> 2) Multiply the result (and the remainder) by blk_size and do final
>>    calculations.
>> 
>> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> Changes since v1:
>> - Check against blk_size == 0 [Rasmus Villemoes]
>> - Do not rename 'i' to 'order' [Andy Shevchenko]
>> ---
>>  lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>>  1 file changed, 24 insertions(+), 13 deletions(-)
>> 
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index f6c27dc..eba8e82 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const
>> enum string_size_units units,
>>  		[STRING_UNITS_2] = 1024,
>>  	};
>>  	int i, j;
>> -	u32 remainder = 0, sf_cap, exp;
>> +	u64 remainder = 0;
>> +	u32 sf_cap;
>>  	char tmp[8];
>>  	const char *unit;
>>  
>> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size,
>> const enum string_size_units units,
>>  	if (!size)
>>  		goto out;
>>  
>> -	while (blk_size >= divisor[units]) {
>> -		remainder = do_div(blk_size, divisor[units]);
>> -		i++;
>> +	if (!blk_size) {
>> +		WARN_ON(1);
>
> Hmm... Isn't it too strong? WARN_ONCE() might reduce a noise. Or even
> pr_warn_once/ratelimited().

Nobody is supposed to call string_get_size() with blk_size = 0, if
someone does that - it is a bug and that's what WARN_ON is supposed to
report. I'm OK with changing it to WARN_ONCE() but I don't see a big
difference - nobody's calling string_get_size() in a loop, one/two calls
per one storage device is expected.

>
>> +		size = 0;
>> +		goto out;
>>  	}
>
> What about doing it before if (!size) ?
>
> Like 
>
> if (!blk_size) {
>  pr_warn_once(); /* or WARN_ONCE() ? */
>  /* Override size to follow error path */
>  size = 0;
> }
>  
> if (!size)

To be honest I don't see a big difference but I'm fine with the change :-)

>
> Also, would it be separate patch?
>

I'm significantly changing the algorithm here and I didn't want to
introduce an infinite loop with blk_size = 0 but I've just checked and
previous version had division by zero here so yes, I can make it a
separate patch.

>>  
>> -	exp = divisor[units] / blk_size;
>>  	/*
>> -	 * size must be strictly greater than exp here to ensure
>> that remainder
>> -	 * is greater than divisor[units] coming out of the if
>> below.
>> +	 * size can be huge and doing size * blk_size right away can
>> overflow.
>> +	 * As a first step reduce huge size to something less than
>> +	 * blk_size * divisor[units].
>>  	 */
>> -	if (size > exp) {
>> +	while (size > (u64)blk_size * divisor[units]) {
>>  		remainder = do_div(size, divisor[units]);
>> -		remainder *= blk_size;
>>  		i++;
>> -	} else {
>> -		remainder *= size;
>>  	}
>>  
>> +	/* Now we're OK with doing size * blk_size, it won't
>> overflow. */
>>  	size *= blk_size;
>> +	remainder *= blk_size;
>> +	/*
>> +	 * We were doing partial multiplication by blk_size.
>> +	 * remainder >= divisor[units] here means size should be
>> increased.
>> +	 */
>>  	size += remainder / divisor[units];
>
>> -	remainder %= divisor[units];
>> +	remainder -= (remainder / divisor[units]) * divisor[units];
>
> I'm sorry I didn't get what the purpose of change here.
>
> (Yes, I was thinking about u64 on 32-bit architecture, but % and / are
> working in the similar way aren't they?)

Thanks for noticing, there is no functional change here, it just made
the code easier to understand (for me only?). I'm OK with reverting it
to '%='.

>
>>  
>> +	/*
>> +	 * Normalize. size >= divisor[units] means we still have
>> enough
>> +	 * precision and dropping remainder is fine.
>> +	 */
>>  	while (size >= divisor[units]) {
>>  		remainder = do_div(size, divisor[units]);
>>  		i++;
>> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const
>> enum string_size_units units,
>>  	if (j) {
>>  		remainder *= 1000;
>>  		remainder /= divisor[units];
>> -		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
>> +		/* remainder is < divisor[units] here, (u32) is
>> legit */
>> +		snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>>  		tmp[j+1] = '\0';
>>  	}
>>  
>> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const
>> enum string_size_units units,
>>  	else
>>  		unit = units_str[units][i];
>>  
>> +	/* size is < divisor[units] here, (u32) is legit */
>>  	snprintf(buf, len, "%u%s %s", (u32)size,
>>  		 tmp, unit);
>>  }

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-27 16:16     ` Vitaly Kuznetsov
@ 2015-10-27 16:24       ` Andy Shevchenko
  0 siblings, 0 replies; 9+ messages in thread
From: Andy Shevchenko @ 2015-10-27 16:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Andrew Morton, Rasmus Villemoes, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

On Tue, 2015-10-27 at 17:16 +0100, Vitaly Kuznetsov wrote:
> Andy Shevchenko <andriy.shevchenko@linux.intel.com> writes:

[]

> > +	if (!blk_size) {
> > > +		WARN_ON(1);
> > 
> > Hmm... Isn't it too strong? WARN_ONCE() might reduce a noise. Or
> > even
> > pr_warn_once/ratelimited().
> 
> Nobody is supposed to call string_get_size() with blk_size = 0, if
> someone does that - it is a bug and that's what WARN_ON is supposed
> to
> report. I'm OK with changing it to WARN_ONCE() but I don't see a big
> difference - nobody's calling string_get_size() in a loop, one/two
> calls
> per one storage device is expected.

I'm fine with WARN_ONCE() if there is no objection.

> 
> > 
> > > +		size = 0;
> > > +		goto out;
> > >  	}
> > 
> > What about doing it before if (!size) ?
> > 
> > Like 
> > 
> > if (!blk_size) {
> >  pr_warn_once(); /* or WARN_ONCE() ? */
> >  /* Override size to follow error path */
> >  size = 0;
> > }
> >  
> > if (!size)
> 
> To be honest I don't see a big difference but I'm fine with the
> change :-)

Maybe Rasmus can judge me.

> > > -	remainder %= divisor[units];
> > > +	remainder -= (remainder / divisor[units]) *
> > > divisor[units];
> > 
> > I'm sorry I didn't get what the purpose of change here.
> > 
> > (Yes, I was thinking about u64 on 32-bit architecture, but % and /
> > are
> > working in the similar way aren't they?)
> 
> Thanks for noticing, there is no functional change here, it just made
> the code easier to understand (for me only?). I'm OK with reverting
> it
> to '%='.

For me is the opposite. remainder = remainder % divisor[units] will
work as well, but why change the code at all.

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
  2015-10-27 15:25   ` Andy Shevchenko
@ 2015-10-27 21:02   ` Rasmus Villemoes
  2015-10-29 13:51     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2015-10-27 21:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Andrew Morton, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

On Tue, Oct 27 2015, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> string_get_size() loses precision when there is a remainder for
> blk_size / divisor[units] and size is big enough. E.g
> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
> while it is supposed to return "33.5 MB". For some artificial inputs
> the result can be ridiculously wrong, e.g.
> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
> when "5.70 MB" is expected.
>
> The issues comes from the fact than we through away
> blk_size / divisor[units] remainder when size is > exp. This can be fixed
> by saving it and doing some non-trivial calculations later to fix the error
> but that would make this function even more cumbersome. Slightly re-factor
> the function to not lose the precision for all inputs.
>
> The overall complexity of this function comes from the fact that size can
> be huge and we don't want to do size * blk_size as it can overflow. Do the
> math in two steps:
> 1) Reduce size to something < blk_size * divisor[units]
> 2) Multiply the result (and the remainder) by blk_size and do final
>    calculations.
>
> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> Changes since v1:
> - Check against blk_size == 0 [Rasmus Villemoes]
> - Do not rename 'i' to 'order' [Andy Shevchenko]
> ---
>  lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>  1 file changed, 24 insertions(+), 13 deletions(-)
>
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index f6c27dc..eba8e82 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>  		[STRING_UNITS_2] = 1024,
>  	};
>  	int i, j;
> -	u32 remainder = 0, sf_cap, exp;
> +	u64 remainder = 0;
> +	u32 sf_cap;
>  	char tmp[8];
>  	const char *unit;
>  
> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>  	if (!size)
>  		goto out;
>  
> -	while (blk_size >= divisor[units]) {
> -		remainder = do_div(blk_size, divisor[units]);
> -		i++;
> +	if (!blk_size) {
> +		WARN_ON(1);
> +		size = 0;
> +		goto out;
>  	}

I don't think we need to handle that, but if you want, please put the
WARN inside the conditional (so say "if (WARN_ON(!blk_size)) {...}". And
don't make it _ONCE; the ordinary version is slightly cheaper (since
there's no static bool __warned and no code to manage that).

>  
> -	exp = divisor[units] / blk_size;
>  	/*
> -	 * size must be strictly greater than exp here to ensure that remainder
> -	 * is greater than divisor[units] coming out of the if below.
> +	 * size can be huge and doing size * blk_size right away can overflow.
> +	 * As a first step reduce huge size to something less than
> +	 * blk_size * divisor[units].
>  	 */
> -	if (size > exp) {
> +	while (size > (u64)blk_size * divisor[units]) {

It seems weird that we reduce _more_ for smaller block sizes - indeed,
for blk_size==1 we end up reducing size to <= 1000 (or 1024), which is
certainly a good way to lose some precision. Also, this relies on
blk_size being smaller than (roughly) sqrt(U64_MAX/divisor[units]),
which is of course true in practice, but a slightly annoying implicit
assumption.

I do think that my approach of reducing size till it's smaller than
U64_MAX/blk_size is simpler and better. There's much less fixup
code. It's simply

  while (size > div_u64(U64_MAX, blk_size) {
    do_div(size, divisor[units]);
    ++i;
  }
  size *= blk_size;
  while (size > divisor[units]) {
    remainder = do_div(size, divisor[units]);
    ++i;
  }

which is as self-explaining as it gets.

And yes, one needs to use the include/linux/math64.h functions/macros
for u64/u32 divisions - otherwise I'm pretty sure one will get friendly
mails from the build bot.

  while (size > )
>  		remainder = do_div(size, divisor[units]);
> -		remainder *= blk_size;
>  		i++;
> -	} else {
> -		remainder *= size;
>  	}
>  
> +	/* Now we're OK with doing size * blk_size, it won't overflow. */
>  	size *= blk_size;
> +	remainder *= blk_size;
> +	/*
> +	 * We were doing partial multiplication by blk_size.
> +	 * remainder >= divisor[units] here means size should be increased.
> +	 */
>  	size += remainder / divisor[units];
> -	remainder %= divisor[units];
> +	remainder -= (remainder / divisor[units]) * divisor[units];
>  
> +	/*
> +	 * Normalize. size >= divisor[units] means we still have enough
> +	 * precision and dropping remainder is fine.
> +	 */
>  	while (size >= divisor[units]) {
>  		remainder = do_div(size, divisor[units]);
>  		i++;
> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>  	if (j) {
>  		remainder *= 1000;
>  		remainder /= divisor[units];
> -		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
> +		/* remainder is < divisor[units] here, (u32) is legit */

What is actually important is that remainder is < 1000. remainder was
initially < divisor[units], but then the multiplication and division
transformed that into "1/1000s" of whatever unit we're using. 

> +		snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>  		tmp[j+1] = '\0';
>  	}
>  
> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>  	else
>  		unit = units_str[units][i];
>  
> +	/* size is < divisor[units] here, (u32) is legit */
>  	snprintf(buf, len, "%u%s %s", (u32)size,
>  		 tmp, unit);
>  }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-27 21:02   ` Rasmus Villemoes
@ 2015-10-29 13:51     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-29 13:51 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Andrew Morton, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Rasmus Villemoes <linux@rasmusvillemoes.dk> writes:

> On Tue, Oct 27 2015, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> string_get_size() loses precision when there is a remainder for
>> blk_size / divisor[units] and size is big enough. E.g
>> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
>> while it is supposed to return "33.5 MB". For some artificial inputs
>> the result can be ridiculously wrong, e.g.
>> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
>> when "5.70 MB" is expected.
>>
>> The issues comes from the fact than we through away
>> blk_size / divisor[units] remainder when size is > exp. This can be fixed
>> by saving it and doing some non-trivial calculations later to fix the error
>> but that would make this function even more cumbersome. Slightly re-factor
>> the function to not lose the precision for all inputs.
>>
>> The overall complexity of this function comes from the fact that size can
>> be huge and we don't want to do size * blk_size as it can overflow. Do the
>> math in two steps:
>> 1) Reduce size to something < blk_size * divisor[units]
>> 2) Multiply the result (and the remainder) by blk_size and do final
>>    calculations.
>>
>> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> Changes since v1:
>> - Check against blk_size == 0 [Rasmus Villemoes]
>> - Do not rename 'i' to 'order' [Andy Shevchenko]
>> ---
>>  lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>>  1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index f6c27dc..eba8e82 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  		[STRING_UNITS_2] = 1024,
>>  	};
>>  	int i, j;
>> -	u32 remainder = 0, sf_cap, exp;
>> +	u64 remainder = 0;
>> +	u32 sf_cap;
>>  	char tmp[8];
>>  	const char *unit;
>>  
>> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  	if (!size)
>>  		goto out;
>>  
>> -	while (blk_size >= divisor[units]) {
>> -		remainder = do_div(blk_size, divisor[units]);
>> -		i++;
>> +	if (!blk_size) {
>> +		WARN_ON(1);
>> +		size = 0;
>> +		goto out;
>>  	}
>
> I don't think we need to handle that, but if you want, please put the
> WARN inside the conditional (so say "if (WARN_ON(!blk_size)) {...}". And
> don't make it _ONCE; the ordinary version is slightly cheaper (since
> there's no static bool __warned and no code to manage that).

Ok, I'll do that in a separate patch and hope Andy is not going to
complain about not adding _ONCE.

>
>>  
>> -	exp = divisor[units] / blk_size;
>>  	/*
>> -	 * size must be strictly greater than exp here to ensure that remainder
>> -	 * is greater than divisor[units] coming out of the if below.
>> +	 * size can be huge and doing size * blk_size right away can overflow.
>> +	 * As a first step reduce huge size to something less than
>> +	 * blk_size * divisor[units].
>>  	 */
>> -	if (size > exp) {
>> +	while (size > (u64)blk_size * divisor[units]) {
>
> It seems weird that we reduce _more_ for smaller block sizes - indeed,
> for blk_size==1 we end up reducing size to <= 1000 (or 1024), which is
> certainly a good way to lose some precision. Also, this relies on
> blk_size being smaller than (roughly) sqrt(U64_MAX/divisor[units]),
> which is of course true in practice, but a slightly annoying implicit
> assumption.
>
> I do think that my approach of reducing size till it's smaller than
> U64_MAX/blk_size is simpler and better. There's much less fixup
> code. It's simply
>
>   while (size > div_u64(U64_MAX, blk_size) {
>     do_div(size, divisor[units]);
>     ++i;
>   }
>   size *= blk_size;
>   while (size > divisor[units]) {
>     remainder = do_div(size, divisor[units]);
>     ++i;
>   }
>
> which is as self-explaining as it gets.
>

OK, let me borrow this idea from you and change Reported-by: tag with
Suggested-by:. I'll test and send v3.

> And yes, one needs to use the include/linux/math64.h functions/macros
> for u64/u32 divisions - otherwise I'm pretty sure one will get friendly
> mails from the build bot.
>
>   while (size > )
>>  		remainder = do_div(size, divisor[units]);
>> -		remainder *= blk_size;
>>  		i++;
>> -	} else {
>> -		remainder *= size;
>>  	}
>>  
>> +	/* Now we're OK with doing size * blk_size, it won't overflow. */
>>  	size *= blk_size;
>> +	remainder *= blk_size;
>> +	/*
>> +	 * We were doing partial multiplication by blk_size.
>> +	 * remainder >= divisor[units] here means size should be increased.
>> +	 */
>>  	size += remainder / divisor[units];
>> -	remainder %= divisor[units];
>> +	remainder -= (remainder / divisor[units]) * divisor[units];
>>  
>> +	/*
>> +	 * Normalize. size >= divisor[units] means we still have enough
>> +	 * precision and dropping remainder is fine.
>> +	 */
>>  	while (size >= divisor[units]) {
>>  		remainder = do_div(size, divisor[units]);
>>  		i++;
>> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  	if (j) {
>>  		remainder *= 1000;
>>  		remainder /= divisor[units];
>> -		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
>> +		/* remainder is < divisor[units] here, (u32) is legit */
>
> What is actually important is that remainder is < 1000. remainder was
> initially < divisor[units], but then the multiplication and division
> transformed that into "1/1000s" of whatever unit we're using. 
>
>> +		snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>>  		tmp[j+1] = '\0';
>>  	}
>>  
>> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  	else
>>  		unit = units_str[units][i];
>>  
>> +	/* size is < divisor[units] here, (u32) is legit */
>>  	snprintf(buf, len, "%u%s %s", (u32)size,
>>  		 tmp, unit);
>>  }

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-10-29 13:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
2015-10-27 15:25   ` Andy Shevchenko
2015-10-27 16:16     ` Vitaly Kuznetsov
2015-10-27 16:24       ` Andy Shevchenko
2015-10-27 21:02   ` Rasmus Villemoes
2015-10-29 13:51     ` Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).