linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] lib/string_helpers: fix precision issues and introduce tests
@ 2015-10-29 16:30 Vitaly Kuznetsov
  2015-10-29 16:30 ` [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-29 16:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Linux always lies about your storage size when it has 4k sectors and its
size is big enough. E.g. a device with 8192 4k sectors will be reported as
"32.7 MB/32 MiB" while "33.5 MB/32 MiB" is expected. This series is
supposed to fix the issue by fixing calculation precision in
string_get_size() for all possible inputs.

PATCH 1/4 is a preparatory change, PATCH 2/4 adds additional protection
against blk_size=0 (nobody is supposed to call string_get_size() with
with blk_size=0, but better safe than sorry), PATCH 3/4 re-factors
string_get_size() fixing the issue, PATCH 4/4 introduces tests for
string_get_size().

PATCH 4/4 was previously sent as part of "lib/string_helpers.c: fix
infinite loop in string_get_size()" series but it is still not merged
upstream. In this submission I improve it and add additional tests to it.

Changes since v2:
- Separate blk_size check from Patch 3/4 to new Patch 2/4 [Andy Shevchenko]
- Slightly change the algorithm in Patch 3/4 [Rasmus Villemoes]

Vitaly Kuznetsov (4):
  lib/string_helpers: change blk_size to u32 for string_get_size()
    interface
  lib/string_helpers.c: protect string_get_size() against blk_size=0
  lib/string_helpers.c: don't lose precision in string_get_size()
  lib/test-string_helpers.c: add string_get_size() tests

 include/linux/string_helpers.h |  2 +-
 lib/string_helpers.c           | 38 ++++++++++++++-------------
 lib/test-string_helpers.c      | 58 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+), 19 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-29 16:30 [PATCH v3 0/4] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
@ 2015-10-29 16:30 ` Vitaly Kuznetsov
  2015-10-29 22:27   ` James Bottomley
  2015-10-29 16:30 ` [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0 Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-29 16:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

string_get_size() can't really handle huge block sizes, especially
blk_size > U32_MAX but string_get_size() interface states the opposite.
Change blk_size from u64 to u32 to reflect the reality.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 include/linux/string_helpers.h | 2 +-
 lib/string_helpers.c           | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
index dabe643..1223e80 100644
--- a/include/linux/string_helpers.h
+++ b/include/linux/string_helpers.h
@@ -10,7 +10,7 @@ enum string_size_units {
 	STRING_UNITS_2,		/* use binary powers of 2^10 */
 };
 
-void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
+void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
 		     char *buf, int len);
 
 #define UNESCAPE_SPACE		0x01
diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..f6c27dc 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -26,7 +26,7 @@
  * at least 9 bytes and will always be zero terminated.
  *
  */
-void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
+void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 		     char *buf, int len)
 {
 	static const char *const units_10[] = {
@@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
+	exp = divisor[units] / blk_size;
 	/*
 	 * size must be strictly greater than exp here to ensure that remainder
 	 * is greater than divisor[units] coming out of the if below.
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-29 16:30 [PATCH v3 0/4] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
  2015-10-29 16:30 ` [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
@ 2015-10-29 16:30 ` Vitaly Kuznetsov
  2015-10-29 21:24   ` Andy Shevchenko
  2015-10-29 23:00   ` James Bottomley
  2015-10-29 16:30 ` [PATCH v3 3/4] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
  2015-10-29 16:30 ` [PATCH v3 4/4] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
  3 siblings, 2 replies; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-29 16:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Division by zero happens if blk_size=0 is supplied to string_get_size().
Add WARN_ON() and set size to 0 to report '0 B'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 lib/string_helpers.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index f6c27dc..ff3575b 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 
 	tmp[0] = '\0';
 	i = 0;
+
+	/* Calling string_get_size() with blk_size=0 is wrong! */
+	if (WARN_ON(!blk_size))
+		size = 0;
+
 	if (!size)
 		goto out;
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/4] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-29 16:30 [PATCH v3 0/4] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
  2015-10-29 16:30 ` [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
  2015-10-29 16:30 ` [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0 Vitaly Kuznetsov
@ 2015-10-29 16:30 ` Vitaly Kuznetsov
  2015-10-29 21:22   ` Andy Shevchenko
  2015-10-29 16:30 ` [PATCH v3 4/4] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
  3 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-29 16:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

string_get_size() loses precision when there is a remainder for
blk_size / divisor[units] and size is big enough. E.g
string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
while it is supposed to return "33.5 MB". For some artificial inputs
the result can be ridiculously wrong, e.g.
string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
when "5.70 MB" is expected.

The issues comes from the fact than we through away
blk_size / divisor[units] remainder when size is > exp. This can be fixed
by saving it and doing some non-trivial calculations later to fix the error
but that would make this function even more cumbersome. Slightly re-factor
the function to not lose the precision for all inputs.

The overall complexity of this function comes from the fact that size can
be huge and we don't want to do size * blk_size as it can overflow. Do the
math in two steps:
1) Reduce size to something < U64_MAX / blk_size.
2) Multiply the result by blk_size and do final calculations.

Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
Changes since v2:
- Reduce size to something < U64_MAX / blk_size as a first step instead of
  blk_size * divisor[units], remove fixup code [Rasmus Villemoes]
- Separate !blk_size check into a separate patch [Andy Shevchenko]
---
 lib/string_helpers.c | 31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index ff3575b..29263c1 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -44,7 +44,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 		[STRING_UNITS_2] = 1024,
 	};
 	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	u32 remainder = 0, sf_cap;
 	char tmp[8];
 	const char *unit;
 
@@ -58,27 +58,23 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 	if (!size)
 		goto out;
 
-	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
-		i++;
-	}
-
-	exp = divisor[units] / blk_size;
 	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
+	 * size can be huge and doing size * blk_size right away can overflow.
+	 * As a first step reduce huge size to something less than
+	 * U64_MAX / blk_size.
 	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
-		i++;
-	} else {
-		remainder *= size;
+	while (size > div_u64(U64_MAX, blk_size)) {
+		/*
+		 * We do not need to keep the remainder as blk_size is capped
+		 * by U32_MAX and we'll still have enough precision after the
+		 * loop.
+		 */
+		do_div(size, divisor[units]);
+		++i;
 	}
 
+	/* Now we're OK with doing size * blk_size, it won't overflow. */
 	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
 
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
@@ -102,6 +98,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
 	else
 		unit = units_str[units][i];
 
+	/* size is < divisor[units] here, (u32) is legit */
 	snprintf(buf, len, "%u%s %s", (u32)size,
 		 tmp, unit);
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/4] lib/test-string_helpers.c: add string_get_size() tests
  2015-10-29 16:30 [PATCH v3 0/4] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2015-10-29 16:30 ` [PATCH v3 3/4] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
@ 2015-10-29 16:30 ` Vitaly Kuznetsov
  2015-10-29 21:33   ` Andy Shevchenko
  3 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-29 16:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, Andy Shevchenko, Ulf Hansson, James Bottomley,
	Kees Cook, linux-kernel

Add a couple of simple tests for string_get_size().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 lib/test-string_helpers.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/lib/test-string_helpers.c b/lib/test-string_helpers.c
index 8e376ef..4c77b54 100644
--- a/lib/test-string_helpers.c
+++ b/lib/test-string_helpers.c
@@ -326,6 +326,61 @@ out:
 	kfree(out_test);
 }
 
+#define string_get_size_maxbuf 16
+#define test_string_get_size_one(size, blk_size, exp_result10, exp_result2)    \
+	do {                                                                   \
+		BUILD_BUG_ON(sizeof(exp_result10) >= string_get_size_maxbuf);  \
+		BUILD_BUG_ON(sizeof(exp_result2) >= string_get_size_maxbuf);   \
+		__test_string_get_size((size), (blk_size), (exp_result10),     \
+				       (exp_result2));                         \
+	} while (0)
+
+
+static __init void __test_string_get_size(const u64 size, const u32 blk_size,
+					  const char *exp_result10,
+					  const char *exp_result2)
+{
+	char buf10[string_get_size_maxbuf];
+	char buf2[string_get_size_maxbuf];
+
+	string_get_size(size, blk_size, STRING_UNITS_10, buf10, sizeof(buf10));
+	string_get_size(size, blk_size, STRING_UNITS_2, buf2, sizeof(buf2));
+
+	if (!memcmp(buf10, exp_result10, strlen(exp_result10) + 1))
+		goto check_stringunits_2;
+
+	buf10[sizeof(buf10) - 1] = '\0';
+
+	pr_warn("Test 'test_string_get_size' failed!\n");
+	pr_warn("string_get_size(size = %llu, blk_size = %u, units = %s)\n",
+		size, blk_size, "STRING_UNITS_10");
+	pr_warn("expected: '%s', got '%s'\n", exp_result10, buf10);
+
+check_stringunits_2:
+	if (!memcmp(buf2, exp_result2, strlen(exp_result2) + 1))
+		return;
+
+	buf2[sizeof(buf2) - 1] = '\0';
+
+	pr_warn("Test 'test_string_get_size' failed!\n");
+	pr_warn("string_get_size(size = %llu, blk_size = %u, units = %s)\n",
+		size, blk_size, "STRING_UNITS_2");
+	pr_warn("expected: '%s', got '%s'\n", exp_result2, buf2);
+}
+
+static __init void test_string_get_size(void)
+{
+	test_string_get_size_one(16384, 512, "8.38 MB", "8.00 MiB");
+	test_string_get_size_one(500118192, 512, "256 GB", "238 GiB");
+	test_string_get_size_one(8192, 4096, "33.5 MB", "32.0 MiB");
+	test_string_get_size_one(1100, 1, "1.10 kB", "1.07 KiB");
+	test_string_get_size_one(3000, 1900, "5.70 MB", "5.43 MiB");
+	test_string_get_size_one(U64_MAX, 4096, "75.5 ZB", "63.9 ZiB");
+	test_string_get_size_one(1999, U32_MAX, "8.58 TB", "7.80 TiB");
+	test_string_get_size_one(1, 512, "512 B", "512 B");
+	test_string_get_size_one(0, 512, "0 B", "0 B");
+}
+
 static int __init test_string_helpers_init(void)
 {
 	unsigned int i;
@@ -344,6 +399,9 @@ static int __init test_string_helpers_init(void)
 	for (i = 0; i < (ESCAPE_ANY_NP | ESCAPE_HEX) + 1; i++)
 		test_string_escape("escape 1", escape1, i, TEST_STRING_2_DICT_1);
 
+	/* Test string_get_size() */
+	test_string_get_size();
+
 	return -EINVAL;
 }
 module_init(test_string_helpers_init);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/4] lib/string_helpers.c: don't lose precision in string_get_size()
  2015-10-29 16:30 ` [PATCH v3 3/4] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
@ 2015-10-29 21:22   ` Andy Shevchenko
  0 siblings, 0 replies; 25+ messages in thread
From: Andy Shevchenko @ 2015-10-29 21:22 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Andrew Morton, Rasmus Villemoes, Andy Shevchenko, Ulf Hansson,
	James Bottomley, Kees Cook, linux-kernel

On Thu, Oct 29, 2015 at 6:30 PM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> string_get_size() loses precision when there is a remainder for
> blk_size / divisor[units] and size is big enough. E.g
> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
> while it is supposed to return "33.5 MB". For some artificial inputs
> the result can be ridiculously wrong, e.g.
> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
> when "5.70 MB" is expected.
>
> The issues comes from the fact than we through away
> blk_size / divisor[units] remainder when size is > exp. This can be fixed
> by saving it and doing some non-trivial calculations later to fix the error
> but that would make this function even more cumbersome. Slightly re-factor
> the function to not lose the precision for all inputs.
>
> The overall complexity of this function comes from the fact that size can
> be huge and we don't want to do size * blk_size as it can overflow. Do the
> math in two steps:
> 1) Reduce size to something < U64_MAX / blk_size.
> 2) Multiply the result by blk_size and do final calculations.
>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

One nitpick below.

>         /*
> -        * size must be strictly greater than exp here to ensure that remainder
> -        * is greater than divisor[units] coming out of the if below.
> +        * size can be huge and doing size * blk_size right away can overflow.
> +        * As a first step reduce huge size to something less than
> +        * U64_MAX / blk_size.
>          */
> -       if (size > exp) {
> -               remainder = do_div(size, divisor[units]);
> -               remainder *= blk_size;
> -               i++;
> -       } else {
> -               remainder *= size;
> +       while (size > div_u64(U64_MAX, blk_size)) {
> +               /*
> +                * We do not need to keep the remainder as blk_size is capped
> +                * by U32_MAX and we'll still have enough precision after the
> +                * loop.
> +                */
> +               do_div(size, divisor[units]);

> +               ++i;

You may leave i++; here. Any reason not to do that?

>         }


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-29 16:30 ` [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0 Vitaly Kuznetsov
@ 2015-10-29 21:24   ` Andy Shevchenko
  2015-10-29 23:00   ` James Bottomley
  1 sibling, 0 replies; 25+ messages in thread
From: Andy Shevchenko @ 2015-10-29 21:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Andrew Morton, Rasmus Villemoes, Andy Shevchenko, Ulf Hansson,
	James Bottomley, Kees Cook, linux-kernel

On Thu, Oct 29, 2015 at 6:30 PM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> Division by zero happens if blk_size=0 is supplied to string_get_size().
> Add WARN_ON() and set size to 0 to report '0 B'.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>

> ---
>  lib/string_helpers.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index f6c27dc..ff3575b 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>
>         tmp[0] = '\0';
>         i = 0;
> +
> +       /* Calling string_get_size() with blk_size=0 is wrong! */
> +       if (WARN_ON(!blk_size))
> +               size = 0;
> +
>         if (!size)
>                 goto out;
>
> --
> 2.4.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/4] lib/test-string_helpers.c: add string_get_size() tests
  2015-10-29 16:30 ` [PATCH v3 4/4] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
@ 2015-10-29 21:33   ` Andy Shevchenko
  0 siblings, 0 replies; 25+ messages in thread
From: Andy Shevchenko @ 2015-10-29 21:33 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Andrew Morton, Rasmus Villemoes, Andy Shevchenko, Ulf Hansson,
	James Bottomley, Kees Cook, linux-kernel

On Thu, Oct 29, 2015 at 6:30 PM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> Add a couple of simple tests for string_get_size().
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  lib/test-string_helpers.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 58 insertions(+)
>
> diff --git a/lib/test-string_helpers.c b/lib/test-string_helpers.c
> index 8e376ef..4c77b54 100644
> --- a/lib/test-string_helpers.c
> +++ b/lib/test-string_helpers.c
> @@ -326,6 +326,61 @@ out:
>         kfree(out_test);
>  }
>
> +#define string_get_size_maxbuf 16
> +#define test_string_get_size_one(size, blk_size, exp_result10, exp_result2)    \
> +       do {                                                                   \
> +               BUILD_BUG_ON(sizeof(exp_result10) >= string_get_size_maxbuf);  \
> +               BUILD_BUG_ON(sizeof(exp_result2) >= string_get_size_maxbuf);   \
> +               __test_string_get_size((size), (blk_size), (exp_result10),     \
> +                                      (exp_result2));                         \
> +       } while (0)
> +
> +
> +static __init void __test_string_get_size(const u64 size, const u32 blk_size,
> +                                         const char *exp_result10,
> +                                         const char *exp_result2)
> +{
> +       char buf10[string_get_size_maxbuf];
> +       char buf2[string_get_size_maxbuf];
> +
> +       string_get_size(size, blk_size, STRING_UNITS_10, buf10, sizeof(buf10));
> +       string_get_size(size, blk_size, STRING_UNITS_2, buf2, sizeof(buf2));
> +
> +       if (!memcmp(buf10, exp_result10, strlen(exp_result10) + 1))
> +               goto check_stringunits_2;
> +

> +       buf10[sizeof(buf10) - 1] = '\0';
> +
> +       pr_warn("Test 'test_string_get_size' failed!\n");
> +       pr_warn("string_get_size(size = %llu, blk_size = %u, units = %s)\n",
> +               size, blk_size, "STRING_UNITS_10");
> +       pr_warn("expected: '%s', got '%s'\n", exp_result10, buf10);

Looks to me as a helper function

test_string_get_size_pr_err(size, blk_size, units, exp_result, buf, buflen) {}

if (memcmp(buf10, exp_result10, strlen(exp_result10) + 1))
 _pr_err(...);

> +
> +check_stringunits_2:
> +       if (!memcmp(buf2, exp_result2, strlen(exp_result2) + 1))
> +               return;
> +
> +       buf2[sizeof(buf2) - 1] = '\0';
> +
> +       pr_warn("Test 'test_string_get_size' failed!\n");
> +       pr_warn("string_get_size(size = %llu, blk_size = %u, units = %s)\n",
> +               size, blk_size, "STRING_UNITS_2");
> +       pr_warn("expected: '%s', got '%s'\n", exp_result2, buf2);
> +}
> +
> +static __init void test_string_get_size(void)
> +{
> +       test_string_get_size_one(16384, 512, "8.38 MB", "8.00 MiB");
> +       test_string_get_size_one(500118192, 512, "256 GB", "238 GiB");
> +       test_string_get_size_one(8192, 4096, "33.5 MB", "32.0 MiB");
> +       test_string_get_size_one(1100, 1, "1.10 kB", "1.07 KiB");
> +       test_string_get_size_one(3000, 1900, "5.70 MB", "5.43 MiB");
> +       test_string_get_size_one(U64_MAX, 4096, "75.5 ZB", "63.9 ZiB");
> +       test_string_get_size_one(1999, U32_MAX, "8.58 TB", "7.80 TiB");
> +       test_string_get_size_one(1, 512, "512 B", "512 B");
> +       test_string_get_size_one(0, 512, "0 B", "0 B");

> +}
> +
>  static int __init test_string_helpers_init(void)
>  {
>         unsigned int i;
> @@ -344,6 +399,9 @@ static int __init test_string_helpers_init(void)
>         for (i = 0; i < (ESCAPE_ANY_NP | ESCAPE_HEX) + 1; i++)
>                 test_string_escape("escape 1", escape1, i, TEST_STRING_2_DICT_1);
>
> +       /* Test string_get_size() */
> +       test_string_get_size();
> +
>         return -EINVAL;
>  }
>  module_init(test_string_helpers_init);
> --
> 2.4.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-29 16:30 ` [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
@ 2015-10-29 22:27   ` James Bottomley
  2015-10-29 23:19     ` Rasmus Villemoes
  2015-10-30 10:46     ` Vitaly Kuznetsov
  0 siblings, 2 replies; 25+ messages in thread
From: James Bottomley @ 2015-10-29 22:27 UTC (permalink / raw)
  To: vkuznets
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2158 bytes --]

On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> string_get_size() can't really handle huge block sizes, especially
> blk_size > U32_MAX but string_get_size() interface states the opposite.
> Change blk_size from u64 to u32 to reflect the reality.

What is the actual evidence for this?  The calculation is designed to be
a symmetric 128 bit multiply.  When I wrote and tested it, it worked
fine for huge block sizes.

James

> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  include/linux/string_helpers.h | 2 +-
>  lib/string_helpers.c           | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
> index dabe643..1223e80 100644
> --- a/include/linux/string_helpers.h
> +++ b/include/linux/string_helpers.h
> @@ -10,7 +10,7 @@ enum string_size_units {
>  	STRING_UNITS_2,		/* use binary powers of 2^10 */
>  };
>  
> -void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
> +void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
>  		     char *buf, int len);
>  
>  #define UNESCAPE_SPACE		0x01
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index 5939f63..f6c27dc 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -26,7 +26,7 @@
>   * at least 9 bytes and will always be zero terminated.
>   *
>   */
> -void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> +void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>  		     char *buf, int len)
>  {
>  	static const char *const units_10[] = {
> @@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>  		i++;
>  	}
>  
> -	exp = divisor[units] / (u32)blk_size;
> +	exp = divisor[units] / blk_size;
>  	/*
>  	 * size must be strictly greater than exp here to ensure that remainder
>  	 * is greater than divisor[units] coming out of the if below.


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-29 16:30 ` [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0 Vitaly Kuznetsov
  2015-10-29 21:24   ` Andy Shevchenko
@ 2015-10-29 23:00   ` James Bottomley
  2015-10-29 23:32     ` Andy Shevchenko
  1 sibling, 1 reply; 25+ messages in thread
From: James Bottomley @ 2015-10-29 23:00 UTC (permalink / raw)
  To: vkuznets
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1068 bytes --]

On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> Division by zero happens if blk_size=0 is supplied to string_get_size().
> Add WARN_ON() and set size to 0 to report '0 B'.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  lib/string_helpers.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index f6c27dc..ff3575b 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>  
>  	tmp[0] = '\0';
>  	i = 0;
> +
> +	/* Calling string_get_size() with blk_size=0 is wrong! */
> +	if (WARN_ON(!blk_size))

Get rid of the WARN_ON; it's the standard thing to do for a partially
connected device.  Seeing zero is standard in a whole variety of
situations.  SCSI shims the zero but most other drivers don't.

James

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-29 22:27   ` James Bottomley
@ 2015-10-29 23:19     ` Rasmus Villemoes
  2015-10-29 23:24       ` Rasmus Villemoes
  2015-10-30  3:33       ` James Bottomley
  2015-10-30 10:46     ` Vitaly Kuznetsov
  1 sibling, 2 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2015-10-29 23:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: vkuznets, ulf.hansson, andriy.shevchenko, keescook, linux-kernel, akpm

On Thu, Oct 29 2015, James Bottomley <jbottomley@odin.com> wrote:

> On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> string_get_size() can't really handle huge block sizes, especially
>> blk_size > U32_MAX but string_get_size() interface states the opposite.
>> Change blk_size from u64 to u32 to reflect the reality.
>
> What is the actual evidence for this?  The calculation is designed to be
> a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> fine for huge block sizes.
>

May I politely ask how you tested it, and what you mean by "worked"? The
bug I reported last week was particularly concerning block sizes >= 1024
(e.g. the 32768, 1024 pair giving 32.7 MB where the correct output would
be 33.5 MB). Now it turns out that it was actually broken for smaller
block sizes as well. For ~13000 semirandom size,blk_size pairs, the
current code produces the wrong result in ~2100 cases. The new code
reduces that to 122 cases, all of which are off by one in the last
digit.

And I don't buy the symmetry argument either. Mathematically, it should
give the same, but your algorithm produces 2.04 MB for 512,4096 and 2.09
MB for 4096,512.

Maybe the commit message could be better, but I think it makes a lot of
sense to make blk_size u32. Breaking the symmetry between size and
blk_size is good (less likely that the arguments get swapped). It
allows a simpler implementation. It makes the generated code
smaller.

Rasmus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-29 23:19     ` Rasmus Villemoes
@ 2015-10-29 23:24       ` Rasmus Villemoes
  2015-10-30  3:33       ` James Bottomley
  1 sibling, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2015-10-29 23:24 UTC (permalink / raw)
  To: James Bottomley
  Cc: vkuznets, ulf.hansson, andriy.shevchenko, keescook, linux-kernel, akpm

On Fri, Oct 30 2015, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:

> block sizes as well. For ~13000 semirandom size,blk_size pairs,

Sorry, that should have been ~20000.

Rasmus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-29 23:00   ` James Bottomley
@ 2015-10-29 23:32     ` Andy Shevchenko
  2015-10-30  3:34       ` James Bottomley
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Shevchenko @ 2015-10-29 23:32 UTC (permalink / raw)
  To: James Bottomley
  Cc: vkuznets, ulf.hansson, linux, andriy.shevchenko, keescook,
	linux-kernel, akpm

On Fri, Oct 30, 2015 at 1:00 AM, James Bottomley <jbottomley@odin.com> wrote:
> On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> Division by zero happens if blk_size=0 is supplied to string_get_size().
>> Add WARN_ON() and set size to 0 to report '0 B'.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  lib/string_helpers.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index f6c27dc..ff3575b 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>
>>       tmp[0] = '\0';
>>       i = 0;
>> +
>> +     /* Calling string_get_size() with blk_size=0 is wrong! */
>> +     if (WARN_ON(!blk_size))
>
> Get rid of the WARN_ON; it's the standard thing to do for a partially
> connected device.  Seeing zero is standard in a whole variety of
> situations.  SCSI shims the zero but most other drivers don't.

For *block* size? It will crash the kernel. I've checked, it wasn't
changed from the beginning (b9f28d863594).

+       exp = divisor[units] / (u32)blk_size;

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-29 23:19     ` Rasmus Villemoes
  2015-10-29 23:24       ` Rasmus Villemoes
@ 2015-10-30  3:33       ` James Bottomley
  1 sibling, 0 replies; 25+ messages in thread
From: James Bottomley @ 2015-10-30  3:33 UTC (permalink / raw)
  To: linux
  Cc: ulf.hansson, keescook, andriy.shevchenko, vkuznets, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2621 bytes --]

On Fri, 2015-10-30 at 00:19 +0100, Rasmus Villemoes wrote:
> On Thu, Oct 29 2015, James Bottomley <jbottomley@odin.com> wrote:
> 
> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> string_get_size() can't really handle huge block sizes, especially
> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
> >> Change blk_size from u64 to u32 to reflect the reality.
> >
> > What is the actual evidence for this?  The calculation is designed to be
> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> > fine for huge block sizes.
> >
> 
> May I politely ask how you tested it, and what you mean by "worked"? The
> bug I reported last week was particularly concerning block sizes >= 1024
> (e.g. the 32768, 1024 pair giving 32.7 MB where the correct output would
> be 33.5 MB).

The test was basically a userspace version reversing the large size
smaller block size numbers and verifying they produce the same output.

>  Now it turns out that it was actually broken for smaller
> block sizes as well. For ~13000 semirandom size,blk_size pairs, the
> current code produces the wrong result in ~2100 cases. The new code
> reduces that to 122 cases, all of which are off by one in the last
> digit.

I wasn't making the point that there isn't a potential off by a couple
of percent problem in the algorithm I was making the point that it
should work as a multiplier of two u64 numbers, so I can't understand
the rational basis for reducing the block size to u32.

> And I don't buy the symmetry argument either. Mathematically, it should
> give the same, but your algorithm produces 2.04 MB for 512,4096 and 2.09
> MB for 4096,512.

That's an off by 2.5%; it means there's a slight error in one of the
carries it doesn't mean there's a fundamental problem in the algorithm.

> Maybe the commit message could be better, but I think it makes a lot of
> sense to make blk_size u32. Breaking the symmetry between size and
> blk_size is good (less likely that the arguments get swapped). It
> allows a simpler implementation. It makes the generated code
> smaller.

The drive vendors are already pushing huge block size systems for ZBC.
They're already talking about 2GB sectors, which is 31 bits ... they'll
be over the 32 bit limit fairly shortly, I predict, so it makes no sense
to have to have the storage layer do silly bit shifting because we were
short sighted enough to cap block size to a u32.

James

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-29 23:32     ` Andy Shevchenko
@ 2015-10-30  3:34       ` James Bottomley
  2015-10-30 10:41         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 25+ messages in thread
From: James Bottomley @ 2015-10-30  3:34 UTC (permalink / raw)
  To: andy.shevchenko
  Cc: ulf.hansson, linux, andriy.shevchenko, vkuznets, linux-kernel,
	akpm, keescook

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1569 bytes --]

On Fri, 2015-10-30 at 01:32 +0200, Andy Shevchenko wrote:
> On Fri, Oct 30, 2015 at 1:00 AM, James Bottomley <jbottomley@odin.com> wrote:
> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> Division by zero happens if blk_size=0 is supplied to string_get_size().
> >> Add WARN_ON() and set size to 0 to report '0 B'.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  lib/string_helpers.c | 5 +++++
> >>  1 file changed, 5 insertions(+)
> >>
> >> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> >> index f6c27dc..ff3575b 100644
> >> --- a/lib/string_helpers.c
> >> +++ b/lib/string_helpers.c
> >> @@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
> >>
> >>       tmp[0] = '\0';
> >>       i = 0;
> >> +
> >> +     /* Calling string_get_size() with blk_size=0 is wrong! */
> >> +     if (WARN_ON(!blk_size))
> >
> > Get rid of the WARN_ON; it's the standard thing to do for a partially
> > connected device.  Seeing zero is standard in a whole variety of
> > situations.  SCSI shims the zero but most other drivers don't.
> 
> For *block* size? It will crash the kernel. I've checked, it wasn't
> changed from the beginning (b9f28d863594).

The standard signal for a drive error in capacity is zero size and zero
block size.  We have to take that case as standard without emitting
scary warnings.

James

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-30  3:34       ` James Bottomley
@ 2015-10-30 10:41         ` Vitaly Kuznetsov
  2015-10-31  0:07           ` James Bottomley
  0 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-30 10:41 UTC (permalink / raw)
  To: James Bottomley
  Cc: andy.shevchenko, ulf.hansson, linux, andriy.shevchenko,
	linux-kernel, akpm, keescook

James Bottomley <jbottomley@odin.com> writes:

> On Fri, 2015-10-30 at 01:32 +0200, Andy Shevchenko wrote:
>> On Fri, Oct 30, 2015 at 1:00 AM, James Bottomley <jbottomley@odin.com> wrote:
>> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> >> Division by zero happens if blk_size=0 is supplied to string_get_size().
>> >> Add WARN_ON() and set size to 0 to report '0 B'.
>> >>
>> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> >> ---
>> >>  lib/string_helpers.c | 5 +++++
>> >>  1 file changed, 5 insertions(+)
>> >>
>> >> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> >> index f6c27dc..ff3575b 100644
>> >> --- a/lib/string_helpers.c
>> >> +++ b/lib/string_helpers.c
>> >> @@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> >>
>> >>       tmp[0] = '\0';
>> >>       i = 0;
>> >> +
>> >> +     /* Calling string_get_size() with blk_size=0 is wrong! */
>> >> +     if (WARN_ON(!blk_size))
>> >
>> > Get rid of the WARN_ON; it's the standard thing to do for a partially
>> > connected device.  Seeing zero is standard in a whole variety of
>> > situations.  SCSI shims the zero but most other drivers don't.
>> 
>> For *block* size? It will crash the kernel. I've checked, it wasn't
>> changed from the beginning (b9f28d863594).
>
> The standard signal for a drive error in capacity is zero size and zero
> block size.  We have to take that case as standard without emitting
> scary warnings.

Ok, but what if size != 0? Is WARN_ON() justified in this case?

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-29 22:27   ` James Bottomley
  2015-10-29 23:19     ` Rasmus Villemoes
@ 2015-10-30 10:46     ` Vitaly Kuznetsov
  2015-10-31  0:21       ` James Bottomley
  1 sibling, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-10-30 10:46 UTC (permalink / raw)
  To: James Bottomley
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

James Bottomley <jbottomley@odin.com> writes:

> On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> string_get_size() can't really handle huge block sizes, especially
>> blk_size > U32_MAX but string_get_size() interface states the opposite.
>> Change blk_size from u64 to u32 to reflect the reality.
>
> What is the actual evidence for this?  The calculation is designed to be
> a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> fine for huge block sizes.

We have 'u32 remainder' and then we do:

exp = divisor[units] / (u32)blk_size;
...
remainder = do_div(size, divisor[units]);
remainder *= blk_size;

I'm pretty sure it will overflow for some inputs.

>
> James
>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  include/linux/string_helpers.h | 2 +-
>>  lib/string_helpers.c           | 4 ++--
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
>> index dabe643..1223e80 100644
>> --- a/include/linux/string_helpers.h
>> +++ b/include/linux/string_helpers.h
>> @@ -10,7 +10,7 @@ enum string_size_units {
>>  	STRING_UNITS_2,		/* use binary powers of 2^10 */
>>  };
>>  
>> -void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
>> +void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
>>  		     char *buf, int len);
>>  
>>  #define UNESCAPE_SPACE		0x01
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index 5939f63..f6c27dc 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -26,7 +26,7 @@
>>   * at least 9 bytes and will always be zero terminated.
>>   *
>>   */
>> -void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> +void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  		     char *buf, int len)
>>  {
>>  	static const char *const units_10[] = {
>> @@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>>  		i++;
>>  	}
>>  
>> -	exp = divisor[units] / (u32)blk_size;
>> +	exp = divisor[units] / blk_size;
>>  	/*
>>  	 * size must be strictly greater than exp here to ensure that remainder
>>  	 * is greater than divisor[units] coming out of the if below.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0
  2015-10-30 10:41         ` Vitaly Kuznetsov
@ 2015-10-31  0:07           ` James Bottomley
  0 siblings, 0 replies; 25+ messages in thread
From: James Bottomley @ 2015-10-31  0:07 UTC (permalink / raw)
  To: vkuznets
  Cc: andy.shevchenko, ulf.hansson, linux, andriy.shevchenko,
	linux-kernel, akpm, keescook

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2258 bytes --]

On Fri, 2015-10-30 at 11:41 +0100, Vitaly Kuznetsov wrote:
> James Bottomley <jbottomley@odin.com> writes:
> 
> > On Fri, 2015-10-30 at 01:32 +0200, Andy Shevchenko wrote:
> >> On Fri, Oct 30, 2015 at 1:00 AM, James Bottomley <jbottomley@odin.com> wrote:
> >> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> >> Division by zero happens if blk_size=0 is supplied to string_get_size().
> >> >> Add WARN_ON() and set size to 0 to report '0 B'.
> >> >>
> >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> >> ---
> >> >>  lib/string_helpers.c | 5 +++++
> >> >>  1 file changed, 5 insertions(+)
> >> >>
> >> >> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> >> >> index f6c27dc..ff3575b 100644
> >> >> --- a/lib/string_helpers.c
> >> >> +++ b/lib/string_helpers.c
> >> >> @@ -50,6 +50,11 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
> >> >>
> >> >>       tmp[0] = '\0';
> >> >>       i = 0;
> >> >> +
> >> >> +     /* Calling string_get_size() with blk_size=0 is wrong! */
> >> >> +     if (WARN_ON(!blk_size))
> >> >
> >> > Get rid of the WARN_ON; it's the standard thing to do for a partially
> >> > connected device.  Seeing zero is standard in a whole variety of
> >> > situations.  SCSI shims the zero but most other drivers don't.
> >> 
> >> For *block* size? It will crash the kernel. I've checked, it wasn't
> >> changed from the beginning (b9f28d863594).
> >
> > The standard signal for a drive error in capacity is zero size and zero
> > block size.  We have to take that case as standard without emitting
> > scary warnings.
> 
> Ok, but what if size != 0? Is WARN_ON() justified in this case?

It's an arithmentic routine whose job is to multiply two numbers, not
second guess the subsystem that gave it the numbers.  Just on general
architectural principles the only time it's allowed to dump a stack
trace without confusing people is when the arithmetic operation it has
been asked to do would produce an illegal result (which for two sixty
four bit numbers multiplying to a 128 bit one is never).

James

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-30 10:46     ` Vitaly Kuznetsov
@ 2015-10-31  0:21       ` James Bottomley
  2015-11-02 15:58         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 25+ messages in thread
From: James Bottomley @ 2015-10-31  0:21 UTC (permalink / raw)
  To: vkuznets
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4205 bytes --]

On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
> James Bottomley <jbottomley@odin.com> writes:
> 
> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> string_get_size() can't really handle huge block sizes, especially
> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
> >> Change blk_size from u64 to u32 to reflect the reality.
> >
> > What is the actual evidence for this?  The calculation is designed to be
> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> > fine for huge block sizes.
> 
> We have 'u32 remainder' and then we do:
> 
> exp = divisor[units] / (u32)blk_size;
> ...
> remainder = do_div(size, divisor[units]);
> remainder *= blk_size;
> 
> I'm pretty sure it will overflow for some inputs.

It shouldn't; the full code snippet does this:

        	while (blk_size >= divisor[units]) {
        		remainder = do_div(blk_size, divisor[units]);
        		i++;
        	}
        
        	exp = divisor[units] / (u32)blk_size;

So by the time it reaches the statement you complain about, blk_size is
already less than or equal to the divisor (which is 1000 or 1024) so
truncating to 32 bits is always correct.

I'm sort of getting the impression you don't quite understand the
mathematics:  i is the logarithm to the base divisor[units].  We reduce
both operands to exponents of the logarithm base (adding the two bases
together in i), which means they are by definition in a range between
zero and the base and then multiply the remaining exponents correcting
the result for a base overflow (so the result is always a correct
exponent and i is the logarithm to the base).  It's actually simply
Napier's algorithm.

The reason we're getting the up to 2.5% rounding errors you complain
about is because at each logarithm until the last one, we throw away the
remainder (it's legitimate because it's always 1000x smaller than the
exponent), but in the case of a large remainder it provides a small
correction to the final operation which we don't account for.  If you
want to make a true correction, you save the penultimate residue in each
case, multiply each by the *other* exponent add them together, divide by
the base and increment the final result by the remainder.

However, for 2.5% the physicist in me says the above is way overkill.

James

> >
> > James
> >
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  include/linux/string_helpers.h | 2 +-
> >>  lib/string_helpers.c           | 4 ++--
> >>  2 files changed, 3 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
> >> index dabe643..1223e80 100644
> >> --- a/include/linux/string_helpers.h
> >> +++ b/include/linux/string_helpers.h
> >> @@ -10,7 +10,7 @@ enum string_size_units {
> >>  	STRING_UNITS_2,		/* use binary powers of 2^10 */
> >>  };
> >>  
> >> -void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
> >> +void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
> >>  		     char *buf, int len);
> >>  
> >>  #define UNESCAPE_SPACE		0x01
> >> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> >> index 5939f63..f6c27dc 100644
> >> --- a/lib/string_helpers.c
> >> +++ b/lib/string_helpers.c
> >> @@ -26,7 +26,7 @@
> >>   * at least 9 bytes and will always be zero terminated.
> >>   *
> >>   */
> >> -void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >> +void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
> >>  		     char *buf, int len)
> >>  {
> >>  	static const char *const units_10[] = {
> >> @@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >>  		i++;
> >>  	}
> >>  
> >> -	exp = divisor[units] / (u32)blk_size;
> >> +	exp = divisor[units] / blk_size;
> >>  	/*
> >>  	 * size must be strictly greater than exp here to ensure that remainder
> >>  	 * is greater than divisor[units] coming out of the if below.
> 


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-10-31  0:21       ` James Bottomley
@ 2015-11-02 15:58         ` Vitaly Kuznetsov
  2015-11-03  3:40           ` James Bottomley
  0 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-11-02 15:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

James Bottomley <jbottomley@odin.com> writes:

> On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
>> James Bottomley <jbottomley@odin.com> writes:
>> 
>> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> >> string_get_size() can't really handle huge block sizes, especially
>> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
>> >> Change blk_size from u64 to u32 to reflect the reality.
>> >
>> > What is the actual evidence for this?  The calculation is designed to be
>> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
>> > fine for huge block sizes.
>> 
>> We have 'u32 remainder' and then we do:
>> 
>> exp = divisor[units] / (u32)blk_size;
>> ...
>> remainder = do_div(size, divisor[units]);
>> remainder *= blk_size;
>> 
>> I'm pretty sure it will overflow for some inputs.
>
> It shouldn't; the full code snippet does this:
>
>         	while (blk_size >= divisor[units]) {
>         		remainder = do_div(blk_size, divisor[units]);
>         		i++;
>         	}
>
>         	exp = divisor[units] / (u32)blk_size;
>
> So by the time it reaches the statement you complain about, blk_size is
> already less than or equal to the divisor (which is 1000 or 1024) so
> truncating to 32 bits is always correct.
>

I overlooked, sorry!

> I'm sort of getting the impression you don't quite understand the
> mathematics:  i is the logarithm to the base divisor[units].  We reduce
> both operands to exponents of the logarithm base (adding the two bases
> together in i), which means they are by definition in a range between
> zero and the base and then multiply the remaining exponents correcting
> the result for a base overflow (so the result is always a correct
> exponent and i is the logarithm to the base).  It's actually simply
> Napier's algorithm.
>
> The reason we're getting the up to 2.5% rounding errors you complain
> about is because at each logarithm until the last one, we throw away the
> remainder (it's legitimate because it's always 1000x smaller than the
> exponent), but in the case of a large remainder it provides a small
> correction to the final operation which we don't account for.  If you
> want to make a true correction, you save the penultimate residue in each
> case, multiply each by the *other* exponent add them together, divide by
> the base and increment the final result by the remainder.

My assumption was that we don't really need to support blk_sizes >
U32_MAX and we can simplify string_get_size() instead of adding
additional complexity. Apparently, the assumption was wrong.

>
> However, for 2.5% the physicist in me says the above is way overkill.
>

It is getting was over 2.5% if blk_size is not a power of 2. While it is
probably never the case for block subsystem the function is in lib and
pretends to be general-enough. I'll try to make proper correction and
let's see if it's worth the effort. 

Thanks,

> James
>
>> >
>> > James
>> >
>> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> >> ---
>> >>  include/linux/string_helpers.h | 2 +-
>> >>  lib/string_helpers.c           | 4 ++--
>> >>  2 files changed, 3 insertions(+), 3 deletions(-)
>> >> 
>> >> diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
>> >> index dabe643..1223e80 100644
>> >> --- a/include/linux/string_helpers.h
>> >> +++ b/include/linux/string_helpers.h
>> >> @@ -10,7 +10,7 @@ enum string_size_units {
>> >>  	STRING_UNITS_2,		/* use binary powers of 2^10 */
>> >>  };
>> >>  
>> >> -void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
>> >> +void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
>> >>  		     char *buf, int len);
>> >>  
>> >>  #define UNESCAPE_SPACE		0x01
>> >> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> >> index 5939f63..f6c27dc 100644
>> >> --- a/lib/string_helpers.c
>> >> +++ b/lib/string_helpers.c
>> >> @@ -26,7 +26,7 @@
>> >>   * at least 9 bytes and will always be zero terminated.
>> >>   *
>> >>   */
>> >> -void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> >> +void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> >>  		     char *buf, int len)
>> >>  {
>> >>  	static const char *const units_10[] = {
>> >> @@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> >>  		i++;
>> >>  	}
>> >>  
>> >> -	exp = divisor[units] / (u32)blk_size;
>> >> +	exp = divisor[units] / blk_size;
>> >>  	/*
>> >>  	 * size must be strictly greater than exp here to ensure that remainder
>> >>  	 * is greater than divisor[units] coming out of the if below.
>> 

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-11-02 15:58         ` Vitaly Kuznetsov
@ 2015-11-03  3:40           ` James Bottomley
  2015-11-03 13:13             ` Vitaly Kuznetsov
  0 siblings, 1 reply; 25+ messages in thread
From: James Bottomley @ 2015-11-03  3:40 UTC (permalink / raw)
  To: vkuznets
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6352 bytes --]

On Mon, 2015-11-02 at 16:58 +0100, Vitaly Kuznetsov wrote:
> James Bottomley <jbottomley@odin.com> writes:
> 
> > On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
> >> James Bottomley <jbottomley@odin.com> writes:
> >> 
> >> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> >> string_get_size() can't really handle huge block sizes, especially
> >> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
> >> >> Change blk_size from u64 to u32 to reflect the reality.
> >> >
> >> > What is the actual evidence for this?  The calculation is designed to be
> >> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> >> > fine for huge block sizes.
> >> 
> >> We have 'u32 remainder' and then we do:
> >> 
> >> exp = divisor[units] / (u32)blk_size;
> >> ...
> >> remainder = do_div(size, divisor[units]);
> >> remainder *= blk_size;
> >> 
> >> I'm pretty sure it will overflow for some inputs.
> >
> > It shouldn't; the full code snippet does this:
> >
> >         	while (blk_size >= divisor[units]) {
> >         		remainder = do_div(blk_size, divisor[units]);
> >         		i++;
> >         	}
> >
> >         	exp = divisor[units] / (u32)blk_size;
> >
> > So by the time it reaches the statement you complain about, blk_size is
> > already less than or equal to the divisor (which is 1000 or 1024) so
> > truncating to 32 bits is always correct.
> >
> 
> I overlooked, sorry!
> 
> > I'm sort of getting the impression you don't quite understand the
> > mathematics:  i is the logarithm to the base divisor[units].  We reduce
> > both operands to exponents of the logarithm base (adding the two bases
> > together in i), which means they are by definition in a range between
> > zero and the base and then multiply the remaining exponents correcting
> > the result for a base overflow (so the result is always a correct
> > exponent and i is the logarithm to the base).  It's actually simply
> > Napier's algorithm.
> >
> > The reason we're getting the up to 2.5% rounding errors you complain
> > about is because at each logarithm until the last one, we throw away the
> > remainder (it's legitimate because it's always 1000x smaller than the
> > exponent), but in the case of a large remainder it provides a small
> > correction to the final operation which we don't account for.  If you
> > want to make a true correction, you save the penultimate residue in each
> > case, multiply each by the *other* exponent add them together, divide by
> > the base and increment the final result by the remainder.
> 
> My assumption was that we don't really need to support blk_sizes >
> U32_MAX and we can simplify string_get_size() instead of adding
> additional complexity. Apparently, the assumption was wrong.
> 
> >
> > However, for 2.5% the physicist in me says the above is way overkill.
> >
> 
> It is getting was over 2.5% if blk_size is not a power of 2. While it is
> probably never the case for block subsystem the function is in lib and
> pretends to be general-enough. I'll try to make proper correction and
> let's see if it's worth the effort. 

OK, this is the full calculation.  It also includes an arithmetic
rounding to the final figure print.  I suppose it's not that much more
complexity than the original, and it does make the algorithm easier to
understand.

We could do with running the comments by some other non-mathematician,
now I've explained it in detail to you two, to see if they actually give
an understanding of the algorithm.

James

---

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..1ec7e77a 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -44,7 +44,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		[STRING_UNITS_2] = 1024,
 	};
 	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	u32 remainder = 0, sf_cap, r1 = 0, r2 = 0, round;
 	char tmp[8];
 	const char *unit;
 
@@ -53,27 +53,46 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	if (!size)
 		goto out;
 
+	/* This is napier's algorithm.  Reduce the original block size to
+	 *
+	 * co * divisor[units]^i
+	 *
+	 * where co = blk_size + r1/divisor[units];
+	 *
+	 * and the same for size.  We simply add to the exponent i, because
+	 * the final calculation we're looking for is
+	 *
+	 * (co1 * co2) * divisor[units]^i
+	 */
+
+
 	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
+		r1 = do_div(blk_size, divisor[units]);
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
-	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
-	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
+	while (size >= divisor[units]) {
+		r2 = do_div(size, divisor[units]);
 		i++;
-	} else {
-		remainder *= size;
 	}
 
-	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
+	/* here's the magic.  co1 * co2 may be > divisor[i], so correct for
+	 * that in the exponent and make sure that the additional corrections
+	 * from the remainders is added in.
+	 *
+	 * co1 *co2 = (blk_size + r1/divisor[units])*(size + r2/divisor[units])
+	 *
+	 * therefore
+	 *
+	 * co1*co2*divisor[units] = blk_size*size*divisor[units] +
+	 *          r1*size + r2*size + r1*r2/divisor[units]
+	 *
+	 * drop the last term because it's too small and perform the
+	 * calculation cleverly by decremeting i to be automatically dealing
+	 * with everything multiplied by divisor[units] */
+
+	--i;
+	size = size * blk_size * divisor[units] + r1 * size + r2 * blk_size;
 
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
@@ -81,8 +100,15 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	}
 
 	sf_cap = size;
-	for (j = 0; sf_cap*10 < 1000; j++)
+	round = 500;
+	for (j = 0; sf_cap*10 < 1000; j++) {
 		sf_cap *= 10;
+		round /= 10;
+	}
+
+	/* add a 5 to the digit below what will be printed to ensure
+	 * an arithmetical round up */
+	remainder += round;
 
 	if (j) {
 		remainder *= 1000;

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-11-03  3:40           ` James Bottomley
@ 2015-11-03 13:13             ` Vitaly Kuznetsov
  2015-11-03 17:02               ` James Bottomley
  0 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2015-11-03 13:13 UTC (permalink / raw)
  To: James Bottomley
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

James Bottomley <jbottomley@odin.com> writes:

> On Mon, 2015-11-02 at 16:58 +0100, Vitaly Kuznetsov wrote:
>> James Bottomley <jbottomley@odin.com> writes:
>> 
>> > On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
>> >> James Bottomley <jbottomley@odin.com> writes:
>> >> 
>> >> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> >> >> string_get_size() can't really handle huge block sizes, especially
>> >> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
>> >> >> Change blk_size from u64 to u32 to reflect the reality.
>> >> >
>> >> > What is the actual evidence for this?  The calculation is designed to be
>> >> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
>> >> > fine for huge block sizes.
>> >> 
>> >> We have 'u32 remainder' and then we do:
>> >> 
>> >> exp = divisor[units] / (u32)blk_size;
>> >> ...
>> >> remainder = do_div(size, divisor[units]);
>> >> remainder *= blk_size;
>> >> 
>> >> I'm pretty sure it will overflow for some inputs.
>> >
>> > It shouldn't; the full code snippet does this:
>> >
>> >         	while (blk_size >= divisor[units]) {
>> >         		remainder = do_div(blk_size, divisor[units]);
>> >         		i++;
>> >         	}
>> >
>> >         	exp = divisor[units] / (u32)blk_size;
>> >
>> > So by the time it reaches the statement you complain about, blk_size is
>> > already less than or equal to the divisor (which is 1000 or 1024) so
>> > truncating to 32 bits is always correct.
>> >
>> 
>> I overlooked, sorry!
>> 
>> > I'm sort of getting the impression you don't quite understand the
>> > mathematics:  i is the logarithm to the base divisor[units].  We reduce
>> > both operands to exponents of the logarithm base (adding the two bases
>> > together in i), which means they are by definition in a range between
>> > zero and the base and then multiply the remaining exponents correcting
>> > the result for a base overflow (so the result is always a correct
>> > exponent and i is the logarithm to the base).  It's actually simply
>> > Napier's algorithm.
>> >
>> > The reason we're getting the up to 2.5% rounding errors you complain
>> > about is because at each logarithm until the last one, we throw away the
>> > remainder (it's legitimate because it's always 1000x smaller than the
>> > exponent), but in the case of a large remainder it provides a small
>> > correction to the final operation which we don't account for.  If you
>> > want to make a true correction, you save the penultimate residue in each
>> > case, multiply each by the *other* exponent add them together, divide by
>> > the base and increment the final result by the remainder.
>> 
>> My assumption was that we don't really need to support blk_sizes >
>> U32_MAX and we can simplify string_get_size() instead of adding
>> additional complexity. Apparently, the assumption was wrong.
>> 
>> >
>> > However, for 2.5% the physicist in me says the above is way overkill.
>> >
>> 
>> It is getting was over 2.5% if blk_size is not a power of 2. While it is
>> probably never the case for block subsystem the function is in lib and
>> pretends to be general-enough. I'll try to make proper correction and
>> let's see if it's worth the effort. 
>
> OK, this is the full calculation.  It also includes an arithmetic
> rounding to the final figure print.  I suppose it's not that much more
> complexity than the original, and it does make the algorithm easier to
> understand.
>
> We could do with running the comments by some other non-mathematician,
> now I've explained it in detail to you two, to see if they actually give
> an understanding of the algorithm.

Thanks, to me they look great! One nitpick below ...

>
> James
>
> ---
>
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index 5939f63..1ec7e77a 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -44,7 +44,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>  		[STRING_UNITS_2] = 1024,
>  	};
>  	int i, j;
> -	u32 remainder = 0, sf_cap, exp;
> +	u32 remainder = 0, sf_cap, r1 = 0, r2 = 0, round;
>  	char tmp[8];
>  	const char *unit;
>
> @@ -53,27 +53,46 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>  	if (!size)
>  		goto out;
>
> +	/* This is napier's algorithm.  Reduce the original block size to
> +	 *
> +	 * co * divisor[units]^i
> +	 *
> +	 * where co = blk_size + r1/divisor[units];
> +	 *
> +	 * and the same for size.  We simply add to the exponent i, because
> +	 * the final calculation we're looking for is
> +	 *
> +	 * (co1 * co2) * divisor[units]^i
> +	 */
> +
> +
>  	while (blk_size >= divisor[units]) {
> -		remainder = do_div(blk_size, divisor[units]);
> +		r1 = do_div(blk_size, divisor[units]);
>  		i++;
>  	}
>
> -	exp = divisor[units] / (u32)blk_size;
> -	/*
> -	 * size must be strictly greater than exp here to ensure that remainder
> -	 * is greater than divisor[units] coming out of the if below.
> -	 */
> -	if (size > exp) {
> -		remainder = do_div(size, divisor[units]);
> -		remainder *= blk_size;
> +	while (size >= divisor[units]) {
> +		r2 = do_div(size, divisor[units]);
>  		i++;
> -	} else {
> -		remainder *= size;
>  	}
>
> -	size *= blk_size;
> -	size += remainder / divisor[units];
> -	remainder %= divisor[units];
> +	/* here's the magic.  co1 * co2 may be > divisor[i], so correct for
> +	 * that in the exponent and make sure that the additional corrections
> +	 * from the remainders is added in.
> +	 *
> +	 * co1 *co2 = (blk_size + r1/divisor[units])*(size + r2/divisor[units])
> +	 *
> +	 * therefore
> +	 *
> +	 * co1*co2*divisor[units] = blk_size*size*divisor[units] +
> +	 *          r1*size + r2*size + r1*r2/divisor[units]
> +	 *
> +	 * drop the last term because it's too small and perform the
> +	 * calculation cleverly by decremeting i to be automatically dealing
> +	 * with everything multiplied by divisor[units] */
> +
> +	--i;
> +	size = size * blk_size * divisor[units] + r1 * size + r2 *
> blk_size;

The last term is actually not that small. Here is an example:

size = 8192  blk_size = 1024

'As is' the algorithm gives us '8.38 MB', and if we add "+ r1 * r1 /
divisor[units]" we get '8.39 MB' (the correct answer is 8192 * 1024 =
8388608 which is 8.39).

Both r1 and r2 are < divisor[units] here so r1 * r2 won't overflow u32,
I suggest we add this term.

>
>  	while (size >= divisor[units]) {
>  		remainder = do_div(size, divisor[units]);
> @@ -81,8 +100,15 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>  	}
>
>  	sf_cap = size;
> -	for (j = 0; sf_cap*10 < 1000; j++)
> +	round = 500;
> +	for (j = 0; sf_cap*10 < 1000; j++) {
>  		sf_cap *= 10;
> +		round /= 10;
> +	}
> +
> +	/* add a 5 to the digit below what will be printed to ensure
> +	 * an arithmetical round up */
> +	remainder += round;
>
>  	if (j) {
>  		remainder *= 1000;

Can I post this solution with your Suggested-by or do you plan to do it
yourself?

Thanks,

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-11-03 13:13             ` Vitaly Kuznetsov
@ 2015-11-03 17:02               ` James Bottomley
  2015-11-03 20:57                 ` Rasmus Villemoes
  0 siblings, 1 reply; 25+ messages in thread
From: James Bottomley @ 2015-11-03 17:02 UTC (permalink / raw)
  To: vkuznets
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 8162 bytes --]

On Tue, 2015-11-03 at 14:13 +0100, Vitaly Kuznetsov wrote:
> James Bottomley <jbottomley@odin.com> writes:
> 
> > On Mon, 2015-11-02 at 16:58 +0100, Vitaly Kuznetsov wrote:
> >> James Bottomley <jbottomley@odin.com> writes:
> >> 
> >> > On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
> >> >> James Bottomley <jbottomley@odin.com> writes:
> >> >> 
> >> >> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> >> >> string_get_size() can't really handle huge block sizes, especially
> >> >> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
> >> >> >> Change blk_size from u64 to u32 to reflect the reality.
> >> >> >
> >> >> > What is the actual evidence for this?  The calculation is designed to be
> >> >> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> >> >> > fine for huge block sizes.
> >> >> 
> >> >> We have 'u32 remainder' and then we do:
> >> >> 
> >> >> exp = divisor[units] / (u32)blk_size;
> >> >> ...
> >> >> remainder = do_div(size, divisor[units]);
> >> >> remainder *= blk_size;
> >> >> 
> >> >> I'm pretty sure it will overflow for some inputs.
> >> >
> >> > It shouldn't; the full code snippet does this:
> >> >
> >> >         	while (blk_size >= divisor[units]) {
> >> >         		remainder = do_div(blk_size, divisor[units]);
> >> >         		i++;
> >> >         	}
> >> >
> >> >         	exp = divisor[units] / (u32)blk_size;
> >> >
> >> > So by the time it reaches the statement you complain about, blk_size is
> >> > already less than or equal to the divisor (which is 1000 or 1024) so
> >> > truncating to 32 bits is always correct.
> >> >
> >> 
> >> I overlooked, sorry!
> >> 
> >> > I'm sort of getting the impression you don't quite understand the
> >> > mathematics:  i is the logarithm to the base divisor[units].  We reduce
> >> > both operands to exponents of the logarithm base (adding the two bases
> >> > together in i), which means they are by definition in a range between
> >> > zero and the base and then multiply the remaining exponents correcting
> >> > the result for a base overflow (so the result is always a correct
> >> > exponent and i is the logarithm to the base).  It's actually simply
> >> > Napier's algorithm.
> >> >
> >> > The reason we're getting the up to 2.5% rounding errors you complain
> >> > about is because at each logarithm until the last one, we throw away the
> >> > remainder (it's legitimate because it's always 1000x smaller than the
> >> > exponent), but in the case of a large remainder it provides a small
> >> > correction to the final operation which we don't account for.  If you
> >> > want to make a true correction, you save the penultimate residue in each
> >> > case, multiply each by the *other* exponent add them together, divide by
> >> > the base and increment the final result by the remainder.
> >> 
> >> My assumption was that we don't really need to support blk_sizes >
> >> U32_MAX and we can simplify string_get_size() instead of adding
> >> additional complexity. Apparently, the assumption was wrong.
> >> 
> >> >
> >> > However, for 2.5% the physicist in me says the above is way overkill.
> >> >
> >> 
> >> It is getting was over 2.5% if blk_size is not a power of 2. While it is
> >> probably never the case for block subsystem the function is in lib and
> >> pretends to be general-enough. I'll try to make proper correction and
> >> let's see if it's worth the effort. 
> >
> > OK, this is the full calculation.  It also includes an arithmetic
> > rounding to the final figure print.  I suppose it's not that much more
> > complexity than the original, and it does make the algorithm easier to
> > understand.
> >
> > We could do with running the comments by some other non-mathematician,
> > now I've explained it in detail to you two, to see if they actually give
> > an understanding of the algorithm.
> 
> Thanks, to me they look great! One nitpick below ...
> 
> >
> > James
> >
> > ---
> >
> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> > index 5939f63..1ec7e77a 100644
> > --- a/lib/string_helpers.c
> > +++ b/lib/string_helpers.c
> > @@ -44,7 +44,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >  		[STRING_UNITS_2] = 1024,
> >  	};
> >  	int i, j;
> > -	u32 remainder = 0, sf_cap, exp;
> > +	u32 remainder = 0, sf_cap, r1 = 0, r2 = 0, round;
> >  	char tmp[8];
> >  	const char *unit;
> >
> > @@ -53,27 +53,46 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >  	if (!size)
> >  		goto out;
> >
> > +	/* This is napier's algorithm.  Reduce the original block size to
> > +	 *
> > +	 * co * divisor[units]^i
> > +	 *
> > +	 * where co = blk_size + r1/divisor[units];
> > +	 *
> > +	 * and the same for size.  We simply add to the exponent i, because
> > +	 * the final calculation we're looking for is
> > +	 *
> > +	 * (co1 * co2) * divisor[units]^i
> > +	 */
> > +
> > +
> >  	while (blk_size >= divisor[units]) {
> > -		remainder = do_div(blk_size, divisor[units]);
> > +		r1 = do_div(blk_size, divisor[units]);
> >  		i++;
> >  	}
> >
> > -	exp = divisor[units] / (u32)blk_size;
> > -	/*
> > -	 * size must be strictly greater than exp here to ensure that remainder
> > -	 * is greater than divisor[units] coming out of the if below.
> > -	 */
> > -	if (size > exp) {
> > -		remainder = do_div(size, divisor[units]);
> > -		remainder *= blk_size;
> > +	while (size >= divisor[units]) {
> > +		r2 = do_div(size, divisor[units]);
> >  		i++;
> > -	} else {
> > -		remainder *= size;
> >  	}
> >
> > -	size *= blk_size;
> > -	size += remainder / divisor[units];
> > -	remainder %= divisor[units];
> > +	/* here's the magic.  co1 * co2 may be > divisor[i], so correct for
> > +	 * that in the exponent and make sure that the additional corrections
> > +	 * from the remainders is added in.
> > +	 *
> > +	 * co1 *co2 = (blk_size + r1/divisor[units])*(size + r2/divisor[units])
> > +	 *
> > +	 * therefore
> > +	 *
> > +	 * co1*co2*divisor[units] = blk_size*size*divisor[units] +
> > +	 *          r1*size + r2*size + r1*r2/divisor[units]
> > +	 *
> > +	 * drop the last term because it's too small and perform the
> > +	 * calculation cleverly by decremeting i to be automatically dealing
> > +	 * with everything multiplied by divisor[units] */
> > +
> > +	--i;
> > +	size = size * blk_size * divisor[units] + r1 * size + r2 *
> > blk_size;
> 
> The last term is actually not that small. Here is an example:

It's always < 1 in the final equation because it's divided by the square
of divisor[units].  However, that can make a contribution to the round
up, I suppose leading to the truncation you see

> size = 8192  blk_size = 1024
> 
> 'As is' the algorithm gives us '8.38 MB', and if we add "+ r1 * r1 /
> divisor[units]" we get '8.39 MB' (the correct answer is 8192 * 1024 =
> 8388608 which is 8.39).
> 
> Both r1 and r2 are < divisor[units] here so r1 * r2 won't overflow u32,
> I suggest we add this term.
> 
> >
> >  	while (size >= divisor[units]) {
> >  		remainder = do_div(size, divisor[units]);
> > @@ -81,8 +100,15 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >  	}
> >
> >  	sf_cap = size;
> > -	for (j = 0; sf_cap*10 < 1000; j++)
> > +	round = 500;
> > +	for (j = 0; sf_cap*10 < 1000; j++) {
> >  		sf_cap *= 10;
> > +		round /= 10;
> > +	}
> > +
> > +	/* add a 5 to the digit below what will be printed to ensure
> > +	 * an arithmetical round up */
> > +	remainder += round;
> >
> >  	if (j) {
> >  		remainder *= 1000;
> 
> Can I post this solution with your Suggested-by or do you plan to do it
> yourself?

It was a suggestion when I explained what the missing sources of
precision were, I don't think it's really a suggestion when it comes
with an exemplary patch.  However, the whole thing needs polishing
because all the divisions need to be eliminated, since they're a huge
source of problems for most 32 bit CPUs, so I can fix it all the way and
repost.

James

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-11-03 17:02               ` James Bottomley
@ 2015-11-03 20:57                 ` Rasmus Villemoes
  2015-11-03 21:19                   ` James Bottomley
  0 siblings, 1 reply; 25+ messages in thread
From: Rasmus Villemoes @ 2015-11-03 20:57 UTC (permalink / raw)
  To: James Bottomley
  Cc: vkuznets, ulf.hansson, andriy.shevchenko, keescook, linux-kernel, akpm

On Tue, Nov 03 2015, James Bottomley <jbottomley@odin.com> wrote:

>
> It was a suggestion when I explained what the missing sources of
> precision were, I don't think it's really a suggestion when it comes
> with an exemplary patch.

ex·em·pla·ry
adjective

    1.
    serving as a desirable model; representing the best of its kind.

Said exemplary patch produces "1.10 KiB" for size=2047,
blk_size=1. (This is caused by the introduction of rounding, and is
probably fixable.)

James, I do understand the algorithm you're trying to use. What I don't
understand is why you insist on using the approach of reducing size and
blk_size all the way before multiplying them. It seems much simpler to
just reduce them till they're below U32_MAX (not keeping track of any
remainders at that point), multiply them, and then proceed as usual,
This avoids having to deal with weird cross-multiplication terms, gives
more accurate results (yes, I tested that) and avoids the extra 64/32
division you introduce by decrementing i.

Rasmus

To be precise, the body I suggest is

	while (blk_size > U32_MAX) {
		do_div(blk_size, divisor[units]);
		i++;
	}
	while (size > U32_MAX) {
		do_div(size, divisor[units]);
		i++;
	}
	size *= blk_size;
	while (size > divisor[units]) {
		remainder = do_div(size, divisor[units]);
		i++;
	}

whether the last one should be > or >= is debatable; I think 1024 KiB is
better than 1.00 MiB.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface
  2015-11-03 20:57                 ` Rasmus Villemoes
@ 2015-11-03 21:19                   ` James Bottomley
  0 siblings, 0 replies; 25+ messages in thread
From: James Bottomley @ 2015-11-03 21:19 UTC (permalink / raw)
  To: linux
  Cc: ulf.hansson, keescook, andriy.shevchenko, vkuznets, linux-kernel, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1468 bytes --]

On Tue, 2015-11-03 at 21:57 +0100, Rasmus Villemoes wrote:
> On Tue, Nov 03 2015, James Bottomley <jbottomley@odin.com> wrote:
> 
> >
> > It was a suggestion when I explained what the missing sources of
> > precision were, I don't think it's really a suggestion when it comes
> > with an exemplary patch.
> 
> ex·em·pla·ry
> adjective
> 
>     1.
>     serving as a desirable model; representing the best of its kind.
> 
> Said exemplary patch produces "1.10 KiB" for size=2047,
> blk_size=1. (This is caused by the introduction of rounding, and is
> probably fixable.)
> 
> James, I do understand the algorithm you're trying to use. What I don't
> understand is why you insist on using the approach of reducing size and
> blk_size all the way before multiplying them. It seems much simpler to
> just reduce them till they're below U32_MAX (not keeping track of any
> remainders at that point), multiply them, and then proceed as usual,
> This avoids having to deal with weird cross-multiplication terms, gives
> more accurate results (yes, I tested that) and avoids the extra 64/32
> division you introduce by decrementing i.

Well, wood and trees, I think.  I don't believe there's any more
accuracy with the second order term, but it is a lot simpler for anyone
to understand.  I'll post a v2.

James

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-11-03 21:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 16:30 [PATCH v3 0/4] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
2015-10-29 16:30 ` [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
2015-10-29 22:27   ` James Bottomley
2015-10-29 23:19     ` Rasmus Villemoes
2015-10-29 23:24       ` Rasmus Villemoes
2015-10-30  3:33       ` James Bottomley
2015-10-30 10:46     ` Vitaly Kuznetsov
2015-10-31  0:21       ` James Bottomley
2015-11-02 15:58         ` Vitaly Kuznetsov
2015-11-03  3:40           ` James Bottomley
2015-11-03 13:13             ` Vitaly Kuznetsov
2015-11-03 17:02               ` James Bottomley
2015-11-03 20:57                 ` Rasmus Villemoes
2015-11-03 21:19                   ` James Bottomley
2015-10-29 16:30 ` [PATCH v3 2/4] lib/string_helpers.c: protect string_get_size() against blk_size=0 Vitaly Kuznetsov
2015-10-29 21:24   ` Andy Shevchenko
2015-10-29 23:00   ` James Bottomley
2015-10-29 23:32     ` Andy Shevchenko
2015-10-30  3:34       ` James Bottomley
2015-10-30 10:41         ` Vitaly Kuznetsov
2015-10-31  0:07           ` James Bottomley
2015-10-29 16:30 ` [PATCH v3 3/4] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
2015-10-29 21:22   ` Andy Shevchenko
2015-10-29 16:30 ` [PATCH v3 4/4] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
2015-10-29 21:33   ` Andy Shevchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).