linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] string_helpers: fix precision loss for some inputs
@ 2015-11-03 20:33 James Bottomley
  2015-11-03 21:21 ` [PATCH v2] " James Bottomley
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2015-11-03 20:33 UTC (permalink / raw)
  To: Vitaly Kuznetsov, linux-scsi
  Cc: ulf.hansson, linux, andriy.shevchenko, keescook, linux-kernel, akpm

From: James Bottomley <JBottomley@Odin.com> 

It was noticed that we lose precision in the final calculation for some
inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
current algorithm doesn't correctly account for all the remainders in the
logarithms.  Fix this by doing a correct calculation in the remainders based
on napier's algorithm.  Additionally, now we have the correct result, we have
to account for arithmetic rounding because we're printing 3 digits of
precision.  This means that if the fourth digit is five or greater, we have to
round up, so add a section to ensure correct rounding.  Finally account for
all possible inputs correctly, including zero for block size.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org	# delay backport by two months for testing
Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBottomley@Odin.com>

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..2ac77bf 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -43,37 +43,67 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		[STRING_UNITS_10] = 1000,
 		[STRING_UNITS_2] = 1024,
 	};
-	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	static const unsigned int rounding[] = { 500, 50, 5, 0};
+	int i = 0, j;
+	u32 remainder = 0, sf_cap, r1 = 0, r2 = 0;
 	char tmp[8];
 	const char *unit;
 
 	tmp[0] = '\0';
-	i = 0;
-	if (!size)
+
+	if (blk_size == 0)
+		size = 0;
+	if (size == 0)
 		goto out;
 
+	/* This is napier's algorithm.  Reduce the original block size to
+	 *
+	 * co * divisor[units]^i
+	 *
+	 * where co = blk_size + r1/divisor[units];
+	 *
+	 * and the same for size.  We simply add to the exponent i, because
+	 * the final calculation we're looking for is
+	 *
+	 * (co1 * co2) * divisor[units]^(i1 +i2)
+	 */
+
+
 	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
+		r1 = do_div(blk_size, divisor[units]);
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
-	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
-	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
+	while (size >= divisor[units]) {
+		r2 = do_div(size, divisor[units]);
 		i++;
-	} else {
-		remainder *= size;
 	}
 
-	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
+	/*
+	 * We've already added the exponents in i, now multiply the
+	 * coefficients:
+	 *
+	 * co1*co2 = (blk_size + r1/divisor[units])*(size + r2/divisor[units])
+	 *
+	 * therefore
+	 *
+	 * co1*co2 = blk_size*size
+	 *           + (r1*size + r2*size)/divisor[units]
+	 *           + r1*r2/divisor[units]/divisor[units]
+	 *
+	 * reduce the exponent by 2 (it can now go negative) to perform this
+	 * calculation without divisions (64 bit divisions are very painful on
+	 * 32 bit architectures), then redo the logarithm adjustment to bring
+	 * us back to the correct coefficient and exponent.  This calculation
+	 * can still not overflow because the largest term must be less than
+	 * divisor[units]^4, which is 40 bits
+	 *
+	 */
+
+	i -= 2;
+	size = size * blk_size * divisor[units] * divisor[units]
+	  + (r1 * size + r2 * blk_size) * divisor[units]
+	  + r1*r2;
 
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
@@ -84,9 +114,20 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	for (j = 0; sf_cap*10 < 1000; j++)
 		sf_cap *= 10;
 
+	/* express the remainder as a decimal (it's currently the numerator of
+	 * a fraction whose denominator is divisor[units]) */
+	remainder *= 1000;
+	remainder /= divisor[units];
+
+	/* add a 5 to the digit below what will be printed to ensure
+	 * an arithmetical round up and carry it through to size */
+	remainder += rounding[j];
+	if (remainder >= 1000) {
+		remainder -= 1000;
+		size += 1;
+	}
+
 	if (j) {
-		remainder *= 1000;
-		remainder /= divisor[units];
 		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
 		tmp[j+1] = '\0';
 	}



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] string_helpers: fix precision loss for some inputs
  2015-11-03 20:33 [PATCH] string_helpers: fix precision loss for some inputs James Bottomley
@ 2015-11-03 21:21 ` James Bottomley
  2015-11-03 22:13   ` Rasmus Villemoes
  2015-11-03 23:12   ` [PATCH v3] " James Bottomley
  0 siblings, 2 replies; 9+ messages in thread
From: James Bottomley @ 2015-11-03 21:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-scsi, ulf.hansson, linux, andriy.shevchenko, keescook,
	linux-kernel, akpm

From: James Bottomley <JBottomley@Odin.com>

It was noticed that we lose precision in the final calculation for some
inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
current algorithm doesn't correctly account for all the remainders in the
logarithms.  Fix this by doing a correct calculation in the remainders based
on napier's algorithm.  Additionally, now we have the correct result, we have
to account for arithmetic rounding because we're printing 3 digits of
precision.  This means that if the fourth digit is five or greater, we have to
round up, so add a section to ensure correct rounding.  Finally account for
all possible inputs correctly, including zero for block size.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org	# delay backport by two months for testing
Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBottomley@Odin.com>

--

v2: updated with a recommendation from Rasmus Villemoes to truncate the
initial precision at just under 32 bits

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..363faca 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		[STRING_UNITS_10] = 1000,
 		[STRING_UNITS_2] = 1024,
 	};
-	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	static const unsigned int rounding[] = { 500, 50, 5, 0};
+	int i = 0, j;
+	u32 remainder = 0, sf_cap;
 	char tmp[8];
 	const char *unit;
 
 	tmp[0] = '\0';
-	i = 0;
-	if (!size)
+
+	if (blk_size == 0)
+		size = 0;
+	if (size == 0)
 		goto out;
 
-	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
+	/* This is napier's algorithm.  Reduce the original block size to
+	 *
+	 * coefficient * divisor[units]^i
+	 *
+	 * we do the reduction so both coefficients are just under 32 bits so
+	 * that multiplying them together won't overflow 64 bits and we keep
+	 * as much precision as possible in the numbers
+	 */
+
+	while (blk_size >= UINT_MAX)
 		i++;
-	}
 
-	exp = divisor[units] / (u32)blk_size;
-	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
-	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
+	while (size >= UINT_MAX)
 		i++;
-	} else {
-		remainder *= size;
-	}
+
+	/* now perform the actual multiplication keeping i as the sum of the
+	 * two logarithms */
 
 	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
 
+	/* and logarithmically reduce it until it's just under the divisor */
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
 		i++;
@@ -84,9 +86,20 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	for (j = 0; sf_cap*10 < 1000; j++)
 		sf_cap *= 10;
 
+	/* express the remainder as a decimal (it's currently the numerator of
+	 * a fraction whose denominator is divisor[units]) */
+	remainder *= 1000;
+	remainder /= divisor[units];
+
+	/* add a 5 to the digit below what will be printed to ensure
+	 * an arithmetical round up and carry it through to size */
+	remainder += rounding[j];
+	if (remainder >= 1000) {
+		remainder -= 1000;
+		size += 1;
+	}
+
 	if (j) {
-		remainder *= 1000;
-		remainder /= divisor[units];
 		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
 		tmp[j+1] = '\0';
 	}



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] string_helpers: fix precision loss for some inputs
  2015-11-03 21:21 ` [PATCH v2] " James Bottomley
@ 2015-11-03 22:13   ` Rasmus Villemoes
  2015-11-03 22:54     ` James Bottomley
  2015-11-03 23:12   ` [PATCH v3] " James Bottomley
  1 sibling, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2015-11-03 22:13 UTC (permalink / raw)
  To: James Bottomley
  Cc: Vitaly Kuznetsov, linux-scsi, ulf.hansson, andriy.shevchenko,
	keescook, linux-kernel, akpm

On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> From: James Bottomley <JBottomley@Odin.com>
>
> It was noticed that we lose precision in the final calculation for some
> inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
> should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
> current algorithm doesn't correctly account for all the remainders in the
> logarithms.  Fix this by doing a correct calculation in the remainders based
> on napier's algorithm.  Additionally, now we have the correct result, we have
> to account for arithmetic rounding because we're printing 3 digits of
> precision.  This means that if the fourth digit is five or greater, we have to
> round up, so add a section to ensure correct rounding.  Finally account for
> all possible inputs correctly, including zero for block size.
>
> Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: stable@vger.kernel.org	# delay backport by two months for testing
> Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
> Signed-off-by: James Bottomley <JBottomley@Odin.com>
>
> --
>
> v2: updated with a recommendation from Rasmus Villemoes to truncate the
> initial precision at just under 32 bits
>
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index 5939f63..363faca 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>  		[STRING_UNITS_10] = 1000,
>  		[STRING_UNITS_2] = 1024,
>  	};
> -	int i, j;
> -	u32 remainder = 0, sf_cap, exp;
> +	static const unsigned int rounding[] = { 500, 50, 5, 0};

j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?

> +
> +	while (blk_size >= UINT_MAX)
>  		i++;
> -	}
>  
> -	exp = divisor[units] / (u32)blk_size;
> -	/*
> -	 * size must be strictly greater than exp here to ensure that remainder
> -	 * is greater than divisor[units] coming out of the if below.
> -	 */
> -	if (size > exp) {
> -		remainder = do_div(size, divisor[units]);
> -		remainder *= blk_size;
> +	while (size >= UINT_MAX)
>  		i++;

Please spell it U32_MAX. Also, it's not clear why you left out the
do_divs ;-)

Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] string_helpers: fix precision loss for some inputs
  2015-11-03 22:13   ` Rasmus Villemoes
@ 2015-11-03 22:54     ` James Bottomley
  2015-11-03 23:26       ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2015-11-03 22:54 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Vitaly Kuznetsov, linux-scsi, ulf.hansson, andriy.shevchenko,
	keescook, linux-kernel, akpm

On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
> On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > From: James Bottomley <JBottomley@Odin.com>
> >
> > It was noticed that we lose precision in the final calculation for some
> > inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
> > current algorithm doesn't correctly account for all the remainders in the
> > logarithms.  Fix this by doing a correct calculation in the remainders based
> > on napier's algorithm.  Additionally, now we have the correct result, we have
> > to account for arithmetic rounding because we're printing 3 digits of
> > precision.  This means that if the fourth digit is five or greater, we have to
> > round up, so add a section to ensure correct rounding.  Finally account for
> > all possible inputs correctly, including zero for block size.
> >
> > Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Cc: stable@vger.kernel.org	# delay backport by two months for testing
> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
> > Signed-off-by: James Bottomley <JBottomley@Odin.com>
> >
> > --
> >
> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
> > initial precision at just under 32 bits
> >
> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> > index 5939f63..363faca 100644
> > --- a/lib/string_helpers.c
> > +++ b/lib/string_helpers.c
> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >  		[STRING_UNITS_10] = 1000,
> >  		[STRING_UNITS_2] = 1024,
> >  	};
> > -	int i, j;
> > -	u32 remainder = 0, sf_cap, exp;
> > +	static const unsigned int rounding[] = { 500, 50, 5, 0};
> 
> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?

No reason beyond a vague worry someone might try to increase the printed
precision by one digit.

> > +
> > +	while (blk_size >= UINT_MAX)
> >  		i++;
> > -	}
> >  
> > -	exp = divisor[units] / (u32)blk_size;
> > -	/*
> > -	 * size must be strictly greater than exp here to ensure that remainder
> > -	 * is greater than divisor[units] coming out of the if below.
> > -	 */
> > -	if (size > exp) {
> > -		remainder = do_div(size, divisor[units]);
> > -		remainder *= blk_size;
> > +	while (size >= UINT_MAX)
> >  		i++;
> 
> Please spell it U32_MAX

Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
works, of course but UINT_MAX is standard.

> . Also, it's not clear why you left out the
> do_divs ;-)

Over reduction.

James


> Rasmus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] string_helpers: fix precision loss for some inputs
  2015-11-03 21:21 ` [PATCH v2] " James Bottomley
  2015-11-03 22:13   ` Rasmus Villemoes
@ 2015-11-03 23:12   ` James Bottomley
  2015-11-07  0:50     ` [PATCH v4] " James Bottomley
  1 sibling, 1 reply; 9+ messages in thread
From: James Bottomley @ 2015-11-03 23:12 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-scsi, ulf.hansson, linux, andriy.shevchenko, keescook,
	linux-kernel, akpm

From: James Bottomley <JBottomley@Odin.com>

It was noticed that we lose precision in the final calculation for some
inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
current algorithm doesn't correctly account for all the remainders in the
logarithms.  Fix this by doing a correct calculation in the remainders based
on napier's algorithm.  Additionally, now we have the correct result, we have
to account for arithmetic rounding because we're printing 3 digits of
precision.  This means that if the fourth digit is five or greater, we have to
round up, so add a section to ensure correct rounding.  Finally account for
all possible inputs correctly, including zero for block size.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org	# delay backport by two months for testing
Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBottomley@Odin.com>

---

v2: updated with a recommendation from Rasmus Villemoes to truncate the
initial precision at just under 32 bits

v3: put back the missing do_divs

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..c2ffb73 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -43,38 +43,45 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		[STRING_UNITS_10] = 1000,
 		[STRING_UNITS_2] = 1024,
 	};
-	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	static const unsigned int rounding[] = { 500, 50, 5, 0};
+	int i = 0, j;
+	u32 remainder = 0, sf_cap;
 	char tmp[8];
 	const char *unit;
 
 	tmp[0] = '\0';
-	i = 0;
-	if (!size)
+
+	if (blk_size == 0)
+		size = 0;
+	if (size == 0)
 		goto out;
 
-	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
+	/* This is napier's algorithm.  Reduce the original block size to
+	 *
+	 * coefficient * divisor[units]^i
+	 *
+	 * we do the reduction so both coefficients are just under 32 bits so
+	 * that multiplying them together won't overflow 64 bits and we keep
+	 * as much precision as possible in the numbers.
+	 *
+	 * Note: it's safe to throw away the remainders here because all the
+	 * precision is in the coefficients.
+	 */
+	while (blk_size >= UINT_MAX) {
+		do_div(blk_size, divisor[units]);
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
-	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
-	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
+	while (size >= UINT_MAX) {
+		do_div(size, divisor[units]);
 		i++;
-	} else {
-		remainder *= size;
 	}
 
+	/* now perform the actual multiplication keeping i as the sum of the
+	 * two logarithms */
 	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
 
+	/* and logarithmically reduce it until it's just under the divisor */
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
 		i++;
@@ -84,9 +91,20 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	for (j = 0; sf_cap*10 < 1000; j++)
 		sf_cap *= 10;
 
+	/* express the remainder as a decimal (it's currently the numerator of
+	 * a fraction whose denominator is divisor[units]) */
+	remainder *= 1000;
+	remainder /= divisor[units];
+
+	/* add a 5 to the digit below what will be printed to ensure
+	 * an arithmetical round up and carry it through to size */
+	remainder += rounding[j];
+	if (remainder >= 1000) {
+		remainder -= 1000;
+		size += 1;
+	}
+
 	if (j) {
-		remainder *= 1000;
-		remainder /= divisor[units];
 		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
 		tmp[j+1] = '\0';
 	}



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] string_helpers: fix precision loss for some inputs
  2015-11-03 22:54     ` James Bottomley
@ 2015-11-03 23:26       ` Rasmus Villemoes
  2015-11-03 23:42         ` James Bottomley
  0 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2015-11-03 23:26 UTC (permalink / raw)
  To: James Bottomley
  Cc: Vitaly Kuznetsov, linux-scsi, ulf.hansson, andriy.shevchenko,
	keescook, linux-kernel, akpm

On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
>> On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>> 
>> > From: James Bottomley <JBottomley@Odin.com>
>> >
>> > It was noticed that we lose precision in the final calculation for some
>> > inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
>> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
>> > current algorithm doesn't correctly account for all the remainders in the
>> > logarithms.  Fix this by doing a correct calculation in the remainders based
>> > on napier's algorithm.  Additionally, now we have the correct result, we have
>> > to account for arithmetic rounding because we're printing 3 digits of
>> > precision.  This means that if the fourth digit is five or greater, we have to
>> > round up, so add a section to ensure correct rounding.  Finally account for
>> > all possible inputs correctly, including zero for block size.
>> >
>> > Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> > Cc: stable@vger.kernel.org	# delay backport by two months for testing
>> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
>> > Signed-off-by: James Bottomley <JBottomley@Odin.com>
>> >
>> > --
>> >
>> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
>> > initial precision at just under 32 bits
>> >
>> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> > index 5939f63..363faca 100644
>> > --- a/lib/string_helpers.c
>> > +++ b/lib/string_helpers.c
>> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> >  		[STRING_UNITS_10] = 1000,
>> >  		[STRING_UNITS_2] = 1024,
>> >  	};
>> > -	int i, j;
>> > -	u32 remainder = 0, sf_cap, exp;
>> > +	static const unsigned int rounding[] = { 500, 50, 5, 0};
>> 
>> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?
>
> No reason beyond a vague worry someone might try to increase the printed
> precision by one digit.

But that would seem to require prepending 5000 to that array and
changing various constants below to 10000 (aside from checking all
callers to see if they pass a sufficient buffer size) - the 0 doesn't
serve any purpose in that scenario either.

>> > +
>> > +	while (blk_size >= UINT_MAX)
>> >  		i++;
>> > -	}
>> >  
>> > -	exp = divisor[units] / (u32)blk_size;
>> > -	/*
>> > -	 * size must be strictly greater than exp here to ensure that remainder
>> > -	 * is greater than divisor[units] coming out of the if below.
>> > -	 */
>> > -	if (size > exp) {
>> > -		remainder = do_div(size, divisor[units]);
>> > -		remainder *= blk_size;
>> > +	while (size >= UINT_MAX)
>> >  		i++;
>> 
>> Please spell it U32_MAX
>
> Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
> works, of course but UINT_MAX is standard.

We're dealing with explicitly sized integers, and the comment even says
that we're reducing till we fit in 32 bits, so that we can do a
32x32->64 multiplication. U32_MAX is the natural name for the
appropriate constant.

Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
to make any difference (well, except it might generate slightly better
code, since it would allow gcc to just test the upper half for being 0,
which might be cheaper on some architectures than comparing to a
literal).

Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] string_helpers: fix precision loss for some inputs
  2015-11-03 23:26       ` Rasmus Villemoes
@ 2015-11-03 23:42         ` James Bottomley
  2015-11-04  9:02           ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2015-11-03 23:42 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Vitaly Kuznetsov, linux-scsi, ulf.hansson, andriy.shevchenko,
	keescook, linux-kernel, akpm

On Wed, 2015-11-04 at 00:26 +0100, Rasmus Villemoes wrote:
> On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
> >> On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> >> 
> >> > From: James Bottomley <JBottomley@Odin.com>
> >> >
> >> > It was noticed that we lose precision in the final calculation for some
> >> > inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
> >> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
> >> > current algorithm doesn't correctly account for all the remainders in the
> >> > logarithms.  Fix this by doing a correct calculation in the remainders based
> >> > on napier's algorithm.  Additionally, now we have the correct result, we have
> >> > to account for arithmetic rounding because we're printing 3 digits of
> >> > precision.  This means that if the fourth digit is five or greater, we have to
> >> > round up, so add a section to ensure correct rounding.  Finally account for
> >> > all possible inputs correctly, including zero for block size.
> >> >
> >> > Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> > Cc: stable@vger.kernel.org	# delay backport by two months for testing
> >> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
> >> > Signed-off-by: James Bottomley <JBottomley@Odin.com>
> >> >
> >> > --
> >> >
> >> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
> >> > initial precision at just under 32 bits
> >> >
> >> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> >> > index 5939f63..363faca 100644
> >> > --- a/lib/string_helpers.c
> >> > +++ b/lib/string_helpers.c
> >> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
> >> >  		[STRING_UNITS_10] = 1000,
> >> >  		[STRING_UNITS_2] = 1024,
> >> >  	};
> >> > -	int i, j;
> >> > -	u32 remainder = 0, sf_cap, exp;
> >> > +	static const unsigned int rounding[] = { 500, 50, 5, 0};
> >> 
> >> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?
> >
> > No reason beyond a vague worry someone might try to increase the printed
> > precision by one digit.
> 
> But that would seem to require prepending 5000 to that array and
> changing various constants below to 10000 (aside from checking all
> callers to see if they pass a sufficient buffer size) - the 0 doesn't
> serve any purpose in that scenario either.
> 
> >> > +
> >> > +	while (blk_size >= UINT_MAX)
> >> >  		i++;
> >> > -	}
> >> >  
> >> > -	exp = divisor[units] / (u32)blk_size;
> >> > -	/*
> >> > -	 * size must be strictly greater than exp here to ensure that remainder
> >> > -	 * is greater than divisor[units] coming out of the if below.
> >> > -	 */
> >> > -	if (size > exp) {
> >> > -		remainder = do_div(size, divisor[units]);
> >> > -		remainder *= blk_size;
> >> > +	while (size >= UINT_MAX)
> >> >  		i++;
> >> 
> >> Please spell it U32_MAX
> >
> > Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
> > works, of course but UINT_MAX is standard.
> 
> We're dealing with explicitly sized integers

An integer is explicitly sized: it's 32 bits.  That's why UINT_MAX is a
universal constant.

>  and the comment even says
> that we're reducing till we fit in 32 bits, so that we can do a
> 32x32->64 multiplication. U32_MAX is the natural name for the
> appropriate constant.
> 
> Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
> to make any difference (well, except it might generate slightly better
> code, since it would allow gcc to just test the upper half for being 0,
> which might be cheaper on some architectures than comparing to a
> literal).

Heh if we're going to be that concerned about the code generation, then
we should just tell gcc exactly how to do it instead of hoping it can
work it out for itself, so

while (blk_size >> 32) {
...

James

> Rasmus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] string_helpers: fix precision loss for some inputs
  2015-11-03 23:42         ` James Bottomley
@ 2015-11-04  9:02           ` Rasmus Villemoes
  0 siblings, 0 replies; 9+ messages in thread
From: Rasmus Villemoes @ 2015-11-04  9:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Vitaly Kuznetsov, linux-scsi, ulf.hansson, andriy.shevchenko,
	keescook, linux-kernel, akpm

On Wed, Nov 04 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Wed, 2015-11-04 at 00:26 +0100, Rasmus Villemoes wrote:
>> On Tue, Nov 03 2015, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>> >> Please spell it U32_MAX
>> >
>> > Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
>> > works, of course but UINT_MAX is standard.
>> 
>> We're dealing with explicitly sized integers
>
> An integer is explicitly sized: it's 32 bits. That's why UINT_MAX is
> a universal constant.

In the Linux universe, yes. It's kind of amusing how you try to argue
based on the UINT_MAX name being (defined in a) standard while at the same
time very much rely on sizeof(int) having a value which is not specified
by the standard. I repeat:

>> U32_MAX is the natural name for the appropriate constant.

(and it's defined right next to UINT_MAX in kernel.h, so it's not like
you'd have to introduce that macro).

>> Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
>> to make any difference (well, except it might generate slightly better
>> code, since it would allow gcc to just test the upper half for being 0,
>> which might be cheaper on some architectures than comparing to a
>> literal).
>
> Heh if we're going to be that concerned about the code generation, then
> we should just tell gcc exactly how to do it instead of hoping it can
> work it out for itself, so
>
> while (blk_size >> 32) {
> ...

Nah, that would still require the compiler to be able to transform that
to the other form, which apparently it isn't. On x86_64, the simplest
is to load U32_MAX once into a register and then do r/r comparisons, but
when it's possible to directly test the upper half (e.g. when the 64 bit
value is represented in a pair of 32 bit registers) that's much
simpler. gcc generates good code for 'blk_size > U32_MAX' on both x86_64
and x86_32, but ends up doing an extra cmp on x86_32 for >=, and ends up
doing mov,shift,test inside the loop on x86_64 for 'blk_size >> 32'.

Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4] string_helpers: fix precision loss for some inputs
  2015-11-03 23:12   ` [PATCH v3] " James Bottomley
@ 2015-11-07  0:50     ` James Bottomley
  0 siblings, 0 replies; 9+ messages in thread
From: James Bottomley @ 2015-11-07  0:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-scsi, ulf.hansson, linux, andriy.shevchenko, keescook,
	linux-kernel, akpm

From: James Bottomley <JBottomley@Odin.com>

It was noticed that we lose precision in the final calculation for some
inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
current algorithm doesn't correctly account for all the remainders in the
logarithms.  Fix this by doing a correct calculation in the remainders based
on napier's algorithm.  Additionally, now we have the correct result, we have
to account for arithmetic rounding because we're printing 3 digits of
precision.  This means that if the fourth digit is five or greater, we have to
round up, so add a section to ensure correct rounding.  Finally account for
all possible inputs correctly, including zero for block size.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org	# delay backport by two months for testing
Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBottomley@Odin.com>

---

v2: updated with a recommendation from Rasmus Villemoes to truncate the
initial precision at just under 32 bits

v3: put back the missing do_divs

v4: use explicit top bit checking in logarithmic reductions.  Only adjust
remainder precision where necessary

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..5c88204 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -43,50 +43,73 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		[STRING_UNITS_10] = 1000,
 		[STRING_UNITS_2] = 1024,
 	};
-	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	static const unsigned int rounding[] = { 500, 50, 5 };
+	int i = 0, j;
+	u32 remainder = 0, sf_cap;
 	char tmp[8];
 	const char *unit;
 
 	tmp[0] = '\0';
-	i = 0;
-	if (!size)
+
+	if (blk_size == 0)
+		size = 0;
+	if (size == 0)
 		goto out;
 
-	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
+	/* This is Napier's algorithm.  Reduce the original block size to
+	 *
+	 * coefficient * divisor[units]^i
+	 *
+	 * we do the reduction so both coefficients are just under 32 bits so
+	 * that multiplying them together won't overflow 64 bits and we keep
+	 * as much precision as possible in the numbers.
+	 *
+	 * Note: it's safe to throw away the remainders here because all the
+	 * precision is in the coefficients.
+	 */
+	while (blk_size >> 32) {
+		do_div(blk_size, divisor[units]);
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
-	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
-	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
+	while (size >> 32) {
+		do_div(size, divisor[units]);
 		i++;
-	} else {
-		remainder *= size;
 	}
 
+	/* now perform the actual multiplication keeping i as the sum of the
+	 * two logarithms */
 	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
 
+	/* and logarithmically reduce it until it's just under the divisor */
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
 		i++;
 	}
 
+	/* work out in j how many digits of precision we need from the
+	 * remainder */
 	sf_cap = size;
 	for (j = 0; sf_cap*10 < 1000; j++)
 		sf_cap *= 10;
 
-	if (j) {
+	if (units == STRING_UNITS_2) {
+		/* express the remainder as a decimal.  It's currently the
+		 * numerator of a fraction whose denominator is
+		 * divisor[units], which is 1 << 10 for STRING_UNITS_2 */
 		remainder *= 1000;
-		remainder /= divisor[units];
+		remainder >>= 10;
+	}
+
+	/* add a 5 to the digit below what will be printed to ensure
+	 * an arithmetical round up and carry it through to size */
+	remainder += rounding[j];
+	if (remainder >= 1000) {
+		remainder -= 1000;
+		size += 1;
+	}
+
+	if (j) {
 		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
 		tmp[j+1] = '\0';
 	}



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-11-07  0:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03 20:33 [PATCH] string_helpers: fix precision loss for some inputs James Bottomley
2015-11-03 21:21 ` [PATCH v2] " James Bottomley
2015-11-03 22:13   ` Rasmus Villemoes
2015-11-03 22:54     ` James Bottomley
2015-11-03 23:26       ` Rasmus Villemoes
2015-11-03 23:42         ` James Bottomley
2015-11-04  9:02           ` Rasmus Villemoes
2015-11-03 23:12   ` [PATCH v3] " James Bottomley
2015-11-07  0:50     ` [PATCH v4] " James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).