linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] decompressors: fix "no limit" output buffer length
@ 2013-07-22  6:56 Alexandre Courbot
  2013-07-22 18:08 ` Jon Medhurst (Tixy)
  0 siblings, 1 reply; 5+ messages in thread
From: Alexandre Courbot @ 2013-07-22  6:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-arm-kernel, gnurou, Alexandre Courbot

When decompressing into memory, the output buffer length is set to some
arbitrarily high value (0x7fffffff) to indicate the output is,
virtually, unlimited in size.

The problem with this is that some platforms have their physical memory
at high physical addresses (0x80000000 or more), and that the output
buffer address and its "unlimited" length cannot be added without
overflowing. An example of this can be found in inflate_fast():

/* next_out is the output buffer address */
out = strm->next_out - OFF;
/* avail_out is the output buffer size. end will overflow if the output
 * address is >= 0x80000104 */
end = out + (strm->avail_out - 257);

This has huge consequences on the performance of kernel decompression,
since the following exit condition of inflate_fast() will be always
true:

} while (in < last && out < end);

Indeed, "end" has overflowed and is now always lower than "out". As a
result, inflate_fast() will return after processing one single byte of
input data, and will thus need to be called an unreasonably high number
of times. This probably went unnoticed because kernel decompression is
fast enough even with this issue.

Nonetheless, adjusting the output buffer length in such a way that the
above pointer arithmetic never overflows results in a kernel
decompression that is about 3 times faster on affected machines.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 lib/decompress_inflate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
index 19ff89e..d619b28 100644
--- a/lib/decompress_inflate.c
+++ b/lib/decompress_inflate.c
@@ -48,7 +48,7 @@ STATIC int INIT gunzip(unsigned char *buf, int len,
 		out_len = 0x8000; /* 32 K */
 		out_buf = malloc(out_len);
 	} else {
-		out_len = 0x7fffffff; /* no limit */
+		out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
 	}
 	if (!out_buf) {
 		error("Out of memory while allocating output buffer");
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] decompressors: fix "no limit" output buffer length
  2013-07-22  6:56 [PATCH] decompressors: fix "no limit" output buffer length Alexandre Courbot
@ 2013-07-22 18:08 ` Jon Medhurst (Tixy)
  2013-07-23  2:15   ` Alex Courbot
  0 siblings, 1 reply; 5+ messages in thread
From: Jon Medhurst (Tixy) @ 2013-07-22 18:08 UTC (permalink / raw)
  To: Alexandre Courbot; +Cc: Andrew Morton, gnurou, linux-kernel, linux-arm-kernel

On Mon, 2013-07-22 at 15:56 +0900, Alexandre Courbot wrote:
> When decompressing into memory, the output buffer length is set to some
> arbitrarily high value (0x7fffffff) to indicate the output is,
> virtually, unlimited in size.
> 
> The problem with this is that some platforms have their physical memory
> at high physical addresses (0x80000000 or more), and that the output
> buffer address and its "unlimited" length cannot be added without
> overflowing. An example of this can be found in inflate_fast():
> 
> /* next_out is the output buffer address */
> out = strm->next_out - OFF;
> /* avail_out is the output buffer size. end will overflow if the output
>  * address is >= 0x80000104 */
> end = out + (strm->avail_out - 257);
> 
> This has huge consequences on the performance of kernel decompression,
> since the following exit condition of inflate_fast() will be always
> true:
> 
> } while (in < last && out < end);
> 
> Indeed, "end" has overflowed and is now always lower than "out". As a
> result, inflate_fast() will return after processing one single byte of
> input data, and will thus need to be called an unreasonably high number
> of times. This probably went unnoticed because kernel decompression is
> fast enough even with this issue.
> 
> Nonetheless, adjusting the output buffer length in such a way that the
> above pointer arithmetic never overflows results in a kernel
> decompression that is about 3 times faster on affected machines.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>

This speeds up booting of my Versatile Express TC2 by 15 seconds when
starting on the A7 cluster :-)

Tested-by: Jon Medhurst <tixy@linaro.org>

> ---
>  lib/decompress_inflate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> index 19ff89e..d619b28 100644
> --- a/lib/decompress_inflate.c
> +++ b/lib/decompress_inflate.c
> @@ -48,7 +48,7 @@ STATIC int INIT gunzip(unsigned char *buf, int len,
>  		out_len = 0x8000; /* 32 K */
>  		out_buf = malloc(out_len);
>  	} else {
> -		out_len = 0x7fffffff; /* no limit */
> +		out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
>  	}
>  	if (!out_buf) {
>  		error("Out of memory while allocating output buffer");



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] decompressors: fix "no limit" output buffer length
  2013-07-22 18:08 ` Jon Medhurst (Tixy)
@ 2013-07-23  2:15   ` Alex Courbot
  2013-07-23  3:32     ` Stephen Warren
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Courbot @ 2013-07-23  2:15 UTC (permalink / raw)
  To: Jon Medhurst (Tixy)
  Cc: Andrew Morton, gnurou, linux-kernel, linux-arm-kernel, linux-tegra

On 07/23/2013 03:08 AM, Jon Medhurst (Tixy) wrote:
> On Mon, 2013-07-22 at 15:56 +0900, Alexandre Courbot wrote:
>> When decompressing into memory, the output buffer length is set to some
>> arbitrarily high value (0x7fffffff) to indicate the output is,
>> virtually, unlimited in size.
>>
>> The problem with this is that some platforms have their physical memory
>> at high physical addresses (0x80000000 or more), and that the output
>> buffer address and its "unlimited" length cannot be added without
>> overflowing. An example of this can be found in inflate_fast():
>>
>> /* next_out is the output buffer address */
>> out = strm->next_out - OFF;
>> /* avail_out is the output buffer size. end will overflow if the output
>>   * address is >= 0x80000104 */
>> end = out + (strm->avail_out - 257);
>>
>> This has huge consequences on the performance of kernel decompression,
>> since the following exit condition of inflate_fast() will be always
>> true:
>>
>> } while (in < last && out < end);
>>
>> Indeed, "end" has overflowed and is now always lower than "out". As a
>> result, inflate_fast() will return after processing one single byte of
>> input data, and will thus need to be called an unreasonably high number
>> of times. This probably went unnoticed because kernel decompression is
>> fast enough even with this issue.
>>
>> Nonetheless, adjusting the output buffer length in such a way that the
>> above pointer arithmetic never overflows results in a kernel
>> decompression that is about 3 times faster on affected machines.
>>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>
> This speeds up booting of my Versatile Express TC2 by 15 seconds when
> starting on the A7 cluster :-)
>
> Tested-by: Jon Medhurst <tixy@linaro.org>

Good to hear! Thanks for taking the time to test this.

Although the patch seems ok to me in its current form, there are two 
points for which I still have small doubts:

1) Whether size_t and pointers will have the same size on all platforms. 
It not we might end up with some funny behaviors. My limited research on 
the topic did not end up with evidence that their size may differ, but I 
don't have a definite case that they do neither.
2) Whether all platforms have their address space ending at (~0). I do 
not have a concrete example in mind, but can imagine, say, a platform 
which represents its addresses as 32-bit pointers but has a smaller 
physical bus. In this case the current calculation could cause overflows 
again.

If one (or both) of these points are to be concerned about, there may 
exist macros I am not aware of that better represent the actual limits 
of pointers in the kernel.

Thanks,
Alex.

>
>> ---
>>   lib/decompress_inflate.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
>> index 19ff89e..d619b28 100644
>> --- a/lib/decompress_inflate.c
>> +++ b/lib/decompress_inflate.c
>> @@ -48,7 +48,7 @@ STATIC int INIT gunzip(unsigned char *buf, int len,
>>   		out_len = 0x8000; /* 32 K */
>>   		out_buf = malloc(out_len);
>>   	} else {
>> -		out_len = 0x7fffffff; /* no limit */
>> +		out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
>>   	}
>>   	if (!out_buf) {
>>   		error("Out of memory while allocating output buffer");
>
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] decompressors: fix "no limit" output buffer length
  2013-07-23  2:15   ` Alex Courbot
@ 2013-07-23  3:32     ` Stephen Warren
  2013-07-23  5:01       ` Alex Courbot
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Warren @ 2013-07-23  3:32 UTC (permalink / raw)
  To: Alex Courbot
  Cc: Jon Medhurst (Tixy),
	Andrew Morton, gnurou, linux-kernel, linux-arm-kernel,
	linux-tegra

On 07/22/2013 07:15 PM, Alex Courbot wrote:
...
> Although the patch seems ok to me in its current form, there are two
> points for which I still have small doubts:
> 
> 1) Whether size_t and pointers will have the same size on all platforms.

ptrsize_t?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] decompressors: fix "no limit" output buffer length
  2013-07-23  3:32     ` Stephen Warren
@ 2013-07-23  5:01       ` Alex Courbot
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Courbot @ 2013-07-23  5:01 UTC (permalink / raw)
  To: Stephen Warren, Andrew Morton
  Cc: Jon Medhurst (Tixy), gnurou, linux-kernel, linux-arm-kernel, linux-tegra

On 07/23/2013 12:32 PM, Stephen Warren wrote:
> On 07/22/2013 07:15 PM, Alex Courbot wrote:
> ...
>> Although the patch seems ok to me in its current form, there are two
>> points for which I still have small doubts:
>>
>> 1) Whether size_t and pointers will have the same size on all platforms.
>
> ptrsize_t?
>

Do you mean ptrdiff_t? (I cannot find ptrsize_t anywhere in the kernel)

Looking further about the uses of size_t and ptrdiff_t, it seems like 
size_t is designed to store the maximum addressable member of an array, 
whereas ptrdiff_t is used to store a substraction of two pointers. In 
effect, they translate to the unsigned (size_t) and signed (ptrdiff_t) 
variants of the same type.

But since here we know that the result of the substraction will always 
be positive and potentially big (for devices with memory in the lower 
half of the address space) using size_t sounds safer to avoid overflows 
and sign-conversion issues (strm->avail_out, where the value of out_len 
eventually ends, is an unsigned int).

So point 1) at least seems to be handled correctly with size_t. Point 2) 
might still be of concern, but if your uncompressed kernel image ends up 
overflowing your addressable memory, I guess you have a bigger problem 
to start with. :)

Andrew, do you think you can merge this as-is? Sorry if you are not the 
right person to ask, but there is no clear maintainer for this part of 
the code and you appear to have handled the latest patches that affect 
the same file.

Thanks,
Alex.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-07-23  5:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-22  6:56 [PATCH] decompressors: fix "no limit" output buffer length Alexandre Courbot
2013-07-22 18:08 ` Jon Medhurst (Tixy)
2013-07-23  2:15   ` Alex Courbot
2013-07-23  3:32     ` Stephen Warren
2013-07-23  5:01       ` Alex Courbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).