linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
@ 2020-12-20 21:10 Randy Dunlap
  2020-12-23  1:35 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2020-12-20 21:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: Randy Dunlap, Jens Axboe, Andrew Morton, Toralf Förster, linux-mm

Add a test to detect if the input ra request size has its high order
bit set (is negative when tested as a signed long). This would be a
really Huge readahead.

If so, WARN() with the value and a stack trace so that we can see
where this is happening and then make further corrections later.
Then adjust the size value so that it is not so Huge (although
this may not be needed).

Also correct a comment: the return value is not squared for
small size.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: linux-mm@kvack.org
---
Notes:
- Look for "WARNING:.*get_init_ra_size"
- If panic_on_warn is set, this will cause a kernel panic().

 mm/readahead.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- linux-5.10.1.orig/mm/readahead.c
+++ linux-5.10.1/mm/readahead.c
@@ -14,6 +14,7 @@
 #include <linux/export.h>
 #include <linux/blkdev.h>
 #include <linux/backing-dev.h>
+#include <linux/bug.h>
 #include <linux/task_io_accounting_ops.h>
 #include <linux/pagevec.h>
 #include <linux/pagemap.h>
@@ -303,14 +304,21 @@ void force_page_cache_ra(struct readahea
 }
 
 /*
- * Set the initial window size, round to next power of 2 and square
+ * Set the initial window size, round to next power of 2
  * for small size, x 4 for medium, and x 2 for large
  * for 128k (32 page) max ra
  * 1-8 page = 32k initial, > 8 page = 128k initial
  */
 static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
 {
-	unsigned long newsize = roundup_pow_of_two(size);
+	unsigned long newsize;
+
+	if ((signed long)size < 0) { /* high bit is set: ultra-large ra req */
+		WARN_ONCE(1, "%s: size=0x%lx\n", __func__, size);
+		size = -size;	/* really only need to flip the high/sign bit */
+	}
+
+	newsize = roundup_pow_of_two(size);
 
 	if (newsize <= max / 32)
 		newsize = newsize * 4;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
  2020-12-20 21:10 [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size() Randy Dunlap
@ 2020-12-23  1:35 ` Andrew Morton
  2020-12-23  1:50   ` Randy Dunlap
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2020-12-23  1:35 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: linux-kernel, Jens Axboe, Toralf Förster, linux-mm

On Sun, 20 Dec 2020 13:10:51 -0800 Randy Dunlap <rdunlap@infradead.org> wrote:

> Add a test to detect if the input ra request size has its high order
> bit set (is negative when tested as a signed long). This would be a
> really Huge readahead.
> 
> If so, WARN() with the value and a stack trace so that we can see
> where this is happening and then make further corrections later.
> Then adjust the size value so that it is not so Huge (although
> this may not be needed).

What motivates this change?  Is there any reason to think this can
happen?

Also, everything in there *should* be unsigned, because a negative
readahead is semantically nonsensical.  Is our handling of this
inherently unsigned quantity incorrect somewhere?

> --- linux-5.10.1.orig/mm/readahead.c
> +++ linux-5.10.1/mm/readahead.c
> 
> ...
>
> @@ -303,14 +304,21 @@ void force_page_cache_ra(struct readahea
>  }
>  
>  /*
> - * Set the initial window size, round to next power of 2 and square
> + * Set the initial window size, round to next power of 2
>   * for small size, x 4 for medium, and x 2 for large
>   * for 128k (32 page) max ra
>   * 1-8 page = 32k initial, > 8 page = 128k initial
>   */
>  static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
>  {
> -	unsigned long newsize = roundup_pow_of_two(size);
> +	unsigned long newsize;
> +
> +	if ((signed long)size < 0) { /* high bit is set: ultra-large ra req */
> +		WARN_ONCE(1, "%s: size=0x%lx\n", __func__, size);
> +		size = -size;	/* really only need to flip the high/sign bit */
> +	}
> +
> +	newsize = roundup_pow_of_two(size);

Is there any way in which userspace can deliberately trigger warning?
Via sys_readadhead() or procfs tuning or whatever?

I guess that permitting a user-triggerable WARN_ONCE() isn't a huuuuge
problem - it isn't a DoS if it only triggers a single time.  It does
permit the malicious user to disable future valid warnings, but I don't
see what incentive there would be for this.  But still, it seems
desirable to avoid it.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
  2020-12-23  1:35 ` Andrew Morton
@ 2020-12-23  1:50   ` Randy Dunlap
  2020-12-29 18:01     ` Toralf Förster
  0 siblings, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2020-12-23  1:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Jens Axboe, Toralf Förster, linux-mm

On 12/22/20 5:35 PM, Andrew Morton wrote:
> On Sun, 20 Dec 2020 13:10:51 -0800 Randy Dunlap <rdunlap@infradead.org> wrote:
> 
>> Add a test to detect if the input ra request size has its high order
>> bit set (is negative when tested as a signed long). This would be a
>> really Huge readahead.
>>
>> If so, WARN() with the value and a stack trace so that we can see
>> where this is happening and then make further corrections later.
>> Then adjust the size value so that it is not so Huge (although
>> this may not be needed).
> 
> What motivates this change?  Is there any reason to think this can
> happen?

Spotted in the wild:

mr-fox kernel: [ 1974.206977] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
mr-fox kernel: [ 1974.206980] shift exponent 64 is too large for 64-bit type 'long unsigned int'

Original report:
https://lore.kernel.org/lkml/c6e5eb81-680f-dd5c-8a81-62041a5ce50c@gmx.de/


Willy suggested that get_init_ra_size() was being called with a size of 0,
which would cause this (instead of some Huge value), so I made a follow-up
patch that only checks size for 0 and if 0, defaults it to 32 (pages).

---
 mm/readahead.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- linux-5.10.1.orig/mm/readahead.c
+++ linux-5.10.1/mm/readahead.c
@@ -310,7 +310,11 @@ void force_page_cache_ra(struct readahea
  */
 static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
 {
-	unsigned long newsize = roundup_pow_of_two(size);
+	unsigned long newsize;
+
+	if (!size)
+		size = 32;
+	newsize = roundup_pow_of_two(size);
 
 	if (newsize <= max / 32)
 		newsize = newsize * 4;


Toralf has only seen this problem one time.


> Also, everything in there *should* be unsigned, because a negative
> readahead is semantically nonsensical.  Is our handling of this
> inherently unsigned quantity incorrect somewhere?
> 
>> --- linux-5.10.1.orig/mm/readahead.c
>> +++ linux-5.10.1/mm/readahead.c
>>
>> ...
>>
>> @@ -303,14 +304,21 @@ void force_page_cache_ra(struct readahea
>>  }
>>  
>>  /*
>> - * Set the initial window size, round to next power of 2 and square
>> + * Set the initial window size, round to next power of 2
>>   * for small size, x 4 for medium, and x 2 for large
>>   * for 128k (32 page) max ra
>>   * 1-8 page = 32k initial, > 8 page = 128k initial
>>   */
>>  static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
>>  {
>> -	unsigned long newsize = roundup_pow_of_two(size);
>> +	unsigned long newsize;
>> +
>> +	if ((signed long)size < 0) { /* high bit is set: ultra-large ra req */
>> +		WARN_ONCE(1, "%s: size=0x%lx\n", __func__, size);
>> +		size = -size;	/* really only need to flip the high/sign bit */
>> +	}
>> +
>> +	newsize = roundup_pow_of_two(size);
> 
> Is there any way in which userspace can deliberately trigger warning?
> Via sys_readadhead() or procfs tuning or whatever?
> 
> I guess that permitting a user-triggerable WARN_ONCE() isn't a huuuuge
> problem - it isn't a DoS if it only triggers a single time.  It does
> permit the malicious user to disable future valid warnings, but I don't
> see what incentive there would be for this.  But still, it seems
> desirable to avoid it.

Sure. I think that we can drop RFC patches 1/2 and 2/2 and just consider the
other one above.


-- 
~Randy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
  2020-12-23  1:50   ` Randy Dunlap
@ 2020-12-29 18:01     ` Toralf Förster
  2020-12-29 18:11       ` Randy Dunlap
  0 siblings, 1 reply; 7+ messages in thread
From: Toralf Förster @ 2020-12-29 18:01 UTC (permalink / raw)
  To: Randy Dunlap, Andrew Morton; +Cc: linux-kernel, Jens Axboe, linux-mm

On 12/23/20 2:50 AM, Randy Dunlap wrote:
>> What motivates this change?  Is there any reason to think this can
>> happen?
> Spotted in the wild:
I run 2 hardened Gentoo systems, a server and a desktop.

I patched the server with this:

mr-fox ~ # cat ubsan.patch
--- linux-5.10.1.orig/mm/readahead.c
+++ linux-5.10.1/mm/readahead.c
@@ -310,7 +310,11 @@ void force_page_cache_ra(struct readahea
   */
  static unsigned long get_init_ra_size(unsigned long size, unsigned
long max)
  {
-       unsigned long newsize = roundup_pow_of_two(size);
+       unsigned long newsize;
+
+       if (!size)
+               size = 32;
+       newsize = roundup_pow_of_two(size);

         if (newsize <= max / 32)
                 newsize = newsize * 4;



and the issue did no longer occurred at the server (5.10.2).

I did not patched the desktop system and the issue occurred still 3
times since 21th of december (5.10.2/3)

--
Toralf

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
  2020-12-29 18:01     ` Toralf Förster
@ 2020-12-29 18:11       ` Randy Dunlap
  2020-12-29 20:00         ` Randy Dunlap
  0 siblings, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2020-12-29 18:11 UTC (permalink / raw)
  To: Toralf Förster, Andrew Morton; +Cc: linux-kernel, Jens Axboe, linux-mm

On 12/29/20 10:01 AM, Toralf Förster wrote:
> On 12/23/20 2:50 AM, Randy Dunlap wrote:
>>> What motivates this change?  Is there any reason to think this can
>>> happen?
>> Spotted in the wild:
> I run 2 hardened Gentoo systems, a server and a desktop.
> 
> I patched the server with this:
> 
> mr-fox ~ # cat ubsan.patch
> --- linux-5.10.1.orig/mm/readahead.c
> +++ linux-5.10.1/mm/readahead.c
> @@ -310,7 +310,11 @@ void force_page_cache_ra(struct readahea
>   */
>  static unsigned long get_init_ra_size(unsigned long size, unsigned
> long max)
>  {
> -       unsigned long newsize = roundup_pow_of_two(size);
> +       unsigned long newsize;
> +
> +       if (!size)
> +               size = 32;
> +       newsize = roundup_pow_of_two(size);
> 
>         if (newsize <= max / 32)
>                 newsize = newsize * 4;
> 
> 
> 
> and the issue did no longer occurred at the server (5.10.2).
> 
> I did not patched the desktop system and the issue occurred still 3
> times since 21th of december (5.10.2/3)

Yes, that's the patch that I posted on 2020-DEC-22.

Looks like I should submit a real patch for that.

thanks.
-- 
~Randy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
  2020-12-29 18:11       ` Randy Dunlap
@ 2020-12-29 20:00         ` Randy Dunlap
  2020-12-29 20:46           ` Toralf Förster
  0 siblings, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2020-12-29 20:00 UTC (permalink / raw)
  To: Toralf Förster, Andrew Morton; +Cc: linux-kernel, Jens Axboe, linux-mm

On 12/29/20 10:11 AM, Randy Dunlap wrote:
> On 12/29/20 10:01 AM, Toralf Förster wrote:
>> On 12/23/20 2:50 AM, Randy Dunlap wrote:
>>>> What motivates this change?  Is there any reason to think this can
>>>> happen?
>>> Spotted in the wild:
>> I run 2 hardened Gentoo systems, a server and a desktop.
>>
>> I patched the server with this:
>>
>> mr-fox ~ # cat ubsan.patch
>> --- linux-5.10.1.orig/mm/readahead.c
>> +++ linux-5.10.1/mm/readahead.c
>> @@ -310,7 +310,11 @@ void force_page_cache_ra(struct readahea
>>   */
>>  static unsigned long get_init_ra_size(unsigned long size, unsigned
>> long max)
>>  {
>> -       unsigned long newsize = roundup_pow_of_two(size);
>> +       unsigned long newsize;
>> +
>> +       if (!size)
>> +               size = 32;
>> +       newsize = roundup_pow_of_two(size);
>>
>>         if (newsize <= max / 32)
>>                 newsize = newsize * 4;
>>
>>
>>
>> and the issue did no longer occurred at the server (5.10.2).
>>
>> I did not patched the desktop system and the issue occurred still 3
>> times since 21th of december (5.10.2/3)
> 
> Yes, that's the patch that I posted on 2020-DEC-22.
> 
> Looks like I should submit a real patch for that.
> 
> thanks.
> 

Hi Toralf,

Do you want either or both of your
Reported-by: and Tested-by: on the patch?

thanks.
-- 
~Randy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size()
  2020-12-29 20:00         ` Randy Dunlap
@ 2020-12-29 20:46           ` Toralf Förster
  0 siblings, 0 replies; 7+ messages in thread
From: Toralf Förster @ 2020-12-29 20:46 UTC (permalink / raw)
  To: Randy Dunlap, Andrew Morton; +Cc: linux-kernel, Jens Axboe, linux-mm

On 12/29/20 9:00 PM, Randy Dunlap wrote:
> Hi Toralf,
>
> Do you want either or both of your
> Reported-by: and Tested-by: on the patch?
>
> thanks.
> -- ~Randy
The first is enough, for a Tested-by: was the time frame too short IMO.

--
Toralf

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-29 20:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-20 21:10 [RFC PATCH 2/2] mm: readahead: handle LARGE input to get_init_ra_size() Randy Dunlap
2020-12-23  1:35 ` Andrew Morton
2020-12-23  1:50   ` Randy Dunlap
2020-12-29 18:01     ` Toralf Förster
2020-12-29 18:11       ` Randy Dunlap
2020-12-29 20:00         ` Randy Dunlap
2020-12-29 20:46           ` Toralf Förster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).