All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: slub: reinitialize random sequence cache on slab object update
@ 2020-01-30 10:18 vjitta
  2020-01-30 11:17 ` vjitta
  2020-01-30 18:28 ` Christopher Lameter
  0 siblings, 2 replies; 9+ messages in thread
From: vjitta @ 2020-01-30 10:18 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm
  Cc: linux-kernel, vinmenon, kernel-team, Vijayanand Jitta

From: Vijayanand Jitta <vjitta@codeaurora.org>

Random sequence cache is precomputed during slab object creation
based up on the object size and no of objects per slab. These could
be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
from sysfs. So when shuffle_freelist is called during slab_alloc it
uses updated object count to access the precomputed random sequence
cache. This could result in incorrect access of the random sequence
cache which could further result in slab corruption. Fix this by
reinitializing the random sequence cache up on slab object update.

A sample panic trace when write to slab_store_user was attempted.

Call trace0:
 exception
 set_freepointer(inline)
 shuffle_freelist(inline)
 new_slab+0x688/0x690
 ___slab_alloc+0x548/0x6f8
 kmem_cache_alloc+0x3dc/0x418
 zs_malloc+0x60/0x578
 zram_bvec_rw+0x66c/0xaa0
 zram_make_request+0x190/0x2c8
 generic_make_request+0x1f8/0x420
 submit_bio+0x140/0x1d8
 submit_bh_wbc+0x1a0/0x1e0
 __block_write_full_page+0x3a0/0x5e8
 block_write_full_page+0xec/0x108
 blkdev_writepage+0x2c/0x38
 __writepage+0x34/0x98
 write_cache_pages+0x33c/0x598
 generic_writepages+0x54/0x98
 blkdev_writepages+0x24/0x30
 do_writepages+0x90/0x138
 __filemap_fdatawrite_range+0xc0/0x128
 file_write_and_wait_range+0x44/0xa0
 blkdev_fsync+0x38/0x68
 __arm64_sys_fsync+0x6c/0xb8

Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
---
 mm/slub.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index 0ab92ec..88abac5 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1533,6 +1533,24 @@ static int init_cache_random_seq(struct kmem_cache *s)
 	return 0;
 }
 
+/* re-initialize the random sequence cache */
+static int reinit_cache_random_seq(struct kmem_cache *s)
+{
+	int err;
+
+	if (s->random_seq) {
+		cache_random_seq_destroy(s);
+		err = init_cache_random_seq(s);
+
+		if (err) {
+			pr_err("SLUB: Unable to re-initialize random sequence
+				cache for %s\n", s->name);
+			return err;
+		}
+	}
+
+	return 0;
+}
 /* Initialize each random sequence freelist per cache */
 static void __init init_freelist_randomization(void)
 {
@@ -1607,6 +1625,10 @@ static inline int init_cache_random_seq(struct kmem_cache *s)
 {
 	return 0;
 }
+static int reinit_cache_random_seq(struct kmem_cache *s)
+{
+	return 0;
+}
 static inline void init_freelist_randomization(void) { }
 static inline bool shuffle_freelist(struct kmem_cache *s, struct page *page)
 {
@@ -5192,6 +5214,7 @@ static ssize_t red_zone_store(struct kmem_cache *s,
 		s->flags |= SLAB_RED_ZONE;
 	}
 	calculate_sizes(s, -1);
+	reinit_cache_random_seq(s);
 	return length;
 }
 SLAB_ATTR(red_zone);
@@ -5212,6 +5235,7 @@ static ssize_t poison_store(struct kmem_cache *s,
 		s->flags |= SLAB_POISON;
 	}
 	calculate_sizes(s, -1);
+	reinit_cache_random_seq(s);
 	return length;
 }
 SLAB_ATTR(poison);
@@ -5233,6 +5257,7 @@ static ssize_t store_user_store(struct kmem_cache *s,
 		s->flags |= SLAB_STORE_USER;
 	}
 	calculate_sizes(s, -1);
+	reinit_cache_random_seq(s);
 	return length;
 }
 SLAB_ATTR(store_user);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-01-30 10:18 [PATCH] mm: slub: reinitialize random sequence cache on slab object update vjitta
@ 2020-01-30 11:17 ` vjitta
  2020-02-27 16:53   ` Vlastimil Babka
  2020-01-30 18:28 ` Christopher Lameter
  1 sibling, 1 reply; 9+ messages in thread
From: vjitta @ 2020-01-30 11:17 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm
  Cc: linux-kernel, vinmenon, kernel-team, Vijayanand Jitta

From: Vijayanand Jitta <vjitta@codeaurora.org>

Random sequence cache is precomputed during slab object creation
based up on the object size and no of objects per slab. These could
be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
from sysfs. So when shuffle_freelist is called during slab_alloc it
uses updated object count to access the precomputed random sequence
cache. This could result in incorrect access of the random sequence
cache which could further result in slab corruption. Fix this by
reinitializing the random sequence cache up on slab object update.

A sample panic trace when write to slab_store_user was attempted.

Call trace0:
 exception
 set_freepointer(inline)
 shuffle_freelist(inline)
 new_slab+0x688/0x690
 ___slab_alloc+0x548/0x6f8
 kmem_cache_alloc+0x3dc/0x418
 zs_malloc+0x60/0x578
 zram_bvec_rw+0x66c/0xaa0
 zram_make_request+0x190/0x2c8
 generic_make_request+0x1f8/0x420
 submit_bio+0x140/0x1d8
 submit_bh_wbc+0x1a0/0x1e0
 __block_write_full_page+0x3a0/0x5e8
 block_write_full_page+0xec/0x108
 blkdev_writepage+0x2c/0x38
 __writepage+0x34/0x98
 write_cache_pages+0x33c/0x598
 generic_writepages+0x54/0x98
 blkdev_writepages+0x24/0x30
 do_writepages+0x90/0x138
 __filemap_fdatawrite_range+0xc0/0x128
 file_write_and_wait_range+0x44/0xa0
 blkdev_fsync+0x38/0x68
 __arm64_sys_fsync+0x6c/0xb8

Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
---
 mm/slub.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index 0ab92ec..b88dd0f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1533,6 +1533,24 @@ static int init_cache_random_seq(struct kmem_cache *s)
 	return 0;
 }
 
+/* re-initialize the random sequence cache */
+static int reinit_cache_random_seq(struct kmem_cache *s)
+{
+	int err;
+
+	if (s->random_seq) {
+		cache_random_seq_destroy(s);
+		err = init_cache_random_seq(s);
+
+		if (err) {
+			pr_err("SLUB: Unable to re-initialize random sequence cache for %s\n",
+				s->name);
+			return err;
+		}
+	}
+
+	return 0;
+}
 /* Initialize each random sequence freelist per cache */
 static void __init init_freelist_randomization(void)
 {
@@ -1607,6 +1625,10 @@ static inline int init_cache_random_seq(struct kmem_cache *s)
 {
 	return 0;
 }
+static int reinit_cache_random_seq(struct kmem_cache *s)
+{
+	return 0;
+}
 static inline void init_freelist_randomization(void) { }
 static inline bool shuffle_freelist(struct kmem_cache *s, struct page *page)
 {
@@ -5192,6 +5214,7 @@ static ssize_t red_zone_store(struct kmem_cache *s,
 		s->flags |= SLAB_RED_ZONE;
 	}
 	calculate_sizes(s, -1);
+	reinit_cache_random_seq(s);
 	return length;
 }
 SLAB_ATTR(red_zone);
@@ -5212,6 +5235,7 @@ static ssize_t poison_store(struct kmem_cache *s,
 		s->flags |= SLAB_POISON;
 	}
 	calculate_sizes(s, -1);
+	reinit_cache_random_seq(s);
 	return length;
 }
 SLAB_ATTR(poison);
@@ -5233,6 +5257,7 @@ static ssize_t store_user_store(struct kmem_cache *s,
 		s->flags |= SLAB_STORE_USER;
 	}
 	calculate_sizes(s, -1);
+	reinit_cache_random_seq(s);
 	return length;
 }
 SLAB_ATTR(store_user);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-01-30 10:18 [PATCH] mm: slub: reinitialize random sequence cache on slab object update vjitta
  2020-01-30 11:17 ` vjitta
@ 2020-01-30 18:28 ` Christopher Lameter
  2020-01-31  6:39   ` Vijayanand Jitta
  2020-02-03  6:57   ` Vijayanand Jitta
  1 sibling, 2 replies; 9+ messages in thread
From: Christopher Lameter @ 2020-01-30 18:28 UTC (permalink / raw)
  To: Vijayanand Jitta
  Cc: penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel,
	vinmenon, kernel-team

On Thu, 30 Jan 2020, vjitta@codeaurora.org wrote:

> Random sequence cache is precomputed during slab object creation
> based up on the object size and no of objects per slab. These could
> be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
> from sysfs. So when shuffle_freelist is called during slab_alloc it

Sorry no. That cannot happen. Changing the size of the slab is only
possible if no slab pages are allocated. Any sysfs changes that affect the
object size must fail if object and slab pages are already allocated.

If you were able to change the object size then we need to prevent that
from happening.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-01-30 18:28 ` Christopher Lameter
@ 2020-01-31  6:39   ` Vijayanand Jitta
  2020-02-03  6:57   ` Vijayanand Jitta
  1 sibling, 0 replies; 9+ messages in thread
From: Vijayanand Jitta @ 2020-01-31  6:39 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel,
	vinmenon, kernel-team



On 1/30/2020 11:58 PM, Christopher Lameter wrote:
> On Thu, 30 Jan 2020, vjitta@codeaurora.org wrote:
> 
>> Random sequence cache is precomputed during slab object creation
>> based up on the object size and no of objects per slab. These could
>> be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
>> from sysfs. So when shuffle_freelist is called during slab_alloc it
> 
> Sorry no. That cannot happen. Changing the size of the slab is only
> possible if no slab pages are allocated. Any sysfs changes that affect the
> object size must fail if object and slab pages are already allocated.
> 
> If you were able to change the object size then we need to prevent that
> from happening.
> 

Yes, size of slab can't be changed after objects are allocated, that holds
true even with this change. Let me explain a bit more about the use case here

ZRAM compression uses the slub allocator, by enabling the slub debug flags like
SLAB_STORE_USER etc.. the memory consumption will rather be increased which doesn't
serve the purpose of ZRAM compression. So, such flags are to be disabled before the
allocations happen, this requires updation of random sequence cache as object
size and number of objects change after these flags are disabled.

So, the sequence will be

1. Slab creation (this will set a precomputed random sequence cache) 
2. Remove the debug flags 
3. update the random sequence cache 
4. Mount zram and then start using it for allocations.

Thanks,
Vijay



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-01-30 18:28 ` Christopher Lameter
  2020-01-31  6:39   ` Vijayanand Jitta
@ 2020-02-03  6:57   ` Vijayanand Jitta
  2020-02-20  5:12     ` vjitta
  1 sibling, 1 reply; 9+ messages in thread
From: Vijayanand Jitta @ 2020-02-03  6:57 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel,
	vinmenon, kernel-team



On 1/30/2020 11:58 PM, Christopher Lameter wrote:
> On Thu, 30 Jan 2020, vjitta@codeaurora.org wrote:
> 
>> Random sequence cache is precomputed during slab object creation
>> based up on the object size and no of objects per slab. These could
>> be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
>> from sysfs. So when shuffle_freelist is called during slab_alloc it
> 
> Sorry no. That cannot happen. Changing the size of the slab is only
> possible if no slab pages are allocated. Any sysfs changes that affect the
> object size must fail if object and slab pages are already allocated.
> 
> If you were able to change the object size then we need to prevent that
> from happening.
> 

Yes, size of slab can't be changed after objects are allocated, that holds
true even with this change. Let me explain a bit more about the use case here

ZRAM compression uses the slub allocator, by enabling the slub debug flags like
SLAB_STORE_USER etc.. the memory consumption will rather be increased which doesn't
serve the purpose of ZRAM compression. So, such flags are to be disabled before the
allocations happen, this requires updation of random sequence cache as object
size and number of objects change after these flags are disabled.

So, the sequence will be

1. Slab creation (this will set a precomputed random sequence cache) 
2. Remove the debug flags 
3. update the random sequence cache 
4. Mount zram and then start using it for allocations.

Thanks,
Vijay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-02-03  6:57   ` Vijayanand Jitta
@ 2020-02-20  5:12     ` vjitta
  0 siblings, 0 replies; 9+ messages in thread
From: vjitta @ 2020-02-20  5:12 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel,
	vinmenon, kernel-team, linux-kernel-owner

On 2020-02-03 12:27, Vijayanand Jitta wrote:
> On 1/30/2020 11:58 PM, Christopher Lameter wrote:
>> On Thu, 30 Jan 2020, vjitta@codeaurora.org wrote:
>> 
>>> Random sequence cache is precomputed during slab object creation
>>> based up on the object size and no of objects per slab. These could
>>> be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
>>> from sysfs. So when shuffle_freelist is called during slab_alloc it
>> 
>> Sorry no. That cannot happen. Changing the size of the slab is only
>> possible if no slab pages are allocated. Any sysfs changes that affect 
>> the
>> object size must fail if object and slab pages are already allocated.
>> 
>> If you were able to change the object size then we need to prevent 
>> that
>> from happening.
>> 
> 
> Yes, size of slab can't be changed after objects are allocated, that 
> holds
> true even with this change. Let me explain a bit more about the use 
> case here
> 
> ZRAM compression uses the slub allocator, by enabling the slub debug 
> flags like
> SLAB_STORE_USER etc.. the memory consumption will rather be increased
> which doesn't
> serve the purpose of ZRAM compression. So, such flags are to be
> disabled before the
> allocations happen, this requires updation of random sequence cache as 
> object
> size and number of objects change after these flags are disabled.
> 
> So, the sequence will be
> 
> 1. Slab creation (this will set a precomputed random sequence cache)
> 2. Remove the debug flags
> 3. update the random sequence cache
> 4. Mount zram and then start using it for allocations.
> 
> Thanks,
> Vijay

Waiting for your response.

Thanks,
Vijay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-01-30 11:17 ` vjitta
@ 2020-02-27 16:53   ` Vlastimil Babka
  2020-03-05  5:48     ` vjitta
  0 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2020-02-27 16:53 UTC (permalink / raw)
  To: vjitta, cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm
  Cc: linux-kernel, vinmenon, kernel-team

On 1/30/20 12:17 PM, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> Random sequence cache is precomputed during slab object creation
> based up on the object size and no of objects per slab. These could
> be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
> from sysfs. So when shuffle_freelist is called during slab_alloc it
> uses updated object count to access the precomputed random sequence
> cache. This could result in incorrect access of the random sequence
> cache which could further result in slab corruption. Fix this by
> reinitializing the random sequence cache up on slab object update.
> 
> A sample panic trace when write to slab_store_user was attempted.

A more complete oops report would have been better, e.g. if anyone was googling
it, to find this patch.

Also I was checking where else calculate_sizes() is called and found
order_store(). So if somebody changes (especially increases) the order,
shouldn't the reinitialization also be done?

This is even more nasty as it doesn't seem to require that no objects exist.
Also there is no synchronization against concurrent allocations/frees? Gasp.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-02-27 16:53   ` Vlastimil Babka
@ 2020-03-05  5:48     ` vjitta
  2020-03-05 12:40       ` Vlastimil Babka
  0 siblings, 1 reply; 9+ messages in thread
From: vjitta @ 2020-03-05  5:48 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm,
	linux-kernel, vinmenon, kernel-team

On 2020-02-27 22:23, Vlastimil Babka wrote:
> On 1/30/20 12:17 PM, vjitta@codeaurora.org wrote:
>> From: Vijayanand Jitta <vjitta@codeaurora.org>
>> 
>> Random sequence cache is precomputed during slab object creation
>> based up on the object size and no of objects per slab. These could
>> be changed when flags like SLAB_STORE_USER, SLAB_POISON are updated
>> from sysfs. So when shuffle_freelist is called during slab_alloc it
>> uses updated object count to access the precomputed random sequence
>> cache. This could result in incorrect access of the random sequence
>> cache which could further result in slab corruption. Fix this by
>> reinitializing the random sequence cache up on slab object update.
>> 
>> A sample panic trace when write to slab_store_user was attempted.
> 
> A more complete oops report would have been better, e.g. if anyone was 
> googling
> it, to find this patch.
> 
> Also I was checking where else calculate_sizes() is called and found
> order_store(). So if somebody changes (especially increases) the order,
> shouldn't the reinitialization also be done?

Yes, reinitialization must be done here aswell , will update the patch.

> 
> This is even more nasty as it doesn't seem to require that no objects 
> exist.
> Also there is no synchronization against concurrent allocations/frees? 
> Gasp.

Since, random sequence cache is only used to update the freelist in 
shuffle_freelist
which is done only when a new slab is created incase if objects 
allocations are
done without a need of new slab creation they will use the existing 
freelist which
should be fine as object size doesn't change after order_store() and 
incase if a new
slab is created we will get the updated freelist. so in both cases i 
think it should
be fine.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: slub: reinitialize random sequence cache on slab object update
  2020-03-05  5:48     ` vjitta
@ 2020-03-05 12:40       ` Vlastimil Babka
  0 siblings, 0 replies; 9+ messages in thread
From: Vlastimil Babka @ 2020-03-05 12:40 UTC (permalink / raw)
  To: vjitta
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm,
	linux-kernel, vinmenon, kernel-team, Jann Horn

On 3/5/20 6:48 AM, vjitta@codeaurora.org wrote:
> On 2020-02-27 22:23, Vlastimil Babka wrote:
>> 
>> This is even more nasty as it doesn't seem to require that no objects 
>> exist.
>> Also there is no synchronization against concurrent allocations/frees? 
>> Gasp.
> 
> Since, random sequence cache is only used to update the freelist in 
> shuffle_freelist
> which is done only when a new slab is created incase if objects 
> allocations are
> done without a need of new slab creation they will use the existing 
> freelist which
> should be fine as object size doesn't change after order_store() and 
> incase if a new
> slab is created we will get the updated freelist. so in both cases i 
> think it should
> be fine.

I have some doubts. With reinit_cache_random_seq() for SLUB, s->random_seq will
in turn:
cache_random_seq_destroy()
- point to an object that's been kfree'd
- point to NULL
init_cache_random_seq()
  cache_random_seq_create()
  - point to freshly allocated zeroed out object
    freelist_randomize()
    - the object is gradually initialized
- the indices are gradually transformed to page offsets

At any point of this, new slab can be allocated in parallel and observe
s->random_seq in shuffle_freelist(), and it's only ok if it's currently NULL.

Could it be fixed? In the reinit part you would need to
- atomically update a valid s->random_seq to another valid s->random_seq
(perhaps with NULL in between which means some freelist won't be perhaps randomized)
- write barrier
- call calculate_sizes() with updated flags / new order, make sure all the
fields of s-> are updated in a safe order and with write barries (i.e. update
s->oo and s->flags would be probably last, but maybe that's not all) so that
anyone allocating a new slab will always get something valid (maybe that path
would need also new read barriers?)

No, I don't think it's worth the trouble?


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-03-05 12:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-30 10:18 [PATCH] mm: slub: reinitialize random sequence cache on slab object update vjitta
2020-01-30 11:17 ` vjitta
2020-02-27 16:53   ` Vlastimil Babka
2020-03-05  5:48     ` vjitta
2020-03-05 12:40       ` Vlastimil Babka
2020-01-30 18:28 ` Christopher Lameter
2020-01-31  6:39   ` Vijayanand Jitta
2020-02-03  6:57   ` Vijayanand Jitta
2020-02-20  5:12     ` vjitta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.