All of lore.kernel.org
 help / color / mirror / Atom feed
* A racy reading spot on n->free_objects in slab.c
@ 2021-04-13 22:06 Gong, Sishuai
  2021-04-14  7:32 ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Gong, Sishuai @ 2021-04-13 22:06 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka; +Cc: linux-mm

Hi,

We found a racy reading spot on shared variable n->free_objects in slab.c and it can be data-racing with several writers that update this variable. As shown below, in function cache_alloc_refill(), n->free_objects will be read without any protection. It could be possible that the read value immediately becomes out-of-date when another writer is changing it (e.g. free_block())

Currently, we haven’t found any explicit errors due to this data race but we noticed 1) most of the reading spots on n->free_objects in slab.c have been protected by locks and 2) the racy result can affect the control flow, thus we want to point out this reader.

------------------------------------------
Execution interleaving

Thread 1 (reader)						Thread 2 (writer)

cache_alloc_refill()						free_block()

shared = READ_ONCE(n->shared);
if (!n->free_objects && (!shared || !shared->avail))
								n->free_objects += nr_objects;
							// lock protected
goto direct_grow;
spin_lock(&n->list_lock);



Thanks,
Sishuai


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: A racy reading spot on n->free_objects in slab.c
  2021-04-13 22:06 A racy reading spot on n->free_objects in slab.c Gong, Sishuai
@ 2021-04-14  7:32 ` Christoph Lameter
  2021-04-14 11:06   ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Lameter @ 2021-04-14  7:32 UTC (permalink / raw)
  To: Gong, Sishuai; +Cc: penberg, rientjes, iamjoonsoo.kim, akpm, vbabka, linux-mm

On Tue, 13 Apr 2021, Gong, Sishuai wrote:

> We found a racy reading spot on shared variable n->free_objects in
> slab.c and it can be data-racing with several writers that update this
> variable. As shown below, in function cache_alloc_refill(),
> n->free_objects will be read without any protection. It could be
> possible that the read value immediately becomes out-of-date when
> another writer is changing it (e.g. free_block())

Ok that is fine. If we mistakenly fill up the per cpu cache with new
objects to the slab then so be it.

If we mistakenly take the lock and fail to get an object then we can still
reverse that decision and do the other thing.

Maybe we need to add a comment there?



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: A racy reading spot on n->free_objects in slab.c
  2021-04-14  7:32 ` Christoph Lameter
@ 2021-04-14 11:06   ` Vlastimil Babka
  2021-04-14 15:09     ` Gong, Sishuai
  0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2021-04-14 11:06 UTC (permalink / raw)
  To: Christoph Lameter, Gong, Sishuai
  Cc: penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm

On 4/14/21 9:32 AM, Christoph Lameter wrote:
> On Tue, 13 Apr 2021, Gong, Sishuai wrote:
> 
>> We found a racy reading spot on shared variable n->free_objects in

I'm assuming this was found with some research tool you're developing?
Did it also flag the line "shared = READ_ONCE(n->shared);" as that's basically
the same thing.

>> slab.c and it can be data-racing with several writers that update this
>> variable. As shown below, in function cache_alloc_refill(),
>> n->free_objects will be read without any protection. It could be
>> possible that the read value immediately becomes out-of-date when
>> another writer is changing it (e.g. free_block())
> 
> Ok that is fine. If we mistakenly fill up the per cpu cache with new
> objects to the slab then so be it.

> If we mistakenly take the lock and fail to get an object then we can still
> reverse that decision and do the other thing.

Agreed. It's common in the kernel to do optimistic reads outside of lock to
decide what to do and avoid locking at all in some cases. This may sacrifice
some precision of these decisions. but not correctness, as locks are taken later
for the critical parts.

> Maybe we need to add a comment there?

Or maybe some construct that makes no difference for the compiler, but does for
the tool? READ_ONCE() maybe?



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: A racy reading spot on n->free_objects in slab.c
  2021-04-14 11:06   ` Vlastimil Babka
@ 2021-04-14 15:09     ` Gong, Sishuai
  0 siblings, 0 replies; 4+ messages in thread
From: Gong, Sishuai @ 2021-04-14 15:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm


> On Apr 14, 2021, at 7:06 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
> On 4/14/21 9:32 AM, Christoph Lameter wrote:
>> On Tue, 13 Apr 2021, Gong, Sishuai wrote:
>> 
>>> We found a racy reading spot on shared variable n->free_objects in
> 
> I'm assuming this was found with some research tool you're developing?
> Did it also flag the line "shared = READ_ONCE(n->shared);" as that's basically
> the same thing.
Yes, the read of n->shared is also racy but we didn’t observe any data races on it. 

We saw cache_alloc_refill() could run in parallel with multi writers (free_block(), cache_grow_end() and even itself cache_alloc_refill()), but none of these writers would change n->shared.
> 
>>> slab.c and it can be data-racing with several writers that update this
>>> variable. As shown below, in function cache_alloc_refill(),
>>> n->free_objects will be read without any protection. It could be
>>> possible that the read value immediately becomes out-of-date when
>>> another writer is changing it (e.g. free_block())
>> 
>> Ok that is fine. If we mistakenly fill up the per cpu cache with new
>> objects to the slab then so be it.
> 
>> If we mistakenly take the lock and fail to get an object then we can still
>> reverse that decision and do the other thing.
> 
> Agreed. It's common in the kernel to do optimistic reads outside of lock to
> decide what to do and avoid locking at all in some cases. This may sacrifice
> some precision of these decisions. but not correctness, as locks are taken later
> for the critical parts.
> 
>> Maybe we need to add a comment there?
> 
> Or maybe some construct that makes no difference for the compiler, but does for
> the tool? READ_ONCE() maybe?


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-04-14 15:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 22:06 A racy reading spot on n->free_objects in slab.c Gong, Sishuai
2021-04-14  7:32 ` Christoph Lameter
2021-04-14 11:06   ` Vlastimil Babka
2021-04-14 15:09     ` Gong, Sishuai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.