All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Christoph Lameter <cl@gentwo.de>,
	Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: David Rientjes <rientjes@google.com>,
	songmuchun@bytedance.com, Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	akpm@linux-foundation.org, roman.gushchin@linux.dev,
	iamjoonsoo.kim@lge.com, penberg@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] mm/slub: fix the race between validate_slab and slab_free
Date: Fri, 17 Jun 2022 11:40:51 +0200	[thread overview]
Message-ID: <faf416b9-f46c-8534-7fb7-557c046a564d@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2206081417370.465021@gentwo.de>

On 6/8/22 14:23, Christoph Lameter wrote:
> On Wed, 8 Jun 2022, Rongwei Wang wrote:
> 
>> If available, I think document the issue and warn this incorrect behavior is
>> OK. But it still prints a large amount of confusing messages, and disturbs us?
> 
> Correct it would be great if you could fix this in a way that does not
> impact performance.
> 
>> > are current operations on the slab being validated.
>> And I am trying to fix it in following way. In a short, these changes only
>> works under the slub debug mode, and not affects the normal mode (I'm not
>> sure). It looks not elegant enough. And if all approve of this way, I can
>> submit the next version.
> 
> 
>>
>> Anyway, thanks for your time:).
>> -wrw
>>
>> @@ -3304,7 +3300,7 @@ static void __slab_free(struct kmem_cache *s,
> struct
>> slab *slab,
>>
>>  {
>>         void *prior;
>> -       int was_frozen;
>> +       int was_frozen, to_take_off = 0;
>>         struct slab new;
> 
> to_take_off has the role of !n ? Why is that needed?
> 
>> -       do {
>> -               if (unlikely(n)) {
>> +               spin_lock_irqsave(&n->list_lock, flags);
>> +               ret = free_debug_processing(s, slab, head, tail, cnt, addr);
> 
> Ok so the idea is to take the lock only if kmem_cache_debug. That looks
> ok. But it still adds a number of new branches etc to the free loop.

It also further complicates the already tricky code. I wonder if we should
make more benefit from the fact that for kmem_cache_debug() caches we don't
leave any slabs on percpu or percpu partial lists, and also in
free_debug_processing() we aready take both list_lock and slab_lock. If we
just did the freeing immediately there under those locks, we would be
protected against other freeing cpus by that list_lock and don't need the
double cmpxchg tricks.

What about against allocating cpus? More tricky as those will currently end
up privatizing the freelist via get_partial(), only to deactivate it again,
so our list_lock+slab_lock in freeing path would not protect in the
meanwhile. But the allocation is currently very inefficient for debug
caches, as in get_partial() it will take the list_lock to take the slab from
partial list and then in most cases again in deactivate_slab() to return it.

If instead the allocation path for kmem_cache_debug() cache would take a
single object from the partial list (not whole freelist) under list_lock, it
would be ultimately more efficient, and protect against freeing using
list_lock. Sounds like an idea worth trying to me?
And of course we would stop creating the 'validate' sysfs files for
non-debug caches.

  parent reply	other threads:[~2022-06-17  9:40 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-29  8:15 [PATCH 1/3] mm/slub: fix the race between validate_slab and slab_free Rongwei Wang
2022-05-29  8:15 ` [PATCH 2/3] mm/slub: improve consistency of nr_slabs count Rongwei Wang
2022-05-29 12:26   ` Hyeonggon Yoo
2022-05-29  8:15 ` [PATCH 3/3] mm/slub: add nr_full count for debugging slub Rongwei Wang
2022-05-29 11:37 ` [PATCH 1/3] mm/slub: fix the race between validate_slab and slab_free Hyeonggon Yoo
2022-05-30 21:14   ` David Rientjes
2022-06-02 15:14     ` Christoph Lameter
2022-06-03  3:35       ` Rongwei Wang
2022-06-07 12:14         ` Christoph Lameter
2022-06-08  3:04           ` Rongwei Wang
2022-06-08 12:23             ` Christoph Lameter
2022-06-11  4:04               ` Rongwei Wang
2022-06-13 13:50                 ` Christoph Lameter
2022-06-14  2:38                   ` Rongwei Wang
2022-06-17  7:55                   ` Rongwei Wang
2022-06-17 14:19                     ` Christoph Lameter
2022-06-18  2:33                       ` Rongwei Wang
2022-06-20 11:57                         ` Christoph Lameter
2022-06-26 16:48                           ` Rongwei Wang
2022-06-17  9:40               ` Vlastimil Babka [this message]
2022-07-15  8:05                 ` Rongwei Wang
2022-07-15 10:33                   ` Vlastimil Babka
2022-07-15 10:51                     ` Rongwei Wang
2022-05-31  3:47   ` Muchun Song
2022-06-04 11:05     ` Hyeonggon Yoo
2022-05-31  8:50   ` Rongwei Wang
2022-07-18 11:09 ` Vlastimil Babka
2022-07-19 14:15   ` Rongwei Wang
2022-07-19 14:21     ` Vlastimil Babka
2022-07-19 14:43       ` Rongwei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=faf416b9-f46c-8534-7fb7-557c046a564d@suse.cz \
    --to=vbabka@suse.cz \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.de \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rongwei.wang@linux.alibaba.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.