linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 3/3] mm/slub: Fix potential deadlock problem in slab_attr_store()
Date: Wed, 12 Feb 2020 15:40:24 -0500	[thread overview]
Message-ID: <fd1c1576-7524-ed1c-a886-852511d1f4cf@redhat.com> (raw)
In-Reply-To: <54380181-84d6-4611-fc5e-daed82b73743@redhat.com>

On 2/11/20 6:30 PM, Waiman Long wrote:
> On 2/10/20 6:10 PM, Andrew Morton wrote:
>> On Mon, 10 Feb 2020 17:14:31 -0500 Waiman Long <longman@redhat.com> wrote:
>>
>>>>> --- a/mm/slub.c
>>>>> +++ b/mm/slub.c
>>>>> @@ -5536,7 +5536,12 @@ static ssize_t slab_attr_store(struct kobject *kobj,
>>>>>  	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
>>>>>  		struct kmem_cache *c;
>>>>>  
>>>>> -		mutex_lock(&slab_mutex);
>>>>> +		/*
>>>>> +		 * Timeout after 100ms
>>>>> +		 */
>>>>> +		if (mutex_timed_lock(&slab_mutex, 100) < 0)
>>>>> +			return -EBUSY;
>>>>> +
>>>> Oh dear.  Surely there's a better fix here.  Does slab really need to
>>>> hold slab_mutex while creating that sysfs file?  Why?
>>>>
>>>> If the issue is two threads trying to create the same sysfs file
>>>> (unlikely, given that both will need to have created the same cache)
>>>> then can we add a new mutex specifically for this purpose?
>>>>
>>>> Or something else.
>>>>
>>> Well, the current code iterates all the memory cgroups to set the same
>>> value in all of them. I believe the reason for holding the slab mutex is
>>> to make sure that memcg hierarchy is stable during this iteration
>>> process.
>> But that is unrelated to creation of the sysfs file?
>>
> OK, I will take a closer look at that.

During the creation of a sysfs file:

static int sysfs_slab_add(struct kmem_cache *s)
{
  :
        if (unmergeable) {
                /*
                 * Slabcache can never be merged so we can use the name
proper.
                 * This is typically the case for debug situations. In that
                 * case we can catch duplicate names easily.
                 */
                sysfs_remove_link(&slab_kset->kobj, s->name);
                name = s->name;

The code is trying to remove sysfs files of a cache with conflicting
name. So it seems like kmem_cache_create() is called with a name that
has been used before. If it happens that a write to one of the sysfs
files to be removed happens at the same time, a deadlock can happen.

In this particular case, the kmem_cache_create() call comes from the
mlx5_core module.

        steering->fgs_cache = kmem_cache_create("mlx5_fs_fgs",
                                                sizeof(struct
mlx5_flow_group), 0,
                                                0, NULL);

Perhaps the module is somehow unloaded and then loaded again. Unfortunately
this lockdep error was seen once. It is hard to find out how to fix it
without an easy way to reproduce it.

So I will table this for now until there is a way to reproduce it.

Thanks,
Longman



  reply	other threads:[~2020-02-12 20:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-10 20:46 [PATCH 0/3] locking/mutex: Add mutex_timed_lock() to solve potential deadlock problems Waiman Long
2020-02-10 20:46 ` [PATCH 1/3] locking/mutex: Add mutex_timed_lock() Waiman Long
2020-02-10 20:46 ` [PATCH 2/3] locking/mutex: Enable some lock event counters Waiman Long
2020-02-10 20:46 ` [PATCH 3/3] mm/slub: Fix potential deadlock problem in slab_attr_store() Waiman Long
2020-02-10 22:03   ` Andrew Morton
2020-02-10 22:14     ` Waiman Long
2020-02-10 23:10       ` Andrew Morton
2020-02-11 23:30         ` Waiman Long
2020-02-12 20:40           ` Waiman Long [this message]
2020-02-13 12:22   ` kbuild test robot
2020-02-13 16:48   ` kbuild test robot
2020-02-11 12:31 ` [PATCH 0/3] locking/mutex: Add mutex_timed_lock() to solve potential deadlock problems Peter Zijlstra
2020-02-11 23:31   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd1c1576-7524-ed1c-a886-852511d1f4cf@redhat.com \
    --to=longman@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).