From: Dan Williams <dan.j.williams@intel.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: "Tejun Heo" <tj@kernel.org>,
"Greg KH" <gregkh@linuxfoundation.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Minchan Kim" <minchan@kernel.org>,
jeyu@kernel.org, shuah <shuah@kernel.org>,
"Randy Dunlap" <rdunlap@infradead.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Masahiro Yamada" <masahiroy@kernel.org>,
"Nick Desaulniers" <ndesaulniers@google.com>,
yzaikin@google.com, "Nathan Chancellor" <nathan@kernel.org>,
ojeda@kernel.org,
"Tetsuo Handa" <penguin-kernel@i-love.sakura.ne.jp>,
vitor@massaru.org, elver@google.com,
"Jarkko Sakkinen" <jarkko@kernel.org>,
"Alexander Potapenko" <glider@google.com>,
rf@opensource.cirrus.com,
"Stephen Hemminger" <stephen@networkplumber.org>,
"David Laight" <David.Laight@aculab.com>,
bvanassche@acm.org, jolsa@kernel.org,
"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
trishalfonso@google.com, andreyknvl@gmail.com,
"Jiri Kosina" <jikos@kernel.org>,
mbenes@suse.com, "Nitin Gupta" <ngupta@vflare.org>,
"Sergey Senozhatsky" <sergey.senozhatsky.work@gmail.com>,
"Reinette Chatre" <reinette.chatre@intel.com>,
"Fenghua Yu" <fenghua.yu@intel.com>,
"Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
"H. Peter Anvin" <hpa@zytor.com>,
lizefan.x@bytedance.com, "Johannes Weiner" <hannes@cmpxchg.org>,
"Daniel Vetter" <daniel.vetter@ffwll.ch>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Krzysztof Wilczyński" <kw@linux.com>,
senozhatsky@chromium.org, "Christoph Hellwig" <hch@lst.de>,
"Joe Perches" <joe@perches.com>,
hkallweit1@gmail.com, "Jens Axboe" <axboe@kernel.dk>,
"Josh Poimboeuf" <jpoimboe@redhat.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Kees Cook" <keescook@chromium.org>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Peter Zijlstra" <peterz@infradead.org>,
linux-spdx@vger.kernel.org,
"Linux Doc Mailing List" <linux-doc@vger.kernel.org>,
linux-block@vger.kernel.org,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-kselftest@vger.kernel.org, cgroups@vger.kernel.org,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
copyleft-next@lists.fedorahosted.org
Subject: Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal
Date: Mon, 20 Sep 2021 14:55:10 -0700 [thread overview]
Message-ID: <CAPcyv4jFaNTfpKq6hQsFrWXTwHz-wG+5K4m-hmsPeteLX4p5AQ@mail.gmail.com> (raw)
In-Reply-To: <YUj6TrGbqlNI0OGC@bombadil.infradead.org>
On Mon, Sep 20, 2021 at 2:17 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Mon, Sep 20, 2021 at 01:52:21PM -0700, Dan Williams wrote:
> > On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > > This deadlock was first reported with the zram driver, however the live
> > > patching folks have acknowledged they have observed this as well with
> > > live patching, when a live patch is removed. I was then able to
> > > reproduce easily by creating a dedicated selftests.
> > >
> > > A sketch of how this can happen follows:
> > >
> > > CPU A CPU B
> > > whatever_store()
> > > module_unload
> > > mutex_lock(foo)
> > > mutex_lock(foo)
> > > del_gendisk(zram->disk);
> > > device_del()
> > > device_remove_groups()
> >
> > This flow seems possible to trigger with:
> >
> > echo $dev > /sys/bus/$bus/drivers/$driver/unbind
> >
> > I am missing why module pinning
>
> The aspect of try_module_get() which comes to value to prevent the
> deadlock is it ensures kernfs ops do not run once exit is on the way.
>
> > is part of the solution when it's the
> > device_del() path that is racing?
>
> But its not, the device_del() path will yield until the kernfs op
> completes. It is fine to wait there.
>
> The deadlock happens if a module exit routine uses a lock which is
> also used on a sysfs op. If the lock was first held by module exit,
> and module exit is waiting for the kernfs op to complete, and the
> kernfs op is waiting to hold the same lock then the exit will wait
> forever.
>
> > Module removal is just a more coarse
> > grained way to trigger unbind => device_del().
>
> Right, but the device_del() path is not sharing a lock with the sysfs op.
The deadlock in the example comes from holding a lock over
device_del() that is also taken in a kernfs op. For example, the code
above looks like something that runs from driver.remove(), not
exclusively module_exit(). Yes, module_exit() may trigger
driver.remove() via driver_unregister(), but there's other ways to
trigger driver.remove() that do not involve module_exit().
> > Isn't the above a bug
> > in the driver, not missing synchronization in kernfs?
>
> We can certainly take the position as an alternative:
>
> "thou shalt not use a lock on exit which is also used on a syfs op"
>
> However that seems counter intuitive, specially if we can resolve the
> issue easily with a try_module_get().
Again, I don't see how try_module_get() can affect the ABBA failure
case of holding a lock over device_del() that is also held inside
sysfs op. I agree that the problem is subtle. Does lockdep not
complain about this case? If it's going to be avoided in the core it
seems try_module_get() does not completely cover the hole that
unsuspecting driver writers might fall into.
next prev parent reply other threads:[~2021-09-21 2:05 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210918050430.3671227-1-mcgrof@kernel.org>
2021-09-18 5:04 ` [PATCH v7 01/12] LICENSES: Add the copyleft-next-0.3.1 license Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 02/12] testing: use the copyleft-next-0.3.1 SPDX tag Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 03/12] selftests: add tests_sysfs module Luis Chamberlain
2021-09-20 21:38 ` Bart Van Assche
2021-09-18 5:04 ` [PATCH v7 04/12] kernfs: add initial failure injection support Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 05/12] test_sysfs: add support to use kernfs failure injection Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 06/12] kernel/module: add documentation for try_module_get() Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 07/12] fs/kernfs/symlink.c: replace S_IRWXUGO with 0777 on kernfs_create_link() Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 08/12] fs/sysfs/dir.c: replace S_IRWXU|S_IRUGO|S_IXUGO with 0755 sysfs_create_dir_ns() Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 09/12] sysfs: fix deadlock race with module removal Luis Chamberlain
2021-09-20 17:53 ` Tejun Heo
2021-09-20 19:15 ` Luis Chamberlain
2021-09-20 19:22 ` Tejun Heo
2021-09-20 19:38 ` Luis Chamberlain
2021-09-20 20:52 ` Dan Williams
2021-09-20 21:17 ` Luis Chamberlain
2021-09-20 21:55 ` Dan Williams [this message]
2021-09-21 0:03 ` Luis Chamberlain
2021-09-20 21:36 ` Bart Van Assche
2021-09-20 21:43 ` Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 10/12] test_sysfs: enable deadlock tests by default Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 11/12] zram: fix crashes with cpu hotplug multistate Luis Chamberlain
2021-09-18 5:04 ` [PATCH v7 12/12] zram: use ATTRIBUTE_GROUPS to fix sysfs deadlock module removal Luis Chamberlain
2021-09-20 17:55 ` [PATCH v7 00/12 (RESEND)] syfs: generic deadlock fix with " Tejun Heo
2021-09-27 15:11 ` Luis Chamberlain
[not found] <20210917194709.3562413-1-mcgrof@kernel.org>
[not found] ` <20210917194709.3562413-10-mcgrof@kernel.org>
[not found] ` <c70dcb03e27e43c5b5311e184357df39@AcuMS.aculab.com>
2021-09-21 15:48 ` [PATCH v7 09/12] sysfs: fix deadlock race " Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPcyv4jFaNTfpKq6hQsFrWXTwHz-wG+5K4m-hmsPeteLX4p5AQ@mail.gmail.com \
--to=dan.j.williams@intel.com \
--cc=David.Laight@aculab.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=bvanassche@acm.org \
--cc=cgroups@vger.kernel.org \
--cc=copyleft-next@lists.fedorahosted.org \
--cc=daniel.vetter@ffwll.ch \
--cc=elver@google.com \
--cc=fenghua.yu@intel.com \
--cc=glider@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=hch@lst.de \
--cc=hkallweit1@gmail.com \
--cc=hpa@zytor.com \
--cc=jarkko@kernel.org \
--cc=jeyu@kernel.org \
--cc=jikos@kernel.org \
--cc=joe@perches.com \
--cc=jolsa@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=keescook@chromium.org \
--cc=kw@linux.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-spdx@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=masahiroy@kernel.org \
--cc=mbenes@suse.com \
--cc=mcgrof@kernel.org \
--cc=minchan@kernel.org \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
--cc=ngupta@vflare.org \
--cc=ojeda@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rdunlap@infradead.org \
--cc=reinette.chatre@intel.com \
--cc=rf@opensource.cirrus.com \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=shuah@kernel.org \
--cc=stephen@networkplumber.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=trishalfonso@google.com \
--cc=vitor@massaru.org \
--cc=x86@kernel.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).