linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: "Tejun Heo" <tj@kernel.org>,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Minchan Kim" <minchan@kernel.org>,
	jeyu@kernel.org, shuah <shuah@kernel.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	"Masahiro Yamada" <masahiroy@kernel.org>,
	"Nick Desaulniers" <ndesaulniers@google.com>,
	yzaikin@google.com, "Nathan Chancellor" <nathan@kernel.org>,
	ojeda@kernel.org,
	"Tetsuo Handa" <penguin-kernel@i-love.sakura.ne.jp>,
	vitor@massaru.org, elver@google.com,
	"Jarkko Sakkinen" <jarkko@kernel.org>,
	"Alexander Potapenko" <glider@google.com>,
	rf@opensource.cirrus.com,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	"David Laight" <David.Laight@aculab.com>,
	bvanassche@acm.org, jolsa@kernel.org,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	trishalfonso@google.com, andreyknvl@gmail.com,
	"Jiri Kosina" <jikos@kernel.org>,
	mbenes@suse.com, "Nitin Gupta" <ngupta@vflare.org>,
	"Sergey Senozhatsky" <sergey.senozhatsky.work@gmail.com>,
	"Reinette Chatre" <reinette.chatre@intel.com>,
	"Fenghua Yu" <fenghua.yu@intel.com>,
	"Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	lizefan.x@bytedance.com, "Johannes Weiner" <hannes@cmpxchg.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	senozhatsky@chromium.org, "Christoph Hellwig" <hch@lst.de>,
	"Joe Perches" <joe@perches.com>,
	hkallweit1@gmail.com, "Jens Axboe" <axboe@kernel.dk>,
	"Josh Poimboeuf" <jpoimboe@redhat.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Kees Cook" <keescook@chromium.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	linux-spdx@vger.kernel.org,
	"Linux Doc Mailing List" <linux-doc@vger.kernel.org>,
	linux-block@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kselftest@vger.kernel.org, cgroups@vger.kernel.org,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	copyleft-next@lists.fedorahosted.org
Subject: Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal
Date: Mon, 20 Sep 2021 13:52:21 -0700	[thread overview]
Message-ID: <CAPcyv4i0xEwMQ5kSK-xGroV7aZr3j1YNrGMVLiLMr3U8nFCMKA@mail.gmail.com> (raw)
In-Reply-To: <20210918050430.3671227-10-mcgrof@kernel.org>

On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> When sysfs attributes use a lock also used on module removal we can
> race to deadlock. This happens when for instance a sysfs file on
> a driver is used, then at the same time we have module removal call
> trigger. The module removal call code holds a lock, and then the sysfs
> file entry waits for the same lock. While holding the lock the module
> removal tries to remove the sysfs entries, but these cannot be removed
> yet as one is waiting for a lock. This won't complete as the lock is
> already held. Likewise module removal cannot complete, and so we deadlock.
>
> This can now be easily reproducible with our sysfs selftest as follows:
>
> ./tools/testing/selftests/sysfs/sysfs.sh -t 0027
>
> To fix this we extend the struct kernfs_node with a module reference and
> use the try_module_get() after kernfs_get_active() is called which
> protects integrity and the existence of the kernfs node during the
> operation.
>
> So long as the kernfs node is protected with kernfs_get_active() we know
> we can rely on its contents. And, as now just documented in the previous
> patch, we also now know that once kernfs_get_active() is called the module
> is also guarded to exist and cannot be removed.
>
> If try_module_get() fails we fail the operation on the kernfs node.
>
> We use a try method as a full lock means we'd then make our sysfs
> attributes busy us out from possible module removal, and so userspace
> could force denying module removal, a silly form of "DOS" against module
> removal. A try lock on the module removal ensures we give priority to
> module removal and interacting with sysfs attributes only comes second.
> Using a full lock could mean for instance that if you don't stop poking
> at sysfs files you cannot remove a module.
>
> Races between removal of sysfs files and the module are not possible
> given sysfs files are created by the same module, and when a sysfs file
> is being used kernfs prevents removal of the sysfs file. So if module
> removal is actually happening the removal would have to wait until
> the sysfs file operation is complete.
>
> This deadlock was first reported with the zram driver, however the live
> patching folks have acknowledged they have observed this as well with
> live patching, when a live patch is removed. I was then able to
> reproduce easily by creating a dedicated selftests.
>
> A sketch of how this can happen follows:
>
> CPU A                              CPU B
>                                    whatever_store()
> module_unload
>   mutex_lock(foo)
>                                    mutex_lock(foo)
>    del_gendisk(zram->disk);
>      device_del()
>        device_remove_groups()

This flow seems possible to trigger with:

   echo $dev > /sys/bus/$bus/drivers/$driver/unbind

I am missing why module pinning is part of the solution when it's the
device_del() path that is racing? Module removal is just a more coarse
grained way to trigger unbind => device_del(). Isn't the above a bug
in the driver, not missing synchronization in kernfs? Forgive me if
the unbind question was asked and answered elsewhere, this is my first
time taking a look at this series.

  parent reply	other threads:[~2021-09-21  1:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210918050430.3671227-1-mcgrof@kernel.org>
2021-09-18  5:04 ` [PATCH v7 01/12] LICENSES: Add the copyleft-next-0.3.1 license Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 02/12] testing: use the copyleft-next-0.3.1 SPDX tag Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 03/12] selftests: add tests_sysfs module Luis Chamberlain
2021-09-20 21:38   ` Bart Van Assche
2021-09-18  5:04 ` [PATCH v7 04/12] kernfs: add initial failure injection support Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 05/12] test_sysfs: add support to use kernfs failure injection Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 06/12] kernel/module: add documentation for try_module_get() Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 07/12] fs/kernfs/symlink.c: replace S_IRWXUGO with 0777 on kernfs_create_link() Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 08/12] fs/sysfs/dir.c: replace S_IRWXU|S_IRUGO|S_IXUGO with 0755 sysfs_create_dir_ns() Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 09/12] sysfs: fix deadlock race with module removal Luis Chamberlain
2021-09-20 17:53   ` Tejun Heo
2021-09-20 19:15     ` Luis Chamberlain
2021-09-20 19:22       ` Tejun Heo
2021-09-20 19:38         ` Luis Chamberlain
2021-09-20 20:52   ` Dan Williams [this message]
2021-09-20 21:17     ` Luis Chamberlain
2021-09-20 21:55       ` Dan Williams
2021-09-21  0:03         ` Luis Chamberlain
2021-09-20 21:36   ` Bart Van Assche
2021-09-20 21:43     ` Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 10/12] test_sysfs: enable deadlock tests by default Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 11/12] zram: fix crashes with cpu hotplug multistate Luis Chamberlain
2021-09-18  5:04 ` [PATCH v7 12/12] zram: use ATTRIBUTE_GROUPS to fix sysfs deadlock module removal Luis Chamberlain
2021-09-20 17:55 ` [PATCH v7 00/12 (RESEND)] syfs: generic deadlock fix with " Tejun Heo
2021-09-27 15:11   ` Luis Chamberlain
     [not found] <20210917194709.3562413-1-mcgrof@kernel.org>
     [not found] ` <20210917194709.3562413-10-mcgrof@kernel.org>
     [not found]   ` <c70dcb03e27e43c5b5311e184357df39@AcuMS.aculab.com>
2021-09-21 15:48     ` [PATCH v7 09/12] sysfs: fix deadlock race " Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4i0xEwMQ5kSK-xGroV7aZr3j1YNrGMVLiLMr3U8nFCMKA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=David.Laight@aculab.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=bvanassche@acm.org \
    --cc=cgroups@vger.kernel.org \
    --cc=copyleft-next@lists.fedorahosted.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=elver@google.com \
    --cc=fenghua.yu@intel.com \
    --cc=glider@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=hkallweit1@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=jeyu@kernel.org \
    --cc=jikos@kernel.org \
    --cc=joe@perches.com \
    --cc=jolsa@kernel.org \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kw@linux.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-spdx@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=masahiroy@kernel.org \
    --cc=mbenes@suse.com \
    --cc=mcgrof@kernel.org \
    --cc=minchan@kernel.org \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=ngupta@vflare.org \
    --cc=ojeda@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=reinette.chatre@intel.com \
    --cc=rf@opensource.cirrus.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=shuah@kernel.org \
    --cc=stephen@networkplumber.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=trishalfonso@google.com \
    --cc=vitor@massaru.org \
    --cc=x86@kernel.org \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).