linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Miroslav Benes <mbenes@suse.cz>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	tj@kernel.org, gregkh@linuxfoundation.org,
	akpm@linux-foundation.org, minchan@kernel.org, jeyu@kernel.org,
	shuah@kernel.org, bvanassche@acm.org, dan.j.williams@intel.com,
	joe@perches.com, tglx@linutronix.de, keescook@chromium.org,
	rostedt@goodmis.org, linux-spdx@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-kernel@vger.kernel.org, live-patching@vger.kernel.org
Subject: Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate
Date: Tue, 26 Oct 2021 10:48:18 +0200	[thread overview]
Message-ID: <YXfA0jfazCPDTEBw@alley> (raw)
In-Reply-To: <YW/q70dLyF+YudyF@T590>

On Wed 2021-10-20 18:09:51, Ming Lei wrote:
> On Wed, Oct 20, 2021 at 10:19:27AM +0200, Miroslav Benes wrote:
> > On Wed, 20 Oct 2021, Ming Lei wrote:
> > 
> > > On Wed, Oct 20, 2021 at 08:43:37AM +0200, Miroslav Benes wrote:
> > > > On Tue, 19 Oct 2021, Ming Lei wrote:
> > > > 
> > > > > On Tue, Oct 19, 2021 at 08:23:51AM +0200, Miroslav Benes wrote:
> > > > > > > > By you only addressing the deadlock as a requirement on approach a) you are
> > > > > > > > forgetting that there *may* already be present drivers which *do* implement
> > > > > > > > such patterns in the kernel. I worked on addressing the deadlock because
> > > > > > > > I was informed livepatching *did* have that issue as well and so very
> > > > > > > > likely a generic solution to the deadlock could be beneficial to other
> > > > > > > > random drivers.
> > > > > > > 
> > > > > > > In-tree zram doesn't have such deadlock, if livepatching has such AA deadlock,
> > > > > > > just fixed it, and seems it has been fixed by 3ec24776bfd0.
> > > > > > 
> > > > > > I would not call it a fix. It is a kind of ugly workaround because the 
> > > > > > generic infrastructure lacked (lacks) the proper support in my opinion. 
> > > > > > Luis is trying to fix that.
> > > > > 
> > > > > What is the proper support of the generic infrastructure? I am not
> > > > > familiar with livepatching's model(especially with module unload), you mean
> > > > > livepatching have to do the following way from sysfs:
> > > > > 
> > > > > 1) during module exit:
> > > > > 	
> > > > > 	mutex_lock(lp_lock);
> > > > > 	kobject_put(lp_kobj);
> > > > > 	mutex_unlock(lp_lock);
> > > > > 	
> > > > > 2) show()/store() method of attributes of lp_kobj
> > > > > 	
> > > > > 	mutex_lock(lp_lock)
> > > > > 	...
> > > > > 	mutex_unlock(lp_lock)
> > > > 
> > > > Yes, this was exactly the case. We then reworked it a lot (see 
> > > > 958ef1e39d24 ("livepatch: Simplify API by removing registration step"), so 
> > > > now the call sequence is different. kobject_put() is basically offloaded 
> > > > to a workqueue scheduled right from the store() method. Meaning that 
> > > > Luis's work would probably not help us currently, but on the other hand 
> > > > the issues with AA deadlock were one of the main drivers of the redesign 
> > > > (if I remember correctly). There were other reasons too as the changelog 
> > > > of the commit describes.
> > > > 
> > > > So, from my perspective, if there was a way to easily synchronize between 
> > > > a data cleanup from module_exit callback and sysfs/kernfs operations, it 
> > > > could spare people many headaches.
> > > 
> > > kobject_del() is supposed to do so, but you can't hold a shared lock
> > > which is required in show()/store() method. Once kobject_del() returns,
> > > no pending show()/store() any more.
> > > 
> > > The question is that why one shared lock is required for livepatching to
> > > delete the kobject. What are you protecting when you delete one kobject?
> > 
> > I think it boils down to the fact that we embed kobject statically to 
> > structures which livepatch uses to maintain data. That is discouraged 
> > generally, but all the attempts to implement it correctly were utter 
> > failures.
> 
> OK, then it isn't one common usage, in which kobject covers the release
> of the external object. What is the exact kobject in livepatching?

Below are more details about the livepatch code. I hope that it will
help you to see if zram has similar problems or not.

We have kobject in three structures: klp_func, klp_object, and
klp_patch, see include/linux/livepatch.h.

These structures have to be statically defined in the module sources
because they define what is livepatched, see
samples/livepatch/livepatch-sample.c

The kobject is used there to show information about the patch, patched
objects, and patched functions, in sysfs. And most importantly,
the sysfs interface can be used to disable the livepatch.

The problem with static structures is that the module must stay
in the memory as long as the sysfs interface exists. It can be
solved in module_exit() callback. It could wait until the sysfs
interface is destroyed.

kobject API does not support this scenario. The relase() callbacks
are called asynchronously. It expects that the structure is bundled
in a dynamically allocated structure.  As a result, the sysfs
interface can be removed even after the module removal.

The livepatching might create the dynamic structures by duplicating
the structures defined in the module statically. It might safe us
some headaches with kobject release. But it would also need an extra code
that would need to be maintained. The structure constrains strings
than need to be duplicated and later freed...


> But kobject_del() won't release the kobject, you shouldn't need the lock
> to delete kobject first. After the kobject is deleted, no any show() and
> store() any more, isn't such sync[1] you expected?

Livepatch code never called kobject_del() under a lock. It would cause
the obvious deadlock. The historic code only waited in the
module_exit() callback until the sysfs interface was removed.

It has changed in the commit 958ef1e39d24d6cb8bf2a740 ("livepatch:
Simplify API by removing registration step"). The livepatch could
never get enabled again after it was disabled now. The sysfs interface
is removed when the livepatch gets disabled. The module could
be removed only after the sysfs interface is destroyed, see
the module_put() in klp_free_patch_finish().

The livepatch code uses workqueue because the livepatch can be
disabled via sysfs interface. It obviously could not wait until
the sysfs interface is removed in the sysfs write() callback
that triggered the removal.

HTH,
Petr

  reply	other threads:[~2021-10-26  8:48 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-27 16:37 [PATCH v8 00/12] syfs: generic deadlock fix with module removal Luis Chamberlain
2021-09-27 16:37 ` [PATCH v8 01/12] LICENSES: Add the copyleft-next-0.3.1 license Luis Chamberlain
     [not found]   ` <202110050907.35FBD2A1@keescook>
     [not found]     ` <YWR2ZrtzChamY1y4@bombadil.infradead.org>
2021-10-11 17:57       ` Kees Cook
2021-09-27 16:37 ` [PATCH v8 02/12] testing: use the copyleft-next-0.3.1 SPDX tag Luis Chamberlain
2021-10-05 16:11   ` Kees Cook
2021-09-27 16:37 ` [PATCH v8 03/12] selftests: add tests_sysfs module Luis Chamberlain
2021-10-05 14:16   ` Greg KH
2021-10-05 16:57     ` Tim.Bird
2021-10-11 17:40       ` Luis Chamberlain
2021-10-11 17:38     ` Luis Chamberlain
2021-10-07 14:23   ` Miroslav Benes
2021-10-11 19:11     ` Luis Chamberlain
     [not found]   ` <202110050912.3DF681ED@keescook>
2021-10-11 19:03     ` Luis Chamberlain
2021-09-27 16:37 ` [PATCH v8 04/12] kernfs: add initial failure injection support Luis Chamberlain
2021-10-05 19:47   ` Kees Cook
2021-10-11 20:44     ` Luis Chamberlain
2021-09-27 16:37 ` [PATCH v8 05/12] test_sysfs: add support to use kernfs failure injection Luis Chamberlain
2021-10-05 19:51   ` Kees Cook
2021-10-11 20:56     ` Luis Chamberlain
2021-09-27 16:37 ` [PATCH v8 06/12] kernel/module: add documentation for try_module_get() Luis Chamberlain
2021-10-05 19:58   ` Kees Cook
2021-10-11 21:16     ` Luis Chamberlain
2021-09-27 16:38 ` [PATCH v8 07/12] fs/kernfs/symlink.c: replace S_IRWXUGO with 0777 on kernfs_create_link() Luis Chamberlain
2021-10-05 19:59   ` Kees Cook
2021-09-27 16:38 ` [PATCH v8 08/12] fs/sysfs/dir.c: replace S_IRWXU|S_IRUGO|S_IXUGO with 0755 sysfs_create_dir_ns() Luis Chamberlain
2021-10-05 16:05   ` Kees Cook
2021-09-27 16:38 ` [PATCH v8 09/12] sysfs: fix deadlock race with module removal Luis Chamberlain
2021-10-05  9:24   ` Ming Lei
2021-10-11 21:25     ` Luis Chamberlain
2021-10-12  0:20       ` Ming Lei
2021-10-12 21:18         ` Luis Chamberlain
2021-10-13  1:07           ` Ming Lei
2021-10-13 12:35             ` Luis Chamberlain
2021-10-13 15:04               ` Ming Lei
2021-10-13 21:16                 ` Luis Chamberlain
2021-10-05 20:50   ` Kees Cook
2021-10-11 22:26     ` Luis Chamberlain
2021-10-13 12:41       ` Luis Chamberlain
2021-09-27 16:38 ` [PATCH v8 10/12] test_sysfs: enable deadlock tests by default Luis Chamberlain
2021-09-27 16:38 ` [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate Luis Chamberlain
2021-10-05 20:55   ` Kees Cook
2021-10-11 18:27     ` Luis Chamberlain
2021-10-14  1:55   ` Ming Lei
2021-10-14  2:11     ` Ming Lei
2021-10-14 20:24       ` Luis Chamberlain
2021-10-14 23:52         ` Ming Lei
2021-10-15  0:22           ` Luis Chamberlain
2021-10-15  8:36             ` Ming Lei
2021-10-15  8:52               ` Greg KH
2021-10-15 17:31               ` Luis Chamberlain
2021-10-16 11:28                 ` Ming Lei
2021-10-18 19:32                   ` Luis Chamberlain
2021-10-19  2:34                     ` Ming Lei
2021-10-19  6:23                       ` Miroslav Benes
2021-10-19  9:23                         ` Ming Lei
2021-10-20  6:43                           ` Miroslav Benes
2021-10-20  7:49                             ` Ming Lei
2021-10-20  8:19                               ` Miroslav Benes
2021-10-20  8:28                                 ` Greg KH
2021-10-25  9:58                                   ` Miroslav Benes
2021-10-20 10:09                                 ` Ming Lei
2021-10-26  8:48                                   ` Petr Mladek [this message]
2021-10-26 15:37                                     ` Ming Lei
2021-10-26 17:01                                       ` Luis Chamberlain
2021-10-27 11:57                                         ` Miroslav Benes
2021-10-27 14:27                                           ` Luis Chamberlain
2021-11-02 15:24                                           ` Petr Mladek
2021-11-02 16:25                                             ` Luis Chamberlain
2021-11-03  0:01                                               ` Ming Lei
2021-11-03 12:44                                                 ` Luis Chamberlain
2021-10-27 11:42                                       ` Miroslav Benes
2021-11-02 14:15                                       ` Petr Mladek
2021-11-02 14:51                                         ` Petr Mladek
2021-11-02 15:17                                           ` Ming Lei
2021-11-02 14:56                                         ` Ming Lei
2021-10-19 15:28                       ` Luis Chamberlain
2021-10-19 16:29                         ` Ming Lei
2021-10-19 19:36                           ` Luis Chamberlain
2021-10-20  1:15                             ` Ming Lei
2021-10-20 15:48                               ` Luis Chamberlain
2021-10-21  0:39                                 ` Ming Lei
2021-10-21 17:18                                   ` Luis Chamberlain
2021-10-22  0:05                                     ` Ming Lei
2021-10-19 15:50                       ` Luis Chamberlain
2021-10-19 16:25                         ` Greg KH
2021-10-19 16:30                           ` Luis Chamberlain
2021-10-19 17:28                             ` Greg KH
2021-10-19 19:46                               ` Luis Chamberlain
2021-10-19 16:39                         ` Ming Lei
2021-10-19 19:38                           ` Luis Chamberlain
2021-10-20  0:55                             ` Ming Lei
2021-09-27 16:38 ` [PATCH v8 12/12] zram: use ATTRIBUTE_GROUPS to fix sysfs deadlock module removal Luis Chamberlain
2021-10-05 20:57   ` Kees Cook
2021-10-11 18:28     ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXfA0jfazCPDTEBw@alley \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bvanassche@acm.org \
    --cc=dan.j.williams@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jeyu@kernel.org \
    --cc=joe@perches.com \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-spdx@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mcgrof@kernel.org \
    --cc=minchan@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=paulus@samba.org \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).