linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Minchan Kim <minchan@kernel.org>
Cc: gregkh@linuxfoundation.org, ngupta@vflare.org,
	sergey.senozhatsky.work@gmail.com, axboe@kernel.dk,
	mbenes@suse.com, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Fri, 12 Mar 2021 18:32:38 +0000	[thread overview]
Message-ID: <20210312183238.GW4332@42.do-not-panic.com> (raw)
In-Reply-To: <YErOkGrvtQODXtB0@google.com>

On Thu, Mar 11, 2021 at 06:14:40PM -0800, Minchan Kim wrote:
> On Wed, Mar 10, 2021 at 09:21:28PM +0000, Luis Chamberlain wrote:
> > On Mon, Mar 08, 2021 at 06:55:30PM -0800, Minchan Kim wrote:
> > > If I understand correctly, bugs you found were related to module
> > > unloading race while the zram are still working.
> > 
> > No, that is a simplifcation of the issue. The issue consists of
> > two separate issues:
> > 
> >  a) race against module unloading in light of incorrect racty use of
> >     cpu hotplug multistate support
> 
> 
> Could you add some pusedo code sequence to show the race cleary?

Let us deal with each issue one at time. First, let's address
understanding the kernel warning can be reproduced easily by
triggering zram02.sh from LTP twice:

kernel: ------------[ cut here ]------------
kernel: Error: Removing state 63 which has instances left.
kernel: WARNING: CPU: 7 PID: 70457 at kernel/cpu.c:2069 __cpuhp_remove_state_cpuslocked+0xf9/0x100
kernel: Modules linked in: zram(E-) zsmalloc(E) <etc>

The first patch prevents this race. This race is possible because on
module init we associate callbacks for CPU hotplug add / remove:

static int __init zram_init(void)                                               
{
	...
	ret = cpuhp_setup_state_multi(CPUHP_ZCOMP_PREPARE, "block/zram:prepare",
	                              zcomp_cpu_up_prepare, zcomp_cpu_dead); 
	...
}

The zcomp_cpu_dead() accesses the zcom->comp, and if zcomp->comp is
removed and this function is called, clearly we'll be accessing some
random data here and can easily crash afterwards:

int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node)                   
{
	struct zcomp *comp = hlist_entry(node, struct zcomp, node);
	struct zcomp_strm *zstrm;

	zstrm = per_cpu_ptr(comp->stream, cpu);
	zcomp_strm_free(zstrm);
	return 0;
}

And zram's syfs reset_store() lets userspace call zram_reset_device()
which calls zcomp_destroy():

void zcomp_destroy(struct zcomp *comp)
{
	cpuhp_state_remove_instance(CPUHP_ZCOMP_PREPARE, &comp->node);
	free_percpu(comp->stream);
	kfree(comp);
}

> It would be great if it goes in the description, too since it's
> more clear to show the problme.

Does the above do it?
> 
> >  b) module unload race with sysfs attribute race on *any* driver which
> >     has sysfs attributes which also shares the same lock as used during
> >     module unload
> 
> Yub, that part I missed. Maybe, we need some wrapper to zram sysfs
> to get try_module_get in the warapper funnction and then call sub
> rountine only if it got the refcount.
> 
> zram_sysfs_wrapper(func, A, B)
>     if (!try_module_get(THIS_MODULE)
>         return -ENODEV;
>     ret = func(A,B);
>     module_put(THIS_MODULE);
>     return ret;

I'd much prefer this be resolved in kernfs later, if you look at the kernel
there are already some drivers which may have realized this requirement
the hard way. Open coding this I think makes the race / intent clearer.

Right now we have no semantics possible for a generic solution, but I
can work on one later.

  Luis

  reply	other threads:[~2021-03-12 18:33 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain [this message]
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210312183238.GW4332@42.do-not-panic.com \
    --to=mcgrof@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).