linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: keescook@chromium.org, dhowells@redhat.com, hch@infradead.org,
	mbenes@suse.com, gregkh@linuxfoundation.org, ngupta@vflare.org,
	sergey.senozhatsky.work@gmail.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Mon, 5 Apr 2021 10:07:24 -0700	[thread overview]
Message-ID: <YGtDzH0dEfEngCij@google.com> (raw)
In-Reply-To: <20210401235925.GR4332@42.do-not-panic.com>

On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote:
> On Mon, Mar 22, 2021 at 03:12:01PM -0700, Minchan Kim wrote:
> > On Mon, Mar 22, 2021 at 08:41:56PM +0000, Luis Chamberlain wrote:
> > > 
> > > I would not call it *every* syfs knob as not all deal with things which
> > > are related to CPU hotplug multistate, right? Note that using just
> > > try_module_get() alone (that is the second patch only, does not fix the
> > > race I am describing above).
> > 
> > It wouldn't be CPU hotplug multistate issue but I'd like to call it
> > as more "zram instance race" bug.
> > What happens in this case?
> > 
> >         CPU 1                            CPU 2
> > 
> > destroy_devices
> > ..
> >                                  compact_store()
> >                                  zram = dev_to_zram(dev);
> > idr_for_each(zram_remove_cb
> >   zram_remove
> >   ..
> >   kfree(zram)
> >                                  down_read(&zram->init_lock);
> > 
> > 
> >         CPU 1                            CPU 2
> > 
> > hot_remove_store
> >                                  compact_store()
> >                                  zram = dev_to_zram(dev);
> >   zram_remove
> >     kfree(zram)
> >                                  down_read(&zram->init_lock);
> >     				
> > So, for me we need to close the zram instance create/removal
> > with sysfs rather than focusing on CPU hotplug issue.
> 
> Sure, that's a good point.
> 
> The issue which I noted for the race which ends up in a deadlock is only
> possible if a shared lock is used on removal but also on sysfs knobs.
> 
> At first glance, the issue you describe above *seems* to be just proper
> care driver developers must take with structures used. It is certainly
> yet another issue we need to address, and if we can generalize a
> solution even better. I now recall I *think* I spotted that race a while
> ago and mentioned it to Kees and David Howells but I didn't have a
> solution for it yet. More on this below.
> 
> The issue you point out is real, however you cannot disregard the
> CPU hoplug possible race as well, it is a design consideration which
> the CPU hotplug multistate support warns for -- consider driver removal.
> I agree that perhaps solving this "zram instance race" can fix he
> hotplug race as well. If we can solves all 3 issues in one shot even
> better. But let's evaluate that prospect...
> 
> > Maybe, we could reuse zram_index_mutex with modifying it with
> > rw_semaphore. What do you think?
> 
> Although ideal given it would knock 3 birds with 1 stone, it ends up
> actually making the sysfs attributes rather useless in light of the
> requirements for each of the races. Namely, the sysfs deadlock race
> *must* use a try lock approach, just as the try_module_get() case.
> It must use this approach so to immediately just bail out if we have
> our module being removed, and so on our __exit path. By trying to
> repurpose zram_index_mutex we end up then just doing too much with it
> and making the syfs attributes rather fragile for most uses:
> 
> Consider disksize_show(), that would have to become:
> 
> static ssize_t disksize_show(struct device *dev, struct device_attribute *attr, char *buf)
> {
> 	struct zram *zram = dev_to_zram(dev);
> +	size_t disksize;
> 
> +	down_read(&zram_index_rwlock);
> +	disksize = zram->disksize;
> +	up_read(&zram_index_rwlock);
> -	return scnprintf(buf, PAGE_SIZE, "%llu\n", zram->disksize);
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", disksize);
> }
> 
> What's wrong with this?
> 
> It can block during a write, yes, but there is a type of write which
> will make this crash after the read lock is acquired. When the instance
> is removed. What if we try down_read_trylock()?
> 
> static ssize_t disksize_show(struct device *dev, struct device_attribute *attr, char *buf)
> {
> 	struct zram *zram = dev_to_zram(dev);
> +	size_t disksize;
> 
> +	if (!down_read_trylock(&zram_index_rwlock))
> +		return -ENODEV;
> +
> +	disksize = zram->disksize;
> +	up_read(&zram_index_rwlock);
> -	return scnprintf(buf, PAGE_SIZE, "%llu\n", zram->disksize);
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", disksize);
> }
> 
> What's wrong with this?
> 
> If it got the lock, it should be OK as it is preventing writes from
> taking the lock for a bit. But then this just becomes pretty fragile,
> it will fail whenever another read or write is happening, triggering
> perhaps quite a bit of regressions on tests.
> 
> And if we use write_trylock() we end up with the same fragile nature
> of failing the read with ENODEV for any silly thing going on with the
> driver.
> 
> And come to think of it the last patch I had sent with a new
> DECLARE_RWSEM(zram_unload) also has this same issue making most
> sysfs attributes rather fragile.

Thanks for looking the way. I agree the single zram_index_rwlock is
not the right approach to fix it. However, I still hope we find more
generic solution to fix them at once since I see it's zram instance
racing problem.

A approach I am considering is to make struct zram include kobject
and then make zram sysfs auto populated under the kobject. So, zram/sysfs
lifetime should be under the kobject. With it, sysfs race probem I
mentioned above should be gone. Furthermore, zram_remove should fail
if one of the alive zram objects is existing
(i.e., zram->kobject->refcount > 1) so module_exit will fail, too.

I see one of the problems is how I could make new zram object's
attribute group for zram knobs under /sys/block/zram0 since block
layer already made zram0 kobject via device_add_disk.

  parent reply	other threads:[~2021-04-05 17:07 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim [this message]
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YGtDzH0dEfEngCij@google.com \
    --to=minchan@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=mcgrof@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).