All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: keescook@chromium.org, dhowells@redhat.com, hch@infradead.org,
	mbenes@suse.com, gregkh@linuxfoundation.org, ngupta@vflare.org,
	sergey.senozhatsky.work@gmail.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Mon, 5 Apr 2021 10:07:24 -0700	[thread overview]
Message-ID: <YGtDzH0dEfEngCij@google.com> (raw)
In-Reply-To: <20210401235925.GR4332@42.do-not-panic.com>

On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote:
> On Mon, Mar 22, 2021 at 03:12:01PM -0700, Minchan Kim wrote:
> > On Mon, Mar 22, 2021 at 08:41:56PM +0000, Luis Chamberlain wrote:
> > > 
> > > I would not call it *every* syfs knob as not all deal with things which
> > > are related to CPU hotplug multistate, right? Note that using just
> > > try_module_get() alone (that is the second patch only, does not fix the
> > > race I am describing above).
> > 
> > It wouldn't be CPU hotplug multistate issue but I'd like to call it
> > as more "zram instance race" bug.
> > What happens in this case?
> > 
> >         CPU 1                            CPU 2
> > 
> > destroy_devices
> > ..
> >                                  compact_store()
> >                                  zram = dev_to_zram(dev);
> > idr_for_each(zram_remove_cb
> >   zram_remove
> >   ..
> >   kfree(zram)
> >                                  down_read(&zram->init_lock);
> > 
> > 
> >         CPU 1                            CPU 2
> > 
> > hot_remove_store
> >                                  compact_store()
> >                                  zram = dev_to_zram(dev);
> >   zram_remove
> >     kfree(zram)
> >                                  down_read(&zram->init_lock);
> >     				
> > So, for me we need to close the zram instance create/removal
> > with sysfs rather than focusing on CPU hotplug issue.
> 
> Sure, that's a good point.
> 
> The issue which I noted for the race which ends up in a deadlock is only
> possible if a shared lock is used on removal but also on sysfs knobs.
> 
> At first glance, the issue you describe above *seems* to be just proper
> care driver developers must take with structures used. It is certainly
> yet another issue we need to address, and if we can generalize a
> solution even better. I now recall I *think* I spotted that race a while
> ago and mentioned it to Kees and David Howells but I didn't have a
> solution for it yet. More on this below.
> 
> The issue you point out is real, however you cannot disregard the
> CPU hoplug possible race as well, it is a design consideration which
> the CPU hotplug multistate support warns for -- consider driver removal.
> I agree that perhaps solving this "zram instance race" can fix he
> hotplug race as well. If we can solves all 3 issues in one shot even
> better. But let's evaluate that prospect...
> 
> > Maybe, we could reuse zram_index_mutex with modifying it with
> > rw_semaphore. What do you think?
> 
> Although ideal given it would knock 3 birds with 1 stone, it ends up
> actually making the sysfs attributes rather useless in light of the
> requirements for each of the races. Namely, the sysfs deadlock race
> *must* use a try lock approach, just as the try_module_get() case.
> It must use this approach so to immediately just bail out if we have
> our module being removed, and so on our __exit path. By trying to
> repurpose zram_index_mutex we end up then just doing too much with it
> and making the syfs attributes rather fragile for most uses:
> 
> Consider disksize_show(), that would have to become:
> 
> static ssize_t disksize_show(struct device *dev, struct device_attribute *attr, char *buf)
> {
> 	struct zram *zram = dev_to_zram(dev);
> +	size_t disksize;
> 
> +	down_read(&zram_index_rwlock);
> +	disksize = zram->disksize;
> +	up_read(&zram_index_rwlock);
> -	return scnprintf(buf, PAGE_SIZE, "%llu\n", zram->disksize);
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", disksize);
> }
> 
> What's wrong with this?
> 
> It can block during a write, yes, but there is a type of write which
> will make this crash after the read lock is acquired. When the instance
> is removed. What if we try down_read_trylock()?
> 
> static ssize_t disksize_show(struct device *dev, struct device_attribute *attr, char *buf)
> {
> 	struct zram *zram = dev_to_zram(dev);
> +	size_t disksize;
> 
> +	if (!down_read_trylock(&zram_index_rwlock))
> +		return -ENODEV;
> +
> +	disksize = zram->disksize;
> +	up_read(&zram_index_rwlock);
> -	return scnprintf(buf, PAGE_SIZE, "%llu\n", zram->disksize);
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", disksize);
> }
> 
> What's wrong with this?
> 
> If it got the lock, it should be OK as it is preventing writes from
> taking the lock for a bit. But then this just becomes pretty fragile,
> it will fail whenever another read or write is happening, triggering
> perhaps quite a bit of regressions on tests.
> 
> And if we use write_trylock() we end up with the same fragile nature
> of failing the read with ENODEV for any silly thing going on with the
> driver.
> 
> And come to think of it the last patch I had sent with a new
> DECLARE_RWSEM(zram_unload) also has this same issue making most
> sysfs attributes rather fragile.

Thanks for looking the way. I agree the single zram_index_rwlock is
not the right approach to fix it. However, I still hope we find more
generic solution to fix them at once since I see it's zram instance
racing problem.

A approach I am considering is to make struct zram include kobject
and then make zram sysfs auto populated under the kobject. So, zram/sysfs
lifetime should be under the kobject. With it, sysfs race probem I
mentioned above should be gone. Furthermore, zram_remove should fail
if one of the alive zram objects is existing
(i.e., zram->kobject->refcount > 1) so module_exit will fail, too.

I see one of the problems is how I could make new zram object's
attribute group for zram knobs under /sys/block/zram0 since block
layer already made zram0 kobject via device_add_disk.

  parent reply	other threads:[~2021-04-05 17:07 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim [this message]
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YGtDzH0dEfEngCij@google.com \
    --to=minchan@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=mcgrof@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.