linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: keescook@chromium.org, dhowells@redhat.com, hch@infradead.org,
	mbenes@suse.com, gregkh@linuxfoundation.org, ngupta@vflare.org,
	sergey.senozhatsky.work@gmail.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Tue, 6 Apr 2021 18:23:46 -0700	[thread overview]
Message-ID: <YG0JouWqrJPHbpqz@google.com> (raw)
In-Reply-To: <20210406002909.GY4332@42.do-not-panic.com>

On Tue, Apr 06, 2021 at 12:29:09AM +0000, Luis Chamberlain wrote:
> On Mon, Apr 05, 2021 at 12:58:05PM -0700, Minchan Kim wrote:
> > On Mon, Apr 05, 2021 at 07:00:23PM +0000, Luis Chamberlain wrote:
> > > On Mon, Apr 05, 2021 at 10:07:24AM -0700, Minchan Kim wrote:
> > > > On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote:
> > > > > And come to think of it the last patch I had sent with a new
> > > > > DECLARE_RWSEM(zram_unload) also has this same issue making most
> > > > > sysfs attributes rather fragile.
> > > > 
> > > > Thanks for looking the way. I agree the single zram_index_rwlock is
> > > > not the right approach to fix it. However, I still hope we find more
> > > > generic solution to fix them at once since I see it's zram instance
> > > > racing problem.
> > > 
> > > They are 3 separate different problems. Related, but different.
> > 
> > What are 3 different problems? I am asking since I remember only two:
> > one for CPU multistate and the other one for sysfs during rmmod.
> 
> The third one is the race to use sysfs attributes and those routines
> then derefernece th egendisk private_data.

First of all, thanks for keeping discussion, Luis.

That was the one I thought race between sysfs and during rmmod.

> 
> > > If the idea then is to busy out rmmod if a sysfs attribute is being
> > > read, that could then mean rmmod can sometimes never complete. Hogging
> > > up / busying out sysfs attributes means the module cannto be removed.
> > 
> > It's true but is it a big problem? There are many cases that system
> > just return error if it's busy and rely on the admin. IMHO, rmmod should
> > be part of them.
> 
> It depends on existing userspace scripts which are used to test and
> expectations set. Consider existing tests, you would know better, and
> since you are the maintainer you decide.
> 
> I at least know for many other types of device drivers an rmmod is
> a sledge hammer.
> 
> You decide. I just thought it would be good to highlight the effect now
> rather than us considering it later.

To me, the rmmod faillure is not a big problem for zram since it's
common cases in the system with -EBUSY(Having said, I agree that's the
best if we could avoid the fail-and-retrial. IOW, -EBUSY should be
last resort unless we have nicer way.)

> 
> > > Which is why the *try_module_get()* I think is much more suitable, as
> > > it will always fails if we're already going down.
> > 
> > How does the try_module_get solved the problem?
> 
> The try stuff only resolves the deadlock. The bget() / bdput() resolves
> the race to access to the gendisk private_data.

That's the one I missed in this discussion. Now I am reading your [2/2]
in original patch. I thought it was just zram instance was destroyed
by sysfs race problem so you had seen the deadlock. I might miss the
point here, too. 

Hmm, we are discussing several problems all at once. I feel it's time
to jump v2 with your way in this point. You said three different
problems. As I asked, please write it down with more detail with
code sequence as we discussed other thread. If you mean a deadlock,
please write what specific locks was deadlock with it.
It would make discussion much easier. Let's discuss the issue
one by one in each thread.

> 
> > > > I see one of the problems is how I could make new zram object's
> > > > attribute group for zram knobs under /sys/block/zram0 since block
> > > > layer already made zram0 kobject via device_add_disk.
> > > 
> > > Right.. well the syfs attribute races uncovered here actually do
> > > apply to any block driver as well. And which is why I was aiming
> > > for something generic if possible.
> > 
> > It would be great but that's not the one we have atm so want to
> > proceed to fix anyway.
> 
> What is not the one we have atm? I *do* have a proposed generic solution
> for 2/3 issues we have been disussing:
> 
>  a) deadlock on sysfs access

This is the one I didn't understand.

>  b) gendisk private_data race

Yub.

> 
> But so far Greg does not see enough justification for a), so we can either
> show how wider this issue is (which I can do using coccinelle), or we
> just open code the try_module_get() / put on each driver that needs it
> for now. Either way it would resolve the issue.

I second if it's general problem for drivers, I agree it's worth to
addresss in the core unless the driver introduce the mess. I have
no idea here since I didn't understand the problem, yet.

> 
> As for b), given that I think even you had missed my attempt to
> generialzie the bdget/bdput solution for any attribute type (did you see
> my dev_type_get() and dev_type_put() proposed changes?), I don't think
> this problem is yet well defined in a generic way for us to rule it out.
> It is however easier to simply open code this per driver that needs it
> for now given that I don't think Greg is yet convinced the deadlock is
> yet a widespread issue. I however am pretty sure both races races *do*
> exist outside of zram in many places.

It would be good sign to propose general solution.

> 
> > > I am not sure if you missed the last hunks of the generic solution,
> > > but that would resolve the issue you noted. Here is the same approach
> > > but in a non-generic solution, specific to just one attribute so far
> > > and to zram:
> > 
> > So idea is refcount of the block_device's inode 
> 
> Yes that itself prevents races against the gendisk private_data from
> being invalid. Why because a bdget() would not be successful after
> del_gendisk():
> 
> del_gendisk() --> invalidate_partition() --> __invalidate_device() --> invalidate_inodes()
> 
> > and module_exit path
> > checks also the inode refcount to make rmmod failure?
> 
> They try_module_get() approach resolves the deadlock race, but it does
> so in a lazy way. I mean lazy in that then rmmod wins over sysfs knobs.
> So touching sysfs knobs won't make an rmmod fail. I think that's more
> typical expected behaviour. Why? Because I find it odd that looping
> forever touching sysfs attributes should prevent a module removal. But
> that's a personal preference.

I agree with you that would be better but let's see how it could go
clean.

FYI, please look at hot_remove_store which also can remove zram
instance on demand. I am looking forward to seeing your v2.

Thanks for your patience, Luis.

  reply	other threads:[~2021-04-07  1:23 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim [this message]
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YG0JouWqrJPHbpqz@google.com \
    --to=minchan@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=mcgrof@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).