All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: keescook@chromium.org, dhowells@redhat.com, hch@infradead.org,
	mbenes@suse.com, gregkh@linuxfoundation.org, ngupta@vflare.org,
	sergey.senozhatsky.work@gmail.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Tue, 6 Apr 2021 18:23:46 -0700	[thread overview]
Message-ID: <YG0JouWqrJPHbpqz@google.com> (raw)
In-Reply-To: <20210406002909.GY4332@42.do-not-panic.com>

On Tue, Apr 06, 2021 at 12:29:09AM +0000, Luis Chamberlain wrote:
> On Mon, Apr 05, 2021 at 12:58:05PM -0700, Minchan Kim wrote:
> > On Mon, Apr 05, 2021 at 07:00:23PM +0000, Luis Chamberlain wrote:
> > > On Mon, Apr 05, 2021 at 10:07:24AM -0700, Minchan Kim wrote:
> > > > On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote:
> > > > > And come to think of it the last patch I had sent with a new
> > > > > DECLARE_RWSEM(zram_unload) also has this same issue making most
> > > > > sysfs attributes rather fragile.
> > > > 
> > > > Thanks for looking the way. I agree the single zram_index_rwlock is
> > > > not the right approach to fix it. However, I still hope we find more
> > > > generic solution to fix them at once since I see it's zram instance
> > > > racing problem.
> > > 
> > > They are 3 separate different problems. Related, but different.
> > 
> > What are 3 different problems? I am asking since I remember only two:
> > one for CPU multistate and the other one for sysfs during rmmod.
> 
> The third one is the race to use sysfs attributes and those routines
> then derefernece th egendisk private_data.

First of all, thanks for keeping discussion, Luis.

That was the one I thought race between sysfs and during rmmod.

> 
> > > If the idea then is to busy out rmmod if a sysfs attribute is being
> > > read, that could then mean rmmod can sometimes never complete. Hogging
> > > up / busying out sysfs attributes means the module cannto be removed.
> > 
> > It's true but is it a big problem? There are many cases that system
> > just return error if it's busy and rely on the admin. IMHO, rmmod should
> > be part of them.
> 
> It depends on existing userspace scripts which are used to test and
> expectations set. Consider existing tests, you would know better, and
> since you are the maintainer you decide.
> 
> I at least know for many other types of device drivers an rmmod is
> a sledge hammer.
> 
> You decide. I just thought it would be good to highlight the effect now
> rather than us considering it later.

To me, the rmmod faillure is not a big problem for zram since it's
common cases in the system with -EBUSY(Having said, I agree that's the
best if we could avoid the fail-and-retrial. IOW, -EBUSY should be
last resort unless we have nicer way.)

> 
> > > Which is why the *try_module_get()* I think is much more suitable, as
> > > it will always fails if we're already going down.
> > 
> > How does the try_module_get solved the problem?
> 
> The try stuff only resolves the deadlock. The bget() / bdput() resolves
> the race to access to the gendisk private_data.

That's the one I missed in this discussion. Now I am reading your [2/2]
in original patch. I thought it was just zram instance was destroyed
by sysfs race problem so you had seen the deadlock. I might miss the
point here, too. 

Hmm, we are discussing several problems all at once. I feel it's time
to jump v2 with your way in this point. You said three different
problems. As I asked, please write it down with more detail with
code sequence as we discussed other thread. If you mean a deadlock,
please write what specific locks was deadlock with it.
It would make discussion much easier. Let's discuss the issue
one by one in each thread.

> 
> > > > I see one of the problems is how I could make new zram object's
> > > > attribute group for zram knobs under /sys/block/zram0 since block
> > > > layer already made zram0 kobject via device_add_disk.
> > > 
> > > Right.. well the syfs attribute races uncovered here actually do
> > > apply to any block driver as well. And which is why I was aiming
> > > for something generic if possible.
> > 
> > It would be great but that's not the one we have atm so want to
> > proceed to fix anyway.
> 
> What is not the one we have atm? I *do* have a proposed generic solution
> for 2/3 issues we have been disussing:
> 
>  a) deadlock on sysfs access

This is the one I didn't understand.

>  b) gendisk private_data race

Yub.

> 
> But so far Greg does not see enough justification for a), so we can either
> show how wider this issue is (which I can do using coccinelle), or we
> just open code the try_module_get() / put on each driver that needs it
> for now. Either way it would resolve the issue.

I second if it's general problem for drivers, I agree it's worth to
addresss in the core unless the driver introduce the mess. I have
no idea here since I didn't understand the problem, yet.

> 
> As for b), given that I think even you had missed my attempt to
> generialzie the bdget/bdput solution for any attribute type (did you see
> my dev_type_get() and dev_type_put() proposed changes?), I don't think
> this problem is yet well defined in a generic way for us to rule it out.
> It is however easier to simply open code this per driver that needs it
> for now given that I don't think Greg is yet convinced the deadlock is
> yet a widespread issue. I however am pretty sure both races races *do*
> exist outside of zram in many places.

It would be good sign to propose general solution.

> 
> > > I am not sure if you missed the last hunks of the generic solution,
> > > but that would resolve the issue you noted. Here is the same approach
> > > but in a non-generic solution, specific to just one attribute so far
> > > and to zram:
> > 
> > So idea is refcount of the block_device's inode 
> 
> Yes that itself prevents races against the gendisk private_data from
> being invalid. Why because a bdget() would not be successful after
> del_gendisk():
> 
> del_gendisk() --> invalidate_partition() --> __invalidate_device() --> invalidate_inodes()
> 
> > and module_exit path
> > checks also the inode refcount to make rmmod failure?
> 
> They try_module_get() approach resolves the deadlock race, but it does
> so in a lazy way. I mean lazy in that then rmmod wins over sysfs knobs.
> So touching sysfs knobs won't make an rmmod fail. I think that's more
> typical expected behaviour. Why? Because I find it odd that looping
> forever touching sysfs attributes should prevent a module removal. But
> that's a personal preference.

I agree with you that would be better but let's see how it could go
clean.

FYI, please look at hot_remove_store which also can remove zram
instance on demand. I am looking forward to seeing your v2.

Thanks for your patience, Luis.

  reply	other threads:[~2021-04-07  1:23 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim [this message]
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YG0JouWqrJPHbpqz@google.com \
    --to=minchan@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=mcgrof@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.