linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: ngupta@vflare.org, sergey.senozhatsky.work@gmail.com,
	axboe@kernel.dk, mbenes@suse.com, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Thu, 11 Mar 2021 18:08:35 -0800	[thread overview]
Message-ID: <YErNI4mv5sK613ed@google.com> (raw)
In-Reply-To: <20210310131115.GQ4332@42.do-not-panic.com>

On Wed, Mar 10, 2021 at 01:11:15PM +0000, Luis Chamberlain wrote:
> On Mon, Mar 08, 2021 at 06:55:30PM -0800, Minchan Kim wrote:
> > On Sat, Mar 06, 2021 at 02:20:34AM +0000, Luis Chamberlain wrote:
> > > The zram driver makes use of cpu hotplug multistate support,
> > > whereby it associates a zram compression stream per CPU. To
> > > support CPU hotplug multistate a callback enabled to allow
> > > the driver to do what it needs when a CPU hotplugs.
> > > 
> > > It is however currently possible to end up removing the
> > > zram driver callback prior to removing the zram compression
> > > streams per CPU. This would leave these compression streams
> > > hanging.
> > > 
> > > We need to fix ordering for driver load / removal, zram
> > > device additions, in light of the driver's use of cpu
> > > hotplug multistate. Since the zram driver exposes many
> > > sysfs attribute which can also muck with the comrpession
> > > streams this also means we can hit page faults today easily.
> > 
> > Hi Luis,
> > 
> > First of all, thanks for reporting the ancient bugs.
> > 
> > Looks like you found several bugs and I am trying to digest them
> > from your description to understand more clear. I need to ask
> > stupid questions, first.
> > 
> > If I understand correctly, bugs you found were related to module
> > unloading race while the zram are still working.
> > If so, couldn't we fix the bug like this(it's not a perfect
> > patch but just wanted to show close module unloading race)?
> > (I might miss other pieces here. Let me know. Thanks!)
> > 
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index a711a2e2a794..646ae9e0b710 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -1696,6 +1696,8 @@ static void zram_reset_device(struct zram *zram)
> >                 return;
> >         }
> > 
> > +       module_put(THIS_MODULE);
> > +
> >         comp = zram->comp;
> >         disksize = zram->disksize;
> >         zram->disksize = 0;
> > @@ -1744,13 +1746,19 @@ static ssize_t disksize_store(struct device *dev,
> >                 goto out_free_meta;
> >         }
> > 
> > +       if (!try_module_get(THIS_MODULE))
> > +               goto out_free_zcomp;
> > +
> >         zram->comp = comp;
> >         zram->disksize = disksize;
> > +
> >         set_capacity_and_notify(zram->disk, zram->disksize >> SECTOR_SHIFT);
> >         up_write(&zram->init_lock);
> > 
> >         return len;
> > 
> > +out_free_zcomp:
> > +       zcomp_destroy(comp);
> >  out_free_meta:
> >         zram_meta_free(zram, disksize);
> >  out_unlock:
> 
> This still allows for a crash on the driver by running the zram02.sh script twice.
> 
> Mar 09 14:52:19 kdevops-blktests-block-dev kernel: zram0: detected capacity change from 209715200 to 0
> Mar 09 14:52:19 kdevops-blktests-block-dev kernel: BUG: unable to handle page fault for address: ffffba7db7904008
> Mar 09 14:52:19 kdevops-blktests-block-dev kernel: #PF: supervisor read access in kernel mode
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: #PF: error_code(0x0000) - not-present page
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: PGD 100000067 P4D 100000067 PUD 100311067 PMD 14cd2f067 PTE 0
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Oops: 0000 [#1] SMP NOPTI
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CPU: 0 PID: 2137 Comm: zram02.sh Tainted: G            E     5.12.0-rc1-next-20210304+ #4
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RIP: 0010:zram_free_page+0x1b/0xf0 [zram]
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Code: 1f 44 00 00 48 89 c8 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 54 49 89 f4 55 89 f5 53 48 8b 17 48 c1 e5 04 48 89 f>
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RSP:0018:ffffba7d808f3d88 EFLAGS: 00010286
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RAX: 0000000000000000 RBX: ffff9eee5317d200 RCX: 0000000000000000
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RDX: ffffba7db7904000 RSI: 0000000000000000 RDI: ffff9eee5317d200
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RBP: 0000000000000000 R08: 00000008f78bb1d3 R09: 0000000000000000
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000000
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R13: ffff9eee53d4cb00 R14: ffff9eee5317d220 R15: ffff9eee70508b80
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: FS: 00007f4bb1482580(0000) GS:ffff9eef77c00000(0000) knlGS:0000000000000000
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CR2: ffffba7db7904008 CR3: 0000000107e9c000 CR4: 0000000000350ef0
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Call Trace:
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: zram_reset_device+0xe9/0x150 [zram]
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: reset_store+0x9a/0x100 [zram]
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: kernfs_fop_write_iter+0x124/0x1b0
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: new_sync_write+0x11c/0x1b0
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: vfs_write+0x1c2/0x260
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel:  ksys_write+0x5f/0xe0
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: do_syscall_64+0x33/0x80
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RIP: 0033:0x7f4bb13aaf33
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00
> 00 00 85 c0 75 14 b8 01 00 0>
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RSP: 002b:00007ffce0090d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RAX: ffffffffffffffda RBX: 000055a17c4a14b0 RCX: 00007f4bb13aaf33
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RDX: 0000000000000002 RSI: 000055a17c4a14b0 RDI: 0000000000000001
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RBP: 0000000000000002 R08: 000055a17c48a1d0 R09: 0000000000000000
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R10: 000055a17c48a1d1 R11: 0000000000000246 R12: 0000000000000001
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R13: 0000000000000002 R14: 7fffffffffffffff R15: 0000000000000000
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Modules linked in: zram(E) zsmalloc(E) xfs(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) joydev(E) evdev(>
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: zram0: detected capacity change from 0 to 209715200
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CR2: ffffba7db7904008
> Mar 09 14:52:20 kdevops-blktests-block-dev kernel: ---[ end trace 534ee1d0b880dd90 ]---
> 
> I can try to modify it to include second patch first, as that is
> required. There are two separate bugs here. Or was your goalt to
> try to address both with only one patch?

I am trying to understand problem first.

  parent reply	other threads:[~2021-03-12  2:09 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim [this message]
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YErNI4mv5sK613ed@google.com \
    --to=minchan@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=mcgrof@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).