All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Minchan Kim <minchan@kernel.org>
Cc: ngupta@vflare.org, sergey.senozhatsky.work@gmail.com,
	axboe@kernel.dk, mbenes@suse.com, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
Date: Wed, 10 Mar 2021 13:11:15 +0000	[thread overview]
Message-ID: <20210310131115.GQ4332@42.do-not-panic.com> (raw)
In-Reply-To: <YEbjom8FIclEgRYv@google.com>

On Mon, Mar 08, 2021 at 06:55:30PM -0800, Minchan Kim wrote:
> On Sat, Mar 06, 2021 at 02:20:34AM +0000, Luis Chamberlain wrote:
> > The zram driver makes use of cpu hotplug multistate support,
> > whereby it associates a zram compression stream per CPU. To
> > support CPU hotplug multistate a callback enabled to allow
> > the driver to do what it needs when a CPU hotplugs.
> > 
> > It is however currently possible to end up removing the
> > zram driver callback prior to removing the zram compression
> > streams per CPU. This would leave these compression streams
> > hanging.
> > 
> > We need to fix ordering for driver load / removal, zram
> > device additions, in light of the driver's use of cpu
> > hotplug multistate. Since the zram driver exposes many
> > sysfs attribute which can also muck with the comrpession
> > streams this also means we can hit page faults today easily.
> 
> Hi Luis,
> 
> First of all, thanks for reporting the ancient bugs.
> 
> Looks like you found several bugs and I am trying to digest them
> from your description to understand more clear. I need to ask
> stupid questions, first.
> 
> If I understand correctly, bugs you found were related to module
> unloading race while the zram are still working.
> If so, couldn't we fix the bug like this(it's not a perfect
> patch but just wanted to show close module unloading race)?
> (I might miss other pieces here. Let me know. Thanks!)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a711a2e2a794..646ae9e0b710 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1696,6 +1696,8 @@ static void zram_reset_device(struct zram *zram)
>                 return;
>         }
> 
> +       module_put(THIS_MODULE);
> +
>         comp = zram->comp;
>         disksize = zram->disksize;
>         zram->disksize = 0;
> @@ -1744,13 +1746,19 @@ static ssize_t disksize_store(struct device *dev,
>                 goto out_free_meta;
>         }
> 
> +       if (!try_module_get(THIS_MODULE))
> +               goto out_free_zcomp;
> +
>         zram->comp = comp;
>         zram->disksize = disksize;
> +
>         set_capacity_and_notify(zram->disk, zram->disksize >> SECTOR_SHIFT);
>         up_write(&zram->init_lock);
> 
>         return len;
> 
> +out_free_zcomp:
> +       zcomp_destroy(comp);
>  out_free_meta:
>         zram_meta_free(zram, disksize);
>  out_unlock:

This still allows for a crash on the driver by running the zram02.sh script twice.

Mar 09 14:52:19 kdevops-blktests-block-dev kernel: zram0: detected capacity change from 209715200 to 0
Mar 09 14:52:19 kdevops-blktests-block-dev kernel: BUG: unable to handle page fault for address: ffffba7db7904008
Mar 09 14:52:19 kdevops-blktests-block-dev kernel: #PF: supervisor read access in kernel mode
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: #PF: error_code(0x0000) - not-present page
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: PGD 100000067 P4D 100000067 PUD 100311067 PMD 14cd2f067 PTE 0
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Oops: 0000 [#1] SMP NOPTI
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CPU: 0 PID: 2137 Comm: zram02.sh Tainted: G            E     5.12.0-rc1-next-20210304+ #4
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RIP: 0010:zram_free_page+0x1b/0xf0 [zram]
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Code: 1f 44 00 00 48 89 c8 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 54 49 89 f4 55 89 f5 53 48 8b 17 48 c1 e5 04 48 89 f>
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RSP:0018:ffffba7d808f3d88 EFLAGS: 00010286
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RAX: 0000000000000000 RBX: ffff9eee5317d200 RCX: 0000000000000000
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RDX: ffffba7db7904000 RSI: 0000000000000000 RDI: ffff9eee5317d200
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RBP: 0000000000000000 R08: 00000008f78bb1d3 R09: 0000000000000000
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000000
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R13: ffff9eee53d4cb00 R14: ffff9eee5317d220 R15: ffff9eee70508b80
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: FS: 00007f4bb1482580(0000) GS:ffff9eef77c00000(0000) knlGS:0000000000000000
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CR2: ffffba7db7904008 CR3: 0000000107e9c000 CR4: 0000000000350ef0
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Call Trace:
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: zram_reset_device+0xe9/0x150 [zram]
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: reset_store+0x9a/0x100 [zram]
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: kernfs_fop_write_iter+0x124/0x1b0
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: new_sync_write+0x11c/0x1b0
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: vfs_write+0x1c2/0x260
Mar 09 14:52:20 kdevops-blktests-block-dev kernel:  ksys_write+0x5f/0xe0
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: do_syscall_64+0x33/0x80
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RIP: 0033:0x7f4bb13aaf33
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00
00 00 85 c0 75 14 b8 01 00 0>
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RSP: 002b:00007ffce0090d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RAX: ffffffffffffffda RBX: 000055a17c4a14b0 RCX: 00007f4bb13aaf33
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RDX: 0000000000000002 RSI: 000055a17c4a14b0 RDI: 0000000000000001
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: RBP: 0000000000000002 R08: 000055a17c48a1d0 R09: 0000000000000000
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R10: 000055a17c48a1d1 R11: 0000000000000246 R12: 0000000000000001
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: R13: 0000000000000002 R14: 7fffffffffffffff R15: 0000000000000000
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: Modules linked in: zram(E) zsmalloc(E) xfs(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) joydev(E) evdev(>
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: zram0: detected capacity change from 0 to 209715200
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: CR2: ffffba7db7904008
Mar 09 14:52:20 kdevops-blktests-block-dev kernel: ---[ end trace 534ee1d0b880dd90 ]---

I can try to modify it to include second patch first, as that is
required. There are two separate bugs here. Or was your goalt to
try to address both with only one patch?

  Luis

  reply	other threads:[~2021-03-10 13:12 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-06  2:20 [PATCH 0/2] zram: fix few ltp zram02.sh crashes Luis Chamberlain
2021-03-06  2:20 ` [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate Luis Chamberlain
2021-03-09  2:55   ` Minchan Kim
2021-03-10 13:11     ` Luis Chamberlain [this message]
2021-03-10 21:25       ` Luis Chamberlain
2021-03-12  2:08       ` Minchan Kim
2021-03-10 21:21     ` Luis Chamberlain
2021-03-12  2:14       ` Minchan Kim
2021-03-12 18:32         ` Luis Chamberlain
2021-03-12 19:28           ` Minchan Kim
2021-03-19 19:09             ` Luis Chamberlain
2021-03-22 16:37               ` Minchan Kim
2021-03-22 20:41                 ` Luis Chamberlain
2021-03-22 22:12                   ` Minchan Kim
2021-04-01 23:59                     ` Luis Chamberlain
2021-04-02  7:54                       ` Greg KH
2021-04-02 18:30                         ` Luis Chamberlain
2021-04-03  6:13                           ` Greg KH
     [not found]                             ` <20210406003152.GZ4332@42.do-not-panic.com>
2021-04-06 12:00                               ` Miroslav Benes
2021-04-06 15:54                                 ` Josh Poimboeuf
2021-04-07 14:09                                   ` Peter Zijlstra
2021-04-07 15:30                                     ` Josh Poimboeuf
2021-04-07 16:48                                       ` Peter Zijlstra
2021-04-07 20:17                         ` Josh Poimboeuf
2021-04-08  6:18                           ` Greg KH
2021-04-08 13:16                             ` Steven Rostedt
2021-04-08 13:37                             ` Josh Poimboeuf
2021-04-08  1:37                         ` Thomas Gleixner
2021-04-08  6:16                           ` Greg KH
2021-04-08  8:01                             ` Jiri Kosina
2021-04-08  8:09                               ` Greg KH
2021-04-08  8:35                                 ` Jiri Kosina
2021-04-08  8:55                                   ` Greg KH
2021-04-08 18:40                                     ` Luis Chamberlain
2021-04-09  3:01                                     ` Kees Cook
2021-04-05 17:07                       ` Minchan Kim
2021-04-05 19:00                         ` Luis Chamberlain
2021-04-05 19:58                           ` Minchan Kim
2021-04-06  0:29                             ` Luis Chamberlain
2021-04-07  1:23                               ` Minchan Kim
2021-04-07  1:38                                 ` Minchan Kim
2021-04-07 14:52                                   ` Luis Chamberlain
2021-04-07 14:50                                 ` Luis Chamberlain
2021-03-06  2:20 ` [PATCH 2/2] zram: fix races of sysfs attribute removal and usage Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210310131115.GQ4332@42.do-not-panic.com \
    --to=mcgrof@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.