linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] writeback, cgroup: remove wb from offline list before releasing refcnt
@ 2021-07-16 20:10 Roman Gushchin
  2021-07-23 18:33 ` Adam Williamson
  0 siblings, 1 reply; 4+ messages in thread
From: Roman Gushchin @ 2021-07-16 20:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-fsdevel, linux-kernel, Will Deacon, Jan Kara,
	Darrick J . Wong, Boyang Xue, Dave Chinner, Murphy Zhou,
	Roman Gushchin

Boyang reported that the commit c22d70a162d3 ("writeback, cgroup:
release dying cgwbs by switching attached inodes") causes the kernel
to crash while running xfstests generic/256 on ext4 on aarch64 and
ppc64le.

  [ 4366.380974] run fstests generic/256 at 2021-07-12 05:41:40
  [ 4368.337078] EXT4-fs (vda3): mounted filesystem with ordered data
  mode. Opts: . Quota mode: none.
  [ 4371.275986] Unable to handle kernel NULL pointer dereference at
  virtual address 0000000000000000
  [ 4371.278210] Mem abort info:
  [ 4371.278880]   ESR = 0x96000005
  [ 4371.279603]   EC = 0x25: DABT (current EL), IL = 32 bits
  [ 4371.280878]   SET = 0, FnV = 0
  [ 4371.281621]   EA = 0, S1PTW = 0
  [ 4371.282396]   FSC = 0x05: level 1 translation fault
  [ 4371.283635] Data abort info:
  [ 4371.284333]   ISV = 0, ISS = 0x00000005
  [ 4371.285246]   CM = 0, WnR = 0
  [ 4371.285975] user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
  [ 4371.287640] [0000000000000000] pgd=0000000000000000,
  p4d=0000000000000000, pud=0000000000000000
  [ 4371.290016] Internal error: Oops: 96000005 [#1] SMP
  [ 4371.291251] Modules linked in: dm_flakey dm_snapshot dm_bufio
  dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
  nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2
  drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64
  sha1_ce virtio_blk virtio_net net_failover virtio_console failover
  virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
  [ 4371.300059] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G
         X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
  [ 4371.303009] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  [ 4371.304685] Workqueue: events_unbound cleanup_offline_cgwbs_workfn
  [ 4371.306329] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
  [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394
  [ 4371.309254] lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
  [ 4371.310597] sp : ffff80001554fd10
  [ 4371.311443] x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
  [ 4371.313320] x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
  [ 4371.315159] x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
  [ 4371.316945] x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
  [ 4371.318690] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
  [ 4371.320437] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
  [ 4371.322444] x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
  [ 4371.324243] x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
  [ 4371.326049] x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
  [ 4371.327898] x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
  [ 4371.329748] Call trace:
  [ 4371.330372]  cleanup_offline_cgwbs_workfn+0x320/0x394
  [ 4371.331694]  process_one_work+0x1f4/0x4b0
  [ 4371.332767]  worker_thread+0x184/0x540
  [ 4371.333732]  kthread+0x114/0x120
  [ 4371.334535]  ret_from_fork+0x10/0x18
  [ 4371.335440] Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
  [ 4371.337174] ---[ end trace e250fe289272792a ]---
  [ 4371.338365] Kernel panic - not syncing: Oops: Fatal exception
  [ 4371.339884] SMP: stopping secondary CPUs
  [ 4372.424137] SMP: failed to stop secondary CPUs 0-2
  [ 4372.436894] Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
  [ 4372.438408] PHYS_OFFSET: 0xfff0defca0000000
  [ 4372.439496] CPU features: 0x00200251,23200840
  [ 4372.440603] Memory Limit: none
  [ 4372.441374] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

The problem happens when cgwb_release_workfn() races with
cleanup_offline_cgwbs_workfn(): wb_tryget() in
cleanup_offline_cgwbs_workfn() can be called after percpu_ref_exit()
is cgwb_release_workfn(), which is basically a use-after-free error.

Fix the problem by making removing the writeback structure from the
offline list before releasing the percpu reference counter. It will
guarantee that cleanup_offline_cgwbs_workfn() will not see and not
access writeback structures which are about to be released.

Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Boyang Xue <bxue@redhat.com>
Suggested-by: Jan Kara <jack@suse.cz>
Tested-by: Darrick J. Wong <djwong@kernel.org>
---
 mm/backing-dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 271f2ca862c8..f5561ea7d90a 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work)
 	blkcg_unpin_online(blkcg);
 
 	fprop_local_destroy_percpu(&wb->memcg_completions);
-	percpu_ref_exit(&wb->refcnt);
 
 	spin_lock_irq(&cgwb_lock);
 	list_del(&wb->offline_node);
 	spin_unlock_irq(&cgwb_lock);
 
+	percpu_ref_exit(&wb->refcnt);
 	wb_exit(wb);
 	WARN_ON_ONCE(!list_empty(&wb->b_attached));
 	kfree_rcu(wb, rcu);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] writeback, cgroup: remove wb from offline list before releasing refcnt
  2021-07-16 20:10 [PATCH] writeback, cgroup: remove wb from offline list before releasing refcnt Roman Gushchin
@ 2021-07-23 18:33 ` Adam Williamson
  2021-07-23 19:13   ` Roman Gushchin
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Williamson @ 2021-07-23 18:33 UTC (permalink / raw)
  To: guro; +Cc: linux-kernel

Hi folks! I'm not subscribed to the list and am trying to send this
using git-send-email, apologies for any weirdness related to that.

Just wanted to confirm the issue that this patch attempts to address,
and ask if it can be reviewed/moved along. I look after Fedora's
openQA automated test instance, and in tests of recent Fedora Rawhide
composes, we're seeing at least one failure per compose that's caused
by this crash (it usually prevents the system shutting down or
rebooting correctly in a test which requires that to happen).

I can't actually confirm the fix as I can't really easily change our
tests to run on a non-official kernel build, and our kernel maintainer
(reasonably) says he won't backport the patch until it's at least got
some review. But I'm definitely seeing the problem!

Thanks folks.
-- 
Adam Williamson
Fedora QA
IRC: adamw | Twitter: adamw_ha
https://www.happyassassin.net


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] writeback, cgroup: remove wb from offline list before releasing refcnt
  2021-07-23 18:33 ` Adam Williamson
@ 2021-07-23 19:13   ` Roman Gushchin
  2021-07-23 19:15     ` Adam Williamson
  0 siblings, 1 reply; 4+ messages in thread
From: Roman Gushchin @ 2021-07-23 19:13 UTC (permalink / raw)
  To: Adam Williamson; +Cc: linux-kernel

On Fri, Jul 23, 2021 at 11:33:43AM -0700, Adam Williamson wrote:
> Hi folks! I'm not subscribed to the list and am trying to send this
> using git-send-email, apologies for any weirdness related to that.
> 
> Just wanted to confirm the issue that this patch attempts to address,
> and ask if it can be reviewed/moved along. I look after Fedora's
> openQA automated test instance, and in tests of recent Fedora Rawhide
> composes, we're seeing at least one failure per compose that's caused
> by this crash (it usually prevents the system shutting down or
> rebooting correctly in a test which requires that to happen).
> 
> I can't actually confirm the fix as I can't really easily change our
> tests to run on a non-official kernel build, and our kernel maintainer
> (reasonably) says he won't backport the patch until it's at least got
> some review. But I'm definitely seeing the problem!
> 
> Thanks folks.
> -- 
> Adam Williamson
> Fedora QA
> IRC: adamw | Twitter: adamw_ha
> https://www.happyassassin.net 
> 

Hello, Adam!

The patch was picked up by Andrew Morton and will be soon merged into the Linus's tree.
Currently you can find it in the "next" tree (can't provide a hash because it's not stable
there).

Please, note that there it another fix called "writeback, cgroup: do not reparent dax inodes",
which you likely want to backport too.

Both fixes got some testing and reviews.

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] writeback, cgroup: remove wb from offline list before releasing refcnt
  2021-07-23 19:13   ` Roman Gushchin
@ 2021-07-23 19:15     ` Adam Williamson
  0 siblings, 0 replies; 4+ messages in thread
From: Adam Williamson @ 2021-07-23 19:15 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: linux-kernel

On Fri, 2021-07-23 at 12:13 -0700, Roman Gushchin wrote:
> On Fri, Jul 23, 2021 at 11:33:43AM -0700, Adam Williamson wrote:
> > Hi folks! I'm not subscribed to the list and am trying to send this
> > using git-send-email, apologies for any weirdness related to that.
> > 
> > Just wanted to confirm the issue that this patch attempts to address,
> > and ask if it can be reviewed/moved along. I look after Fedora's
> > openQA automated test instance, and in tests of recent Fedora Rawhide
> > composes, we're seeing at least one failure per compose that's caused
> > by this crash (it usually prevents the system shutting down or
> > rebooting correctly in a test which requires that to happen).
> > 
> > I can't actually confirm the fix as I can't really easily change our
> > tests to run on a non-official kernel build, and our kernel maintainer
> > (reasonably) says he won't backport the patch until it's at least got
> > some review. But I'm definitely seeing the problem!
> > 
> > Thanks folks.
> > -- 
> > Adam Williamson
> > Fedora QA
> > IRC: adamw | Twitter: adamw_ha
> > https://www.happyassassin.net 
> > 
> 
> Hello, Adam!
> 
> The patch was picked up by Andrew Morton and will be soon merged into the Linus's tree.
> Currently you can find it in the "next" tree (can't provide a hash because it's not stable
> there).
> 
> Please, note that there it another fix called "writeback, cgroup: do not reparent dax inodes",
> which you likely want to backport too.
> 
> Both fixes got some testing and reviews.

Hi Roman! Aha, great, thanks for the info. I didn't see any follow-up
on LKML so I figured it was in limbo. I'll let our kernel maintainer
know. Thanks again!
-- 
Adam Williamson
Fedora QA
IRC: adamw | Twitter: adamw_ha
https://www.happyassassin.net



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-23 19:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-16 20:10 [PATCH] writeback, cgroup: remove wb from offline list before releasing refcnt Roman Gushchin
2021-07-23 18:33 ` Adam Williamson
2021-07-23 19:13   ` Roman Gushchin
2021-07-23 19:15     ` Adam Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).