From: akpm@linux-foundation.org
To: dchinner@redhat.com, djwong@kernel.org, guro@fb.com,
jack@suse.cz, jencce.kernel@gmail.com,
mm-commits@vger.kernel.org, willy@infradead.org
Subject: [merged] writeback-cgroup-do-not-reparent-dax-inodes.patch removed from -mm tree
Date: Tue, 27 Jul 2021 12:35:26 -0700 [thread overview]
Message-ID: <20210727193526.vUxqw3Z9a%akpm@linux-foundation.org> (raw)
The patch titled
Subject: writeback, cgroup: do not reparent dax inodes
has been removed from the -mm tree. Its filename was
writeback-cgroup-do-not-reparent-dax-inodes.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: writeback, cgroup: do not reparent dax inodes
The inode switching code is not suited for dax inodes. An attempt to
switch a dax inode to a parent writeback structure (as a part of a
writeback cleanup procedure) results in a panic like this:
[ 987.071651] run fstests generic/270 at 2021-07-15 05:54:02
[ 988.704940] XFS (pmem0p2): EXPERIMENTAL big timestamp feature in
use. Use at your own risk!
[ 988.746847] XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use
at your own risk
[ 988.786070] XFS (pmem0p2): EXPERIMENTAL inode btree counters
feature in use. Use at your own risk!
[ 988.828639] XFS (pmem0p2): Mounting V5 Filesystem
[ 988.854019] XFS (pmem0p2): Ending clean mount
[ 988.874550] XFS (pmem0p2): Quotacheck needed: Please wait.
[ 988.900618] XFS (pmem0p2): Quotacheck: Done.
[ 989.090783] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[ 989.092751] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[ 989.092962] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[ 1010.105586] BUG: unable to handle page fault for address: 0000000005b0f669
[ 1010.141817] #PF: supervisor read access in kernel mode
[ 1010.167824] #PF: error_code(0x0000) - not-present page
[ 1010.191499] PGD 0 P4D 0
[ 1010.203346] Oops: 0000 [#1] SMP PTI
[ 1010.219596] CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted
5.14.0-rc1-master-8096acd7442e+ #8
[ 1010.260441] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360
Gen9, BIOS P89 09/13/2016
[ 1010.297792] Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
[ 1010.324832] RIP: 0010:inode_do_switch_wbs+0xaf/0x470
[ 1010.347261] Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48
c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff
ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08
0f 85
[ 1010.434307] RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
[ 1010.457795] RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
[ 1010.489922] RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
[ 1010.522085] RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
[ 1010.554234] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
[ 1010.586414] R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
[ 1010.619394] FS: 0000000000000000(0000) GS:ffff89ee5fb40000(0000)
knlGS:0000000000000000
[ 1010.658874] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1010.688085] CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
[ 1010.722129] Call Trace:
[ 1010.733132] inode_switch_wbs_work_fn+0xb6/0x2a0
[ 1010.754121] process_one_work+0x1e6/0x380
[ 1010.772512] worker_thread+0x53/0x3d0
[ 1010.789221] ? process_one_work+0x380/0x380
[ 1010.807964] kthread+0x10f/0x130
[ 1010.822043] ? set_kthread_struct+0x40/0x40
[ 1010.840818] ret_from_fork+0x22/0x30
[ 1010.856851] Modules linked in: xt_CHECKSUM xt_MASQUERADE
xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables
nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr
intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp
coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt
irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl
syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma
dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr
intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad
lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod
t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3
ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi
dm_mirror dm_region_hash dm_log dm_mod
[ 1011.200864] CR2: 0000000005b0f669
[ 1011.215700] ---[ end trace ed2105faff8384f3 ]---
[ 1011.241727] RIP: 0010:inode_do_switch_wbs+0xaf/0x470
[ 1011.264306] Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48
c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff
ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08
0f 85
[ 1011.348821] RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
[ 1011.372734] RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
[ 1011.405826] RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
[ 1011.437852] RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
[ 1011.469926] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
[ 1011.502179] R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
[ 1011.534233] FS: 0000000000000000(0000) GS:ffff89ee5fb40000(0000)
knlGS:0000000000000000
[ 1011.571247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1011.597063] CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
[ 1011.629160] Kernel panic - not syncing: Fatal exception
[ 1011.653802] Kernel Offset: 0x15200000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1011.713723] ---[ end Kernel panic - not syncing: Fatal exception ]---
The crash happens on an attempt to iterate over attached pagecache pages
and check the dirty flag: a dax inode's xarray contains pfn's instead of
generic struct page pointers.
This happens for DAX and not for other kinds of non-page entries in the
inodes because it's a tagged iteration, and shadow/swap entries are never
tagged; only DAX entries get tagged.
Fix the problem by bailing out (with the false return value) of
inode_prepare_sbs_switch() if a dax inode is passed.
[willy@infradead.org: changelog addition]
Link: https://lkml.kernel.org/r/20210719171350.3876830-1-guro@fb.com
Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Murphy Zhou <jencce.kernel@gmail.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/fs-writeback.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/fs-writeback.c~writeback-cgroup-do-not-reparent-dax-inodes
+++ a/fs/fs-writeback.c
@@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(str
*/
smp_mb();
+ if (IS_DAX(inode))
+ return false;
+
/* while holding I_WB_SWITCH, no one else can update the association */
spin_lock(&inode->i_lock);
if (!(inode->i_sb->s_flags & SB_ACTIVE) ||
_
Patches currently in -mm which might be from guro@fb.com are
reply other threads:[~2021-07-27 19:35 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210727193526.vUxqw3Z9a%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=guro@fb.com \
--cc=jack@suse.cz \
--cc=jencce.kernel@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.