All of lore.kernel.org
 help / color / mirror / Atom feed
* [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
@ 2021-07-15 10:10 Murphy Zhou
  2021-07-15 16:06 ` Roman Gushchin
  0 siblings, 1 reply; 11+ messages in thread
From: Murphy Zhou @ 2021-07-15 10:10 UTC (permalink / raw)
  To: Linux-Fsdevel, Roman Gushchin; +Cc: Linux MM

Hi,

#Looping generic/270 of xfstests[1] on pmem ramdisk with
mount option:  -o dax=always
mkfs.xfs option: -f -b size=4096 -m reflink=0
can hit this panic now.

#It's not reproducible on ext4.
#It's not reproducible without dax=always.

#Bisecting points to this
  commit c22d70a162d3cc177282c4487be4d54876ca55c8
  Author: Roman Gushchin <guro@fb.com>
  Date:   Mon Jun 28 19:36:03 2021 -0700

      writeback, cgroup: release dying cgwbs by switching attached inodes

#With this commit reverted, no panic.

#xfs_info output:

meta-data=/dev/pmem0p2           isize=512    agcount=4, agsize=532416
blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1,
rmapbt=0
         =                       reflink=0    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=2129664, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

#call trace:

[  987.071651] run fstests generic/270 at 2021-07-15 05:54:02
[  988.704940] XFS (pmem0p2): EXPERIMENTAL big timestamp feature in
use.  Use at your own risk!
[  988.746847] XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use
at your own risk
[  988.786070] XFS (pmem0p2): EXPERIMENTAL inode btree counters
feature in use. Use at your own risk!
[  988.828639] XFS (pmem0p2): Mounting V5 Filesystem
[  988.854019] XFS (pmem0p2): Ending clean mount
[  988.874550] XFS (pmem0p2): Quotacheck needed: Please wait.
[  988.900618] XFS (pmem0p2): Quotacheck: Done.
[  989.090783] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[  989.092751] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[  989.092962] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[ 1010.105586] BUG: unable to handle page fault for address: 0000000005b0f669
[ 1010.141817] #PF: supervisor read access in kernel mode
[ 1010.167824] #PF: error_code(0x0000) - not-present page
[ 1010.191499] PGD 0 P4D 0
[ 1010.203346] Oops: 0000 [#1] SMP PTI
[ 1010.219596] CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted
5.14.0-rc1-master-8096acd7442e+ #8
[ 1010.260441] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360
Gen9, BIOS P89 09/13/2016
[ 1010.297792] Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
[ 1010.324832] RIP: 0010:inode_do_switch_wbs+0xaf/0x470
[ 1010.347261] Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48
c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff
ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08
0f 85
[ 1010.434307] RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
[ 1010.457795] RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
[ 1010.489922] RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
[ 1010.522085] RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
[ 1010.554234] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
[ 1010.586414] R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
[ 1010.619394] FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000)
knlGS:0000000000000000
[ 1010.658874] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1010.688085] CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
[ 1010.722129] Call Trace:
[ 1010.733132]  inode_switch_wbs_work_fn+0xb6/0x2a0
[ 1010.754121]  process_one_work+0x1e6/0x380
[ 1010.772512]  worker_thread+0x53/0x3d0
[ 1010.789221]  ? process_one_work+0x380/0x380
[ 1010.807964]  kthread+0x10f/0x130
[ 1010.822043]  ? set_kthread_struct+0x40/0x40
[ 1010.840818]  ret_from_fork+0x22/0x30
[ 1010.856851] Modules linked in: xt_CHECKSUM xt_MASQUERADE
xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables
nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr
intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp
coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt
irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl
syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma
dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr
intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad
lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod
t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3
ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi
dm_mirror dm_region_hash dm_log dm_mod
[ 1011.200864] CR2: 0000000005b0f669
[ 1011.215700] ---[ end trace ed2105faff8384f3 ]---
[ 1011.241727] RIP: 0010:inode_do_switch_wbs+0xaf/0x470
[ 1011.264306] Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48
c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff
ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08
0f 85
[ 1011.348821] RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
[ 1011.372734] RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
[ 1011.405826] RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
[ 1011.437852] RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
[ 1011.469926] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
[ 1011.502179] R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
[ 1011.534233] FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000)
knlGS:0000000000000000
[ 1011.571247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1011.597063] CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
[ 1011.629160] Kernel panic - not syncing: Fatal exception
[ 1011.653802] Kernel Offset: 0x15200000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1011.713723] ---[ end Kernel panic - not syncing: Fatal exception ]---


[1] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/
-- 
Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-15 10:10 [fsdax xfs] Regression panic at inode_switch_wbs_work_fn Murphy Zhou
@ 2021-07-15 16:06 ` Roman Gushchin
  2021-07-16  5:57   ` Murphy Zhou
  0 siblings, 1 reply; 11+ messages in thread
From: Roman Gushchin @ 2021-07-15 16:06 UTC (permalink / raw)
  To: Murphy Zhou; +Cc: Linux-Fsdevel, Linux MM

On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> Hi,
> 
> #Looping generic/270 of xfstests[1] on pmem ramdisk with
> mount option:  -o dax=always
> mkfs.xfs option: -f -b size=4096 -m reflink=0
> can hit this panic now.
> 
> #It's not reproducible on ext4.
> #It's not reproducible without dax=always.

Hi Murphy!

Thank you for the report!

Can you, please, check if the following patch fixes the problem?

Thank you!

--

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 271f2ca862c8..f5561ea7d90a 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work)
        blkcg_unpin_online(blkcg);
 
        fprop_local_destroy_percpu(&wb->memcg_completions);
-       percpu_ref_exit(&wb->refcnt);
 
        spin_lock_irq(&cgwb_lock);
        list_del(&wb->offline_node);
        spin_unlock_irq(&cgwb_lock);
 
+       percpu_ref_exit(&wb->refcnt);
        wb_exit(wb);
        WARN_ON_ONCE(!list_empty(&wb->b_attached));
        kfree_rcu(wb, rcu);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-15 16:06 ` Roman Gushchin
@ 2021-07-16  5:57   ` Murphy Zhou
  2021-07-16 20:13     ` Roman Gushchin
  0 siblings, 1 reply; 11+ messages in thread
From: Murphy Zhou @ 2021-07-16  5:57 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Linux-Fsdevel, Linux MM

Hi,

On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > Hi,
> >
> > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > mount option:  -o dax=always
> > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > can hit this panic now.
> >
> > #It's not reproducible on ext4.
> > #It's not reproducible without dax=always.
>
> Hi Murphy!
>
> Thank you for the report!
>
> Can you, please, check if the following patch fixes the problem?

No. Still the same panic.

>
> Thank you!
>
> --
>
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 271f2ca862c8..f5561ea7d90a 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work)
>         blkcg_unpin_online(blkcg);
>
>         fprop_local_destroy_percpu(&wb->memcg_completions);
> -       percpu_ref_exit(&wb->refcnt);
>
>         spin_lock_irq(&cgwb_lock);
>         list_del(&wb->offline_node);
>         spin_unlock_irq(&cgwb_lock);
>
> +       percpu_ref_exit(&wb->refcnt);
>         wb_exit(wb);
>         WARN_ON_ONCE(!list_empty(&wb->b_attached));
>         kfree_rcu(wb, rcu);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-16  5:57   ` Murphy Zhou
@ 2021-07-16 20:13     ` Roman Gushchin
  2021-07-17 17:17       ` Darrick J. Wong
  0 siblings, 1 reply; 11+ messages in thread
From: Roman Gushchin @ 2021-07-16 20:13 UTC (permalink / raw)
  To: Murphy Zhou; +Cc: Linux-Fsdevel, Linux MM

On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> Hi,
> 
> On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > Hi,
> > >
> > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > mount option:  -o dax=always
> > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > can hit this panic now.
> > >
> > > #It's not reproducible on ext4.
> > > #It's not reproducible without dax=always.
> >
> > Hi Murphy!
> >
> > Thank you for the report!
> >
> > Can you, please, check if the following patch fixes the problem?
> 
> No. Still the same panic.

Hm, can you, please, double check this? It seems that the patch fixes the
problem for others (of course, it can be a different problem).
CCed you on the proper patch, just sent to the list.

Otherwise, can you, please, say on which line of code the panic happens?
(using addr2line utility, for example)

Thank you!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-16 20:13     ` Roman Gushchin
@ 2021-07-17 17:17       ` Darrick J. Wong
  2021-07-17 17:40         ` Matthew Wilcox
  2021-07-17 22:08         ` Roman Gushchin
  0 siblings, 2 replies; 11+ messages in thread
From: Darrick J. Wong @ 2021-07-17 17:17 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Murphy Zhou, Linux-Fsdevel, Linux MM

On Fri, Jul 16, 2021 at 01:13:05PM -0700, Roman Gushchin wrote:
> On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> > Hi,
> > 
> > On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > > Hi,
> > > >
> > > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > > mount option:  -o dax=always
> > > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > > can hit this panic now.
> > > >
> > > > #It's not reproducible on ext4.
> > > > #It's not reproducible without dax=always.
> > >
> > > Hi Murphy!
> > >
> > > Thank you for the report!
> > >
> > > Can you, please, check if the following patch fixes the problem?
> > 
> > No. Still the same panic.
> 
> Hm, can you, please, double check this? It seems that the patch fixes the
> problem for others (of course, it can be a different problem).
> CCed you on the proper patch, just sent to the list.
> 
> Otherwise, can you, please, say on which line of code the panic happens?
> (using addr2line utility, for example)

I experience the same problem that Murphy does, and I tracked it down
to this chunk of inode_do_switch_wbs:

	/*
	 * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
	 * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
	 * pages actually under writeback.
	 */
	xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
here >>>>>>>>>> if (PageDirty(page)) {
			dec_wb_stat(old_wb, WB_RECLAIMABLE);
			inc_wb_stat(new_wb, WB_RECLAIMABLE);
		}
	}

I suspect that "page" is really a pfn to a pmem mapping and not a real
struct page.

--D

> 
> Thank you!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-17 17:17       ` Darrick J. Wong
@ 2021-07-17 17:40         ` Matthew Wilcox
  2021-07-17 22:08         ` Roman Gushchin
  1 sibling, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2021-07-17 17:40 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Roman Gushchin, Murphy Zhou, Linux-Fsdevel, Linux MM

On Sat, Jul 17, 2021 at 10:17:13AM -0700, Darrick J. Wong wrote:
> I experience the same problem that Murphy does, and I tracked it down
> to this chunk of inode_do_switch_wbs:
> 
> 	/*
> 	 * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
> 	 * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
> 	 * pages actually under writeback.
> 	 */
> 	xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
> here >>>>>>>>>> if (PageDirty(page)) {
> 			dec_wb_stat(old_wb, WB_RECLAIMABLE);
> 			inc_wb_stat(new_wb, WB_RECLAIMABLE);
> 		}
> 	}
> 
> I suspect that "page" is really a pfn to a pmem mapping and not a real
> struct page.

I think you're right.

Running scripts/decodecode on the original report, that's:

   0:	48 8b 50 08          	mov    0x8(%rax),%rdx

RAX: 0000000005b0f661

so rax is not a struct page, it's a PFN (/DAX) entry.

We shouldn't even be calling inode_do_switch_wbs() for DAX inodes because
we don't do writeback for DAX inodes (at least as far as the kernel's
writeback infrastructure is concerned; obviously the CPU does writeback
from its caches to PMEM).

Maybe some check at a higher level would be appropriate?  I don't know
much about this part of the kernel.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-17 17:17       ` Darrick J. Wong
  2021-07-17 17:40         ` Matthew Wilcox
@ 2021-07-17 22:08         ` Roman Gushchin
  2021-07-17 23:26           ` Matthew Wilcox
  2021-07-19  5:56             ` Murphy Zhou
  1 sibling, 2 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-07-17 22:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Murphy Zhou, Linux-Fsdevel, Linux MM

On Sat, Jul 17, 2021 at 10:17:13AM -0700, Darrick J. Wong wrote:
> On Fri, Jul 16, 2021 at 01:13:05PM -0700, Roman Gushchin wrote:
> > On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> > > Hi,
> > > 
> > > On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > > > Hi,
> > > > >
> > > > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > > > mount option:  -o dax=always
> > > > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > > > can hit this panic now.
> > > > >
> > > > > #It's not reproducible on ext4.
> > > > > #It's not reproducible without dax=always.
> > > >
> > > > Hi Murphy!
> > > >
> > > > Thank you for the report!
> > > >
> > > > Can you, please, check if the following patch fixes the problem?
> > > 
> > > No. Still the same panic.
> > 
> > Hm, can you, please, double check this? It seems that the patch fixes the
> > problem for others (of course, it can be a different problem).
> > CCed you on the proper patch, just sent to the list.
> > 
> > Otherwise, can you, please, say on which line of code the panic happens?
> > (using addr2line utility, for example)
> 
> I experience the same problem that Murphy does, and I tracked it down
> to this chunk of inode_do_switch_wbs:
> 
> 	/*
> 	 * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
> 	 * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
> 	 * pages actually under writeback.
> 	 */
> 	xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
> here >>>>>>>>>> if (PageDirty(page)) {
> 			dec_wb_stat(old_wb, WB_RECLAIMABLE);
> 			inc_wb_stat(new_wb, WB_RECLAIMABLE);
> 		}
> 	}
> 
> I suspect that "page" is really a pfn to a pmem mapping and not a real
> struct page.

Good catch! Now it's clear that it's a different issue.

I think as now the best option is to ignore dax inodes completely.
Can you, please, confirm, that the following patch solves the problem?

Thanks!

--

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 06d04a74ab6c..4c3370548982 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
         */
        smp_mb();
 
+       if (IS_DAX(inode))
+               return false;
+
        /* while holding I_WB_SWITCH, no one else can update the association */
        spin_lock(&inode->i_lock);
        if (!(inode->i_sb->s_flags & SB_ACTIVE) ||

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-17 22:08         ` Roman Gushchin
@ 2021-07-17 23:26           ` Matthew Wilcox
  2021-07-18 16:07             ` Roman Gushchin
  2021-07-19  5:56             ` Murphy Zhou
  1 sibling, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2021-07-17 23:26 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Darrick J. Wong, Murphy Zhou, Linux-Fsdevel, Linux MM

On Sat, Jul 17, 2021 at 03:08:28PM -0700, Roman Gushchin wrote:
> On Sat, Jul 17, 2021 at 10:17:13AM -0700, Darrick J. Wong wrote:
> > On Fri, Jul 16, 2021 at 01:13:05PM -0700, Roman Gushchin wrote:
> > > On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> > > > Hi,
> > > > 
> > > > On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > > > > Hi,
> > > > > >
> > > > > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > > > > mount option:  -o dax=always
> > > > > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > > > > can hit this panic now.
> > > > > >
> > > > > > #It's not reproducible on ext4.
> > > > > > #It's not reproducible without dax=always.
> > > > >
> > > > > Hi Murphy!
> > > > >
> > > > > Thank you for the report!
> > > > >
> > > > > Can you, please, check if the following patch fixes the problem?
> > > > 
> > > > No. Still the same panic.
> > > 
> > > Hm, can you, please, double check this? It seems that the patch fixes the
> > > problem for others (of course, it can be a different problem).
> > > CCed you on the proper patch, just sent to the list.
> > > 
> > > Otherwise, can you, please, say on which line of code the panic happens?
> > > (using addr2line utility, for example)
> > 
> > I experience the same problem that Murphy does, and I tracked it down
> > to this chunk of inode_do_switch_wbs:
> > 
> > 	/*
> > 	 * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
> > 	 * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
> > 	 * pages actually under writeback.
> > 	 */
> > 	xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
> > here >>>>>>>>>> if (PageDirty(page)) {
> > 			dec_wb_stat(old_wb, WB_RECLAIMABLE);
> > 			inc_wb_stat(new_wb, WB_RECLAIMABLE);
> > 		}
> > 	}
> > 
> > I suspect that "page" is really a pfn to a pmem mapping and not a real
> > struct page.
> 
> Good catch! Now it's clear that it's a different issue.
> 
> I think as now the best option is to ignore dax inodes completely.
> Can you, please, confirm, that the following patch solves the problem?
> 
> Thanks!
> 
> --
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 06d04a74ab6c..4c3370548982 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
>          */
>         smp_mb();
>  
> +       if (IS_DAX(inode))
> +               return false;
> +
>         /* while holding I_WB_SWITCH, no one else can update the association */
>         spin_lock(&inode->i_lock);
>         if (!(inode->i_sb->s_flags & SB_ACTIVE) ||

That should work, but wouldn't it be better to test that at the top of
inode_switch_wbs()?  Or even earlier?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-17 23:26           ` Matthew Wilcox
@ 2021-07-18 16:07             ` Roman Gushchin
  0 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-07-18 16:07 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Darrick J. Wong, Murphy Zhou, Linux-Fsdevel, Linux MM

On Sun, Jul 18, 2021 at 12:26:00AM +0100, Matthew Wilcox wrote:
> On Sat, Jul 17, 2021 at 03:08:28PM -0700, Roman Gushchin wrote:
> > On Sat, Jul 17, 2021 at 10:17:13AM -0700, Darrick J. Wong wrote:
> > > On Fri, Jul 16, 2021 at 01:13:05PM -0700, Roman Gushchin wrote:
> > > > On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> > > > > Hi,
> > > > > 
> > > > > On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> > > > > >
> > > > > > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > > > > > mount option:  -o dax=always
> > > > > > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > > > > > can hit this panic now.
> > > > > > >
> > > > > > > #It's not reproducible on ext4.
> > > > > > > #It's not reproducible without dax=always.
> > > > > >
> > > > > > Hi Murphy!
> > > > > >
> > > > > > Thank you for the report!
> > > > > >
> > > > > > Can you, please, check if the following patch fixes the problem?
> > > > > 
> > > > > No. Still the same panic.
> > > > 
> > > > Hm, can you, please, double check this? It seems that the patch fixes the
> > > > problem for others (of course, it can be a different problem).
> > > > CCed you on the proper patch, just sent to the list.
> > > > 
> > > > Otherwise, can you, please, say on which line of code the panic happens?
> > > > (using addr2line utility, for example)
> > > 
> > > I experience the same problem that Murphy does, and I tracked it down
> > > to this chunk of inode_do_switch_wbs:
> > > 
> > > 	/*
> > > 	 * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
> > > 	 * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
> > > 	 * pages actually under writeback.
> > > 	 */
> > > 	xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
> > > here >>>>>>>>>> if (PageDirty(page)) {
> > > 			dec_wb_stat(old_wb, WB_RECLAIMABLE);
> > > 			inc_wb_stat(new_wb, WB_RECLAIMABLE);
> > > 		}
> > > 	}
> > > 
> > > I suspect that "page" is really a pfn to a pmem mapping and not a real
> > > struct page.
> > 
> > Good catch! Now it's clear that it's a different issue.
> > 
> > I think as now the best option is to ignore dax inodes completely.
> > Can you, please, confirm, that the following patch solves the problem?
> > 
> > Thanks!
> > 
> > --
> > 
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 06d04a74ab6c..4c3370548982 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
> >          */
> >         smp_mb();
> >  
> > +       if (IS_DAX(inode))
> > +               return false;
> > +
> >         /* while holding I_WB_SWITCH, no one else can update the association */
> >         spin_lock(&inode->i_lock);
> >         if (!(inode->i_sb->s_flags & SB_ACTIVE) ||
> 
> That should work, but wouldn't it be better to test that at the top of
> inode_switch_wbs()?  Or even earlier?
> 

Hm, inode_switch_wbs() is not called from the cleanup path.
The cleanup path works like this:
  cleanup_offline_cgwbs_workfn()
    cleanup_offline_cgwb()
      inode_prepare_wbs_switch()
      inode_switch_wbs_work_fn()

While the generic switching path:
  inode_switch_wbs()
    inode_prepare_wbs_switch()
    inode_switch_wbs_work_fn()

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
  2021-07-17 22:08         ` Roman Gushchin
@ 2021-07-19  5:56             ` Murphy Zhou
  2021-07-19  5:56             ` Murphy Zhou
  1 sibling, 0 replies; 11+ messages in thread
From: Murphy Zhou @ 2021-07-19  5:56 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Darrick J. Wong, Linux-Fsdevel, Linux MM

On Sun, Jul 18, 2021 at 6:08 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Sat, Jul 17, 2021 at 10:17:13AM -0700, Darrick J. Wong wrote:
> > On Fri, Jul 16, 2021 at 01:13:05PM -0700, Roman Gushchin wrote:
> > > On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> > > > Hi,
> > > >
> > > > On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > > > > Hi,
> > > > > >
> > > > > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > > > > mount option:  -o dax=always
> > > > > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > > > > can hit this panic now.
> > > > > >
> > > > > > #It's not reproducible on ext4.
> > > > > > #It's not reproducible without dax=always.
> > > > >
> > > > > Hi Murphy!
> > > > >
> > > > > Thank you for the report!
> > > > >
> > > > > Can you, please, check if the following patch fixes the problem?
> > > >
> > > > No. Still the same panic.
> > >
> > > Hm, can you, please, double check this? It seems that the patch fixes the
> > > problem for others (of course, it can be a different problem).
> > > CCed you on the proper patch, just sent to the list.
> > >
> > > Otherwise, can you, please, say on which line of code the panic happens?
> > > (using addr2line utility, for example)
> >
> > I experience the same problem that Murphy does, and I tracked it down
> > to this chunk of inode_do_switch_wbs:
> >
> >       /*
> >        * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
> >        * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
> >        * pages actually under writeback.
> >        */
> >       xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
> > here >>>>>>>>>> if (PageDirty(page)) {
> >                       dec_wb_stat(old_wb, WB_RECLAIMABLE);
> >                       inc_wb_stat(new_wb, WB_RECLAIMABLE);
> >               }
> >       }
> >
> > I suspect that "page" is really a pfn to a pmem mapping and not a real
> > struct page.
>
> Good catch! Now it's clear that it's a different issue.
>
> I think as now the best option is to ignore dax inodes completely.
> Can you, please, confirm, that the following patch solves the problem?

This one works for me. Thanks.


>
> Thanks!
>
> --
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 06d04a74ab6c..4c3370548982 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
>          */
>         smp_mb();
>
> +       if (IS_DAX(inode))
> +               return false;
> +
>         /* while holding I_WB_SWITCH, no one else can update the association */
>         spin_lock(&inode->i_lock);
>         if (!(inode->i_sb->s_flags & SB_ACTIVE) ||

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fsdax xfs] Regression panic at inode_switch_wbs_work_fn
@ 2021-07-19  5:56             ` Murphy Zhou
  0 siblings, 0 replies; 11+ messages in thread
From: Murphy Zhou @ 2021-07-19  5:56 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Darrick J. Wong, Linux-Fsdevel, Linux MM

On Sun, Jul 18, 2021 at 6:08 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Sat, Jul 17, 2021 at 10:17:13AM -0700, Darrick J. Wong wrote:
> > On Fri, Jul 16, 2021 at 01:13:05PM -0700, Roman Gushchin wrote:
> > > On Fri, Jul 16, 2021 at 01:57:55PM +0800, Murphy Zhou wrote:
> > > > Hi,
> > > >
> > > > On Fri, Jul 16, 2021 at 12:07 AM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > On Thu, Jul 15, 2021 at 06:10:22PM +0800, Murphy Zhou wrote:
> > > > > > Hi,
> > > > > >
> > > > > > #Looping generic/270 of xfstests[1] on pmem ramdisk with
> > > > > > mount option:  -o dax=always
> > > > > > mkfs.xfs option: -f -b size=4096 -m reflink=0
> > > > > > can hit this panic now.
> > > > > >
> > > > > > #It's not reproducible on ext4.
> > > > > > #It's not reproducible without dax=always.
> > > > >
> > > > > Hi Murphy!
> > > > >
> > > > > Thank you for the report!
> > > > >
> > > > > Can you, please, check if the following patch fixes the problem?
> > > >
> > > > No. Still the same panic.
> > >
> > > Hm, can you, please, double check this? It seems that the patch fixes the
> > > problem for others (of course, it can be a different problem).
> > > CCed you on the proper patch, just sent to the list.
> > >
> > > Otherwise, can you, please, say on which line of code the panic happens?
> > > (using addr2line utility, for example)
> >
> > I experience the same problem that Murphy does, and I tracked it down
> > to this chunk of inode_do_switch_wbs:
> >
> >       /*
> >        * Count and transfer stats.  Note that PAGECACHE_TAG_DIRTY points
> >        * to possibly dirty pages while PAGECACHE_TAG_WRITEBACK points to
> >        * pages actually under writeback.
> >        */
> >       xas_for_each_marked(&xas, page, ULONG_MAX, PAGECACHE_TAG_DIRTY) {
> > here >>>>>>>>>> if (PageDirty(page)) {
> >                       dec_wb_stat(old_wb, WB_RECLAIMABLE);
> >                       inc_wb_stat(new_wb, WB_RECLAIMABLE);
> >               }
> >       }
> >
> > I suspect that "page" is really a pfn to a pmem mapping and not a real
> > struct page.
>
> Good catch! Now it's clear that it's a different issue.
>
> I think as now the best option is to ignore dax inodes completely.
> Can you, please, confirm, that the following patch solves the problem?

This one works for me. Thanks.


>
> Thanks!
>
> --
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 06d04a74ab6c..4c3370548982 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
>          */
>         smp_mb();
>
> +       if (IS_DAX(inode))
> +               return false;
> +
>         /* while holding I_WB_SWITCH, no one else can update the association */
>         spin_lock(&inode->i_lock);
>         if (!(inode->i_sb->s_flags & SB_ACTIVE) ||


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-07-19  5:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-15 10:10 [fsdax xfs] Regression panic at inode_switch_wbs_work_fn Murphy Zhou
2021-07-15 16:06 ` Roman Gushchin
2021-07-16  5:57   ` Murphy Zhou
2021-07-16 20:13     ` Roman Gushchin
2021-07-17 17:17       ` Darrick J. Wong
2021-07-17 17:40         ` Matthew Wilcox
2021-07-17 22:08         ` Roman Gushchin
2021-07-17 23:26           ` Matthew Wilcox
2021-07-18 16:07             ` Roman Gushchin
2021-07-19  5:56           ` Murphy Zhou
2021-07-19  5:56             ` Murphy Zhou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.