linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fuse: kernel BUG at mm/truncate.c:763!
@ 2021-03-12  8:52 Luis Henriques
  2021-03-12  9:48 ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-12  8:52 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

Hi Miklos,

I've seen a bug report (5.10.16 kernel splat below) that seems to be
reproducible in kernels as early as 5.4.

The commit that caught my attention when looking at what was merged in 5.4
was e4648309b85a ("fuse: truncate pending writes on O_TRUNC") but I didn't
went too deeper on that -- I was wondering if you have seen something
similar before.

There's another splat in the bug report[1] for a 5.4.14 kernel (which may
be for a different bug, but the traces don't look as reliable as the one
bellow).

[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1182929

[97604.721590] kernel BUG at mm/truncate.c:763!
[97604.721601] invalid opcode: 0000 [#1] SMP PTI
[97604.721613] CPU: 18 PID: 1584438 Comm: g++ Tainted: P           O 
 5.10.16-1-default #1 openSUSE Tumbleweed
[97604.721618] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a
10/16/2019
[97604.721631] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0
[97604.721637] Code: 0f 48 f0 e9 19 ff ff ff 31 c9 4c 89 e7 ba 01 00 00 00
48 89 ee e8 1a c5 02 00 4c 89 ff e8 02 1b 01 00 84 c0 0f 84 ca fe ff ff <0f>
0b 49 8b 57 18 49 39 d4 0f 85 e2 fe ff ff 49 f7 07 00 60 00 00
[97604.721645] RSP: 0018:ffffa613aa54ba40 EFLAGS: 00010202
[97604.721651] RAX: 0000000000000001 RBX: 000000000000000a RCX:
0000000000000200
[97604.721656] RDX: 0000000000000090 RSI: 00affff800010037 RDI:
ffffd880718e0000
[97604.721660] RBP: 0000000000001400 R08: 0000000000001400 R09:
0000000000001a73
[97604.721664] R10: 0000000000000000 R11: 0000000004a684da R12:
ffff8a28d4549d78
[97604.721669] R13: ffffffffffffffff R14: 0000000000000000 R15:
ffffd880718e0000
[97604.721674] FS:  00007f9cdd7fb740(0000) GS:ffff8a5c7f980000(0000)
knlGS:0000000000000000
[97604.721679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[97604.721683] CR2: 00007f89d3d78d80 CR3: 0000004d8a14e005 CR4:
00000000007706e0
[97604.721688] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[97604.721692] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
97604.721696] PKRU: 55555554
[97604.721699] Call Trace:
[97604.721719]  ? request_wait_answer+0x11a/0x210 [fuse]
[97604.721729]  ? fuse_dentry_delete+0xb/0x20 [fuse]
[97604.721740]  fuse_finish_open+0x85/0x150 [fuse]
[97604.721750]  fuse_open_common+0x1a8/0x1b0 [fuse]
[97604.721759]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
[97604.721766]  do_dentry_open+0x14e/0x380
[97604.721775]  path_openat+0x600/0x10d0
[97604.721782]  ? handle_mm_fault+0x103c/0x1a00
[97604.721791]  ? follow_page_pte+0x314/0x5f0
[97604.721795]  do_filp_open+0x88/0x130
[97604.721803]  ? security_prepare_creds+0x6d/0x90
[97604.721808]  ? __kmalloc+0x11d/0x2a0
[97604.721814]  do_open_execat+0x6d/0x1a0
[97604.721819]  bprm_execve+0x190/0x6b0
[97604.721825]  do_execveat_common+0x192/0x1c0
[97604.721830]  __x64_sys_execve+0x39/0x50
[97604.721836]  do_syscall_64+0x33/0x80
[97604.721843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[97604.721848] RIP: 0033:0x7f9cdcfe2c37
[97604.721853] Code: ff ff 76 df 89 c6 f7 de 64 41 89 32 eb d5 89 c6 f7 de
64 41 89 32 eb db 66 2e 0f 1f 84 00 00 00 00 00 90 b8 3b 00 00 00 0f 05 <48>
3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 08 12 30 00 f7 d8 64 89 02
[97604.721862] RSP: 002b:00007ffe444f5758 EFLAGS: 00000202 ORIG_RAX:
000000000000003b
[97604.721867] RAX: ffffffffffffffda RBX: 00007f9cdd7fb6a0 RCX:
00007f9cdcfe2c37
[97604.721872] RDX: 00000000020f5300 RSI: 00000000020f3bf8 RDI:
00000000020f36a0
[97604.721876] RBP: 0000000000000001 R08: 0000000000000000 R09:
0000000000000000
[97604.721880] R10: 00007ffe444f4b60 R11: 0000000000000202 R12:
0000000000000000
[97604.721884] R13: 0000000000000001 R14: 00000000020f36a0 R15:
0000000000000000
[97604.721890] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver
nfsv3 nfs fscache libafs(PO) iscsi_ibft iscsi_boot_sysfs rfkill
vboxnetadp(O) vboxnetflt(O) vboxdrv(O) dmi_sysfs intel_rapl_msr
intel_rapl_common isst_if_common joydev ipmi_ssif i40iw ib_uverbs iTCO_wdt
intel_pmc_bxt ib_core hid_generic iTCO_vendor_support skx_edac nfit
libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi
usbhid kvm i40e ipmi_si ioatdma mei_me i2c_i801 irqbypass ipmi_devintf mei
i2c_smbus lpc_ich dca efi_pstore pcspkr ipmi_msghandler tiny_power_button
acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl lockd
auth_rpcgss grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit
drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
cec rc_core drm_ttm_helper xhci_pci ttm xhci_pci_renesas xhci_hcd
crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
drm glue_helper crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp
llc
[97604.721991]  dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
msr efivarfs
[97604.722031] ---[ end trace edcabaccd35272e2 ]---
[97604.727773] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0

Cheers,
--
Luís


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-12  8:52 fuse: kernel BUG at mm/truncate.c:763! Luis Henriques
@ 2021-03-12  9:48 ` Miklos Szeredi
  2021-03-12 12:21   ` Luis Henriques
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2021-03-12  9:48 UTC (permalink / raw)
  To: Luis Henriques; +Cc: linux-fsdevel, linux-kernel

On Fri, Mar 12, 2021 at 9:51 AM Luis Henriques <lhenriques@suse.de> wrote:
>
> Hi Miklos,
>
> I've seen a bug report (5.10.16 kernel splat below) that seems to be
> reproducible in kernels as early as 5.4.
>
> The commit that caught my attention when looking at what was merged in 5.4
> was e4648309b85a ("fuse: truncate pending writes on O_TRUNC") but I didn't
> went too deeper on that -- I was wondering if you have seen something
> similar before.

Don't remember seeing this.

Excerpt from invalidate_inode_pages2_range():

        lock_page(page);
        [...]
        if (page_mapped(page)) {
             [...]
                        unmap_mapping_pages(mapping, index,
                                                1, false);
                }
        }
        BUG_ON(page_mapped(page));

Page fault locks the page before installing a new pte, at least
AFAICS, so the BUG looks impossible.  The referenced commits only
touch very high level control of writeback, so they may well increase
the chance of a bug triggering, but very unlikely to be the actual
cause of the bug.   I'm guessing this to be an MM issue.

Is this reproducible on vanilla, or just openSUSE kernels?

Thanks,
Miklos



>
>
> There's another splat in the bug report[1] for a 5.4.14 kernel (which may
> be for a different bug, but the traces don't look as reliable as the one
> bellow).
>
> [1] https://bugzilla.opensuse.org/show_bug.cgi?id=1182929
>
> [97604.721590] kernel BUG at mm/truncate.c:763!
> [97604.721601] invalid opcode: 0000 [#1] SMP PTI
> [97604.721613] CPU: 18 PID: 1584438 Comm: g++ Tainted: P           O
>  5.10.16-1-default #1 openSUSE Tumbleweed
> [97604.721618] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a
> 10/16/2019
> [97604.721631] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0
> [97604.721637] Code: 0f 48 f0 e9 19 ff ff ff 31 c9 4c 89 e7 ba 01 00 00 00
> 48 89 ee e8 1a c5 02 00 4c 89 ff e8 02 1b 01 00 84 c0 0f 84 ca fe ff ff <0f>
> 0b 49 8b 57 18 49 39 d4 0f 85 e2 fe ff ff 49 f7 07 00 60 00 00
> [97604.721645] RSP: 0018:ffffa613aa54ba40 EFLAGS: 00010202
> [97604.721651] RAX: 0000000000000001 RBX: 000000000000000a RCX:
> 0000000000000200
> [97604.721656] RDX: 0000000000000090 RSI: 00affff800010037 RDI:
> ffffd880718e0000
> [97604.721660] RBP: 0000000000001400 R08: 0000000000001400 R09:
> 0000000000001a73
> [97604.721664] R10: 0000000000000000 R11: 0000000004a684da R12:
> ffff8a28d4549d78
> [97604.721669] R13: ffffffffffffffff R14: 0000000000000000 R15:
> ffffd880718e0000
> [97604.721674] FS:  00007f9cdd7fb740(0000) GS:ffff8a5c7f980000(0000)
> knlGS:0000000000000000
> [97604.721679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [97604.721683] CR2: 00007f89d3d78d80 CR3: 0000004d8a14e005 CR4:
> 00000000007706e0
> [97604.721688] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [97604.721692] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> 97604.721696] PKRU: 55555554
> [97604.721699] Call Trace:
> [97604.721719]  ? request_wait_answer+0x11a/0x210 [fuse]
> [97604.721729]  ? fuse_dentry_delete+0xb/0x20 [fuse]
> [97604.721740]  fuse_finish_open+0x85/0x150 [fuse]
> [97604.721750]  fuse_open_common+0x1a8/0x1b0 [fuse]
> [97604.721759]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
> [97604.721766]  do_dentry_open+0x14e/0x380
> [97604.721775]  path_openat+0x600/0x10d0
> [97604.721782]  ? handle_mm_fault+0x103c/0x1a00
> [97604.721791]  ? follow_page_pte+0x314/0x5f0
> [97604.721795]  do_filp_open+0x88/0x130
> [97604.721803]  ? security_prepare_creds+0x6d/0x90
> [97604.721808]  ? __kmalloc+0x11d/0x2a0
> [97604.721814]  do_open_execat+0x6d/0x1a0
> [97604.721819]  bprm_execve+0x190/0x6b0
> [97604.721825]  do_execveat_common+0x192/0x1c0
> [97604.721830]  __x64_sys_execve+0x39/0x50
> [97604.721836]  do_syscall_64+0x33/0x80
> [97604.721843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [97604.721848] RIP: 0033:0x7f9cdcfe2c37
> [97604.721853] Code: ff ff 76 df 89 c6 f7 de 64 41 89 32 eb d5 89 c6 f7 de
> 64 41 89 32 eb db 66 2e 0f 1f 84 00 00 00 00 00 90 b8 3b 00 00 00 0f 05 <48>
> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 08 12 30 00 f7 d8 64 89 02
> [97604.721862] RSP: 002b:00007ffe444f5758 EFLAGS: 00000202 ORIG_RAX:
> 000000000000003b
> [97604.721867] RAX: ffffffffffffffda RBX: 00007f9cdd7fb6a0 RCX:
> 00007f9cdcfe2c37
> [97604.721872] RDX: 00000000020f5300 RSI: 00000000020f3bf8 RDI:
> 00000000020f36a0
> [97604.721876] RBP: 0000000000000001 R08: 0000000000000000 R09:
> 0000000000000000
> [97604.721880] R10: 00007ffe444f4b60 R11: 0000000000000202 R12:
> 0000000000000000
> [97604.721884] R13: 0000000000000001 R14: 00000000020f36a0 R15:
> 0000000000000000
> [97604.721890] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver
> nfsv3 nfs fscache libafs(PO) iscsi_ibft iscsi_boot_sysfs rfkill
> vboxnetadp(O) vboxnetflt(O) vboxdrv(O) dmi_sysfs intel_rapl_msr
> intel_rapl_common isst_if_common joydev ipmi_ssif i40iw ib_uverbs iTCO_wdt
> intel_pmc_bxt ib_core hid_generic iTCO_vendor_support skx_edac nfit
> libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi
> usbhid kvm i40e ipmi_si ioatdma mei_me i2c_i801 irqbypass ipmi_devintf mei
> i2c_smbus lpc_ich dca efi_pstore pcspkr ipmi_msghandler tiny_power_button
> acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl lockd
> auth_rpcgss grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit
> drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> cec rc_core drm_ttm_helper xhci_pci ttm xhci_pci_renesas xhci_hcd
> crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> drm glue_helper crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp
> llc
> [97604.721991]  dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> msr efivarfs
> [97604.722031] ---[ end trace edcabaccd35272e2 ]---
> [97604.727773] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0
>
> Cheers,
> --
> Luís
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-12  9:48 ` Miklos Szeredi
@ 2021-03-12 12:21   ` Luis Henriques
  2021-03-12 13:11     ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-12 12:21 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Kirill A. Shutemov, Andrew Morton, linux-fsdevel, linux-kernel

On Fri, Mar 12, 2021 at 10:48:40AM +0100, Miklos Szeredi wrote:
> On Fri, Mar 12, 2021 at 9:51 AM Luis Henriques <lhenriques@suse.de> wrote:
> >
> > Hi Miklos,
> >
> > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > reproducible in kernels as early as 5.4.
> >
> > The commit that caught my attention when looking at what was merged in 5.4
> > was e4648309b85a ("fuse: truncate pending writes on O_TRUNC") but I didn't
> > went too deeper on that -- I was wondering if you have seen something
> > similar before.
> 
> Don't remember seeing this.
> 
> Excerpt from invalidate_inode_pages2_range():
> 
>         lock_page(page);
>         [...]
>         if (page_mapped(page)) {
>              [...]
>                         unmap_mapping_pages(mapping, index,
>                                                 1, false);
>                 }
>         }
>         BUG_ON(page_mapped(page));
> 
> Page fault locks the page before installing a new pte, at least
> AFAICS, so the BUG looks impossible.  The referenced commits only
> touch very high level control of writeback, so they may well increase
> the chance of a bug triggering, but very unlikely to be the actual
> cause of the bug.   I'm guessing this to be an MM issue.

Ok, thank you for having a look at it.

Interestingly, there's a single commit to mm/truncate.c in 5.4:
ef18a1ca847b ("mm/thp: allow dropping THP from page cache").  I'm Cc'ing
Andrew and Kirill, maybe they have some ideas.

> Is this reproducible on vanilla, or just openSUSE kernels?

Well, this is on a Tumbleweed kernel, which is pretty much the stable
kernel with a few patches that AFAIK touch mostly drivers.  But I'll see
if I can get the reporter trying to reproduce on a vanilla kernel.

Cheers,
--
Luís

> 
> Thanks,
> Miklos
> 
> 
> 
> >
> >
> > There's another splat in the bug report[1] for a 5.4.14 kernel (which may
> > be for a different bug, but the traces don't look as reliable as the one
> > bellow).
> >
> > [1] https://bugzilla.opensuse.org/show_bug.cgi?id=1182929
> >
> > [97604.721590] kernel BUG at mm/truncate.c:763!
> > [97604.721601] invalid opcode: 0000 [#1] SMP PTI
> > [97604.721613] CPU: 18 PID: 1584438 Comm: g++ Tainted: P           O
> >  5.10.16-1-default #1 openSUSE Tumbleweed
> > [97604.721618] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a
> > 10/16/2019
> > [97604.721631] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0
> > [97604.721637] Code: 0f 48 f0 e9 19 ff ff ff 31 c9 4c 89 e7 ba 01 00 00 00
> > 48 89 ee e8 1a c5 02 00 4c 89 ff e8 02 1b 01 00 84 c0 0f 84 ca fe ff ff <0f>
> > 0b 49 8b 57 18 49 39 d4 0f 85 e2 fe ff ff 49 f7 07 00 60 00 00
> > [97604.721645] RSP: 0018:ffffa613aa54ba40 EFLAGS: 00010202
> > [97604.721651] RAX: 0000000000000001 RBX: 000000000000000a RCX:
> > 0000000000000200
> > [97604.721656] RDX: 0000000000000090 RSI: 00affff800010037 RDI:
> > ffffd880718e0000
> > [97604.721660] RBP: 0000000000001400 R08: 0000000000001400 R09:
> > 0000000000001a73
> > [97604.721664] R10: 0000000000000000 R11: 0000000004a684da R12:
> > ffff8a28d4549d78
> > [97604.721669] R13: ffffffffffffffff R14: 0000000000000000 R15:
> > ffffd880718e0000
> > [97604.721674] FS:  00007f9cdd7fb740(0000) GS:ffff8a5c7f980000(0000)
> > knlGS:0000000000000000
> > [97604.721679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [97604.721683] CR2: 00007f89d3d78d80 CR3: 0000004d8a14e005 CR4:
> > 00000000007706e0
> > [97604.721688] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > 0000000000000000
> > [97604.721692] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > 0000000000000400
> > 97604.721696] PKRU: 55555554
> > [97604.721699] Call Trace:
> > [97604.721719]  ? request_wait_answer+0x11a/0x210 [fuse]
> > [97604.721729]  ? fuse_dentry_delete+0xb/0x20 [fuse]
> > [97604.721740]  fuse_finish_open+0x85/0x150 [fuse]
> > [97604.721750]  fuse_open_common+0x1a8/0x1b0 [fuse]
> > [97604.721759]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
> > [97604.721766]  do_dentry_open+0x14e/0x380
> > [97604.721775]  path_openat+0x600/0x10d0
> > [97604.721782]  ? handle_mm_fault+0x103c/0x1a00
> > [97604.721791]  ? follow_page_pte+0x314/0x5f0
> > [97604.721795]  do_filp_open+0x88/0x130
> > [97604.721803]  ? security_prepare_creds+0x6d/0x90
> > [97604.721808]  ? __kmalloc+0x11d/0x2a0
> > [97604.721814]  do_open_execat+0x6d/0x1a0
> > [97604.721819]  bprm_execve+0x190/0x6b0
> > [97604.721825]  do_execveat_common+0x192/0x1c0
> > [97604.721830]  __x64_sys_execve+0x39/0x50
> > [97604.721836]  do_syscall_64+0x33/0x80
> > [97604.721843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [97604.721848] RIP: 0033:0x7f9cdcfe2c37
> > [97604.721853] Code: ff ff 76 df 89 c6 f7 de 64 41 89 32 eb d5 89 c6 f7 de
> > 64 41 89 32 eb db 66 2e 0f 1f 84 00 00 00 00 00 90 b8 3b 00 00 00 0f 05 <48>
> > 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 08 12 30 00 f7 d8 64 89 02
> > [97604.721862] RSP: 002b:00007ffe444f5758 EFLAGS: 00000202 ORIG_RAX:
> > 000000000000003b
> > [97604.721867] RAX: ffffffffffffffda RBX: 00007f9cdd7fb6a0 RCX:
> > 00007f9cdcfe2c37
> > [97604.721872] RDX: 00000000020f5300 RSI: 00000000020f3bf8 RDI:
> > 00000000020f36a0
> > [97604.721876] RBP: 0000000000000001 R08: 0000000000000000 R09:
> > 0000000000000000
> > [97604.721880] R10: 00007ffe444f4b60 R11: 0000000000000202 R12:
> > 0000000000000000
> > [97604.721884] R13: 0000000000000001 R14: 00000000020f36a0 R15:
> > 0000000000000000
> > [97604.721890] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver
> > nfsv3 nfs fscache libafs(PO) iscsi_ibft iscsi_boot_sysfs rfkill
> > vboxnetadp(O) vboxnetflt(O) vboxdrv(O) dmi_sysfs intel_rapl_msr
> > intel_rapl_common isst_if_common joydev ipmi_ssif i40iw ib_uverbs iTCO_wdt
> > intel_pmc_bxt ib_core hid_generic iTCO_vendor_support skx_edac nfit
> > libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi
> > usbhid kvm i40e ipmi_si ioatdma mei_me i2c_i801 irqbypass ipmi_devintf mei
> > i2c_smbus lpc_ich dca efi_pstore pcspkr ipmi_msghandler tiny_power_button
> > acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl lockd
> > auth_rpcgss grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit
> > drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> > cec rc_core drm_ttm_helper xhci_pci ttm xhci_pci_renesas xhci_hcd
> > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> > drm glue_helper crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp
> > llc
> > [97604.721991]  dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> > msr efivarfs
> > [97604.722031] ---[ end trace edcabaccd35272e2 ]---
> > [97604.727773] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0
> >
> > Cheers,
> > --
> > Luís
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-12 12:21   ` Luis Henriques
@ 2021-03-12 13:11     ` Matthew Wilcox
  2021-03-15  9:47       ` Luis Henriques
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2021-03-12 13:11 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Miklos Szeredi, Kirill A. Shutemov, Andrew Morton, linux-fsdevel,
	linux-kernel

On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > reproducible in kernels as early as 5.4.

If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
so we know what kind of problem we're dealing with?  Assuming the SUSE
tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.

> > Page fault locks the page before installing a new pte, at least
> > AFAICS, so the BUG looks impossible.  The referenced commits only
> > touch very high level control of writeback, so they may well increase
> > the chance of a bug triggering, but very unlikely to be the actual
> > cause of the bug.   I'm guessing this to be an MM issue.
> 
> Ok, thank you for having a look at it.
> 
> Interestingly, there's a single commit to mm/truncate.c in 5.4:
> ef18a1ca847b ("mm/thp: allow dropping THP from page cache").  I'm Cc'ing
> Andrew and Kirill, maybe they have some ideas.

That's probably not it; unless FUSE has developed the ability to insert
compound pages into the page cache without me noticing.

(if it had, that would absolutely explain it -- i have a fix in my thp
tree for this case, but it doesn't affect any existing filesystem
because only shmem uses compound pages and it doesn't call
invalidate_inode_pages2_range)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-12 13:11     ` Matthew Wilcox
@ 2021-03-15  9:47       ` Luis Henriques
  2021-03-15 11:06         ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-15  9:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Miklos Szeredi, Kirill A. Shutemov, Andrew Morton, linux-fsdevel,
	linux-kernel

On Fri, Mar 12, 2021 at 01:11:23PM +0000, Matthew Wilcox wrote:
> On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > > reproducible in kernels as early as 5.4.
> 
> If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
> so we know what kind of problem we're dealing with?  Assuming the SUSE
> tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.

Just to make sure I got this right, you want to test something like this:

 				}
 			}
-			BUG_ON(page_mapped(page));
+			VM_BUG_ON_PAGE(page_mapped(page), page);
 			ret2 = do_launder_page(mapping, page);
 			if (ret2 == 0) {
 				if (!invalidate_complete_page2(mapping, page))

Cheers,
--
Luís

> 
> > > Page fault locks the page before installing a new pte, at least
> > > AFAICS, so the BUG looks impossible.  The referenced commits only
> > > touch very high level control of writeback, so they may well increase
> > > the chance of a bug triggering, but very unlikely to be the actual
> > > cause of the bug.   I'm guessing this to be an MM issue.
> > 
> > Ok, thank you for having a look at it.
> > 
> > Interestingly, there's a single commit to mm/truncate.c in 5.4:
> > ef18a1ca847b ("mm/thp: allow dropping THP from page cache").  I'm Cc'ing
> > Andrew and Kirill, maybe they have some ideas.
> 
> That's probably not it; unless FUSE has developed the ability to insert
> compound pages into the page cache without me noticing.
> 
> (if it had, that would absolutely explain it -- i have a fix in my thp
> tree for this case, but it doesn't affect any existing filesystem
> because only shmem uses compound pages and it doesn't call
> invalidate_inode_pages2_range)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-15  9:47       ` Luis Henriques
@ 2021-03-15 11:06         ` Matthew Wilcox
  2021-03-18  9:26           ` Luis Henriques
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2021-03-15 11:06 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Miklos Szeredi, Kirill A. Shutemov, Andrew Morton, linux-fsdevel,
	linux-kernel

On Mon, Mar 15, 2021 at 09:47:45AM +0000, Luis Henriques wrote:
> On Fri, Mar 12, 2021 at 01:11:23PM +0000, Matthew Wilcox wrote:
> > On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > > > reproducible in kernels as early as 5.4.
> > 
> > If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
> > so we know what kind of problem we're dealing with?  Assuming the SUSE
> > tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.
> 
> Just to make sure I got this right, you want to test something like this:
> 
>  				}
>  			}
> -			BUG_ON(page_mapped(page));
> +			VM_BUG_ON_PAGE(page_mapped(page), page);
>  			ret2 = do_launder_page(mapping, page);
>  			if (ret2 == 0) {
>  				if (!invalidate_complete_page2(mapping, page))

Yes, exactly.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-15 11:06         ` Matthew Wilcox
@ 2021-03-18  9:26           ` Luis Henriques
  2021-03-18 10:59             ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-18  9:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Vlastimil Babka, Miklos Szeredi, Kirill A. Shutemov,
	Andrew Morton, linux-fsdevel, linux-kernel

(I thought Vlastimil was already on CC...)

On Mon, Mar 15, 2021 at 11:06:59AM +0000, Matthew Wilcox wrote:
> On Mon, Mar 15, 2021 at 09:47:45AM +0000, Luis Henriques wrote:
> > On Fri, Mar 12, 2021 at 01:11:23PM +0000, Matthew Wilcox wrote:
> > > On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > > > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > > > > reproducible in kernels as early as 5.4.
> > > 
> > > If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
> > > so we know what kind of problem we're dealing with?  Assuming the SUSE
> > > tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.
> > 
> > Just to make sure I got this right, you want to test something like this:
> > 
> >  				}
> >  			}
> > -			BUG_ON(page_mapped(page));
> > +			VM_BUG_ON_PAGE(page_mapped(page), page);
> >  			ret2 = do_launder_page(mapping, page);
> >  			if (ret2 == 0) {
> >  				if (!invalidate_complete_page2(mapping, page))
> 
> Yes, exactly.

Ok, finally I got some feedback from the bug reporter.  Please see bellow
the kernel log with the VM_BUG_ON_PAGE() in place.  Also note that this is
on a 5.12-rc3, vanilla.

Cheers,
--
Luís

[16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
[16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
[16247.536361] memcg:ffff8e730012b000
[16247.536364] aops:fuse_file_aops [fuse] ino:8b8 dentry name:"cc1plus"
[16247.536379] flags: 0xaffff800010037(locked|referenced|uptodate|lru|active|head)
[16247.536385] raw: 00affff800010037 ffffd6519ed9c448 ffffd651abea5b08 ffff8eb2f9a02ef8
[16247.536388] raw: 0000000000001400 0000000000000000 000002a1ffffffff ffff8e730012b000
[16247.536389] page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
[16247.536399] ------------[ cut here ]------------
[16247.536400] kernel BUG at mm/truncate.c:678!
[16247.536406] invalid opcode: 0000 [#1] SMP PTI
[16247.536416] CPU: 42 PID: 2063761 Comm: g++ Not tainted 5.12.0-rc3-1.g008d601-default #1 openSUSE Tumbleweed (unreleased)
[16247.536423] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a 10/16/2019
[16247.536427] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
[16247.536436] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
[16247.536444] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
[16247.536450] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
[16247.536455] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
[16247.536460] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
[16247.536464] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
[16247.536468] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
[16247.536473] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
[16247.536478] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16247.536483] CR2: 00007fd48a25a7c0 CR3: 00000040aa3ac006 CR4: 00000000007706e0
[16247.536487] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[16247.536491] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[16247.536495] PKRU: 55555554
[16247.536498] Call Trace:
[16247.536506]  fuse_finish_open+0x82/0x150 [fuse]
[16247.536520]  fuse_open_common+0x1a8/0x1b0 [fuse]
[16247.536530]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
[16247.536540]  do_dentry_open+0x14e/0x380
[16247.536547]  path_openat+0xaf6/0x10a0
[16247.536555]  do_filp_open+0x88/0x130
[16247.536560]  ? security_prepare_creds+0x6d/0x90
[16247.536566]  ? __kmalloc+0x157/0x2e0
[16247.536575]  do_open_execat+0x6d/0x1a0
[16247.536581]  bprm_execve+0x128/0x660
[16247.536587]  do_execveat_common+0x192/0x1c0
[16247.536593]  __x64_sys_execve+0x39/0x50
[16247.536599]  do_syscall_64+0x33/0x80
[16247.536606]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[16247.536614] RIP: 0033:0x7f97c0efec37
[16247.536621] Code: Unable to access opcode bytes at RIP 0x7f97c0efec0d.
[16247.536625] RSP: 002b:00007ffdc2fdea68 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
[16247.536631] RAX: ffffffffffffffda RBX: 00007f97c17176a0 RCX: 00007f97c0efec37
[16247.536635] RDX: 0000000000ea42c0 RSI: 0000000000ea5848 RDI: 0000000000ea5d00
[16247.536639] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[16247.536643] R10: 00007ffdc2fdde60 R11: 0000000000000202 R12: 0000000000000000
[16247.536647] R13: 0000000000000001 R14: 0000000000ea5d00 R15: 0000000000000000
[16247.536653] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache iscsi_ibft iscsi_boot_sysfs rfkill dmi_sysfs intel_rapl_msr intel_rapl_common joydev isst_if_common ipmi_ssif i40iw ib_uverbs iTCO_wdt intel_pmc_bxt skx_edac ib_core hid_generic iTCO_vendor_support nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi kvm usbhid i2c_i801 mei_me i40e irqbypass efi_pstore pcspkr ipmi_si ioatdma i2c_smbus lpc_ich mei intel_pch_thermal dca ipmi_devintf ipmi_msghandler tiny_power_button acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl auth_rpcgss lockd grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm_ttm_helper ttm xhci_pci xhci_pci_renesas drm xhci_hcd crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp llc dm_multipath dm_mod scsi_dh_rdac scsi_dh
 _emc
[16247.536758]  scsi_dh_alua msr efivarfs
[16247.536800] ---[ end trace e1493f55bf5b3a34 ]---
[16247.544126] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
[16247.544140] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
[16247.544148] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
[16247.544153] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
[16247.544158] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
[16247.544162] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
[16247.544166] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
[16247.544170] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
[16247.544175] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
[16247.544180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16247.544184] CR2: 00007f97c0efec0d CR3: 00000040aa3ac006 CR4: 00000000007706e0
[16247.544188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[16247.544191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[16247.544194] PKRU: 55555554
[16247.546763] BUG: Bad rss-counter state mm:00000000060c94f4 type:MM_ANONPAGES val:8



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-18  9:26           ` Luis Henriques
@ 2021-03-18 10:59             ` Miklos Szeredi
  2021-03-18 11:03               ` Kirill A. Shutemov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2021-03-18 10:59 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Matthew Wilcox, Vlastimil Babka, Kirill A. Shutemov,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

[CC linux-mm]

On Thu, Mar 18, 2021 at 10:25 AM Luis Henriques <lhenriques@suse.de> wrote:
>
> (I thought Vlastimil was already on CC...)
>
> On Mon, Mar 15, 2021 at 11:06:59AM +0000, Matthew Wilcox wrote:
> > On Mon, Mar 15, 2021 at 09:47:45AM +0000, Luis Henriques wrote:
> > > On Fri, Mar 12, 2021 at 01:11:23PM +0000, Matthew Wilcox wrote:
> > > > On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > > > > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > > > > > reproducible in kernels as early as 5.4.
> > > >
> > > > If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
> > > > so we know what kind of problem we're dealing with?  Assuming the SUSE
> > > > tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.
> > >
> > > Just to make sure I got this right, you want to test something like this:
> > >
> > >                             }
> > >                     }
> > > -                   BUG_ON(page_mapped(page));
> > > +                   VM_BUG_ON_PAGE(page_mapped(page), page);
> > >                     ret2 = do_launder_page(mapping, page);
> > >                     if (ret2 == 0) {
> > >                             if (!invalidate_complete_page2(mapping, page))
> >
> > Yes, exactly.
>
> Ok, finally I got some feedback from the bug reporter.  Please see bellow
> the kernel log with the VM_BUG_ON_PAGE() in place.  Also note that this is
> on a 5.12-rc3, vanilla.
>
> Cheers,
> --
> Luís
>
> [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0

This is a compound page alright.   Have no idea how it got into fuse's
pagecache.


> [16247.536361] memcg:ffff8e730012b000
> [16247.536364] aops:fuse_file_aops [fuse] ino:8b8 dentry name:"cc1plus"
> [16247.536379] flags: 0xaffff800010037(locked|referenced|uptodate|lru|active|head)
> [16247.536385] raw: 00affff800010037 ffffd6519ed9c448 ffffd651abea5b08 ffff8eb2f9a02ef8
> [16247.536388] raw: 0000000000001400 0000000000000000 000002a1ffffffff ffff8e730012b000
> [16247.536389] page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
> [16247.536399] ------------[ cut here ]------------
> [16247.536400] kernel BUG at mm/truncate.c:678!
> [16247.536406] invalid opcode: 0000 [#1] SMP PTI
> [16247.536416] CPU: 42 PID: 2063761 Comm: g++ Not tainted 5.12.0-rc3-1.g008d601-default #1 openSUSE Tumbleweed (unreleased)
> [16247.536423] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a 10/16/2019
> [16247.536427] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
> [16247.536436] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
> [16247.536444] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
> [16247.536450] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
> [16247.536455] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
> [16247.536460] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
> [16247.536464] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
> [16247.536468] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
> [16247.536473] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
> [16247.536478] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [16247.536483] CR2: 00007fd48a25a7c0 CR3: 00000040aa3ac006 CR4: 00000000007706e0
> [16247.536487] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [16247.536491] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [16247.536495] PKRU: 55555554
> [16247.536498] Call Trace:
> [16247.536506]  fuse_finish_open+0x82/0x150 [fuse]
> [16247.536520]  fuse_open_common+0x1a8/0x1b0 [fuse]
> [16247.536530]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
> [16247.536540]  do_dentry_open+0x14e/0x380
> [16247.536547]  path_openat+0xaf6/0x10a0
> [16247.536555]  do_filp_open+0x88/0x130
> [16247.536560]  ? security_prepare_creds+0x6d/0x90
> [16247.536566]  ? __kmalloc+0x157/0x2e0
> [16247.536575]  do_open_execat+0x6d/0x1a0
> [16247.536581]  bprm_execve+0x128/0x660
> [16247.536587]  do_execveat_common+0x192/0x1c0
> [16247.536593]  __x64_sys_execve+0x39/0x50
> [16247.536599]  do_syscall_64+0x33/0x80
> [16247.536606]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [16247.536614] RIP: 0033:0x7f97c0efec37
> [16247.536621] Code: Unable to access opcode bytes at RIP 0x7f97c0efec0d.
> [16247.536625] RSP: 002b:00007ffdc2fdea68 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
> [16247.536631] RAX: ffffffffffffffda RBX: 00007f97c17176a0 RCX: 00007f97c0efec37
> [16247.536635] RDX: 0000000000ea42c0 RSI: 0000000000ea5848 RDI: 0000000000ea5d00
> [16247.536639] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> [16247.536643] R10: 00007ffdc2fdde60 R11: 0000000000000202 R12: 0000000000000000
> [16247.536647] R13: 0000000000000001 R14: 0000000000ea5d00 R15: 0000000000000000
> [16247.536653] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache iscsi_ibft iscsi_boot_sysfs rfkill dmi_sysfs intel_rapl_msr intel_rapl_common joydev isst_if_common ipmi_ssif i40iw ib_uverbs iTCO_wdt intel_pmc_bxt skx_edac ib_core hid_generic iTCO_vendor_support nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi kvm usbhid i2c_i801 mei_me i40e irqbypass efi_pstore pcspkr ipmi_si ioatdma i2c_smbus lpc_ich mei intel_pch_thermal dca ipmi_devintf ipmi_msghandler tiny_power_button acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl auth_rpcgss lockd grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm_ttm_helper ttm xhci_pci xhci_pci_renesas drm xhci_hcd crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp llc dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc
> [16247.536758]  scsi_dh_alua msr efivarfs
> [16247.536800] ---[ end trace e1493f55bf5b3a34 ]---
> [16247.544126] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
> [16247.544140] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
> [16247.544148] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
> [16247.544153] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
> [16247.544158] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
> [16247.544162] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
> [16247.544166] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
> [16247.544170] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
> [16247.544175] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
> [16247.544180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [16247.544184] CR2: 00007f97c0efec0d CR3: 00000040aa3ac006 CR4: 00000000007706e0
> [16247.544188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [16247.544191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [16247.544194] PKRU: 55555554
> [16247.546763] BUG: Bad rss-counter state mm:00000000060c94f4 type:MM_ANONPAGES val:8
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-18 10:59             ` Miklos Szeredi
@ 2021-03-18 11:03               ` Kirill A. Shutemov
  2021-03-18 11:29                 ` Luis Henriques
  0 siblings, 1 reply; 16+ messages in thread
From: Kirill A. Shutemov @ 2021-03-18 11:03 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Miklos Szeredi, Matthew Wilcox, Vlastimil Babka, Andrew Morton,
	linux-fsdevel, linux-kernel, linux-mm

On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> [CC linux-mm]
> 
> On Thu, Mar 18, 2021 at 10:25 AM Luis Henriques <lhenriques@suse.de> wrote:
> >
> > (I thought Vlastimil was already on CC...)
> >
> > On Mon, Mar 15, 2021 at 11:06:59AM +0000, Matthew Wilcox wrote:
> > > On Mon, Mar 15, 2021 at 09:47:45AM +0000, Luis Henriques wrote:
> > > > On Fri, Mar 12, 2021 at 01:11:23PM +0000, Matthew Wilcox wrote:
> > > > > On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > > > > > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > > > > > > reproducible in kernels as early as 5.4.
> > > > >
> > > > > If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
> > > > > so we know what kind of problem we're dealing with?  Assuming the SUSE
> > > > > tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.
> > > >
> > > > Just to make sure I got this right, you want to test something like this:
> > > >
> > > >                             }
> > > >                     }
> > > > -                   BUG_ON(page_mapped(page));
> > > > +                   VM_BUG_ON_PAGE(page_mapped(page), page);
> > > >                     ret2 = do_launder_page(mapping, page);
> > > >                     if (ret2 == 0) {
> > > >                             if (!invalidate_complete_page2(mapping, page))
> > >
> > > Yes, exactly.
> >
> > Ok, finally I got some feedback from the bug reporter.  Please see bellow
> > the kernel log with the VM_BUG_ON_PAGE() in place.  Also note that this is
> > on a 5.12-rc3, vanilla.
> >
> > Cheers,
> > --
> > Luís
> >
> > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> 
> This is a compound page alright.   Have no idea how it got into fuse's
> pagecache.


Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?

> > [16247.536361] memcg:ffff8e730012b000
> > [16247.536364] aops:fuse_file_aops [fuse] ino:8b8 dentry name:"cc1plus"
> > [16247.536379] flags: 0xaffff800010037(locked|referenced|uptodate|lru|active|head)
> > [16247.536385] raw: 00affff800010037 ffffd6519ed9c448 ffffd651abea5b08 ffff8eb2f9a02ef8
> > [16247.536388] raw: 0000000000001400 0000000000000000 000002a1ffffffff ffff8e730012b000
> > [16247.536389] page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
> > [16247.536399] ------------[ cut here ]------------
> > [16247.536400] kernel BUG at mm/truncate.c:678!
> > [16247.536406] invalid opcode: 0000 [#1] SMP PTI
> > [16247.536416] CPU: 42 PID: 2063761 Comm: g++ Not tainted 5.12.0-rc3-1.g008d601-default #1 openSUSE Tumbleweed (unreleased)
> > [16247.536423] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a 10/16/2019
> > [16247.536427] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
> > [16247.536436] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
> > [16247.536444] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
> > [16247.536450] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
> > [16247.536455] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
> > [16247.536460] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
> > [16247.536464] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
> > [16247.536468] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
> > [16247.536473] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
> > [16247.536478] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [16247.536483] CR2: 00007fd48a25a7c0 CR3: 00000040aa3ac006 CR4: 00000000007706e0
> > [16247.536487] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [16247.536491] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [16247.536495] PKRU: 55555554
> > [16247.536498] Call Trace:
> > [16247.536506]  fuse_finish_open+0x82/0x150 [fuse]
> > [16247.536520]  fuse_open_common+0x1a8/0x1b0 [fuse]
> > [16247.536530]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
> > [16247.536540]  do_dentry_open+0x14e/0x380
> > [16247.536547]  path_openat+0xaf6/0x10a0
> > [16247.536555]  do_filp_open+0x88/0x130
> > [16247.536560]  ? security_prepare_creds+0x6d/0x90
> > [16247.536566]  ? __kmalloc+0x157/0x2e0
> > [16247.536575]  do_open_execat+0x6d/0x1a0
> > [16247.536581]  bprm_execve+0x128/0x660
> > [16247.536587]  do_execveat_common+0x192/0x1c0
> > [16247.536593]  __x64_sys_execve+0x39/0x50
> > [16247.536599]  do_syscall_64+0x33/0x80
> > [16247.536606]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [16247.536614] RIP: 0033:0x7f97c0efec37
> > [16247.536621] Code: Unable to access opcode bytes at RIP 0x7f97c0efec0d.
> > [16247.536625] RSP: 002b:00007ffdc2fdea68 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
> > [16247.536631] RAX: ffffffffffffffda RBX: 00007f97c17176a0 RCX: 00007f97c0efec37
> > [16247.536635] RDX: 0000000000ea42c0 RSI: 0000000000ea5848 RDI: 0000000000ea5d00
> > [16247.536639] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> > [16247.536643] R10: 00007ffdc2fdde60 R11: 0000000000000202 R12: 0000000000000000
> > [16247.536647] R13: 0000000000000001 R14: 0000000000ea5d00 R15: 0000000000000000
> > [16247.536653] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache iscsi_ibft iscsi_boot_sysfs rfkill dmi_sysfs intel_rapl_msr intel_rapl_common joydev isst_if_common ipmi_ssif i40iw ib_uverbs iTCO_wdt intel_pmc_bxt skx_edac ib_core hid_generic iTCO_vendor_support nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi kvm usbhid i2c_i801 mei_me i40e irqbypass efi_pstore pcspkr ipmi_si ioatdma i2c_smbus lpc_ich mei intel_pch_thermal dca ipmi_devintf ipmi_msghandler tiny_power_button acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl auth_rpcgss lockd grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm_ttm_helper ttm xhci_pci xhci_pci_renesas drm xhci_hcd crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp llc dm_multipath dm_mod scsi_dh_rdac scs
 i_dh_emc
> > [16247.536758]  scsi_dh_alua msr efivarfs
> > [16247.536800] ---[ end trace e1493f55bf5b3a34 ]---
> > [16247.544126] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
> > [16247.544140] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
> > [16247.544148] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
> > [16247.544153] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
> > [16247.544158] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
> > [16247.544162] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
> > [16247.544166] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
> > [16247.544170] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
> > [16247.544175] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
> > [16247.544180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [16247.544184] CR2: 00007f97c0efec0d CR3: 00000040aa3ac006 CR4: 00000000007706e0
> > [16247.544188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [16247.544191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [16247.544194] PKRU: 55555554
> > [16247.546763] BUG: Bad rss-counter state mm:00000000060c94f4 type:MM_ANONPAGES val:8
> >
> >

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-18 11:03               ` Kirill A. Shutemov
@ 2021-03-18 11:29                 ` Luis Henriques
  2021-03-18 11:55                   ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-18 11:29 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Miklos Szeredi, Matthew Wilcox, Vlastimil Babka, Andrew Morton,
	linux-fsdevel, linux-kernel, linux-mm

On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> > [CC linux-mm]
> > 
> > On Thu, Mar 18, 2021 at 10:25 AM Luis Henriques <lhenriques@suse.de> wrote:
> > >
> > > (I thought Vlastimil was already on CC...)
> > >
> > > On Mon, Mar 15, 2021 at 11:06:59AM +0000, Matthew Wilcox wrote:
> > > > On Mon, Mar 15, 2021 at 09:47:45AM +0000, Luis Henriques wrote:
> > > > > On Fri, Mar 12, 2021 at 01:11:23PM +0000, Matthew Wilcox wrote:
> > > > > > On Fri, Mar 12, 2021 at 12:21:59PM +0000, Luis Henriques wrote:
> > > > > > > > > I've seen a bug report (5.10.16 kernel splat below) that seems to be
> > > > > > > > > reproducible in kernels as early as 5.4.
> > > > > >
> > > > > > If this is reproducible, can you turn this BUG_ON into a VM_BUG_ON_PAGE()
> > > > > > so we know what kind of problem we're dealing with?  Assuming the SUSE
> > > > > > tumbleweed kernels enable CONFIG_DEBUG_VM, which I'm sure they do.
> > > > >
> > > > > Just to make sure I got this right, you want to test something like this:
> > > > >
> > > > >                             }
> > > > >                     }
> > > > > -                   BUG_ON(page_mapped(page));
> > > > > +                   VM_BUG_ON_PAGE(page_mapped(page), page);
> > > > >                     ret2 = do_launder_page(mapping, page);
> > > > >                     if (ret2 == 0) {
> > > > >                             if (!invalidate_complete_page2(mapping, page))
> > > >
> > > > Yes, exactly.
> > >
> > > Ok, finally I got some feedback from the bug reporter.  Please see bellow
> > > the kernel log with the VM_BUG_ON_PAGE() in place.  Also note that this is
> > > on a 5.12-rc3, vanilla.
> > >
> > > Cheers,
> > > --
> > > Luís
> > >
> > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> > 
> > This is a compound page alright.   Have no idea how it got into fuse's
> > pagecache.
> 
> 
> Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?

Yes, it looks like Tumbleweed kernels have that config option enabled by
default.  And it this feature was introduced in 5.4 (the bug doesn't seem
to be reproducible in 5.3).

Cheers,
--
Luís


> > > [16247.536361] memcg:ffff8e730012b000
> > > [16247.536364] aops:fuse_file_aops [fuse] ino:8b8 dentry name:"cc1plus"
> > > [16247.536379] flags: 0xaffff800010037(locked|referenced|uptodate|lru|active|head)
> > > [16247.536385] raw: 00affff800010037 ffffd6519ed9c448 ffffd651abea5b08 ffff8eb2f9a02ef8
> > > [16247.536388] raw: 0000000000001400 0000000000000000 000002a1ffffffff ffff8e730012b000
> > > [16247.536389] page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
> > > [16247.536399] ------------[ cut here ]------------
> > > [16247.536400] kernel BUG at mm/truncate.c:678!
> > > [16247.536406] invalid opcode: 0000 [#1] SMP PTI
> > > [16247.536416] CPU: 42 PID: 2063761 Comm: g++ Not tainted 5.12.0-rc3-1.g008d601-default #1 openSUSE Tumbleweed (unreleased)
> > > [16247.536423] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a 10/16/2019
> > > [16247.536427] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
> > > [16247.536436] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
> > > [16247.536444] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
> > > [16247.536450] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
> > > [16247.536455] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
> > > [16247.536460] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
> > > [16247.536464] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
> > > [16247.536468] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
> > > [16247.536473] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
> > > [16247.536478] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [16247.536483] CR2: 00007fd48a25a7c0 CR3: 00000040aa3ac006 CR4: 00000000007706e0
> > > [16247.536487] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [16247.536491] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [16247.536495] PKRU: 55555554
> > > [16247.536498] Call Trace:
> > > [16247.536506]  fuse_finish_open+0x82/0x150 [fuse]
> > > [16247.536520]  fuse_open_common+0x1a8/0x1b0 [fuse]
> > > [16247.536530]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
> > > [16247.536540]  do_dentry_open+0x14e/0x380
> > > [16247.536547]  path_openat+0xaf6/0x10a0
> > > [16247.536555]  do_filp_open+0x88/0x130
> > > [16247.536560]  ? security_prepare_creds+0x6d/0x90
> > > [16247.536566]  ? __kmalloc+0x157/0x2e0
> > > [16247.536575]  do_open_execat+0x6d/0x1a0
> > > [16247.536581]  bprm_execve+0x128/0x660
> > > [16247.536587]  do_execveat_common+0x192/0x1c0
> > > [16247.536593]  __x64_sys_execve+0x39/0x50
> > > [16247.536599]  do_syscall_64+0x33/0x80
> > > [16247.536606]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > [16247.536614] RIP: 0033:0x7f97c0efec37
> > > [16247.536621] Code: Unable to access opcode bytes at RIP 0x7f97c0efec0d.
> > > [16247.536625] RSP: 002b:00007ffdc2fdea68 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
> > > [16247.536631] RAX: ffffffffffffffda RBX: 00007f97c17176a0 RCX: 00007f97c0efec37
> > > [16247.536635] RDX: 0000000000ea42c0 RSI: 0000000000ea5848 RDI: 0000000000ea5d00
> > > [16247.536639] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> > > [16247.536643] R10: 00007ffdc2fdde60 R11: 0000000000000202 R12: 0000000000000000
> > > [16247.536647] R13: 0000000000000001 R14: 0000000000ea5d00 R15: 0000000000000000
> > > [16247.536653] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache iscsi_ibft iscsi_boot_sysfs rfkill dmi_sysfs intel_rapl_msr intel_rapl_common joydev isst_if_common ipmi_ssif i40iw ib_uverbs iTCO_wdt intel_pmc_bxt skx_edac ib_core hid_generic iTCO_vendor_support nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi kvm usbhid i2c_i801 mei_me i40e irqbypass efi_pstore pcspkr ipmi_si ioatdma i2c_smbus lpc_ich mei intel_pch_thermal dca ipmi_devintf ipmi_msghandler tiny_power_button acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl auth_rpcgss lockd grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm_ttm_helper ttm xhci_pci xhci_pci_renesas drm xhci_hcd crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp llc dm_multipath dm_mod scsi_dh_rdac s
 csi_dh_emc
> > > [16247.536758]  scsi_dh_alua msr efivarfs
> > > [16247.536800] ---[ end trace e1493f55bf5b3a34 ]---
> > > [16247.544126] RIP: 0010:invalidate_inode_pages2_range+0x3b4/0x550
> > > [16247.544140] Code: 00 00 00 4c 89 e6 e8 eb 0f 03 00 4c 89 ff e8 63 40 01 00 84 c0 0f 84 23 fe ff ff 48 c7 c6 d0 1d f4 b1 4c 89 ff e8 ec 82 02 00 <0f> 0b 48 8b 45 78 48 8b 80 80 00 00 00 48 85 c0 0f 84 fb fe ff ff
> > > [16247.544148] RSP: 0000:ffffa18cb0af7a40 EFLAGS: 00010246
> > > [16247.544153] RAX: 0000000000000036 RBX: 000000000000000d RCX: ffff8ef13fc9a748
> > > [16247.544158] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8ef13fc9a740
> > > [16247.544162] RBP: ffff8eb2f9a02ef8 R08: ffff8ef23ffb48a8 R09: 000000000004fffb
> > > [16247.544166] R10: 00000000ffff0000 R11: 3fffffffffffffff R12: 0000000000001400
> > > [16247.544170] R13: ffff8eb2f9a02f00 R14: 0000000000000000 R15: ffffd651b1978000
> > > [16247.544175] FS:  00007f97c1717740(0000) GS:ffff8ef13fc80000(0000) knlGS:0000000000000000
> > > [16247.544180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [16247.544184] CR2: 00007f97c0efec0d CR3: 00000040aa3ac006 CR4: 00000000007706e0
> > > [16247.544188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [16247.544191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [16247.544194] PKRU: 55555554
> > > [16247.546763] BUG: Bad rss-counter state mm:00000000060c94f4 type:MM_ANONPAGES val:8
> > >
> > >
> 
> -- 
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-18 11:29                 ` Luis Henriques
@ 2021-03-18 11:55                   ` Matthew Wilcox
  2021-03-18 12:16                     ` Luis Henriques
  2021-03-19  9:02                     ` Luis Henriques
  0 siblings, 2 replies; 16+ messages in thread
From: Matthew Wilcox @ 2021-03-18 11:55 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Kirill A. Shutemov, Miklos Szeredi, Vlastimil Babka,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

On Thu, Mar 18, 2021 at 11:29:28AM +0000, Luis Henriques wrote:
> On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> > > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> > > 
> > > This is a compound page alright.   Have no idea how it got into fuse's
> > > pagecache.
> > 
> > 
> > Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?
> 
> Yes, it looks like Tumbleweed kernels have that config option enabled by
> default.  And it this feature was introduced in 5.4 (the bug doesn't seem
> to be reproducible in 5.3).

Can you try adding this patch?

https://git.infradead.org/users/willy/pagecache.git/commitdiff/369a4fcd78369b7a026bdef465af9669bde98ef4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-18 11:55                   ` Matthew Wilcox
@ 2021-03-18 12:16                     ` Luis Henriques
  2021-03-19  9:02                     ` Luis Henriques
  1 sibling, 0 replies; 16+ messages in thread
From: Luis Henriques @ 2021-03-18 12:16 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kirill A. Shutemov, Miklos Szeredi, Vlastimil Babka,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

On Thu, Mar 18, 2021 at 11:55:43AM +0000, Matthew Wilcox wrote:
> On Thu, Mar 18, 2021 at 11:29:28AM +0000, Luis Henriques wrote:
> > On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
> > > On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> > > > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > > > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> > > > 
> > > > This is a compound page alright.   Have no idea how it got into fuse's
> > > > pagecache.
> > > 
> > > 
> > > Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?
> > 
> > Yes, it looks like Tumbleweed kernels have that config option enabled by
> > default.  And it this feature was introduced in 5.4 (the bug doesn't seem
> > to be reproducible in 5.3).
> 
> Can you try adding this patch?
> 
> https://git.infradead.org/users/willy/pagecache.git/commitdiff/369a4fcd78369b7a026bdef465af9669bde98ef4

Yep, sure.  Unfortunately, the testing round-trip can be a bit high.  I'll
push a new kernel build and ask the reporter to give it a try.

[ I'll add this patch on top of the s/BUG_ON/VM_BUG_ON_PAGE change. ]

Cheers,
--
Luís

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-18 11:55                   ` Matthew Wilcox
  2021-03-18 12:16                     ` Luis Henriques
@ 2021-03-19  9:02                     ` Luis Henriques
  2021-03-29  9:01                       ` Luis Henriques
  1 sibling, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-19  9:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kirill A. Shutemov, Miklos Szeredi, Vlastimil Babka,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

On Thu, Mar 18, 2021 at 11:55:43AM +0000, Matthew Wilcox wrote:
> On Thu, Mar 18, 2021 at 11:29:28AM +0000, Luis Henriques wrote:
> > On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
> > > On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> > > > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > > > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> > > > 
> > > > This is a compound page alright.   Have no idea how it got into fuse's
> > > > pagecache.
> > > 
> > > 
> > > Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?
> > 
> > Yes, it looks like Tumbleweed kernels have that config option enabled by
> > default.  And it this feature was introduced in 5.4 (the bug doesn't seem
> > to be reproducible in 5.3).
> 
> Can you try adding this patch?
> 
> https://git.infradead.org/users/willy/pagecache.git/commitdiff/369a4fcd78369b7a026bdef465af9669bde98ef4

Good news, looks like this patch fixes the issue[1].  Thanks a lot
everyone.  Is this already queued somewhere for 5.12?  Also, it would be
nice to have it Cc'ed for stable kernels >= 5.4.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1182929#c24

Cheers,
--
Luís

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-19  9:02                     ` Luis Henriques
@ 2021-03-29  9:01                       ` Luis Henriques
  2021-03-29 12:05                         ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Henriques @ 2021-03-29  9:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kirill A. Shutemov, Miklos Szeredi, Vlastimil Babka,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

On Fri, Mar 19, 2021 at 09:02:33AM +0000, Luis Henriques wrote:
> On Thu, Mar 18, 2021 at 11:55:43AM +0000, Matthew Wilcox wrote:
> > On Thu, Mar 18, 2021 at 11:29:28AM +0000, Luis Henriques wrote:
> > > On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
> > > > On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> > > > > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > > > > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> > > > > 
> > > > > This is a compound page alright.   Have no idea how it got into fuse's
> > > > > pagecache.
> > > > 
> > > > 
> > > > Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?
> > > 
> > > Yes, it looks like Tumbleweed kernels have that config option enabled by
> > > default.  And it this feature was introduced in 5.4 (the bug doesn't seem
> > > to be reproducible in 5.3).
> > 
> > Can you try adding this patch?
> > 
> > https://git.infradead.org/users/willy/pagecache.git/commitdiff/369a4fcd78369b7a026bdef465af9669bde98ef4
> 
> Good news, looks like this patch fixes the issue[1].  Thanks a lot
> everyone.  Is this already queued somewhere for 5.12?  Also, it would be
> nice to have it Cc'ed for stable kernels >= 5.4.

Ping.  Are you planning to push this for 5.12, or is that queued for the
5.13 merged window?  Or "none of the above"? :)

Cheers,
--
Luís

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-29  9:01                       ` Luis Henriques
@ 2021-03-29 12:05                         ` Matthew Wilcox
  2021-05-03  8:52                           ` Luis Henriques
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2021-03-29 12:05 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Kirill A. Shutemov, Miklos Szeredi, Vlastimil Babka,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

On Mon, Mar 29, 2021 at 10:01:58AM +0100, Luis Henriques wrote:
> On Fri, Mar 19, 2021 at 09:02:33AM +0000, Luis Henriques wrote:
> > On Thu, Mar 18, 2021 at 11:55:43AM +0000, Matthew Wilcox wrote:
> > > On Thu, Mar 18, 2021 at 11:29:28AM +0000, Luis Henriques wrote:
> > > > On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
> > > > > On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
> > > > > > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
> > > > > > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
> > > > > > 
> > > > > > This is a compound page alright.   Have no idea how it got into fuse's
> > > > > > pagecache.
> > > > > 
> > > > > 
> > > > > Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?
> > > > 
> > > > Yes, it looks like Tumbleweed kernels have that config option enabled by
> > > > default.  And it this feature was introduced in 5.4 (the bug doesn't seem
> > > > to be reproducible in 5.3).
> > > 
> > > Can you try adding this patch?
> > > 
> > > https://git.infradead.org/users/willy/pagecache.git/commitdiff/369a4fcd78369b7a026bdef465af9669bde98ef4
> > 
> > Good news, looks like this patch fixes the issue[1].  Thanks a lot
> > everyone.  Is this already queued somewhere for 5.12?  Also, it would be
> > nice to have it Cc'ed for stable kernels >= 5.4.
> 
> Ping.  Are you planning to push this for 5.12, or is that queued for the
> 5.13 merged window?  Or "none of the above"? :)

Sorry, dropped the ball on this one.  This patch is good for that point
in the patch series, but I'm not sure it works against upstream in all
cases.  I need to spend some time evaluating it.  Thanks for the reminder.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: fuse: kernel BUG at mm/truncate.c:763!
  2021-03-29 12:05                         ` Matthew Wilcox
@ 2021-05-03  8:52                           ` Luis Henriques
  0 siblings, 0 replies; 16+ messages in thread
From: Luis Henriques @ 2021-05-03  8:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kirill A. Shutemov, Miklos Szeredi, Vlastimil Babka,
	Andrew Morton, linux-fsdevel, linux-kernel, linux-mm

Matthew Wilcox <willy@infradead.org> writes:

> On Mon, Mar 29, 2021 at 10:01:58AM +0100, Luis Henriques wrote:
>> On Fri, Mar 19, 2021 at 09:02:33AM +0000, Luis Henriques wrote:
>> > On Thu, Mar 18, 2021 at 11:55:43AM +0000, Matthew Wilcox wrote:
>> > > On Thu, Mar 18, 2021 at 11:29:28AM +0000, Luis Henriques wrote:
>> > > > On Thu, Mar 18, 2021 at 02:03:02PM +0300, Kirill A. Shutemov wrote:
>> > > > > On Thu, Mar 18, 2021 at 11:59:59AM +0100, Miklos Szeredi wrote:
>> > > > > > > [16247.536348] page:00000000dfe36ab1 refcount:673 mapcount:0 mapping:00000000f982a7f8 index:0x1400 pfn:0x4c65e00
>> > > > > > > [16247.536359] head:00000000dfe36ab1 order:9 compound_mapcount:0 compound_pincount:0
>> > > > > > 
>> > > > > > This is a compound page alright.   Have no idea how it got into fuse's
>> > > > > > pagecache.
>> > > > > 
>> > > > > 
>> > > > > Luis, do you have CONFIG_READ_ONLY_THP_FOR_FS enabled?
>> > > > 
>> > > > Yes, it looks like Tumbleweed kernels have that config option enabled by
>> > > > default.  And it this feature was introduced in 5.4 (the bug doesn't seem
>> > > > to be reproducible in 5.3).
>> > > 
>> > > Can you try adding this patch?
>> > > 
>> > > https://git.infradead.org/users/willy/pagecache.git/commitdiff/369a4fcd78369b7a026bdef465af9669bde98ef4
>> > 
>> > Good news, looks like this patch fixes the issue[1].  Thanks a lot
>> > everyone.  Is this already queued somewhere for 5.12?  Also, it would be
>> > nice to have it Cc'ed for stable kernels >= 5.4.
>> 
>> Ping.  Are you planning to push this for 5.12, or is that queued for the
>> 5.13 merged window?  Or "none of the above"? :)
>
> Sorry, dropped the ball on this one.  This patch is good for that point
> in the patch series, but I'm not sure it works against upstream in all
> cases.  I need to spend some time evaluating it.  Thanks for the reminder.

Gentle ping :-)

Any chances of getting this into 5.13?  (And tagged for stable kernels.)

Cheers,
-- 
Luis

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-05-03  8:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-12  8:52 fuse: kernel BUG at mm/truncate.c:763! Luis Henriques
2021-03-12  9:48 ` Miklos Szeredi
2021-03-12 12:21   ` Luis Henriques
2021-03-12 13:11     ` Matthew Wilcox
2021-03-15  9:47       ` Luis Henriques
2021-03-15 11:06         ` Matthew Wilcox
2021-03-18  9:26           ` Luis Henriques
2021-03-18 10:59             ` Miklos Szeredi
2021-03-18 11:03               ` Kirill A. Shutemov
2021-03-18 11:29                 ` Luis Henriques
2021-03-18 11:55                   ` Matthew Wilcox
2021-03-18 12:16                     ` Luis Henriques
2021-03-19  9:02                     ` Luis Henriques
2021-03-29  9:01                       ` Luis Henriques
2021-03-29 12:05                         ` Matthew Wilcox
2021-05-03  8:52                           ` Luis Henriques

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).