From: Wang Yugui <wangyugui@e16-tech.com>
To: Filipe Manana <fdmanana@kernel.org>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] btrfs: fix deadlock due to page faults during direct IO reads and writes
Date: Sat, 23 Oct 2021 11:58:52 +0800 [thread overview]
Message-ID: <20211023115852.2517.409509F4@e16-tech.com> (raw)
In-Reply-To: <CAL3q7H5t6mL8G8-8QuUBOZDR-oniSosPHZCNo81dFQTcZXigQw@mail.gmail.com>
hi,
With this new patch, xfstest/generic/475 and xfstest/generic/650 works well.
Thanks a lot.
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/10/23
> On Fri, Oct 22, 2021 at 1:12 PM Wang Yugui <wangyugui@e16-tech.com> wrote:
> >
> > Hi,
> >
> > > On Fri, Oct 22, 2021 at 6:59 AM Wang Yugui <wangyugui@e16-tech.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I noticed a infinite loop of fstests/generic/475 when I apply this patch
> > > > and "[PATCH v9 00/17] gfs2: Fix mmap + page fault deadlocks"
> > >
> > > You mean v8? I can't find v9 anywhere.
> >
> > Yes. It is v8.
> >
> >
> > > >
> > > > reproduce frequency: about 50%.
> > >
> > > with v8, on top of current misc-next, I couldn't trigger any issues
> > > after running g/475 for 50+ times.
> > >
> > > >
> > > > Call Trace 1:
> > > > Oct 22 06:13:06 T7610 kernel: task:fsstress state:R running task stack: 0 pid:2652125 ppid: 1 flags:0x00004006
> > > > Oct 22 06:13:06 T7610 kernel: Call Trace:
> > > > Oct 22 06:13:06 T7610 kernel: ? iomap_dio_rw+0xa/0x30
> > > > Oct 22 06:13:06 T7610 kernel: ? btrfs_file_read_iter+0x157/0x1c0 [btrfs]
> > > > Oct 22 06:13:06 T7610 kernel: ? new_sync_read+0x118/0x1a0
> > > > Oct 22 06:13:06 T7610 kernel: ? vfs_read+0xf1/0x190
> > > > Oct 22 06:13:06 T7610 kernel: ? ksys_read+0x59/0xd0
> > > > Oct 22 06:13:06 T7610 kernel: ? do_syscall_64+0x37/0x80
> > > > Oct 22 06:13:06 T7610 kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > >
> > > >
> > > > Call Trace 2:
> > > > Oct 22 07:45:37 T7610 kernel: task:fsstress state:R running task stack: 0 pid: 9584 ppid: 1 flags:0x00004006
> > > > Oct 22 07:45:37 T7610 kernel: Call Trace:
> > > > Oct 22 07:45:37 T7610 kernel: ? iomap_dio_complete+0x9e/0x140
> > > > Oct 22 07:45:37 T7610 kernel: ? btrfs_file_read_iter+0x124/0x1c0 [btrfs]
> > > > Oct 22 07:45:37 T7610 kernel: ? new_sync_read+0x118/0x1a0
> > > > Oct 22 07:45:37 T7610 kernel: ? vfs_read+0xf1/0x190
> > > > Oct 22 07:45:37 T7610 kernel: ? ksys_read+0x59/0xd0
> > > > Oct 22 07:45:37 T7610 kernel: ? do_syscall_64+0x37/0x80
> > > > Oct 22 07:45:37 T7610 kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > >
> > >
> > > Are those the complete traces?
> > > It looks like too little, and inexact (the prefix ?).
> >
> > Yes. these are complete traces. I do not know the reason of 'the prefix ?'
> >
> > I run fstests/generic/475 2 times again.
> > - failed to reproduce on SSD/SAS
> > - sucessed to reproduce on SSD/NVMe
> >
> > Then I gathered 'sysrq -t' for 3 times.
> >
> > [ 909.099550] task:fsstress state:R running task stack: 0 pid: 9269 ppid: 1 flags:0x00004006
> > [ 909.100594] Call Trace:
> > [ 909.101633] ? __clear_user+0x40/0x70
> > [ 909.102675] ? lock_release+0x1c6/0x270
> > [ 909.103717] ? alloc_extent_state+0xc1/0x190 [btrfs]
> > [ 909.104822] ? fixup_exception+0x41/0x60
> > [ 909.105881] ? rcu_read_lock_held_common+0xe/0x40
> > [ 909.106924] ? rcu_read_lock_sched_held+0x23/0x80
> > [ 909.107947] ? rcu_read_lock_sched_held+0x23/0x80
> > [ 909.108966] ? slab_post_alloc_hook+0x50/0x340
> > [ 909.109993] ? trace_hardirqs_on+0x1a/0xd0
> > [ 909.111039] ? lock_extent_bits+0x64/0x90 [btrfs]
> > [ 909.112202] ? __clear_extent_bit+0x37a/0x530 [btrfs]
> > [ 909.113366] ? filemap_write_and_wait_range+0x87/0xd0
> > [ 909.114455] ? clear_extent_bit+0x15/0x20 [btrfs]
> > [ 909.115628] ? __iomap_dio_rw+0x284/0x830
> > [ 909.116741] ? find_vma+0x32/0xb0
> > [ 909.117868] ? __get_user_pages+0xba/0x740
> > [ 909.118971] ? iomap_dio_rw+0xa/0x30
> > [ 909.120069] ? btrfs_file_read_iter+0x157/0x1c0 [btrfs]
> > [ 909.121219] ? new_sync_read+0x11b/0x1b0
> > [ 909.122301] ? vfs_read+0xf7/0x190
> > [ 909.123373] ? ksys_read+0x5f/0xe0
> > [ 909.124451] ? do_syscall_64+0x37/0x80
> > [ 909.125556] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > [ 1066.293028] task:fsstress state:R running task stack: 0 pid: 9269 ppid: 1 flags:0x00004006
> > [ 1066.294069] Call Trace:
> > [ 1066.295111] ? btrfs_file_read_iter+0x157/0x1c0 [btrfs]
> > [ 1066.296213] ? new_sync_read+0x11b/0x1b0
> > [ 1066.297268] ? vfs_read+0xf7/0x190
> > [ 1066.298314] ? ksys_read+0x5f/0xe0
> > [ 1066.299359] ? do_syscall_64+0x37/0x80
> > [ 1066.300394] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> >
> > [ 1201.027178] task:fsstress state:R running task stack: 0 pid: 9269 ppid: 1 flags:0x00004006
> > [ 1201.028233] Call Trace:
> > [ 1201.029298] ? iomap_dio_rw+0xa/0x30
> > [ 1201.030352] ? btrfs_file_read_iter+0x157/0x1c0 [btrfs]
> > [ 1201.031465] ? new_sync_read+0x11b/0x1b0
> > [ 1201.032534] ? vfs_read+0xf7/0x190
> > [ 1201.033592] ? ksys_read+0x5f/0xe0
> > [ 1201.034633] ? do_syscall_64+0x37/0x80
> > [ 1201.035661] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > By the way, I enable ' -O no-holes -R free-space-tree' for mkfs.btrfs by
> > default.
>
> Those mkfs options/fs features should be irrelevant.
>
> Can you try with the attached patch applied on top of those patches?
>
> Thanks.
>
> >
> >
> > > >
> > > > We noticed some error in dmesg before this infinite loop.
> > > > [15590.720909] BTRFS: error (device dm-0) in __btrfs_free_extent:3069: errno=-5 IO failure
> > > > [15590.723014] BTRFS info (device dm-0): forced readonly
> > > > [15590.725115] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2150: errno=-5 IO failure
> > >
> > > Yes, that's expected, the test intentionally triggers simulated IO
> > > failures, which can happen anywhere, not just when running delayed
> > > references.
> >
> > Best Regards
> > Wang Yugui (wangyugui@e16-tech.com)
> > 2021/10/22
> >
> >
next prev parent reply other threads:[~2021-10-23 3:58 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-08 10:50 [PATCH] btrfs: fix deadlock due to page faults during direct IO reads and writes fdmanana
2021-09-09 19:21 ` Boris Burkov
2021-09-10 8:41 ` Filipe Manana
2021-09-10 16:44 ` Boris Burkov
2021-10-22 5:59 ` Wang Yugui
2021-10-22 10:54 ` Filipe Manana
2021-10-22 12:12 ` Wang Yugui
2021-10-22 13:17 ` Filipe Manana
2021-10-23 3:58 ` Wang Yugui [this message]
2021-10-25 9:41 ` Filipe Manana
2021-10-25 9:42 ` [PATCH v2] " fdmanana
2021-10-25 14:42 ` Josef Bacik
2021-10-25 14:54 ` Filipe Manana
2021-10-25 16:11 ` Josef Bacik
2021-10-25 16:27 ` [PATCH v3] " fdmanana
2021-10-25 18:58 ` Josef Bacik
2021-11-09 11:27 ` Filipe Manana
2021-11-09 12:39 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211023115852.2517.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.