All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daire Byrne <daire@dneg.com>
To: NeilBrown <neilb@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Chuck Lever <chuck.lever@oracle.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC 00/12] Allow concurrent directory updates.
Date: Thu, 16 Jun 2022 11:48:28 +0100	[thread overview]
Message-ID: <CAPt2mGOw_PS-5KY-9WFzGOT=ax6PFhVYSTQG-dpXzV5MeGieYg@mail.gmail.com> (raw)
In-Reply-To: <165534094600.26404.4349155093299535793@noble.neil.brown.name>

On Thu, 16 Jun 2022 at 01:56, NeilBrown <neilb@suse.de> wrote:
>
> On Wed, 15 Jun 2022, Daire Byrne wrote:
> ..
> > However, it is at this point that I started to experience some
> > stability issues with the re-export server that are not present with
> > the vanilla unpatched v5.19-rc2 kernel. In particular the knfsd
> > threads start to lock up with stack traces like this:
> >
> > [ 1234.460696] INFO: task nfsd:5514 blocked for more than 123 seconds.
> > [ 1234.461481]       Tainted: G        W   E     5.19.0-1.dneg.x86_64 #1
> > [ 1234.462289] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 1234.463227] task:nfsd            state:D stack:    0 pid: 5514
> > ppid:     2 flags:0x00004000
> > [ 1234.464212] Call Trace:
> > [ 1234.464677]  <TASK>
> > [ 1234.465104]  __schedule+0x2a9/0x8a0
> > [ 1234.465663]  schedule+0x55/0xc0
> > [ 1234.466183]  ? nfs_lookup_revalidate_dentry+0x3a0/0x3a0 [nfs]
> > [ 1234.466995]  __nfs_lookup_revalidate+0xdf/0x120 [nfs]
>
> I can see the cause of this - I forget a wakeup.  This patch should fix
> it, though I hope to find a better solution.
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 54c2c7adcd56..072130d000c4 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -2483,17 +2483,16 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry)
>         if (!(dentry->d_flags & DCACHE_PAR_UPDATE)) {
>                 /* Must have exclusive lock on parent */
>                 did_set_par_update = true;
> +               lock_acquire_exclusive(&dentry->d_update_map, 0,
> +                                      0, NULL, _THIS_IP_);
>                 dentry->d_flags |= DCACHE_PAR_UPDATE;
>         }
>
>         spin_unlock(&dentry->d_lock);
>         error = nfs_safe_remove(dentry);
>         nfs_dentry_remove_handle_error(dir, dentry, error);
> -       if (did_set_par_update) {
> -               spin_lock(&dentry->d_lock);
> -               dentry->d_flags &= ~DCACHE_PAR_UPDATE;
> -               spin_unlock(&dentry->d_lock);
> -       }
> +       if (did_set_par_update)
> +               d_unlock_update(dentry);
>  out:
>         trace_nfs_unlink_exit(dir, dentry, error);
>         return error;
>
> >
> > So all in all, the performance improvements in the knfsd re-export
> > case is looking great and we have real world use cases that this helps
> > with (batch processing workloads with latencies >10ms). If we can
> > figure out the hanging knfsd threads, then I can test it more heavily.
>
> Hopefully the above patch will allow the more heavy testing to continue.
> In any case, thanks a lot for the testing so far,

Patch applied but unfortunately I'm still getting the same trace, but
this time I also captured a preceding stack for a hung process local
to the reexport server - I wonder if it's happening somewhere in the
VFS changes rather than nfsd which then exports the path?

[  373.930506] INFO: task XXXX:5072 blocked for more than 122 seconds.
[  373.931410]       Tainted: G        W   E     5.19.0-3.dneg.x86_64 #1
[  373.932313] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  373.933442] task:XXXX          state:D stack:    0 pid: 5072 ppid:
   1 flags:0x00000000
[  373.934639] Call Trace:
[  373.935007]  <TASK>
[  373.935306]  __schedule+0x2a9/0x8a0
[  373.935844]  schedule+0x55/0xc0
[  373.936294]  ? nfs_lookup_revalidate_dentry+0x3a0/0x3a0 [nfs]
[  373.937137]  __nfs_lookup_revalidate+0xdf/0x120 [nfs]
[  373.937875]  ? put_prev_task_stop+0x170/0x170
[  373.938525]  nfs_lookup_revalidate+0x15/0x20 [nfs]
[  373.939226]  lookup_fast+0xda/0x150
[  373.939756]  path_openat+0x12a/0x1090
[  373.940293]  ? __filemap_fdatawrite_range+0x54/0x70
[  373.941100]  do_filp_open+0xb2/0x120
[  373.941635]  ? hashlen_string+0xd0/0xd0
[  373.942190]  ? _raw_spin_unlock+0xe/0x30
[  373.942766]  do_sys_openat2+0x245/0x320
[  373.943305]  do_sys_open+0x46/0x80
[  373.943839]  __x64_sys_open+0x21/0x30
[  373.944428]  do_syscall_64+0x3b/0x90
[  373.944979]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  373.945688] RIP: 0033:0x7fcd80ceeeb0
[  373.946226] RSP: 002b:00007fff90fd8298 EFLAGS: 00000246 ORIG_RAX:
0000000000000002
[  373.947330] RAX: ffffffffffffffda RBX: 00007fcd81d6e981 RCX: 00007fcd80ceeeb0
[  373.947333] RDX: 00000000000001b6 RSI: 0000000000000000 RDI: 00007fff90fd8360
[  373.947334] RBP: 00007fff90fd82f0 R08: 00007fcd81d6e986 R09: 0000000000000000
[  373.947335] R10: 0000000000000024 R11: 0000000000000246 R12: 0000000000cd6110
[  373.947337] R13: 0000000000000008 R14: 00007fff90fd8360 R15: 00007fff90fdb580
[  373.947339]  </TASK>
[  373.947421] INFO: task nfsd:5696 blocked for more than 122 seconds.
[  373.947423]       Tainted: G        W   E     5.19.0-3.dneg.x86_64 #1
[  373.947424] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  373.947425] task:nfsd            state:D stack:    0 pid: 5696
ppid:     2 flags:0x00004000
[  373.947428] Call Trace:
[  373.947429]  <TASK>
[  373.947430]  __schedule+0x2a9/0x8a0
[  373.947434]  schedule+0x55/0xc0
[  373.947436]  ? nfs_lookup_revalidate_dentry+0x3a0/0x3a0 [nfs]
[  373.947451]  __nfs_lookup_revalidate+0xdf/0x120 [nfs]
[  373.947464]  ? put_prev_task_stop+0x170/0x170
[  373.947466]  nfs_lookup_revalidate+0x15/0x20 [nfs]
[  373.947478]  lookup_dcache+0x5a/0x80
[  373.947481]  lookup_one_unlocked+0x59/0xa0
[  373.947484]  lookup_one_len_unlocked+0x1d/0x20
[  373.947487]  nfsd_lookup_dentry+0x190/0x470 [nfsd]
[  373.947509]  nfsd_lookup+0x88/0x1b0 [nfsd]
[  373.947522]  nfsd3_proc_lookup+0xb4/0x100 [nfsd]
[  373.947537]  nfsd_dispatch+0x161/0x290 [nfsd]
[  373.947551]  svc_process_common+0x48a/0x620 [sunrpc]
[  373.947589]  ? nfsd_svc+0x330/0x330 [nfsd]
[  373.947602]  ? nfsd_shutdown_threads+0xa0/0xa0 [nfsd]
[  373.947621]  svc_process+0xbc/0xf0 [sunrpc]
[  373.951088]  nfsd+0xda/0x190 [nfsd]
[  373.951136]  kthread+0xf0/0x120
[  373.951138]  ? kthread_complete_and_exit+0x20/0x20
[  373.951140]  ret_from_fork+0x22/0x30
[  373.951149]  </TASK>

I double checked that the patch had been applied and I hadn't made a
mistake with installation.

I could perhaps try running with just the VFS patches to see if I can
still reproduce the "local" VFS hang without the nfsd patches? Your
previous VFS only patchset was stable for me.

Daire

  reply	other threads:[~2022-06-16 10:49 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13 23:18 [PATCH RFC 00/12] Allow concurrent directory updates NeilBrown
2022-06-13 23:18 ` [PATCH 04/12] VFS: move dput() and mnt_drop_write() into done_path_update() NeilBrown
2022-06-13 23:18 ` [PATCH 03/12] VFS: move want_write checks into lookup_hash_update() NeilBrown
2022-06-13 23:18 ` [PATCH 02/12] VFS: move EEXIST and ENOENT tests " NeilBrown
2022-06-13 23:18 ` [PATCH 01/12] VFS: support parallel updates in the one directory NeilBrown
2022-06-13 23:18 ` [PATCH 05/12] VFS: export done_path_update() NeilBrown
2022-06-13 23:18 ` [PATCH 08/12] nfsd: allow parallel creates from nfsd NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-28 22:35   ` Chuck Lever III
2022-06-28 23:09     ` NeilBrown
2022-07-04 17:17       ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 07/12] NFS: support parallel updates in the one directory NeilBrown
2022-06-13 23:18 ` [PATCH 11/12] nfsd: use (un)lock_inode instead of fh_(un)lock NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 06/12] VFS: support concurrent renames NeilBrown
2022-06-14  4:35   ` kernel test robot
2022-06-14 12:37   ` kernel test robot
2022-06-14 13:28   ` kernel test robot
2022-06-26 13:07   ` [VFS] 46a2afd9f6: ltp.rename10.fail kernel test robot
2022-06-26 13:07     ` kernel test robot
2022-06-26 13:07     ` [LTP] " kernel test robot
2022-06-13 23:18 ` [PATCH 12/12] nfsd: discard fh_locked flag and fh_lock/fh_unlock NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 10/12] nfsd: reduce locking in nfsd_lookup() NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-13 23:18 ` [PATCH 09/12] nfsd: support concurrent renames NeilBrown
2022-06-24 14:43   ` Chuck Lever III
2022-06-15 13:46 ` [PATCH RFC 00/12] Allow concurrent directory updates Daire Byrne
2022-06-16  0:55   ` NeilBrown
2022-06-16 10:48     ` Daire Byrne [this message]
2022-06-17  5:49       ` NeilBrown
2022-06-17 15:27         ` Daire Byrne
2022-06-20 10:18           ` Daire Byrne
2022-06-16 13:49     ` Anna Schumaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPt2mGOw_PS-5KY-9WFzGOT=ax6PFhVYSTQG-dpXzV5MeGieYg@mail.gmail.com' \
    --to=daire@dneg.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=trond.myklebust@hammerspace.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.