All of lore.kernel.org
 help / color / mirror / Atom feed
From: Scott Mayhew <smayhew@redhat.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "bfields@fieldses.org" <bfields@fieldses.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"tpearson@raptorengineering.com" <tpearson@raptorengineering.com>,
	"neilb@suse.de" <neilb@suse.de>,
	"chuck.lever@oracle.com" <chuck.lever@oracle.com>
Subject: Re: CPU stall, eventual host hang with BTRFS + NFS under heavy load
Date: Wed, 1 Dec 2021 15:13:31 -0500	[thread overview]
Message-ID: <YafXa0hSxfOLhPDp@aion.usersys.redhat.com> (raw)
In-Reply-To: <f9b305425209e99aa73417823e82cf7ce56b0141.camel@hammerspace.com>

On Wed, 01 Dec 2021, Trond Myklebust wrote:

> On Wed, 2021-12-01 at 13:36 -0500, Scott Mayhew wrote:
> > On Fri, 15 Oct 2021, Trond Myklebust wrote:
> > 
> > > On Fri, 2021-10-15 at 08:03 +0000, Trond Myklebust wrote:
> > > > On Fri, 2021-10-15 at 09:51 +1100, NeilBrown wrote:
> > > > > On Fri, 15 Oct 2021, Trond Myklebust wrote:
> > > > > > On Tue, 2021-10-12 at 08:57 +1100, NeilBrown wrote:
> > > > > > > On Tue, 12 Oct 2021, Chuck Lever III wrote:
> > > > > > > > 
> > > > > > > > Scott seems well positioned to identify a reproducer.
> > > > > > > > Maybe
> > > > > > > > we
> > > > > > > > can give him some likely candidates for possible bugs to
> > > > > > > > explore
> > > > > > > > first.
> > > > > > > 
> > > > > > > Has this patch been tried?
> > > > > > > 
> > > > > > > NeilBrown
> > > > > > > 
> > > > > > > 
> > > > > > > diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> > > > > > > index c045f63d11fa..308f5961cb78 100644
> > > > > > > --- a/net/sunrpc/sched.c
> > > > > > > +++ b/net/sunrpc/sched.c
> > > > > > > @@ -814,6 +814,7 @@ rpc_reset_task_statistics(struct
> > > > > > > rpc_task
> > > > > > > *task)
> > > > > > >  {
> > > > > > >         task->tk_timeouts = 0;
> > > > > > >         task->tk_flags &=
> > > > > > > ~(RPC_CALL_MAJORSEEN|RPC_TASK_SENT);
> > > > > > > +       clear_bit(RPC_TASK_SIGNALLED, &task->tk_runstate);
> > > > > > >         rpc_init_task_st
> > > > > > 
> > > > > > We shouldn't automatically "unsignal" a task once it has been
> > > > > > told
> > > > > > to
> > > > > > die. The correct thing to do here should rather be to change
> > > > > > rpc_restart_call() to exit early if the task was signalled.
> > > > > > 
> > > > > 
> > > > > Maybe.  It depends on exactly what the signal meant
> > > > > (rpc_killall_tasks()
> > > > > is a bit different from getting a SIGKILL), and exactly what
> > > > > the
> > > > > task
> > > > > is
> > > > > trying to achieve.
> > > > > 
> > > > > Before Commit ae67bd3821bb ("SUNRPC: Fix up task signalling")
> > > > > that is exactly what we did.
> > > > > If we want to change the behaviour of a task responding to
> > > > > rpc_killall_tasks(), we should clearly justify it in a patch
> > > > > doing
> > > > > exactly that.
> > > > > 
> > > > 
> > > > The intention behind rpc_killall_tasks() never changed, which is
> > > > why
> > > > it
> > > 
> > > ("it" being the error ERESTARTSYS)
> > > 
> > > > is listed in nfs_error_is_fatal(). I'm not aware of any case
> > > > where we
> > > > deliberately override in order to restart the RPC call on an
> > > > ERESTARTSYS error.
> > > > 
> > Update: I'm not able to reproduce this with an upstream kernel.  I
> > bisected it down to commit 2ba5acfb3495 "SUNRPC: fix sign error
> > causing
> > rpcsec_gss drops" as the commit that "fixed" the issue (but really
> > just
> > makes the issue less likely to occur, I think).
> > 
> > I also tested commit 10b9d99a3dbb "SUNRPC: Augment server-side rpcgss
> > tracepoints" (the commit in the Fixes: tag of 2ba5acfb3495) as well
> > as
> > commit 0e885e846d96 "nfsd: add fattr support for user extended
> > attributes"
> > (the parent of commit 10b9d99a3dbb) and verified that commit
> > 10b9d99a3dbb is where the issue started occurring.
> > 
> > I think what is happening is that the NFS server gets a request that
> > it
> > thinks is outside of the GSS sequence window and drops the request,
> > closes the connection and calls nfsd4_conn_lost(), which calls
> > nfsd4_probe_callback() which sets NFSD4_CLIENT_CB_UPDATE in
> > clp->cl_flags.  Then the client reestablishes the connection on that
> > port, sends another request which receives
> > NFS4ERR_CONN_NOT_BOUND_TO_SESSION.  The client runs the state manager
> > which calls nfs4_bind_conn_to_session(), which calls
> > nfs4_begin_drain_session(), which sets NFS4_SLOT_TBL_DRAINING in
> > tbl->slot_tbl_state.  Meanwhile a conflicting request comes in that
> > causes the server to recall the delegation.  Since
> > NFS4_SLOT_TBL_DRAINING is set, the client responds to the CB_SEQUENCE
> > with NFS4ERR_DELAY.  At the same time, the BIND_CONN_TO_SESSION
> > requests
> > from the client are causing the server to call
> > nfsd4_process_cb_update(), since NFSD4_CLIENT_CB_UPDATE flag is set.
> > nfsd4_process_cb_update() calls rpc_shutdown_client() which signals
> > the
> > CB_RECALL task, which the server is trying re-send due to the
> > NFS4ERR_DELAY, and we get into the soft-lockup.
> > 
> 
> I'm a little lost with the above explantion. How can the server send a
> callback on a connection that isn't bound? If it isn't bound, then it
> can't be used as a back channel.

The callback is on port 787 and the client sent a BIND_CONN_TO_SESSION
on port 787 right before that:

$ tshark -2 -r nfstest_delegation_20211123_154510_001.cap -Y '(nfs||nfs.cb||tcp.flags.fin==1) && (frame.number >= 447 && frame.number <= 491)'
  447   2.146535 10.16.225.113 → 10.16.225.28 NFS 877 2049 V4 Call (Reply In 448) OPEN DH: 0x7c6b0f9b/nfstest_delegation_20211123_154510_f_001
  448   2.162160 10.16.225.28 → 10.16.225.113 NFS 2049 877 V4 Reply (Call In 447) OPEN StateID: 0xcb0c
  455   2.180240 10.16.225.113 → 10.16.225.28 NFS 850 2049 V4 Call (Reply In 461) READ StateID: 0xa0f6 Offset: 0 Len: 4096
  456   2.180301 10.16.225.113 → 10.16.225.28 NFS 846 2049 V4 Call (Reply In 466) READ StateID: 0xa0f6 Offset: 8192 Len: 4096
  457   2.180357 10.16.225.113 → 10.16.225.28 NFS 775 2049 V4 Call READ StateID: 0xa0f6 Offset: 4096 Len: 4096
  458   2.180382 10.16.225.113 → 10.16.225.28 NFS 879 2049 V4 Call (Reply In 470) READ StateID: 0xa0f6 Offset: 12288 Len: 4096
  461   2.195961 10.16.225.28 → 10.16.225.113 NFS 2049 850 V4 Reply (Call In 455) READ
  463   2.195989 10.16.225.28 → 10.16.225.113 TCP 2049 775 2049 → 775 [FIN, ACK] Seq=557 Ack=689 Win=32256 Len=0 TSval=3414832656 TSecr=3469334230
  466   2.196039 10.16.225.28 → 10.16.225.113 NFS 2049 846 V4 Reply (Call In 456) READ
  470   2.196088 10.16.225.28 → 10.16.225.113 NFS 2049 879 V4 Reply (Call In 458) READ
  472   2.196171 10.16.225.113 → 10.16.225.28 TCP 775 2049 775 → 2049 [FIN, ACK] Seq=689 Ack=558 Win=31104 Len=0 TSval=3469334246 TSecr=3414832656
  479   2.227386 10.16.225.113 → 10.16.225.28 NFS 775 2049 V4 Call (Reply In 482) READ StateID: 0xa0f6 Offset: 4096 Len: 4096
  482   2.242931 10.16.225.28 → 10.16.225.113 NFS 2049 775 V4 Reply (Call In 479) SEQUENCE Status: NFS4ERR_CONN_NOT_BOUND_TO_SESSION
  484   2.243537 10.16.225.113 → 10.16.225.28 NFS 787 2049 V4 Call (Reply In 486) BIND_CONN_TO_SESSION
  486   2.259358 10.16.225.28 → 10.16.225.113 NFS 2049 787 V4 Reply (Call In 484) BIND_CONN_TO_SESSION
  488   2.259755 10.16.225.113 → 10.16.225.28 NFS 802 2049 V4 Call (Reply In 492) BIND_CONN_TO_SESSION
  489   2.260037 10.16.225.28 → 10.16.225.113 NFS CB 2049 787 V1 CB_COMPOUND Call (Reply In 491) <EMPTY> CB_SEQUENCE;CB_RECALL
  491   2.260139 10.16.225.113 → 10.16.225.28 NFS CB 787 2049 V1 CB_COMPOUND Reply (Call In 489) <EMPTY> CB_SEQUENCE

Come to think of it though, I'm not sure I understand why the CB_RECALL
is happening because I don't see the conflicting open until later in the
capture.

> 
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
> 


      reply	other threads:[~2021-12-01 20:16 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-05  9:44 CPU stall, eventual host hang with BTRFS + NFS under heavy load Timothy Pearson
2021-07-05  9:47 ` Timothy Pearson
2021-07-23 21:01   ` J. Bruce Fields
     [not found]   ` <B4D8C4B7-EE8C-456C-A6C5-D25FF1F3608E@rutgers.edu>
     [not found]     ` <3A4DF3BB-955C-4301-BBED-4D5F02959F71@rutgers.edu>
2021-08-09 17:06       ` Timothy Pearson
2021-08-09 17:15         ` hedrick
2021-08-09 17:25           ` Timothy Pearson
2021-08-09 17:37           ` Chuck Lever III
     [not found]             ` <F5179A41-FB9A-4AB1-BE58-C2859DB7EC06@rutgers.edu>
2021-08-09 18:30               ` Timothy Pearson
2021-08-09 18:38                 ` hedrick
2021-08-09 18:44                   ` Timothy Pearson
2021-08-09 18:49                   ` J. Bruce Fields
     [not found]                     ` <15AD846A-4638-4ACF-B47C-8EF655AD6E85@rutgers.edu>
2021-08-09 18:56                       ` Timothy Pearson
2021-08-09 20:54                         ` Charles Hedrick
2021-08-09 21:49                           ` Timothy Pearson
2021-08-09 22:01                             ` Charles Hedrick
     [not found]             ` <1119B476-171F-4C5A-9DEF-184F211A6A98@rutgers.edu>
2021-08-10 16:22               ` Timothy Pearson
2021-08-16 14:43                 ` hedrick
2021-08-09 18:30           ` J. Bruce Fields
2021-08-09 18:34             ` hedrick
     [not found]             ` <413163A6-8484-4170-9877-C0C2D50B13C0@rutgers.edu>
2021-08-10 14:58               ` J. Bruce Fields
2021-07-23 21:00 ` J. Bruce Fields
2021-07-23 21:22   ` Timothy Pearson
2021-07-28 19:51     ` Timothy Pearson
2021-08-02 19:28       ` J. Bruce Fields
2021-08-10  0:43 ` NeilBrown
2021-08-10  0:54   ` J.  Bruce Fields
2021-08-12 14:44   ` J.  Bruce Fields
2021-08-12 21:36     ` NeilBrown
2021-10-08 20:27       ` Scott Mayhew
2021-10-08 20:53         ` Timothy Pearson
2021-10-08 21:11         ` J.  Bruce Fields
2021-10-09 17:33         ` Chuck Lever III
2021-10-11 14:30           ` Bruce Fields
2021-10-11 16:36             ` Chuck Lever III
2021-10-11 21:57               ` NeilBrown
2021-10-14 22:36                 ` Trond Myklebust
2021-10-14 22:51                   ` NeilBrown
2021-10-15  8:03                     ` Trond Myklebust
2021-10-15  8:05                       ` Trond Myklebust
2021-12-01 18:36                         ` Scott Mayhew
2021-12-01 19:35                           ` Trond Myklebust
2021-12-01 20:13                             ` Scott Mayhew [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YafXa0hSxfOLhPDp@aion.usersys.redhat.com \
    --to=smayhew@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tpearson@raptorengineering.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.