All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Wysochanski <dwysocha@redhat.com>
To: Trond Myklebust <trondmy@hammerspace.com>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 1/1] SUNRPC: Ensure XPRT_CONNECTED is cleared while handling TCP RST
Date: Wed, 12 Dec 2018 14:56:21 -0500	[thread overview]
Message-ID: <1544644581.3750.4.camel@redhat.com> (raw)
In-Reply-To: <99a88b32d140228d5382eb176d0f19df27483654.camel@hammerspace.com>

On Wed, 2018-12-12 at 18:02 +0000, Trond Myklebust wrote:
> On Wed, 2018-12-12 at 12:47 -0500, Dave Wysochanski wrote:
> > On Wed, 2018-12-12 at 16:56 +0000, Trond Myklebust wrote:
> > > On Wed, 2018-12-12 at 08:51 -0500, Dave Wysochanski wrote:
> > > > Commit 9b30889c548a changed the handling of TCP_CLOSE inside
> > > > xs_tcp_state_change.  Prior to this change, the XPRT_CONNECTED
> > > > bit
> > > > was cleared unconditionally inside xprt_disconnect_done,
> > > > similar
> > > > to the handling of TCP_CLOSE_WAIT.  After the change the
> > > > clearing
> > > > of XPRT_CONNECTED depends on successfully queueing a work based
> > > > xprt_autoclose which depends on XPRT_LOCKED and may not happen.
> > > > This is significant in the case of an unexpected RST from the
> > > > server, as the client will only see xs_tcp_state_change called
> > > > with
> > > > sk_state == TCP_CLOSE.  Restore the unconditional clear_bit on
> > > > XPRT_CONNECTED while handling TCP_CLOSE and make it consistent
> > > > with handling TCP_CLOSE_WAIT.
> > > > 
> > > > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > > > ---
> > > >  net/sunrpc/xprtsock.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > > index 8a5e823e0b33..b9789036051d 100644
> > > > --- a/net/sunrpc/xprtsock.c
> > > > +++ b/net/sunrpc/xprtsock.c
> > > > @@ -1492,6 +1492,7 @@ static void xs_tcp_state_change(struct
> > > > sock
> > > > *sk)
> > > >  		if (sk->sk_err)
> > > >  			xprt_wake_pending_tasks(xprt, -sk-
> > > > > sk_err);
> > > > 
> > > >  		/* Trigger the socket release */
> > > > +		clear_bit(XPRT_CONNECTED, &xprt->state);
> > > >  		xs_tcp_force_close(xprt);
> > > >  	}
> > > >   out:
> > > 
> > > Hi Dave,
> > > 
> > > This isn't needed for 4.20 or newer because call_transmit() will
> > > now
> > > always call xprt_end_transmit(). I suggest that a stable fix do
> > > something similar (perhaps conditional on the error returned by
> > > xprt_transmit()?).
> > > 
> > 
> > Can you explain the handling in xs_tcp_state_change - why
> > XPRT_CONNECTED would need to remain set longer in the case of
> > handling
> > TCP_CLOSE case rather than TCP_CLOSE_WAIT?  It seems like if we get
> > an
> > RST we'll go directly to TCP_CLOSE and why would the XPRT_CONNECTED
> > bit
> > getting cleared need to be delayed in that case?
> > 
> > I will look into the approach you mention though at the moment I do
> > not
> > see how it helps because the underlying issue seems to be clearing
> > of
> > the XPRT_CONNECTED bit.
> > 
> 
> See xprt_clear_locked(). Dropping the XPRT_LOCK will automatically
> trigger autoclose if  XPRT_CLOSE_WAIT is set, regardless of whether
> or
> not there are other tasks waiting for the lock.
> 
> 

Ok thanks for pointing me in that direction I will investigate.

I am actually seeing a hang now with the reproducer on 4.20-rc6 but
it's not CPU spinning.  Investigating as it was not easy to reproduce
but there must still be a race somewhere.



  reply	other threads:[~2018-12-12 19:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 13:51 [PATCH 0/1] SUNRPC: Ensure XPRT_CONNECTED is cleared while handling TCP RST Dave Wysochanski
2018-12-12 13:51 ` [PATCH 1/1] " Dave Wysochanski
2018-12-12 16:56   ` Trond Myklebust
2018-12-12 17:47     ` Dave Wysochanski
2018-12-12 18:02       ` Trond Myklebust
2018-12-12 19:56         ` Dave Wysochanski [this message]
2019-01-08 12:46         ` Benjamin Coddington
2018-12-14 13:48     ` Dave Wysochanski
2018-12-14 18:29     ` Scott Mayhew

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1544644581.3750.4.camel@redhat.com \
    --to=dwysocha@redhat.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.