All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olga Kornievskaia <olga.kornievskaia@gmail.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH 1/1] SUNRPC dont update timeout value on connection reset
Date: Fri, 10 Jul 2020 13:35:55 -0400	[thread overview]
Message-ID: <CAN-5tyHrd4dZEkO3CRrvcLv82Gy0fXERCPYSq0Snj3zBHh3gxw@mail.gmail.com> (raw)
In-Reply-To: <CAN-5tyFD6XuZAZ3HAvfxyr7xLsvb-04hhe4PYvO_594ZQ0TuNw@mail.gmail.com>

On Thu, Jul 9, 2020 at 5:07 PM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Thu, Jul 9, 2020 at 1:19 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
> >
> > On Thu, 2020-07-09 at 11:43 -0400, Olga Kornievskaia wrote:
> > > On Thu, Jul 9, 2020 at 8:08 AM Trond Myklebust <
> > > trondmy@hammerspace.com> wrote:
> > > > Hi Olga
> > > >
> > > > On Wed, 2020-07-08 at 17:05 -0400, Olga Kornievskaia wrote:
> > > > > Current behaviour: every time a v3 operation is re-sent to the
> > > > > server
> > > > > we update (double) the timeout. There is no distinction between
> > > > > whether
> > > > > or not the previous timer had expired before the re-sent
> > > > > happened.
> > > > >
> > > > > Here's the scenario:
> > > > > 1. Client sends a v3 operation
> > > > > 2. Server RST-s the connection (prior to the timeout) (eg.,
> > > > > connection
> > > > > is immediately reset)
> > > > > 3. Client re-sends a v3 operation but the timeout is now 120sec.
> > > > >
> > > > > As a result, an application sees 2mins pause before a retry in
> > > > > case
> > > > > server again does not reply.
> > > > >
> > > > > Instead, this patch proposes to keep track off when the minor
> > > > > timeout
> > > > > should happen and if it didn't, then don't update the new
> > > > > timeout.
> > > > >
> > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > ---
> > > > >  include/linux/sunrpc/xprt.h |  1 +
> > > > >  net/sunrpc/xprt.c           | 11 +++++++++++
> > > > >  2 files changed, 12 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/sunrpc/xprt.h
> > > > > b/include/linux/sunrpc/xprt.h
> > > > > index e64bd82..a603d48 100644
> > > > > --- a/include/linux/sunrpc/xprt.h
> > > > > +++ b/include/linux/sunrpc/xprt.h
> > > > > @@ -101,6 +101,7 @@ struct rpc_rqst {
> > > > >                                                        * used in
> > > > > the
> > > > > softirq.
> > > > >                                                        */
> > > > >       unsigned long           rq_majortimeo;  /* major timeout
> > > > > alarm */
> > > > > +     unsigned long           rq_minortimeo;  /* minor timeout
> > > > > alarm */
> > > > >       unsigned long           rq_timeout;     /* Current timeout
> > > > > value */
> > > > >       ktime_t                 rq_rtt;         /* round-trip time
> > > > > */
> > > > >       unsigned int            rq_retries;     /* # of retries */
> > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > > > index d5cc5db..c0ce232 100644
> > > > > --- a/net/sunrpc/xprt.c
> > > > > +++ b/net/sunrpc/xprt.c
> > > > > @@ -607,6 +607,11 @@ static void xprt_reset_majortimeo(struct
> > > > > rpc_rqst *req)
> > > > >       req->rq_majortimeo += xprt_calc_majortimeo(req);
> > > > >  }
> > > > >
> > > > > +static void xprt_reset_minortimeo(struct rpc_rqst *req)
> > > > > +{
> > > > > +     req->rq_minortimeo = jiffies + req->rq_timeout;
> > > > > +}
> > > > > +
> > > > >  static void xprt_init_majortimeo(struct rpc_task *task, struct
> > > > > rpc_rqst *req)
> > > > >  {
> > > > >       unsigned long time_init;
> > > > > @@ -618,6 +623,7 @@ static void xprt_init_majortimeo(struct
> > > > > rpc_task
> > > > > *task, struct rpc_rqst *req)
> > > > >               time_init = xprt_abs_ktime_to_jiffies(task-
> > > > > >tk_start);
> > > > >       req->rq_timeout = task->tk_client->cl_timeout->to_initval;
> > > > >       req->rq_majortimeo = time_init + xprt_calc_majortimeo(req);
> > > > > +     req->rq_minortimeo = time_init + req->rq_timeout;
> > > > >  }
> > > > >
> > > > >  /**
> > > > > @@ -631,6 +637,10 @@ int xprt_adjust_timeout(struct rpc_rqst
> > > > > *req)
> > > > >       const struct rpc_timeout *to = req->rq_task->tk_client-
> > > > > > cl_timeout;
> > > > >       int status = 0;
> > > > >
> > > > > +     if (time_before(jiffies, req->rq_minortimeo)) {
> > > > > +             xprt_reset_minortimeo(req);
> > > > > +             return status;
> > > >
> > > > Shouldn't this case be just returning without updating the timeout?
> > > > After all, this is the case where nothing has expired yet.
> > >
> > > I think we perhaps should readjust the minor timeout every here but I
> > > can't figure out what the desired behaviour should be. When should we
> > > consider it's appropriate to double the timer. Consider the
> > > following:
> > >
> > > time1: v3 op sent
> > > time1+50s: server RSTs
> > > We check that it's not yet the minor timeout (time1+60s)
> > > time1+50s: v3 op re-sent  (say we don't reset the minor timeout to be
> > > current time+60s)
> > > time1+60s: server RSTs
> > > Client will resend the op but now it's past the initial minor timeout
> > > so the timeout will be doubled. Is that what we really want? Maybe it
> > > is.
> > > Say now the server RSTs the connection again (shortly after or in
> > > less
> > > than 60s), since we are not updating the minor timeout value, then
> > > the
> > > client will again modify the timeout before resending. Is that Ok?
> > >
> > > That's why my reasoning was that at every re-evaluation of the
> > > timeout
> > > value, we have the minor timeout set for current time+60s and we get
> > > an RST within it then we don't modify the timeout value.
> >
> > So a couple of issues with that:
> >
> > The first is that a series of RST calls could cause the timeout to get
> > pushed to the max value fairly quickly (btw, xprt_reset_minortimeo()
> > does not enforce a limit right now).
> >
> > The second is that we end up pushing out the major timeout value, since
> > the major timeout cannot occur unless the value of jiffies is after the
> > minor timeout (which keeps changing on each pass).
>
> But dont we want to push out the major timeout?
>
> Actually i think, back in my example of getting the RST, at
> (time1+50s). shouldn't minor_timeo and majortimeo be reset to
> currenttime+appropriate value of minor/major?  If we are evaluating
> the timer and the time difference between when the operation was sent
> and now is less than 60s, we shouldn't say a timeout has occurried
> (it's a pre-mature timeout) and thus its value shouldn't be modified.
>
> Thoughts?

Do you feel that the following approach is incorrect? Sry it's just
cut-and-paste but the logic is there. Thank you.

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index e64bd82..a603d48 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -101,6 +101,7 @@ struct rpc_rqst {
  * used in the softirq.
  */
  unsigned long rq_majortimeo; /* major timeout alarm */
+ unsigned long rq_minortimeo; /* minor timeout alarm */
  unsigned long rq_timeout; /* Current timeout value */
  ktime_t rq_rtt; /* round-trip time */
  unsigned int rq_retries; /* # of retries */
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index d5cc5db..66d412b 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -607,6 +607,11 @@ static void xprt_reset_majortimeo(struct rpc_rqst *req)
  req->rq_majortimeo += xprt_calc_majortimeo(req);
 }

+static void xprt_reset_minortimeo(struct rpc_rqst *req)
+{
+ req->rq_minortimeo = jiffies + req->rq_timeout;
+}
+
 static void xprt_init_majortimeo(struct rpc_task *task, struct rpc_rqst *req)
 {
  unsigned long time_init;
@@ -618,6 +623,7 @@ static void xprt_init_majortimeo(struct rpc_task
*task, struct rpc_rqst *req)
  time_init = xprt_abs_ktime_to_jiffies(task->tk_start);
  req->rq_timeout = task->tk_client->cl_timeout->to_initval;
  req->rq_majortimeo = time_init + xprt_calc_majortimeo(req);
+ req->rq_minortimeo = time_init + req->rq_timeout;
 }

 /**
@@ -631,6 +637,11 @@ int xprt_adjust_timeout(struct rpc_rqst *req)
  const struct rpc_timeout *to = req->rq_task->tk_client->cl_timeout;
  int status = 0;

+ if (time_before(jiffies, req->rq_minortimeo)) {
+ req->rq_majortimeo = jiffies + xprt_calc_majortimeo(req);
+ req->rq_minortimeo = jiffies + req->rq_timeout;
+ return status;
+ }
  if (time_before(jiffies, req->rq_majortimeo)) {
  if (to->to_exponential)
  req->rq_timeout <<= 1;
@@ -649,6 +660,7 @@ int xprt_adjust_timeout(struct rpc_rqst *req)
  spin_unlock(&xprt->transport_lock);
  status = -ETIMEDOUT;
  }
+ xprt_reset_minortimeo(req);

  if (req->rq_timeout == 0) {
  printk(KERN_WARNING "xprt_adjust_timeout: rq_timeout = 0!\n");
-- 

> > > > > +     }
> > > > >       if (time_before(jiffies, req->rq_majortimeo)) {
> > > > >               if (to->to_exponential)
> > > > >                       req->rq_timeout <<= 1;
> > > > > @@ -638,6 +648,7 @@ int xprt_adjust_timeout(struct rpc_rqst *req)
> > > > >                       req->rq_timeout += to->to_increment;
> > > > >               if (to->to_maxval && req->rq_timeout >= to-
> > > > > >to_maxval)
> > > > >                       req->rq_timeout = to->to_maxval;
> > > > > +             xprt_reset_minortimeo(req);
> > > >
> > > > ...and then perhaps this can just be moved out of the time_before()
> > > > condition, since it looks to me as if we also want to reset req-
> > > > > rq_minortimeo when a major timeout occurs.
> > > > >               req->rq_retries++;
> > > > >       } else {
> > > > >               req->rq_timeout = to->to_initval;
> >
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
> >
> >

  reply	other threads:[~2020-07-10 17:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-08 21:05 [PATCH 1/1] SUNRPC dont update timeout value on connection reset Olga Kornievskaia
2020-07-09 12:08 ` Trond Myklebust
2020-07-09 15:43   ` Olga Kornievskaia
2020-07-09 17:19     ` Trond Myklebust
2020-07-09 21:07       ` Olga Kornievskaia
2020-07-10 17:35         ` Olga Kornievskaia [this message]
2020-07-10 18:40           ` Olga Kornievskaia
2020-07-13 13:47             ` Trond Myklebust
2020-07-13 16:18               ` Olga Kornievskaia
  -- strict thread matches above, loose matches on Subject: below --
2020-06-23 15:24 Olga Kornievskaia
2020-06-28 18:03 ` Olga Kornievskaia
2020-06-28 21:16   ` Trond Myklebust
2020-07-08 21:04     ` Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyHrd4dZEkO3CRrvcLv82Gy0fXERCPYSq0Snj3zBHh3gxw@mail.gmail.com \
    --to=olga.kornievskaia@gmail.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.