All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Artem Savkov <asavkov@redhat.com>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	netdev@vger.kernel.org, davem@davemloft.net,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/2] timer: add a function to adjust timeouts to be upper bound
Date: Thu, 7 Apr 2022 22:39:14 -0700	[thread overview]
Message-ID: <20220408053914.csbplfylubrlkads@treble> (raw)
In-Reply-To: <87zgkwjtq2.ffs@tglx>

On Fri, Apr 08, 2022 at 02:37:25AM +0200, Thomas Gleixner wrote:
>  "Make sure TCP keepalive timer does not expire late. Switching to upper
>   bound timers means it can fire off early but in case of keepalive
>   tcp_keepalive_timer() handler checks elapsed time and resets the timer
>   if it was triggered early. This results in timer "cascading" to a
>   higher precision and being just a couple of milliseconds off it's
>   original mark."
> 
> Which reinvents the cascading effect of the original timer wheel just
> with more overhead. Where is the justification for this?
> 
> Is this really true for all the reasons where the keep alive timers are
> armed? I seriously doubt that. Why?
> 
> On the end which waits for the keep alive packet to arrive in time it
> does not matter at all, whether the cutoff is a bit later than defined.
> 
>      So why do you want to let the timer fire early just to rearm it? 
> 
> But it matters a lot on the sender side. If that is late and the other
> end is strict about the timeout then you lost. But does it matter
> whether you send the packet too early? No, it does not matter at all
> because the important point is that you send it _before_ the other side
> decides to give up.
> 
>      So why do you want to let the timer fire precise?
> 
> You are solving the sender side problem by introducing a receiver side
> problem and both suffer from the overhead for no reason.

Here are my thoughts.  Maybe some networking folks can chime in to
keep us honest.

I get most of what you're saying, though my understanding is that
keepalive is only involved in sending packets, not receiving them.  I do
think there would be two opposing use cases:

  1) Client sending packets to prevent server disconnects

  2) Server sending packets to detect client disconnects

For #1, it's ok for the timer to pop early.  For #2, it's ok for it to
pop late.  So my conclusion is about the same as your sender/receiver
scenario: there are two sides to the same coin.

If we assume both use cases are valid (which I'm not entirely convinced
of), doesn't that mean that the keepalive timer needs to be precise?

Otherwise we're going to have broken expectations in one direction or
the other, depending on the use case.

> Aside of the theoerical issue why this matters at all I have yet ot see
> a reasonable argument what the practical problen is. If this would be a
> real problem in the wild then why haven't we ssen a reassonable bug
> report within 6 years?

Good question.  At least part of the answer *might* be that enterprise
kernels tend to be adopted very slowly.  This issue was reported on RHEL
8.3 which is a 4.18 based kernel:

  The time that the 1st TCP keepalive probe is sent can be configured by
  the "net.ipv4.tcp_keepalive_time" sysctl or by setsockopt(). 

  We observe that if that value is set to 300 seconds, the timer
  actually fires around 15-20 seconds later. So ~317 seconds. The larger
  the expiration time the greater the delay. So for the default of 2
  hours it can be delayed by minutes. This is causing problems for some
  customers that rely on the TCP keepalive timer to keep entries active
  in firewalls and expect it to be accurate as TCP keepalive values have
  to correspond to the firewall settings. 

-- 
Josh


  reply	other threads:[~2022-04-08  5:39 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23 11:16 [PATCH 0/2] Upper bound mode for kernel timers Artem Savkov
2022-03-23 11:16 ` [PATCH 1/2] timer: introduce upper bound timers Artem Savkov
2022-03-23 18:40   ` Josh Poimboeuf
2022-03-24  9:14     ` [PATCH v2 0/2] Upper bound mode for kernel timers Artem Savkov
2022-03-24  9:14       ` [PATCH v2 1/2] timer: introduce upper bound timers Artem Savkov
2022-03-24  9:15       ` [PATCH v2 2/2] net: make tcp keepalive timer upper bound Artem Savkov
2022-03-24 12:28   ` [PATCH 1/2] timer: introduce upper bound timers Thomas Gleixner
2022-03-24 13:54     ` Thomas Gleixner
2022-03-26 21:13     ` Thomas Gleixner
2022-03-30  8:20       ` [PATCH v3 0/2] Upper bound kernel timers Artem Savkov
2022-03-30  8:20         ` [PATCH v3 1/2] timer: add a function to adjust timeouts to be upper bound Artem Savkov
2022-03-30 13:40           ` Anna-Maria Behnsen
2022-04-02  6:55             ` Artem Savkov
2022-04-05 15:33               ` Thomas Gleixner
2022-04-07  7:52                 ` [PATCH v4 0/2] Upper bound kernel timers Artem Savkov
2022-04-07  7:52                   ` [PATCH v4 1/2] timer: add a function to adjust timeouts to be upper bound Artem Savkov
2022-04-08  0:37                     ` Thomas Gleixner
2022-04-08  5:39                       ` Josh Poimboeuf [this message]
2022-04-12 13:42                       ` Artem Savkov
2022-05-05 13:18                       ` [PATCH v5 0/2] Upper bound kernel timers Artem Savkov
2022-05-05 13:18                         ` [PATCH v5 1/2] timer: add a function to adjust timeouts to be upper bound Artem Savkov
2022-05-05 13:18                         ` [PATCH v5 2/2] net: make tcp keepalive timer " Artem Savkov
2022-05-05 17:56                           ` Josh Poimboeuf
2022-05-06  6:39                             ` Artem Savkov
2022-05-06 16:24                               ` Josh Poimboeuf
2022-07-26 22:42                         ` [PATCH v5 0/2] Upper bound kernel timers Josh Poimboeuf
2022-04-07  7:52                   ` [PATCH v4 2/2] net: make tcp keepalive timer upper bound Artem Savkov
     [not found]                 ` <Yk1i3WrcVIICAiF0@samus.usersys.redhat.com>
2022-04-07 23:26                   ` [PATCH v3 1/2] timer: add a function to adjust timeouts to be " Thomas Gleixner
2022-03-30  8:20         ` [PATCH v3 2/2] net: make tcp keepalive timer " Artem Savkov
2022-04-02  3:09           ` [net] 6ef3f95797: UBSAN:shift-out-of-bounds_in_kernel/time/timer.c kernel test robot
2022-04-02  3:09             ` kernel test robot
2022-04-02  7:11             ` Artem Savkov
2022-04-02  7:11               ` Artem Savkov
2022-03-30 10:28         ` [PATCH v3 0/2] Upper bound kernel timers David Laight
2022-03-25  7:38   ` [timer] d41e0719d5: UBSAN:shift-out-of-bounds_in_lib/flex_proportions.c kernel test robot
2022-03-25  7:38     ` kernel test robot
2022-03-25 19:14     ` Thomas Gleixner
2022-03-25 19:14       ` Thomas Gleixner
2022-03-23 11:16 ` [PATCH 2/2] net: make tcp keepalive timer upper bound Artem Savkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220408053914.csbplfylubrlkads@treble \
    --to=jpoimboe@redhat.com \
    --cc=anna-maria@linutronix.de \
    --cc=asavkov@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.