All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>,
	netdev@vger.kernel.org, Eric Dumazet <eric.dumazet@gmail.com>
Cc: xiyou.wangcong@gmail.com
Subject: Re: [net-next PATCH 2/3] net: reduce cycles spend on ICMP replies that gets rate limited
Date: Sun, 4 Jun 2017 09:11:53 +0200	[thread overview]
Message-ID: <7d432179-5e3a-febe-ced7-39ea33ba4906@redhat.com> (raw)
In-Reply-To: <20170109150409.30215.34612.stgit@firesoul>

[-- Attachment #1: Type: text/plain, Size: 1845 bytes --]

On 01/09/2017 04:04 PM, Jesper Dangaard Brouer wrote:
> This patch split the global and per (inet)peer ICMP-reply limiter
> code, and moves the global limit check to earlier in the packet
> processing path.  Thus, avoid spending cycles on ICMP replies that
> gets limited/suppressed anyhow.
> 
> The global ICMP rate limiter icmp_global_allow() is a good solution,
> it just happens too late in the process.  The kernel goes through the
> full route lookup (return path) for the ICMP message, before taking
> the rate limit decision of not sending the ICMP reply.
> 
> Details: The kernels global rate limiter for ICMP messages got added
> in commit 4cdf507d5452 ("icmp: add a global rate limitation").  It is
> a token bucket limiter with a global lock.  It brilliantly avoids
> locking congestion by only updating when 20ms (HZ/50) were elapsed. It
> can then avoids taking lock when credit is exhausted (when under
> pressure) and time constraint for refill is not yet meet.

This patch removed the rate limit bypass for localhost.  As a result, it
is impossible to write deterministic UDP client tests tests which
exercise failover behavior in response to unreachable servers.

H.J. Lu noted that a glibc test started failing on kernel 4.11 and
identified the regression:

  https://sourceware.org/ml/libc-alpha/2017-06/msg00167.html

(I have more tests which are afflicted by this, but are not yet in glibc
upstream.)

This is particularly annoying because we already run such tests in a
network namespace for isolation, but the rate limit counter is global,
so that doesn't help here.

I'm attaching a self-contained test case.  It fails for me with:

localhost-icmp: iteration 50: no ICMP message (poll timeout)

On kernel 4.10, it passes and runs within just a few milliseconds.

Would you please fix this in some way?  Thanks.

Florian

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: localhost-icmp.c --]
[-- Type: text/x-csrc; name="localhost-icmp.c", Size: 1728 bytes --]

#include <arpa/inet.h>
#include <err.h>
#include <netinet/in.h>
#include <stdio.h>
#include <sys/poll.h>
#include <sys/socket.h>
#include <unistd.h>

/* How many UDP packets to send to a non-responding part.  */
enum { ITERATIONS = 1000 };

int
main (void)
{
  /* Pick a port number which is likely unused.  */
  unsigned short port;
  {
    int sock = socket (AF_INET, SOCK_DGRAM, 0);
    if (sock < 0)
      err (1, "socket");
    struct sockaddr_in sin = { .sin_family = AF_INET };
    if (bind (sock, (struct sockaddr *) &sin, sizeof (sin)) < 0)
      err (1, "bind");
    socklen_t sinlen = sizeof (sin);
    if (getsockname (sock, (struct sockaddr *) &sin, &sinlen))
      err (1, "getsockname");
    if (sinlen != sizeof (sin) || sin.sin_family != AF_INET)
      errx (1, "wrong address information for socket");
    if (close (sock) < 0)
      err (1, "close");
    port = sin.sin_port;
  }

  for (int i = 0; i < ITERATIONS; ++i)
    {
      int sock = socket (AF_INET, SOCK_DGRAM, 0);
      if (sock < 0)
        err (1, "socket");
      struct sockaddr_in sin =
        {
          .sin_family = AF_INET,
          .sin_addr = { ntohl (INADDR_LOOPBACK) },
          .sin_port = port,
        };
      if (connect (sock, (struct sockaddr *) &sin, sizeof (sin)) < 0)
        err (1, "connect");
      if (sendto (sock, "", 1, 0, NULL, 0) < 0)
        err (1, "sendto");
      struct pollfd fd = { .fd = sock, .events = POLLIN };
      int ret = poll (&fd, 1, 5000);
      if (ret < 0)
        err (1, "poll");
      if (ret == 0)
        errx (1, "iteration %d: no ICMP message (poll timeout)", i);
      if (close (sock) < 0)
        err (1, "close");
    }
}

  parent reply	other threads:[~2017-06-04  7:11 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-09 15:03 [net-next PATCH 0/3] net: optimize ICMP-reply code path Jesper Dangaard Brouer
2017-01-09 15:04 ` [net-next PATCH 1/3] Revert "icmp: avoid allocating large struct on stack" Jesper Dangaard Brouer
2017-01-09 17:42   ` Cong Wang
2017-01-09 17:50     ` Eric Dumazet
2017-01-09 17:59       ` Cong Wang
2017-01-09 18:07         ` Eric Dumazet
2017-01-09 18:52           ` David Miller
2017-01-09 20:53             ` Jesper Dangaard Brouer
2017-01-10 18:06             ` Cong Wang
2017-01-10 18:12               ` David Miller
2017-01-10 18:44                 ` Cong Wang
2017-01-10 18:48                   ` Cong Wang
2017-01-10 18:54                   ` David Miller
2017-01-12 22:46                     ` Cong Wang
2017-01-10 20:08                   ` Jesper Dangaard Brouer
2017-01-10 21:48                     ` Eric Dumazet
2017-01-12 22:21                       ` Cong Wang
2017-01-10 21:41                 ` Joe Perches
2017-01-09 19:33           ` Joe Perches
2017-01-10 18:01           ` Cong Wang
2017-01-09 18:47         ` David Miller
2017-01-09 17:42   ` Eric Dumazet
2017-01-09 15:04 ` [net-next PATCH 2/3] net: reduce cycles spend on ICMP replies that gets rate limited Jesper Dangaard Brouer
2017-01-09 17:44   ` Eric Dumazet
2017-01-11 17:15     ` Eric Dumazet
2017-06-04  7:11   ` Florian Weimer [this message]
2017-06-04 14:38     ` Jesper Dangaard Brouer
2017-06-05 14:22       ` Florian Weimer
2017-01-09 15:04 ` [net-next PATCH 3/3] net: for rate-limited ICMP replies save one atomic operation Jesper Dangaard Brouer
2017-01-09 17:44   ` Eric Dumazet
2017-01-09 17:43 ` [net-next PATCH 0/3] net: optimize ICMP-reply code path Cong Wang
2017-01-09 17:56   ` Eric Dumazet
2017-01-09 20:49 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d432179-5e3a-febe-ced7-39ea33ba4906@redhat.com \
    --to=fweimer@redhat.com \
    --cc=brouer@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.