All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: Jakub Sitnicki <jakub@cloudflare.com>
Cc: <netdev@vger.kernel.org>, <bpf@vger.kernel.org>,
	<dccp@vger.kernel.org>, <kernel-team@cloudflare.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Gerrit Renker <gerrit@erg.abdn.ac.uk>,
	Jakub Kicinski <kuba@kernel.org>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>
Subject: Re: [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup
Date: Mon, 11 May 2020 12:45:20 -0700	[thread overview]
Message-ID: <20200511194520.pr5d74ao34jigvof@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20200511185218.1422406-1-jakub@cloudflare.com>

On Mon, May 11, 2020 at 08:52:01PM +0200, Jakub Sitnicki wrote:

[ ... ]

> Performance considerations
> ==========================
> 
> Patch set adds new code on receive hot path. This comes with a cost,
> especially in a scenario of a SYN flood or small UDP packet flood.
> 
> Measuring the performance penalty turned out to be harder than expected
> because socket lookup is fast. For CPUs to spend >= 1% of time in socket
> lookup we had to modify our setup by unloading iptables and reducing the
> number of routes.
> 
> The receiver machine is a Cloudflare Gen 9 server covered in detail at [0].
> In short:
> 
>  - 24 core Intel custom off-roadmap 1.9Ghz 150W (Skylake) CPU
>  - dual-port 25G Mellanox ConnectX-4 NIC
>  - 256G DDR4 2666Mhz RAM
> 
> Flood traffic pattern:
> 
>  - source: 1 IP, 10k ports
>  - destination: 1 IP, 1 port
>  - TCP - SYN packet
>  - UDP - Len=0 packet
> 
> Receiver setup:
> 
>  - ingress traffic spread over 4 RX queues,
>  - RX/TX pause and autoneg disabled,
>  - Intel Turbo Boost disabled,
>  - TCP SYN cookies always on.
> 
> For TCP test there is a receiver process with single listening socket
> open. Receiver is not accept()'ing connections.
> 
> For UDP the receiver process has a single UDP socket with a filter
> installed, dropping the packets.
> 
> With such setup in place, we record RX pps and cpu-cycles events under
> flood for 60 seconds in 3 configurations:
> 
>  1. 5.6.3 kernel w/o this patch series (baseline),
>  2. 5.6.3 kernel with patches applied, but no SK_LOOKUP program attached,
>  3. 5.6.3 kernel with patches applied, and SK_LOOKUP program attached;
>     BPF program [1] is doing a lookup LPM_TRIE map with 200 entries.
Is the link in [1] up-to-date?  I don't see it calling bpf_sk_assign().

> 
> RX pps measured with `ifpps -d <dev> -t 1000 --csv --loop` for 60 seconds.
> 
> | tcp4 SYN flood               | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 939,616 ± 0.5%         |        - |
> | no SK_LOOKUP prog attached   | 929,275 ± 1.2%         |    -1.1% |
> | with SK_LOOKUP prog attached | 918,582 ± 0.4%         |    -2.2% |
> 
> | tcp6 SYN flood               | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 875,838 ± 0.5%         |        - |
> | no SK_LOOKUP prog attached   | 872,005 ± 0.3%         |    -0.4% |
> | with SK_LOOKUP prog attached | 856,250 ± 0.5%         |    -2.2% |
> 
> | udp4 0-len flood             | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 2,738,662 ± 1.5%       |        - |
> | no SK_LOOKUP prog attached   | 2,576,893 ± 1.0%       |    -5.9% |
> | with SK_LOOKUP prog attached | 2,530,698 ± 1.0%       |    -7.6% |
> 
> | udp6 0-len flood             | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 2,867,885 ± 1.4%       |        - |
> | no SK_LOOKUP prog attached   | 2,646,875 ± 1.0%       |    -7.7% |
What is causing this regression?

> | with SK_LOOKUP prog attached | 2,520,474 ± 0.7%       |   -12.1% |
This also looks very different from udp4.

> 
> Also visualized on bpf-sk-lookup-v1-rx-pps.png chart [2].
> 
> cpu-cycles measured with `perf record -F 999 --cpu 1-4 -g -- sleep 60`.
> 
> |                              |      cpu-cycles events |          |
> | tcp4 SYN flood               | __inet_lookup_listener | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  1.12% |        - |
> | no SK_LOOKUP prog attached   |                  1.31% |    0.19% |
> | with SK_LOOKUP prog attached |                  3.05% |    1.93% |
> 
> |                              |      cpu-cycles events |          |
> | tcp6 SYN flood               |  inet6_lookup_listener | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  1.05% |        - |
> | no SK_LOOKUP prog attached   |                  1.68% |    0.63% |
> | with SK_LOOKUP prog attached |                  3.15% |    2.10% |
> 
> |                              |      cpu-cycles events |          |
> | udp4 0-len flood             |      __udp4_lib_lookup | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  3.81% |        - |
> | no SK_LOOKUP prog attached   |                  5.22% |    1.41% |
> | with SK_LOOKUP prog attached |                  8.20% |    4.39% |
> 
> |                              |      cpu-cycles events |          |
> | udp6 0-len flood             |      __udp6_lib_lookup | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  5.51% |        - |
> | no SK_LOOKUP prog attached   |                  6.51% |    1.00% |
> | with SK_LOOKUP prog attached |                 10.14% |    4.63% |
> 
> Also visualized on bpf-sk-lookup-v1-cpu-cycles.png chart [3].
> 

[ ... ]

> 
> [0] https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.cloudflare.com_a-2Dtour-2Dinside-2Dcloudflares-2Dg9-2Dservers_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=VQnoQ7LvghIj0gVEaiQSUw&m=v4r30a5NaPFxNXVRakV9SeJkshbI4G4c5D83yZtGm-g&s=PhkIqKdmL12ZMD_6jY_rALjmO2ahv_KNF3F7TikyfTo&e= 
> [1] https://github.com/majek/inet-tool/blob/master/ebpf/inet-kern.c
> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_file_d_1HrrjWhQoVlqiqT73-5FeLtWMPhuGPKhGFX_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=VQnoQ7LvghIj0gVEaiQSUw&m=v4r30a5NaPFxNXVRakV9SeJkshbI4G4c5D83yZtGm-g&s=9tums5TZ16ttY69vEHkzyiEkblxT3iwvm0mFjZySJXo&e= 
> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_file_d_1cYPPOlGg7M-2DbkzI4RW1SOm49goI4LYbb_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=VQnoQ7LvghIj0gVEaiQSUw&m=v4r30a5NaPFxNXVRakV9SeJkshbI4G4c5D83yZtGm-g&s=VWolTQx3GVmSh2J7TQixTlGvRTb6S9qDNx4N8id5lf8&e= 
> [RFCv1] https://lore.kernel.org/bpf/20190618130050.8344-1-jakub@cloudflare.com/
> [RFCv2] https://lore.kernel.org/bpf/20190828072250.29828-1-jakub@cloudflare.com/

WARNING: multiple messages have this Message-ID (diff)
From: Martin KaFai Lau <kafai@fb.com>
To: dccp@vger.kernel.org
Subject: Re: [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup
Date: Mon, 11 May 2020 19:45:20 +0000	[thread overview]
Message-ID: <20200511194520.pr5d74ao34jigvof@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20200511185218.1422406-1-jakub@cloudflare.com>

On Mon, May 11, 2020 at 08:52:01PM +0200, Jakub Sitnicki wrote:

[ ... ]

> Performance considerations
> =============
> 
> Patch set adds new code on receive hot path. This comes with a cost,
> especially in a scenario of a SYN flood or small UDP packet flood.
> 
> Measuring the performance penalty turned out to be harder than expected
> because socket lookup is fast. For CPUs to spend >= 1% of time in socket
> lookup we had to modify our setup by unloading iptables and reducing the
> number of routes.
> 
> The receiver machine is a Cloudflare Gen 9 server covered in detail at [0].
> In short:
> 
>  - 24 core Intel custom off-roadmap 1.9Ghz 150W (Skylake) CPU
>  - dual-port 25G Mellanox ConnectX-4 NIC
>  - 256G DDR4 2666Mhz RAM
> 
> Flood traffic pattern:
> 
>  - source: 1 IP, 10k ports
>  - destination: 1 IP, 1 port
>  - TCP - SYN packet
>  - UDP - Len=0 packet
> 
> Receiver setup:
> 
>  - ingress traffic spread over 4 RX queues,
>  - RX/TX pause and autoneg disabled,
>  - Intel Turbo Boost disabled,
>  - TCP SYN cookies always on.
> 
> For TCP test there is a receiver process with single listening socket
> open. Receiver is not accept()'ing connections.
> 
> For UDP the receiver process has a single UDP socket with a filter
> installed, dropping the packets.
> 
> With such setup in place, we record RX pps and cpu-cycles events under
> flood for 60 seconds in 3 configurations:
> 
>  1. 5.6.3 kernel w/o this patch series (baseline),
>  2. 5.6.3 kernel with patches applied, but no SK_LOOKUP program attached,
>  3. 5.6.3 kernel with patches applied, and SK_LOOKUP program attached;
>     BPF program [1] is doing a lookup LPM_TRIE map with 200 entries.
Is the link in [1] up-to-date?  I don't see it calling bpf_sk_assign().

> 
> RX pps measured with `ifpps -d <dev> -t 1000 --csv --loop` for 60 seconds.
> 
> | tcp4 SYN flood               | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 939,616 ± 0.5%         |        - |
> | no SK_LOOKUP prog attached   | 929,275 ± 1.2%         |    -1.1% |
> | with SK_LOOKUP prog attached | 918,582 ± 0.4%         |    -2.2% |
> 
> | tcp6 SYN flood               | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 875,838 ± 0.5%         |        - |
> | no SK_LOOKUP prog attached   | 872,005 ± 0.3%         |    -0.4% |
> | with SK_LOOKUP prog attached | 856,250 ± 0.5%         |    -2.2% |
> 
> | udp4 0-len flood             | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 2,738,662 ± 1.5%       |        - |
> | no SK_LOOKUP prog attached   | 2,576,893 ± 1.0%       |    -5.9% |
> | with SK_LOOKUP prog attached | 2,530,698 ± 1.0%       |    -7.6% |
> 
> | udp6 0-len flood             | rx pps (mean ± sstdev) | Δ rx pps |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     | 2,867,885 ± 1.4%       |        - |
> | no SK_LOOKUP prog attached   | 2,646,875 ± 1.0%       |    -7.7% |
What is causing this regression?

> | with SK_LOOKUP prog attached | 2,520,474 ± 0.7%       |   -12.1% |
This also looks very different from udp4.

> 
> Also visualized on bpf-sk-lookup-v1-rx-pps.png chart [2].
> 
> cpu-cycles measured with `perf record -F 999 --cpu 1-4 -g -- sleep 60`.
> 
> |                              |      cpu-cycles events |          |
> | tcp4 SYN flood               | __inet_lookup_listener | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  1.12% |        - |
> | no SK_LOOKUP prog attached   |                  1.31% |    0.19% |
> | with SK_LOOKUP prog attached |                  3.05% |    1.93% |
> 
> |                              |      cpu-cycles events |          |
> | tcp6 SYN flood               |  inet6_lookup_listener | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  1.05% |        - |
> | no SK_LOOKUP prog attached   |                  1.68% |    0.63% |
> | with SK_LOOKUP prog attached |                  3.15% |    2.10% |
> 
> |                              |      cpu-cycles events |          |
> | udp4 0-len flood             |      __udp4_lib_lookup | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  3.81% |        - |
> | no SK_LOOKUP prog attached   |                  5.22% |    1.41% |
> | with SK_LOOKUP prog attached |                  8.20% |    4.39% |
> 
> |                              |      cpu-cycles events |          |
> | udp6 0-len flood             |      __udp6_lib_lookup | Δ events |
> |------------------------------+------------------------+----------|
> | 5.6.3 vanilla (baseline)     |                  5.51% |        - |
> | no SK_LOOKUP prog attached   |                  6.51% |    1.00% |
> | with SK_LOOKUP prog attached |                 10.14% |    4.63% |
> 
> Also visualized on bpf-sk-lookup-v1-cpu-cycles.png chart [3].
> 

[ ... ]

> 
> [0] https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.cloudflare.com_a-2Dtour-2Dinside-2Dcloudflares-2Dg9-2Dservers_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=VQnoQ7LvghIj0gVEaiQSUw&m=v4r30a5NaPFxNXVRakV9SeJkshbI4G4c5D83yZtGm-g&s=PhkIqKdmL12ZMD_6jY_rALjmO2ahv_KNF3F7TikyfTo&e= 
> [1] https://github.com/majek/inet-tool/blob/master/ebpf/inet-kern.c
> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_file_d_1HrrjWhQoVlqiqT73-5FeLtWMPhuGPKhGFX_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=VQnoQ7LvghIj0gVEaiQSUw&m=v4r30a5NaPFxNXVRakV9SeJkshbI4G4c5D83yZtGm-g&s=9tums5TZ16ttY69vEHkzyiEkblxT3iwvm0mFjZySJXo&e= 
> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_file_d_1cYPPOlGg7M-2DbkzI4RW1SOm49goI4LYbb_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=VQnoQ7LvghIj0gVEaiQSUw&m=v4r30a5NaPFxNXVRakV9SeJkshbI4G4c5D83yZtGm-g&s=VWolTQx3GVmSh2J7TQixTlGvRTb6S9qDNx4N8id5lf8&e= 
> [RFCv1] https://lore.kernel.org/bpf/20190618130050.8344-1-jakub@cloudflare.com/
> [RFCv2] https://lore.kernel.org/bpf/20190828072250.29828-1-jakub@cloudflare.com/

  parent reply	other threads:[~2020-05-11 19:45 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-11 18:52 [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup Jakub Sitnicki
2020-05-11 18:52 ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 01/17] flow_dissector: Extract attach/detach/query helpers Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 02/17] bpf: Introduce SK_LOOKUP program type with a dedicated attach point Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 19:06   ` Jakub Sitnicki
2020-05-11 19:06     ` Jakub Sitnicki
2020-05-13  5:41   ` Martin KaFai Lau
2020-05-13  5:41     ` Martin KaFai Lau
2020-05-13 14:34     ` Jakub Sitnicki
2020-05-13 14:34       ` Jakub Sitnicki
2020-05-13 18:10       ` Martin KaFai Lau
2020-05-13 18:10         ` Martin KaFai Lau
2020-05-11 18:52 ` [PATCH bpf-next v2 03/17] inet: Store layer 4 protocol in inet_hashinfo Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 04/17] inet: Extract helper for selecting socket from reuseport group Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 05/17] inet: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 20:44   ` Alexei Starovoitov
2020-05-11 20:44     ` Alexei Starovoitov
2020-05-12 13:52     ` Jakub Sitnicki
2020-05-12 13:52       ` Jakub Sitnicki
2020-05-12 23:58       ` Alexei Starovoitov
2020-05-12 23:58         ` Alexei Starovoitov
2020-05-13 13:55         ` Jakub Sitnicki
2020-05-13 13:55           ` Jakub Sitnicki
2020-05-13 14:21       ` Lorenz Bauer
2020-05-13 14:21         ` Lorenz Bauer
2020-05-13 14:50         ` Jakub Sitnicki
2020-05-13 14:50           ` Jakub Sitnicki
2020-05-15 12:28     ` Jakub Sitnicki
2020-05-15 12:28       ` Jakub Sitnicki
2020-05-15 15:07       ` Alexei Starovoitov
2020-05-15 15:07         ` Alexei Starovoitov
2020-05-11 18:52 ` [PATCH bpf-next v2 06/17] inet6: Extract helper for selecting socket from reuseport group Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 07/17] inet6: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 08/17] udp: Store layer 4 protocol in udp_table Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 09/17] udp: Extract helper for selecting socket from reuseport group Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 10/17] udp: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 11/17] udp6: Extract helper for selecting socket from reuseport group Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 12/17] udp6: Run SK_LOOKUP BPF program on socket lookup Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 13/17] bpf: Sync linux/bpf.h to tools/ Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 14/17] libbpf: Add support for SK_LOOKUP program type Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 15/17] selftests/bpf: Add verifier tests for bpf_sk_lookup context access Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 16/17] selftests/bpf: Rename test_sk_lookup_kern.c to test_ref_track_kern.c Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 18:52 ` [PATCH bpf-next v2 17/17] selftests/bpf: Tests for BPF_SK_LOOKUP attach point Jakub Sitnicki
2020-05-11 18:52   ` Jakub Sitnicki
2020-05-11 19:45 ` Martin KaFai Lau [this message]
2020-05-11 19:45   ` [PATCH bpf-next v2 00/17] Run a BPF program on socket lookup Martin KaFai Lau
2020-05-12 11:57   ` Jakub Sitnicki
2020-05-12 11:57     ` Jakub Sitnicki
2020-05-12 16:34     ` Martin KaFai Lau
2020-05-12 16:34       ` Martin KaFai Lau
2020-05-13 17:54       ` Jakub Sitnicki
2020-05-13 17:54         ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200511194520.pr5d74ao34jigvof@kafai-mbp.dhcp.thefacebook.com \
    --to=kafai@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dccp@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=gerrit@erg.abdn.ac.uk \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.