bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: Martin KaFai Lau <kafai@fb.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	kernel-team <kernel-team@fb.com>, Lawrence Brakmo <brakmo@fb.com>,
	Neal Cardwell <ncardwell@google.com>,
	netdev <netdev@vger.kernel.org>,
	Yuchung Cheng <ycheng@google.com>
Subject: Re: [PATCH bpf-next 01/10] tcp: Use a struct to represent a saved_syn
Date: Sat, 27 Jun 2020 10:41:32 -0700	[thread overview]
Message-ID: <CANn89iLJNWh6bkH7DNhy_kmcAexuUCccqERqe7z2QsvPhGrYPQ@mail.gmail.com> (raw)
In-Reply-To: <20200626175508.1460345-1-kafai@fb.com>

On Fri, Jun 26, 2020 at 10:55 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
> The total length of the saved syn packet is currently stored in
> the first 4 bytes (u32) and the actual packet data is stored after that.
>
> A latter patch will also want to store an offset (bpf_hdr_opt_off) to
> a TCP header option which the bpf program will be interested in parsing.
> Instead of anonymously storing this offset into the second 4 bytes,
> this patch creates a struct for the existing saved_syn.
> It can give a readable name to the stored lengths instead of implicitly
> using the first few u32(s) to do that.
>
> The new TCP bpf header offset (bpf_hdr_opt_off) added in a latter patch is
> an offset from the tcp header instead of from the network header.
> It will make the bpf programming side easier.  Thus, this patch stores
> the network header length instead of the total length of the syn
> header.  The total length can be obtained by the
> "network header len + tcp_hdrlen".  The latter patch can
> then also gets the offset to the TCP bpf header option by
> "network header len + bpf_hdr_opt_off".
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---
>  include/linux/tcp.h        | 11 ++++++++++-
>  include/net/request_sock.h |  7 ++++++-
>  net/core/filter.c          |  4 ++--
>  net/ipv4/tcp.c             |  9 +++++----
>  net/ipv4/tcp_input.c       | 12 ++++++------
>  5 files changed, 29 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 3bdec31ce8f4..9d50132d95e6 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -404,7 +404,7 @@ struct tcp_sock {
>          * socket. Used to retransmit SYNACKs etc.
>          */
>         struct request_sock __rcu *fastopen_rsk;
> -       u32     *saved_syn;
> +       struct saved_syn *saved_syn;
>  };
>
>  enum tsq_enum {
> @@ -482,6 +482,15 @@ static inline void tcp_saved_syn_free(struct tcp_sock *tp)
>         tp->saved_syn = NULL;
>  }
>
> +static inline u32 tcp_saved_syn_len(const struct saved_syn *saved_syn)
> +{
> +       const struct tcphdr *th;
> +
> +       th = (void *)saved_syn->data + saved_syn->network_hdrlen;
> +
> +       return saved_syn->network_hdrlen + __tcp_hdrlen(th);
> +}


Ah... We have a patch extending TCP_SAVE_SYN to save all headers, so
keeping the length in a proper field would be better than going back
to TCP header to get __tcp_hdrlen(th)

I am not sure why trying to save 4 bytes in this saved_syn would matter ;)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index c9fcfa4ec43f3f0d75763e2bc6773e15bd38d68f..8ecdc5f87788439c7a08d3b72f9567e6369e7c4e
100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -258,7 +258,7 @@ struct tcp_sock {
                fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */
                fastopen_no_cookie:1, /* Allow send/recv SYN+data
without a cookie */
                is_sack_reneg:1,    /* in recovery from loss with SACK reneg? */
-               unused:2;
+               save_syn:2;     /* Save headers of SYN packet */
        u8      nonagle     : 4,/* Disable Nagle algorithm?             */
                thin_lto    : 1,/* Use linear timeouts for thin streams */
                recvmsg_inq : 1,/* Indicate # of bytes in queue upon recvmsg */
@@ -270,7 +270,7 @@ struct tcp_sock {
                syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
                syn_fastopen_ch:1, /* Active TFO re-enabling probe */
                syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
-               save_syn:1,     /* Save headers of SYN packet */
+               unused_save_syn:1,      /* Moved above */
                is_cwnd_limited:1,/* forward progress limited by snd_cwnd? */
                syn_smc:1;      /* SYN includes SMC */
        u32     tlp_high_seq;   /* snd_nxt at the time of TLP retransmit. */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fa62baf509c8075cb7da30ee0f65059ac35c1c60..7e108de07fb4e45a994d3d75331489ad82f9deb7
100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3097,7 +3097,8 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
                break;

        case TCP_SAVE_SYN:
-               if (val < 0 || val > 1)
+               /* 0: disable, 1: enable, 2: start from ether_header */
+               if (val < 0 || val > 2)
                        err = -EINVAL;
                else
                        tp->save_syn = val;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 7ce5bad2308134954133f612ec129cf56946d9a1..5513f8aaae9f6c0303fac4d2c590ead1a6076502
100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6708,11 +6708,19 @@ static void tcp_reqsk_record_syn(const struct sock *sk,
        if (tcp_sk(sk)->save_syn) {
                u32 len = skb_network_header_len(skb) + tcp_hdrlen(skb);
                u32 *copy;
+               void *base;
+
+               if (tcp_sk(sk)->save_syn == 2) {  /* Save full header. */
+                       len += skb->network_header - skb->mac_header;
+                       base = skb_mac_header(skb);
+               } else {
+                       base = skb_network_header(skb);
+               }

                copy = kmalloc(len + sizeof(u32), GFP_ATOMIC);
                if (copy) {
                        copy[0] = len;
-                       memcpy(&copy[1], skb_network_header(skb), len);
+                       memcpy(&copy[1], base, len);
                        req->saved_syn = copy;
                }
        }

  reply	other threads:[~2020-06-27 17:41 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-26 17:55 [PATCH bpf-next 00/10] BPF TCP header options Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 01/10] tcp: Use a struct to represent a saved_syn Martin KaFai Lau
2020-06-27 17:41   ` Eric Dumazet [this message]
2020-06-30 23:24     ` Martin KaFai Lau
2020-06-30 23:35       ` Eric Dumazet
2020-06-26 17:55 ` [PATCH bpf-next 02/10] tcp: bpf: Parse BPF experimental header option Martin KaFai Lau
2020-06-27 16:44   ` Eric Dumazet
2020-06-27 17:17   ` Eric Dumazet
2020-06-28 23:44     ` Martin KaFai Lau
2020-06-29  0:45     ` Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 03/10] bpf: sock_ops: Change some members of sock_ops_kern from u32 to u8 Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 04/10] bpf: tcp: Allow bpf prog to write and parse BPF TCP header option Martin KaFai Lau
2020-06-28 18:24   ` Alexei Starovoitov
2020-06-29  0:34     ` Martin KaFai Lau
2020-07-02  5:31       ` Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 05/10] bpf: selftests: A few improvements to network_helpers.c Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 06/10] bpf: selftests: Add fastopen_connect to network_helpers Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 07/10] bpf: selftests: Restore netns after each test Martin KaFai Lau
2020-06-26 22:45   ` Andrii Nakryiko
2020-06-27  0:23     ` Martin KaFai Lau
2020-06-27 20:31       ` Andrii Nakryiko
2020-06-29 18:00         ` Martin KaFai Lau
2020-06-29 18:13           ` Andrii Nakryiko
2020-06-29 18:24             ` Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 08/10] bpf: selftests: tcp header options Martin KaFai Lau
2020-06-26 17:55 ` [PATCH bpf-next 09/10] tcp: bpf: Add TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN to bpf_setsockopt Martin KaFai Lau
2020-06-27 17:30   ` Eric Dumazet
2020-06-26 17:56 ` [PATCH bpf-next 10/10] bpf: selftest: Add test for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANn89iLJNWh6bkH7DNhy_kmcAexuUCccqERqe7z2QsvPhGrYPQ@mail.gmail.com \
    --to=edumazet@google.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brakmo@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).