bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Stanislav Fomichev <sdf@google.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>,
	Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>,
	Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>,
	David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [xdp-hints] Re: [PATCH bpf-next 05/11] veth: Support rx timestamp metadata for xdp
Date: Thu, 17 Nov 2022 12:32:31 +0100	[thread overview]
Message-ID: <87wn7t4y0g.fsf@toke.dk> (raw)
In-Reply-To: <CAKH8qBvTdnyRYT+ocNS_ZmOfoN+nBEJ5jcBcKcqZ1hx0a5WrSw@mail.gmail.com>

Stanislav Fomichev <sdf@google.com> writes:

>> > Doesn't look like the descriptors are as nice as you're trying to
>> > paint them (with clear hash/csum fields) :-) So not sure how much
>> > CO-RE would help.
>> > At least looking at mlx4 rx_csum, the driver consults three different
>> > sets of flags to figure out the hash_type. Or am I just unlucky with
>> > mlx4?
>>
>> Which part are you talking about ?
>>         hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
>> is trivial enough for bpf prog to do if it has access to 'cqe' pointer
>> which is what John is proposing (I think).
>
> I'm talking about mlx4_en_process_rx_cq, the caller of that check_csum.
> In particular: if (likely(dev->features & NETIF_F_RXCSUM)) branch
> I'm assuming we want to have hash_type available to the progs?

I agree we should expose the hash_type, but that doesn't actually look
to be that complicated, see below.

> But also, check_csum handles other corner cases:
> - short_frame: we simply force all those small frames to skip checksum complete
> - get_fixed_ipv6_csum: In IPv6 packets, hw_checksum lacks 6 bytes from
> IPv6 header
> - get_fixed_ipv4_csum: Although the stack expects checksum which
> doesn't include the pseudo header, the HW adds it
>
> So it doesn't look like we can just unconditionally use cqe->checksum?
> The driver does a lot of massaging around that field to make it
> palatable.

Poking around a bit in the other drivers, AFAICT it's only a subset of
drivers that support CSUM_COMPLETE at all; for instance, the Intel
drivers just set CHECKSUM_UNNECESSARY for TCP/UDP/SCTP. I think the
CHECKSUM_UNNECESSARY is actually the most important bit we'd want to
propagate?

AFAICT, the drivers actually implementing CHECKSUM_COMPLETE need access
to other data structures than the rx descriptor to determine the status
of the checksum (mlx4 looks at priv->flags, mlx5 checks rq->state), so
just exposing the rx descriptor to BPF as John is suggesting does not
actually give the XDP program enough information to act on the checksum
field on its own. We could still have a separate kfunc to just expose
the hw checksum value (see below), but I think it probably needs to be
paired with other kfuncs to be useful.

Looking at the mlx4 code, I think the following mapping to kfuncs (in
pseudo-C) would give the flexibility for XDP to access all the bits it
needs, while inlining everything except getting the full checksum for
non-TCP/UDP traffic. An (admittedly cursory) glance at some of the other
drivers (mlx5, ice, i40e) indicates that this would work for those
drivers as well.


bpf_xdp_metadata_rx_hash_supported() {
  return dev->features & NETIF_F_RXHASH;
}

bpf_xdp_metadata_rx_hash() {
  return be32_to_cpu(cqe->immed_rss_invalid);
}

bpf_xdp_metdata_rx_hash_type() {
  if (likely(dev->features & NETIF_F_RXCSUM) &&
      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP | MLX4_CQE_STATUS_UDP)) &&
	(cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
	  cqe->checksum == cpu_to_be16(0xffff))
     return PKT_HASH_TYPE_L4;

   return PKT_HASH_TYPE_L3;
}

bpf_xdp_metadata_rx_csum_supported() {
  return dev->features & NETIF_F_RXCSUM;
}

bpf_xdp_metadata_rx_csum_level() {
	if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
				       MLX4_CQE_STATUS_UDP)) &&
	    (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
	    cqe->checksum == cpu_to_be16(0xffff))
            return CHECKSUM_UNNECESSARY;
            
	if (!(priv->flags & MLX4_EN_FLAG_RX_CSUM_NON_TCP_UDP &&
	      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IP_ANY))) &&
              !short_frame(len))
            return CHECKSUM_COMPLETE; /* we could also omit this case entirely */

        return CHECKSUM_NONE;
}

/* this one could be called by the metadata_to_skb code */
bpf_xdp_metadata_rx_csum_full() {
  return check_csum() /* BPF_CALL this after refactoring so it is skb-agnostic */
}

/* this one would be for people like John who want to re-implement
 * check_csum() themselves */
bpf_xdp_metdata_rx_csum_raw() {
  return cqe->checksum;
}


-Toke


  parent reply	other threads:[~2022-11-17 11:33 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-15  3:01 [PATCH bpf-next 00/11] xdp: hints via kfuncs Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 01/11] bpf: Document XDP RX metadata Stanislav Fomichev
2022-11-15 22:31   ` Zvi Effron
2022-11-15 22:43     ` Stanislav Fomichev
2022-11-15 23:34       ` Zvi Effron
2022-11-16  3:50         ` Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 02/11] bpf: Introduce bpf_patch Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 03/11] bpf: Support inlined/unrolled kfuncs for xdp metadata Stanislav Fomichev
2022-11-15 16:16   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-15 18:37     ` Stanislav Fomichev
2022-11-16 20:42   ` Jakub Kicinski
2022-11-16 20:53     ` Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 04/11] bpf: Implement hidden BPF_PUSH64 and BPF_POP64 instructions Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 05/11] veth: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-11-15 16:17   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-15 18:37     ` Stanislav Fomichev
2022-11-15 22:46       ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-16  4:09         ` Stanislav Fomichev
2022-11-16  6:38           ` John Fastabend
2022-11-16  7:47             ` Martin KaFai Lau
2022-11-16 10:08               ` Toke Høiland-Jørgensen
2022-11-16 18:20                 ` Martin KaFai Lau
2022-11-16 19:03                 ` John Fastabend
2022-11-16 20:50                   ` Stanislav Fomichev
2022-11-16 23:47                     ` John Fastabend
2022-11-17  0:19                       ` Stanislav Fomichev
2022-11-17  2:17                         ` Alexei Starovoitov
2022-11-17  2:53                           ` Stanislav Fomichev
2022-11-17  2:59                             ` Alexei Starovoitov
2022-11-17  4:18                               ` Stanislav Fomichev
2022-11-17  6:55                                 ` John Fastabend
2022-11-17 17:51                                   ` Stanislav Fomichev
2022-11-17 19:47                                     ` John Fastabend
2022-11-17 20:17                                       ` Alexei Starovoitov
2022-11-17 11:32                             ` Toke Høiland-Jørgensen [this message]
2022-11-17 16:59                               ` Alexei Starovoitov
2022-11-17 17:52                                 ` Stanislav Fomichev
2022-11-17 23:46                                   ` Toke Høiland-Jørgensen
2022-11-18  0:02                                     ` Alexei Starovoitov
2022-11-18  0:29                                       ` Toke Høiland-Jørgensen
2022-11-17 10:27                       ` Toke Høiland-Jørgensen
2022-11-15  3:02 ` [PATCH bpf-next 06/11] xdp: Carry over xdp metadata into skb context Stanislav Fomichev
2022-11-15 23:20   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-16  3:49     ` Stanislav Fomichev
2022-11-16  9:30       ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-16  7:04   ` Martin KaFai Lau
2022-11-16  9:48     ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-16 20:51       ` Stanislav Fomichev
2022-11-16 20:51     ` Stanislav Fomichev
2022-11-16 21:12   ` Jakub Kicinski
2022-11-16 21:49     ` Martin KaFai Lau
2022-11-18 14:05   ` Jesper Dangaard Brouer
2022-11-18 18:18     ` Stanislav Fomichev
2022-11-19 12:31       ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-21 17:53         ` Stanislav Fomichev
2022-11-21 18:47           ` Jakub Kicinski
2022-11-21 19:41             ` Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 07/11] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 08/11] selftests/bpf: Verify xdp_metadata xdp->skb path Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 09/11] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 10/11] mxl4: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-11-15  3:02 ` [PATCH bpf-next 11/11] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2022-11-15 15:54 ` [xdp-hints] [PATCH bpf-next 00/11] xdp: hints via kfuncs Toke Høiland-Jørgensen
2022-11-15 18:37   ` Stanislav Fomichev
2022-11-15 22:31     ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-15 22:54     ` [xdp-hints] " Alexei Starovoitov
2022-11-15 23:13       ` [xdp-hints] " Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wn7t4y0g.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).