All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: "Maciej Żenczykowski" <maze@google.com>
Cc: "Tyler Wear (QUIC)" <quic_twear@quicinc.com>,
	"Network Development" <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, "Yonghong Song" <yhs@fb.com>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Song Liu" <song@kernel.org>
Subject: Re: [PATCH bpf-next v6 1/2] Add skb_store_bytes() for BPF_PROG_TYPE_CGROUP_SKB
Date: Tue, 18 Jan 2022 12:37:52 -0800	[thread overview]
Message-ID: <CAADnVQ+5YbkVOHqVGgusGYYYc0sB0uLKNJC+JKZu5Hs07=dgvw@mail.gmail.com> (raw)
In-Reply-To: <CANP3RGfJ2G8P40hN2F=PGDYUc3pm84=SNppHp_J0V+YiDkLM_A@mail.gmail.com>

On Fri, Jan 14, 2022 at 1:18 PM Maciej Żenczykowski <maze@google.com> wrote:
>
> > > > > This is wrong.
> > > > > CGROUP_INET_EGRESS bpf prog cannot arbitrary change packet data.
>
> I agree with this sentiment, which is why the original proposal was
> simply to add a helper which is only capable of modifying the
> tos/tclass/dscp field, and not any arbitrary bytes.  (note: there
> already is such a helper to set the ECN congestion notification bits,
> so there's somewhat of a precedent)

True. bpf_skb_ecn_set_ce() is available to cg_skb progs.
An arbitrary tos rewriting helper would screw it up.

> > > > > The networking stack populated the IP header at that point.
> > > > > If the prog changes it to something else it will be confusing other
> > > > > layers of stack. neigh(L2) will be wrong, etc.
> > > > > We can still change certain things in the packet, but not arbitrary bytes.
> > > > >
> > > > > We cannot change the DS field directly in the packet either.
>
> This part I won't agree with.  In most cases there is no DSCP based
> routing decision, in which case it seems perfectly reasonable to
> change the DSCP bits here.  Indeed last I checked (though this was a
> few years ago) the ipv4 tos routing code wasn't even capable of making
> sane decisions, because it looks at the bottom 4 bits of the TOS
> field, instead of the top 6 bits, ie. you can route on ECN bits, but
> you can't route on the full DSCP field.  Additionally afaik the ipv6
> tclass routing simply wasn't implemented.  However, I last had to deal
> with this probably half a decade ago, on even older kernels, so
> perhaps the situation has changed.
>
> Additionally DSCP bits may affect transmit queue selection (for
> something like wifi qos / traffic prioritization across multiple
> transmit queues with different air-time behaviours - which can use
> dscp), so ideally we need dscp to be set *before* the mq qdisc /
> dispatch.  I think this implies it needs to happen before tc (though
> again, I'm not too certain of the ordering here).

and that's where tc bpf progs are running. Right before qdiscs.
They can change any byte of the packet.
So I still don't see a use case of adding a specific
tos/tclass/dscp rewriting helper to cg_skb when tc bpf prog
are already capable.

  reply	other threads:[~2022-01-18 20:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-13  0:06 [PATCH bpf-next v6 1/2] Add skb_store_bytes() for BPF_PROG_TYPE_CGROUP_SKB Tyler Wear
2022-01-13  0:06 ` [PATCH bpf-next v6 2/2] selftests/bpf: CGROUP_SKB test for skb_store_bytes() Tyler Wear
2022-01-13  0:55   ` Song Liu
2022-01-13  0:43 ` [PATCH bpf-next v6 1/2] Add skb_store_bytes() for BPF_PROG_TYPE_CGROUP_SKB Song Liu
2022-01-13  2:14 ` Alexei Starovoitov
2022-01-13  5:12   ` Tyler Wear
2022-01-14  6:49     ` Tyler Wear (QUIC)
2022-01-14  6:55       ` Alexei Starovoitov
2022-01-14 21:18         ` Maciej Żenczykowski
2022-01-18 20:37           ` Alexei Starovoitov [this message]
2022-01-18 22:23             ` Tyler Wear
2022-01-18 22:31               ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQ+5YbkVOHqVGgusGYYYc0sB0uLKNJC+JKZu5Hs07=dgvw@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=quic_twear@quicinc.com \
    --cc=song@kernel.org \
    --cc=toke@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.