bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "Jesper Dangaard Brouer" <hawk@kernel.org>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	"Björn Töpel" <bjorn@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Hao Luo" <haoluo@google.com>, "Jakub Kicinski" <kuba@kernel.org>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Song Liu" <song@kernel.org>,
	"Stanislav Fomichev" <sdf@google.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Yonghong Song" <yonghong.song@linux.dev>
Subject: Re: [PATCH RFC net-next 1/2] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.
Date: Thu, 15 Feb 2024 21:23:23 +0100	[thread overview]
Message-ID: <87jzn5cw90.fsf@toke.dk> (raw)
In-Reply-To: <20240214163607.RjjT5bO_@linutronix.de>

Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:

> On 2024-02-14 17:08:44 [+0100], Toke Høiland-Jørgensen wrote:
>> > During testing I forgot a spot in egress and the test module. You could
>> > argue that the warning is enough since it should pop up in testing and
>> > not production because the code is always missed and not by chance (go
>> > boom, send a report). I *think* I covered all spots, at least the test
>> > suite didn't point anything out to me.
>> 
>> Well, I would prefer if we could make sure we covered everything and not
>> have this odd failure mode where redirect just mysteriously stops
>> working. At the very least, if we keep the check we should have a
>> WARN_ON in there to make it really obvious that something needs to be
>> fixed.
>
> Agree.
>
>> This brings me to another thing I was going to point out separately, but
>> may as well mention it here: It would be good if we could keep the
>> difference between the RT and !RT versions as small as possible to avoid
>> having subtle bugs that only appear in one configuration.
>
> Yes. I do so, too.
>
>> I agree with Jesper that the concept of a stack-allocated "run context"
>> for the XDP program makes sense in general (and I have some vague ideas
>> about other things that may be useful to stick in there). So I'm
>> wondering if it makes sense to do that even in the !RT case? We can't
>> stick the pointer to it into 'current' when running in softirq, but we
>> could change the per-cpu variable to just be a pointer that gets
>> populated by xdp_storage_set()?
>
> I *think* that it could be added to current. The assignment currently
> allows nesting so that is not a problem. Only the outer most set/clear
> would do something. If you run in softirq, you would hijack a random
> task_struct. If the pointer is already assigned then the list and so one
> must be empty because access is only allowed in BH-disabled sections.
>
> However, using per-CPU for the pointer (instead of task_struct) would
> have the advantage that it is always CPU/node local memory while the
> random task_struct could have been allocated on a different NUMA node.

Ah yes, good point, that's probably desirable :)

>> I'm not really sure if this would be performance neutral (it's just
>> moving around a few bits of memory, but we do gain an extra pointer
>> deref), but it should be simple enough to benchmark.
>
> My guess is that we remain with one per-CPU dereference and an
> additional "add offset". That is why I kept the !RT bits as they are
> before getting yelled at.
>
> I could prepare something and run a specific test if you have one.

The test itself is simple enough: Simply run xdp-bench (from
xdp-tools[0]) in drop mode, serve some traffic to the machine and
observe the difference in PPS before and after the patch.

The tricky part is that the traffic actually has to stress the CPU,
which means that the offered load has to be higher than what the CPU can
handle. Which generally means running on high-speed NICs with small
packets: a modern server CPU has no problem keeping up with a 10G link
even at 64-byte packet size, so a 100G NIC is needed, or the test needs
to be run on a low-powered machine.

As a traffic generator, the xdp-trafficgen utility also in xdp-tools can
be used, or the in-kernel pktgen, or something like T-rex or Moongen.
Generally serving UDP traffic with 64-byte packets on a single port
is enough to make sure the traffic is serviced by a single CPU, although
some configuration may be needed to steer IRQs as well.

-Toke

[0] https://github.com/xdp-project/xdp-tools


  reply	other threads:[~2024-02-15 20:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13 14:58 [PATCH RFC net-next 0/2] Use per-task storage for XDP-redirects on PREEMPT_RT Sebastian Andrzej Siewior
2024-02-13 14:58 ` [PATCH RFC net-next 1/2] net: Reference bpf_redirect_info via task_struct " Sebastian Andrzej Siewior
2024-02-13 20:50   ` Jesper Dangaard Brouer
2024-02-14 12:19     ` Sebastian Andrzej Siewior
2024-02-14 13:23       ` Toke Høiland-Jørgensen
2024-02-14 14:28         ` Sebastian Andrzej Siewior
2024-02-14 16:08           ` Toke Høiland-Jørgensen
2024-02-14 16:36             ` Sebastian Andrzej Siewior
2024-02-15 20:23               ` Toke Høiland-Jørgensen [this message]
2024-02-16 16:57                 ` Sebastian Andrzej Siewior
2024-02-19 19:01                   ` Toke Høiland-Jørgensen
2024-02-20  9:17                     ` Jesper Dangaard Brouer
2024-02-20 10:17                       ` Sebastian Andrzej Siewior
2024-02-20 10:42                         ` Jesper Dangaard Brouer
2024-02-20 12:08                           ` Sebastian Andrzej Siewior
2024-02-20 12:57                             ` Jesper Dangaard Brouer
2024-02-20 15:32                               ` Sebastian Andrzej Siewior
2024-02-22  9:22                                 ` Sebastian Andrzej Siewior
2024-02-22 10:10                                   ` Jesper Dangaard Brouer
2024-02-22 10:58                                     ` Sebastian Andrzej Siewior
2024-02-20 12:10                           ` Dave Taht
2024-02-14 16:13   ` Toke Høiland-Jørgensen
2024-02-15  9:04     ` Sebastian Andrzej Siewior
2024-02-15 12:11       ` Toke Høiland-Jørgensen
2024-02-13 14:58 ` [PATCH RFC net-next 2/2] net: Move per-CPU flush-lists to bpf_xdp_storage " Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzn5cw90.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).