xdp-newbies.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* cpumap infinite loop
@ 2024-03-01 23:04 Tobias Böhm
  2024-03-05 17:08 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 4+ messages in thread
From: Tobias Böhm @ 2024-03-01 23:04 UTC (permalink / raw)
  To: xdp-newbies

Hello,

I was playing around a bit with cpumaps and wondered what happens when
the attached program just does another CPU redirect to itself.

I ended up having an infinite loop. The working example can be found
here: https://github.com/aibor/cpumap-loop

Now, I wonder if there is a way to detect and break this loop. I took a
look at the xdp_md->rx_queue_index values. When executed by a NIC event,
the value is the NIC ID, so a fairly low number. After CPU redirection
the values I saw were far above the range of NIC queue IDs. I couldn't
figure out if it is just a random memory value or if this value still 
has a (maybe different) meaning after CPU redirection. Maybe somehow
related to the CPU queue?

If the field is set to a meaningful value I can make assumptions about
it and would be able to detect previous CPU redirection, I guess.

I'd appreciate any pointers and tips how I could detect such a loop. Or
maybe there is a way to prevent it in the first place other than "just
being careful"?

Thanks in advance,
Tobias

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cpumap infinite loop
  2024-03-01 23:04 cpumap infinite loop Tobias Böhm
@ 2024-03-05 17:08 ` Toke Høiland-Jørgensen
  2024-03-06  9:09   ` Tobias Böhm
  0 siblings, 1 reply; 4+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-03-05 17:08 UTC (permalink / raw)
  To: Tobias Böhm, xdp-newbies; +Cc: Lorenzo Bianconi

Tobias Böhm <tobias@aibor.de> writes:

> Hello,
>
> I was playing around a bit with cpumaps and wondered what happens when
> the attached program just does another CPU redirect to itself.
>
> I ended up having an infinite loop. The working example can be found
> here: https://github.com/aibor/cpumap-loop
>
> Now, I wonder if there is a way to detect and break this loop. I took a
> look at the xdp_md->rx_queue_index values. When executed by a NIC event,
> the value is the NIC ID, so a fairly low number. After CPU redirection
> the values I saw were far above the range of NIC queue IDs. I couldn't
> figure out if it is just a random memory value or if this value still 
> has a (maybe different) meaning after CPU redirection. Maybe somehow
> related to the CPU queue?

It's random. The rxq data structure is not initialised on the stack, so
it's basically whatever was in that memory. Interestingly, there's a
TODO comment in there to fix this:

https://elixir.bootlin.com/linux/latest/source/kernel/bpf/cpumap.c#L195

Not sure what the intention was here. +Lorenzo, who wrote that code.
Returning the contents of a random uninitialised stack variable is
probably not a good idea, though, we should zero out the data structure.
I'll send a patch for that.

> If the field is set to a meaningful value I can make assumptions about
> it and would be able to detect previous CPU redirection, I guess.
>
> I'd appreciate any pointers and tips how I could detect such a loop. Or
> maybe there is a way to prevent it in the first place other than "just
> being careful"?

Well, you kinda have to go out of your way to construct a loop like
this. How are you envisioning this would happen accidentally? :)

-Toke


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cpumap infinite loop
  2024-03-05 17:08 ` Toke Høiland-Jørgensen
@ 2024-03-06  9:09   ` Tobias Böhm
  2024-03-06 10:20     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 4+ messages in thread
From: Tobias Böhm @ 2024-03-06  9:09 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: xdp-newbies, Lorenzo Bianconi

On Tue, 05 Mar 2024 at 06:08 PM, Toke Høiland-Jørgensen wrote:
> Tobias Böhm <tobias@aibor.de> writes:
> 
> > Hello,
> >
> > I was playing around a bit with cpumaps and wondered what happens when
> > the attached program just does another CPU redirect to itself.
> >
> > I ended up having an infinite loop. The working example can be found
> > here: https://github.com/aibor/cpumap-loop
> >
> > Now, I wonder if there is a way to detect and break this loop. I took a
> > look at the xdp_md->rx_queue_index values. When executed by a NIC event,
> > the value is the NIC ID, so a fairly low number. After CPU redirection
> > the values I saw were far above the range of NIC queue IDs. I couldn't
> > figure out if it is just a random memory value or if this value still 
> > has a (maybe different) meaning after CPU redirection. Maybe somehow
> > related to the CPU queue?
> 
> It's random. The rxq data structure is not initialised on the stack, so
> it's basically whatever was in that memory. Interestingly, there's a
> TODO comment in there to fix this:
> 
> https://elixir.bootlin.com/linux/latest/source/kernel/bpf/cpumap.c#L195
> 
> Not sure what the intention was here. +Lorenzo, who wrote that code.
> Returning the contents of a random uninitialised stack variable is
> probably not a good idea, though, we should zero out the data structure.
> I'll send a patch for that.

Thank you for the explanation and the patch. :)

> > If the field is set to a meaningful value I can make assumptions about
> > it and would be able to detect previous CPU redirection, I guess.
> >
> > I'd appreciate any pointers and tips how I could detect such a loop. Or
> > maybe there is a way to prevent it in the first place other than "just
> > being careful"?
> 
> Well, you kinda have to go out of your way to construct a loop like
> this. How are you envisioning this would happen accidentally? :)

I totally agree that it is pretty unlikely to create such a loop by accident.
Especially since the map programs usually are rather simple.
My example was driven by pure curiosity, exploring the possibilities of
redirect map programs. And since I saw infinite recursion is possible I
was looking for options for reliable termination conditions. This made me
wonder if I can detect if the program was invoked by the device or by map
redirection.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cpumap infinite loop
  2024-03-06  9:09   ` Tobias Böhm
@ 2024-03-06 10:20     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 4+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-03-06 10:20 UTC (permalink / raw)
  To: Tobias Böhm; +Cc: xdp-newbies, Lorenzo Bianconi

Tobias Böhm <tobias@aibor.de> writes:

> On Tue, 05 Mar 2024 at 06:08 PM, Toke Høiland-Jørgensen wrote:
>> Tobias Böhm <tobias@aibor.de> writes:
>> 
>> > Hello,
>> >
>> > I was playing around a bit with cpumaps and wondered what happens when
>> > the attached program just does another CPU redirect to itself.
>> >
>> > I ended up having an infinite loop. The working example can be found
>> > here: https://github.com/aibor/cpumap-loop
>> >
>> > Now, I wonder if there is a way to detect and break this loop. I took a
>> > look at the xdp_md->rx_queue_index values. When executed by a NIC event,
>> > the value is the NIC ID, so a fairly low number. After CPU redirection
>> > the values I saw were far above the range of NIC queue IDs. I couldn't
>> > figure out if it is just a random memory value or if this value still 
>> > has a (maybe different) meaning after CPU redirection. Maybe somehow
>> > related to the CPU queue?
>> 
>> It's random. The rxq data structure is not initialised on the stack, so
>> it's basically whatever was in that memory. Interestingly, there's a
>> TODO comment in there to fix this:
>> 
>> https://elixir.bootlin.com/linux/latest/source/kernel/bpf/cpumap.c#L195
>> 
>> Not sure what the intention was here. +Lorenzo, who wrote that code.
>> Returning the contents of a random uninitialised stack variable is
>> probably not a good idea, though, we should zero out the data structure.
>> I'll send a patch for that.
>
> Thank you for the explanation and the patch. :)

You're welcome :)

>> > If the field is set to a meaningful value I can make assumptions about
>> > it and would be able to detect previous CPU redirection, I guess.
>> >
>> > I'd appreciate any pointers and tips how I could detect such a loop. Or
>> > maybe there is a way to prevent it in the first place other than "just
>> > being careful"?
>> 
>> Well, you kinda have to go out of your way to construct a loop like
>> this. How are you envisioning this would happen accidentally? :)
>
> I totally agree that it is pretty unlikely to create such a loop by accident.
> Especially since the map programs usually are rather simple.
> My example was driven by pure curiosity, exploring the possibilities of
> redirect map programs. And since I saw infinite recursion is possible I
> was looking for options for reliable termination conditions. This made me
> wonder if I can detect if the program was invoked by the device or by map
> redirection.

I don't think there's any direct way to detect this; after the patch I
sent you could look for queue index == 0, but I am not quite sure if
there are any legitimate devices that would have that as well, so not
sure if it's reliable. The most reliable is probably to resort to the
kind of wrapping you're doing: you know where you attach the program, so
the cpumap program type can pass additional parameters to the functions
that contain the main logic bits, like:

SEC("xdp")
int my_xdp_prog(struct xdp_md *ctx)
{
  return my_real_prog(ctx, false);
}

SEC("xdp/cpumap")
int my_cpumap_prog(struct xdp_md *ctx)
{
  return my_real_prog(ctx, true);
}


-Toke


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-03-06 10:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-01 23:04 cpumap infinite loop Tobias Böhm
2024-03-05 17:08 ` Toke Høiland-Jørgensen
2024-03-06  9:09   ` Tobias Böhm
2024-03-06 10:20     ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).