bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Running JITed and interpreted programs simultaneously
@ 2020-10-09 18:40 Juraj Vijtiuk
  2020-10-13 22:05 ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: Juraj Vijtiuk @ 2020-10-09 18:40 UTC (permalink / raw)
  To: bpf; +Cc: Luka Perkov, David Marcinkovic

It would be great to hear if anyone has any thoughts on running a set
of BPF programs JITed while other programs are run by the interpreter.

Something like that would be useful on 32-bit architectures, as the
JIT compiler there doesn't support some instructions, primarily
instructions that work with 64-bit data. As far as I can tell, it is
unlikely that support will be coming soon as it is a general issue for
all 32-bit architectures. Atomic operations like BPF_XADD look
especially problematic regarding support on 32 bit platforms. From
what I managed to see such a conclusion appeared in a few patches
where support for 32-bit JITs was added, for example [0].
That results in some programs being runnable with BPF JIT enabled, and
some failing during load time, but running successfully without JIT on
32-bit platforms.

The only way to run some programs with JIT and some without, that
seems possible right now, is to manually change
/proc/sys/net/core/bpf_jit_enable every time a program is loaded.
Although I've managed to do that and it seems to be working, it seems
pretty hacky and looks like it could cause race conditions if multiple
programs were loaded, especially by independent loaders.

At first glance it seems that if something like this was to be added
to a loader, it would have to either somehow be aware of other BPF
programs being loaded or possibly implement some sort of locking
mechanism which also seems hacky. From what I understand, doing it in
the kernel looks even less promising as bpf_jit_enable is a system
wide setting, and I imagine that changing it to work on a per program
basis would pretty much require a rework of the current design, so
that looks even less promising.

It looks like the best option right now is to just run everything in
interpreted mode, but I want to make sure that I am not missing
something. If someone has tried doing something similar, it would be
great to know about that.

Thanks,
Juraj Vijtiuk

[0] https://lore.kernel.org/netdev/20200305050207.4159-3-luke.r.nels@gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Running JITed and interpreted programs simultaneously
  2020-10-09 18:40 Running JITed and interpreted programs simultaneously Juraj Vijtiuk
@ 2020-10-13 22:05 ` Andrii Nakryiko
  2020-10-19 10:20   ` Juraj Vijtiuk
  0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2020-10-13 22:05 UTC (permalink / raw)
  To: Juraj Vijtiuk; +Cc: bpf, Luka Perkov, David Marcinkovic

On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote:
>
> It would be great to hear if anyone has any thoughts on running a set
> of BPF programs JITed while other programs are run by the interpreter.
>
> Something like that would be useful on 32-bit architectures, as the
> JIT compiler there doesn't support some instructions, primarily
> instructions that work with 64-bit data. As far as I can tell, it is
> unlikely that support will be coming soon as it is a general issue for
> all 32-bit architectures. Atomic operations like BPF_XADD look
> especially problematic regarding support on 32 bit platforms. From
> what I managed to see such a conclusion appeared in a few patches
> where support for 32-bit JITs was added, for example [0].
> That results in some programs being runnable with BPF JIT enabled, and
> some failing during load time, but running successfully without JIT on
> 32-bit platforms.
>
> The only way to run some programs with JIT and some without, that
> seems possible right now, is to manually change
> /proc/sys/net/core/bpf_jit_enable every time a program is loaded.
> Although I've managed to do that and it seems to be working, it seems
> pretty hacky and looks like it could cause race conditions if multiple
> programs were loaded, especially by independent loaders.

I agree, the global file is not flexible enough and can cause problems
in production environment.

I don't see any reason why we shouldn't allow to decide interpreted vs
jitted mode per program during BPF_PROG_LOAD.

See kernel/bpf/core.c, bpf_prog's jit_requested field determines
whether a program is going to be jitted or not. It should be trivial
to allow overriding that during BPF_PROG_LOAD command.

We can probably also generalize this to allow to "force-jit" or
"force-interpret" by users, which would fail if kernel didn't support
requested mode.

>
> At first glance it seems that if something like this was to be added
> to a loader, it would have to either somehow be aware of other BPF
> programs being loaded or possibly implement some sort of locking
> mechanism which also seems hacky. From what I understand, doing it in
> the kernel looks even less promising as bpf_jit_enable is a system
> wide setting, and I imagine that changing it to work on a per program
> basis would pretty much require a rework of the current design, so
> that looks even less promising.
>
> It looks like the best option right now is to just run everything in
> interpreted mode, but I want to make sure that I am not missing
> something. If someone has tried doing something similar, it would be
> great to know about that.
>
> Thanks,
> Juraj Vijtiuk
>
> [0] https://lore.kernel.org/netdev/20200305050207.4159-3-luke.r.nels@gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Running JITed and interpreted programs simultaneously
  2020-10-13 22:05 ` Andrii Nakryiko
@ 2020-10-19 10:20   ` Juraj Vijtiuk
  2020-10-19 12:58     ` Daniel Borkmann
  0 siblings, 1 reply; 7+ messages in thread
From: Juraj Vijtiuk @ 2020-10-19 10:20 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, Luka Perkov, David Marcinkovic

On Wed, Oct 14, 2020 at 12:05 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote:
> >
> > It would be great to hear if anyone has any thoughts on running a set
> > of BPF programs JITed while other programs are run by the interpreter.
> >
> > Something like that would be useful on 32-bit architectures, as the
> > JIT compiler there doesn't support some instructions, primarily
> > instructions that work with 64-bit data. As far as I can tell, it is
> > unlikely that support will be coming soon as it is a general issue for
> > all 32-bit architectures. Atomic operations like BPF_XADD look
> > especially problematic regarding support on 32 bit platforms. From
> > what I managed to see such a conclusion appeared in a few patches
> > where support for 32-bit JITs was added, for example [0].
> > That results in some programs being runnable with BPF JIT enabled, and
> > some failing during load time, but running successfully without JIT on
> > 32-bit platforms.
> >
> > The only way to run some programs with JIT and some without, that
> > seems possible right now, is to manually change
> > /proc/sys/net/core/bpf_jit_enable every time a program is loaded.
> > Although I've managed to do that and it seems to be working, it seems
> > pretty hacky and looks like it could cause race conditions if multiple
> > programs were loaded, especially by independent loaders.
>
> I agree, the global file is not flexible enough and can cause problems
> in production environment.
>
> I don't see any reason why we shouldn't allow to decide interpreted vs
> jitted mode per program during BPF_PROG_LOAD.
>
> See kernel/bpf/core.c, bpf_prog's jit_requested field determines
> whether a program is going to be jitted or not. It should be trivial
> to allow overriding that during BPF_PROG_LOAD command.
>
> We can probably also generalize this to allow to "force-jit" or
> "force-interpret" by users, which would fail if kernel didn't support
> requested mode.
>

Thanks for the suggestion, that makes sense. I've started working on a
patch today.
I'll post again when I get something working and test it.

> >
> > At first glance it seems that if something like this was to be added
> > to a loader, it would have to either somehow be aware of other BPF
> > programs being loaded or possibly implement some sort of locking
> > mechanism which also seems hacky. From what I understand, doing it in
> > the kernel looks even less promising as bpf_jit_enable is a system
> > wide setting, and I imagine that changing it to work on a per program
> > basis would pretty much require a rework of the current design, so
> > that looks even less promising.
> >
> > It looks like the best option right now is to just run everything in
> > interpreted mode, but I want to make sure that I am not missing
> > something. If someone has tried doing something similar, it would be
> > great to know about that.
> >
> > Thanks,
> > Juraj Vijtiuk
> >
> > [0] https://lore.kernel.org/netdev/20200305050207.4159-3-luke.r.nels@gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Running JITed and interpreted programs simultaneously
  2020-10-19 10:20   ` Juraj Vijtiuk
@ 2020-10-19 12:58     ` Daniel Borkmann
  2020-10-19 18:26       ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2020-10-19 12:58 UTC (permalink / raw)
  To: Juraj Vijtiuk, Andrii Nakryiko
  Cc: bpf, Luka Perkov, David Marcinkovic, alexei.starovoitov

On 10/19/20 12:20 PM, Juraj Vijtiuk wrote:
> On Wed, Oct 14, 2020 at 12:05 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>> On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote:
>>>
>>> It would be great to hear if anyone has any thoughts on running a set
>>> of BPF programs JITed while other programs are run by the interpreter.
>>>
>>> Something like that would be useful on 32-bit architectures, as the
>>> JIT compiler there doesn't support some instructions, primarily
>>> instructions that work with 64-bit data. As far as I can tell, it is
>>> unlikely that support will be coming soon as it is a general issue for
>>> all 32-bit architectures. Atomic operations like BPF_XADD look
>>> especially problematic regarding support on 32 bit platforms. From
>>> what I managed to see such a conclusion appeared in a few patches
>>> where support for 32-bit JITs was added, for example [0].
>>> That results in some programs being runnable with BPF JIT enabled, and
>>> some failing during load time, but running successfully without JIT on
>>> 32-bit platforms.
>>>
>>> The only way to run some programs with JIT and some without, that
>>> seems possible right now, is to manually change
>>> /proc/sys/net/core/bpf_jit_enable every time a program is loaded.
>>> Although I've managed to do that and it seems to be working, it seems
>>> pretty hacky and looks like it could cause race conditions if multiple
>>> programs were loaded, especially by independent loaders.
>>
>> I agree, the global file is not flexible enough and can cause problems
>> in production environment.
>>
>> I don't see any reason why we shouldn't allow to decide interpreted vs
>> jitted mode per program during BPF_PROG_LOAD.
>>
>> See kernel/bpf/core.c, bpf_prog's jit_requested field determines
>> whether a program is going to be jitted or not. It should be trivial
>> to allow overriding that during BPF_PROG_LOAD command.
>>
>> We can probably also generalize this to allow to "force-jit" or
>> "force-interpret" by users, which would fail if kernel didn't support
>> requested mode.
> 
> Thanks for the suggestion, that makes sense. I've started working on a
> patch today.
> I'll post again when I get something working and test it.

Hmm, I'm probably missing some context, but why is it not enough to just set the
bpf_jit_enable to 1, and if 32 bit JITs don't support specific instructions like
BPF_XADD then they should transparently fall back to interpreter if you have
the latter compiled in. That is what it /should/ do today and user loading the
prog shouldn't have to care about it. Juraj, you are suggesting that this is not
happening in your case? Or is the issue tail calls?

Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this
decision will then be pushed to the user who should not have to care about these
internals. And how would generic loaders try to react if force-jit fails? They would
then fallback to force-interpret same way as kernel does?

Wrt BPF_XADD, maybe 32 bit platforms should just implement a function call to the
atomic64_add() internally, it will be slow but otoh the rest can then be JITed, so
most likely this still ends up being faster than using interpreter for everything
anyway.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Running JITed and interpreted programs simultaneously
  2020-10-19 12:58     ` Daniel Borkmann
@ 2020-10-19 18:26       ` Andrii Nakryiko
  2020-10-19 22:02         ` Alexei Starovoitov
  0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2020-10-19 18:26 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Juraj Vijtiuk, bpf, Luka Perkov, David Marcinkovic, Alexei Starovoitov

On Mon, Oct 19, 2020 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 10/19/20 12:20 PM, Juraj Vijtiuk wrote:
> > On Wed, Oct 14, 2020 at 12:05 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >> On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote:
> >>>
> >>> It would be great to hear if anyone has any thoughts on running a set
> >>> of BPF programs JITed while other programs are run by the interpreter.
> >>>
> >>> Something like that would be useful on 32-bit architectures, as the
> >>> JIT compiler there doesn't support some instructions, primarily
> >>> instructions that work with 64-bit data. As far as I can tell, it is
> >>> unlikely that support will be coming soon as it is a general issue for
> >>> all 32-bit architectures. Atomic operations like BPF_XADD look
> >>> especially problematic regarding support on 32 bit platforms. From
> >>> what I managed to see such a conclusion appeared in a few patches
> >>> where support for 32-bit JITs was added, for example [0].
> >>> That results in some programs being runnable with BPF JIT enabled, and
> >>> some failing during load time, but running successfully without JIT on
> >>> 32-bit platforms.
> >>>
> >>> The only way to run some programs with JIT and some without, that
> >>> seems possible right now, is to manually change
> >>> /proc/sys/net/core/bpf_jit_enable every time a program is loaded.
> >>> Although I've managed to do that and it seems to be working, it seems
> >>> pretty hacky and looks like it could cause race conditions if multiple
> >>> programs were loaded, especially by independent loaders.
> >>
> >> I agree, the global file is not flexible enough and can cause problems
> >> in production environment.
> >>
> >> I don't see any reason why we shouldn't allow to decide interpreted vs
> >> jitted mode per program during BPF_PROG_LOAD.
> >>
> >> See kernel/bpf/core.c, bpf_prog's jit_requested field determines
> >> whether a program is going to be jitted or not. It should be trivial
> >> to allow overriding that during BPF_PROG_LOAD command.
> >>
> >> We can probably also generalize this to allow to "force-jit" or
> >> "force-interpret" by users, which would fail if kernel didn't support
> >> requested mode.
> >
> > Thanks for the suggestion, that makes sense. I've started working on a
> > patch today.
> > I'll post again when I get something working and test it.
>
> Hmm, I'm probably missing some context, but why is it not enough to just set the
> bpf_jit_enable to 1, and if 32 bit JITs don't support specific instructions like
> BPF_XADD then they should transparently fall back to interpreter if you have
> the latter compiled in. That is what it /should/ do today and user loading the
> prog shouldn't have to care about it. Juraj, you are suggesting that this is not
> happening in your case? Or is the issue tail calls?

That wasn't happening last time people reported this on ARM32.
BPF_XADD was causing load failure, no fail back to interpreter mode.

>
> Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this
> decision will then be pushed to the user who should not have to care about these
> internals. And how would generic loaders try to react if force-jit fails? They would
> then fallback to force-interpret same way as kernel does?

The way I imagined this was if the user wants to force the mode and
the kernel doesn't support it (or the program can't be loaded in that
mode), then it's a fail-stop, no fall back. And it's strictly an
opt-in flag, if nothing is specified then it's current behavior with
fallback (which apparently doesn't always work).

>
> Wrt BPF_XADD, maybe 32 bit platforms should just implement a function call to the
> atomic64_add() internally, it will be slow but otoh the rest can then be JITed, so
> most likely this still ends up being faster than using interpreter for everything
> anyway.
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Running JITed and interpreted programs simultaneously
  2020-10-19 18:26       ` Andrii Nakryiko
@ 2020-10-19 22:02         ` Alexei Starovoitov
  2020-10-20 20:56           ` Juraj Vijtiuk
  0 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2020-10-19 22:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Daniel Borkmann, Juraj Vijtiuk, bpf, Luka Perkov, David Marcinkovic

On Mon, Oct 19, 2020 at 11:26 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> That wasn't happening last time people reported this on ARM32.
> BPF_XADD was causing load failure, no fail back to interpreter mode.
>
> >
> > Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this
> > decision will then be pushed to the user who should not have to care about these
> > internals. And how would generic loaders try to react if force-jit fails? They would
> > then fallback to force-interpret same way as kernel does?
>
> The way I imagined this was if the user wants to force the mode and
> the kernel doesn't support it (or the program can't be loaded in that
> mode), then it's a fail-stop, no fall back. And it's strictly an
> opt-in flag, if nothing is specified then it's current behavior with
> fallback (which apparently doesn't always work).

That doesn't sound right.
Fallback to interpreter should always work unless features like
trampoline are used.
But that's not the case for arm32. Missing xadd support shouldn't cause
load failure.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Running JITed and interpreted programs simultaneously
  2020-10-19 22:02         ` Alexei Starovoitov
@ 2020-10-20 20:56           ` Juraj Vijtiuk
  0 siblings, 0 replies; 7+ messages in thread
From: Juraj Vijtiuk @ 2020-10-20 20:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Daniel Borkmann, bpf, Luka Perkov, David Marcinkovic

On Tue, Oct 20, 2020 at 12:02 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Oct 19, 2020 at 11:26 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > That wasn't happening last time people reported this on ARM32.
> > BPF_XADD was causing load failure, no fail back to interpreter mode.
> >
> > >
> > > Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this
> > > decision will then be pushed to the user who should not have to care about these
> > > internals. And how would generic loaders try to react if force-jit fails? They would
> > > then fallback to force-interpret same way as kernel does?
> >
> > The way I imagined this was if the user wants to force the mode and
> > the kernel doesn't support it (or the program can't be loaded in that
> > mode), then it's a fail-stop, no fall back. And it's strictly an
> > opt-in flag, if nothing is specified then it's current behavior with
> > fallback (which apparently doesn't always work).
>
> That doesn't sound right.
> Fallback to interpreter should always work unless features like
> trampoline are used.
> But that's not the case for arm32. Missing xadd support shouldn't cause
> load failure.

After some retesting, it turns out that everything is working as it is
supposed to. I'm sorry for the confusion this caused.

My colleagues and I originally ran into the XADD issue on a device
that had CONFIG_BPF_JIT_ALWAYS_ON [0].
That resulted in libbpf reporting the following error:
libbpf: load bpf program failed:
ERROR: strerror_r(524)=22

Other than that the log was mostly empty, except for the number of
processed instructions and other similar info.
After the suggestion to try running the program without JIT, we
recompiled the image without JIT_ALWAYS_ON, but wrongly assumed that
/proc/sys/net/core/bpf_jit_enable has to be set to 0 for the program
to work, so we have never tested with bpf_jit_enable set to 1.

We have now tested on a device with JIT_ALWAYS_ON turned off, and the
program works with bpf_jit_enable set to both 1 or 0, while running on
a device with JIT_ALWAYS_ON still causes the same error that we
originally encountered.

Thank you for the help everyone.

[0] https://lore.kernel.org/bpf/CAO__=G6kqajLdP_cWJiAUjXMRdJe2xBy2FJGiM1v4h6YquD3kg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-20 20:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 18:40 Running JITed and interpreted programs simultaneously Juraj Vijtiuk
2020-10-13 22:05 ` Andrii Nakryiko
2020-10-19 10:20   ` Juraj Vijtiuk
2020-10-19 12:58     ` Daniel Borkmann
2020-10-19 18:26       ` Andrii Nakryiko
2020-10-19 22:02         ` Alexei Starovoitov
2020-10-20 20:56           ` Juraj Vijtiuk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).