* Running JITed and interpreted programs simultaneously @ 2020-10-09 18:40 Juraj Vijtiuk 2020-10-13 22:05 ` Andrii Nakryiko 0 siblings, 1 reply; 7+ messages in thread From: Juraj Vijtiuk @ 2020-10-09 18:40 UTC (permalink / raw) To: bpf; +Cc: Luka Perkov, David Marcinkovic It would be great to hear if anyone has any thoughts on running a set of BPF programs JITed while other programs are run by the interpreter. Something like that would be useful on 32-bit architectures, as the JIT compiler there doesn't support some instructions, primarily instructions that work with 64-bit data. As far as I can tell, it is unlikely that support will be coming soon as it is a general issue for all 32-bit architectures. Atomic operations like BPF_XADD look especially problematic regarding support on 32 bit platforms. From what I managed to see such a conclusion appeared in a few patches where support for 32-bit JITs was added, for example [0]. That results in some programs being runnable with BPF JIT enabled, and some failing during load time, but running successfully without JIT on 32-bit platforms. The only way to run some programs with JIT and some without, that seems possible right now, is to manually change /proc/sys/net/core/bpf_jit_enable every time a program is loaded. Although I've managed to do that and it seems to be working, it seems pretty hacky and looks like it could cause race conditions if multiple programs were loaded, especially by independent loaders. At first glance it seems that if something like this was to be added to a loader, it would have to either somehow be aware of other BPF programs being loaded or possibly implement some sort of locking mechanism which also seems hacky. From what I understand, doing it in the kernel looks even less promising as bpf_jit_enable is a system wide setting, and I imagine that changing it to work on a per program basis would pretty much require a rework of the current design, so that looks even less promising. It looks like the best option right now is to just run everything in interpreted mode, but I want to make sure that I am not missing something. If someone has tried doing something similar, it would be great to know about that. Thanks, Juraj Vijtiuk [0] https://lore.kernel.org/netdev/20200305050207.4159-3-luke.r.nels@gmail.com/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Running JITed and interpreted programs simultaneously 2020-10-09 18:40 Running JITed and interpreted programs simultaneously Juraj Vijtiuk @ 2020-10-13 22:05 ` Andrii Nakryiko 2020-10-19 10:20 ` Juraj Vijtiuk 0 siblings, 1 reply; 7+ messages in thread From: Andrii Nakryiko @ 2020-10-13 22:05 UTC (permalink / raw) To: Juraj Vijtiuk; +Cc: bpf, Luka Perkov, David Marcinkovic On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote: > > It would be great to hear if anyone has any thoughts on running a set > of BPF programs JITed while other programs are run by the interpreter. > > Something like that would be useful on 32-bit architectures, as the > JIT compiler there doesn't support some instructions, primarily > instructions that work with 64-bit data. As far as I can tell, it is > unlikely that support will be coming soon as it is a general issue for > all 32-bit architectures. Atomic operations like BPF_XADD look > especially problematic regarding support on 32 bit platforms. From > what I managed to see such a conclusion appeared in a few patches > where support for 32-bit JITs was added, for example [0]. > That results in some programs being runnable with BPF JIT enabled, and > some failing during load time, but running successfully without JIT on > 32-bit platforms. > > The only way to run some programs with JIT and some without, that > seems possible right now, is to manually change > /proc/sys/net/core/bpf_jit_enable every time a program is loaded. > Although I've managed to do that and it seems to be working, it seems > pretty hacky and looks like it could cause race conditions if multiple > programs were loaded, especially by independent loaders. I agree, the global file is not flexible enough and can cause problems in production environment. I don't see any reason why we shouldn't allow to decide interpreted vs jitted mode per program during BPF_PROG_LOAD. See kernel/bpf/core.c, bpf_prog's jit_requested field determines whether a program is going to be jitted or not. It should be trivial to allow overriding that during BPF_PROG_LOAD command. We can probably also generalize this to allow to "force-jit" or "force-interpret" by users, which would fail if kernel didn't support requested mode. > > At first glance it seems that if something like this was to be added > to a loader, it would have to either somehow be aware of other BPF > programs being loaded or possibly implement some sort of locking > mechanism which also seems hacky. From what I understand, doing it in > the kernel looks even less promising as bpf_jit_enable is a system > wide setting, and I imagine that changing it to work on a per program > basis would pretty much require a rework of the current design, so > that looks even less promising. > > It looks like the best option right now is to just run everything in > interpreted mode, but I want to make sure that I am not missing > something. If someone has tried doing something similar, it would be > great to know about that. > > Thanks, > Juraj Vijtiuk > > [0] https://lore.kernel.org/netdev/20200305050207.4159-3-luke.r.nels@gmail.com/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Running JITed and interpreted programs simultaneously 2020-10-13 22:05 ` Andrii Nakryiko @ 2020-10-19 10:20 ` Juraj Vijtiuk 2020-10-19 12:58 ` Daniel Borkmann 0 siblings, 1 reply; 7+ messages in thread From: Juraj Vijtiuk @ 2020-10-19 10:20 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: bpf, Luka Perkov, David Marcinkovic On Wed, Oct 14, 2020 at 12:05 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote: > > > > It would be great to hear if anyone has any thoughts on running a set > > of BPF programs JITed while other programs are run by the interpreter. > > > > Something like that would be useful on 32-bit architectures, as the > > JIT compiler there doesn't support some instructions, primarily > > instructions that work with 64-bit data. As far as I can tell, it is > > unlikely that support will be coming soon as it is a general issue for > > all 32-bit architectures. Atomic operations like BPF_XADD look > > especially problematic regarding support on 32 bit platforms. From > > what I managed to see such a conclusion appeared in a few patches > > where support for 32-bit JITs was added, for example [0]. > > That results in some programs being runnable with BPF JIT enabled, and > > some failing during load time, but running successfully without JIT on > > 32-bit platforms. > > > > The only way to run some programs with JIT and some without, that > > seems possible right now, is to manually change > > /proc/sys/net/core/bpf_jit_enable every time a program is loaded. > > Although I've managed to do that and it seems to be working, it seems > > pretty hacky and looks like it could cause race conditions if multiple > > programs were loaded, especially by independent loaders. > > I agree, the global file is not flexible enough and can cause problems > in production environment. > > I don't see any reason why we shouldn't allow to decide interpreted vs > jitted mode per program during BPF_PROG_LOAD. > > See kernel/bpf/core.c, bpf_prog's jit_requested field determines > whether a program is going to be jitted or not. It should be trivial > to allow overriding that during BPF_PROG_LOAD command. > > We can probably also generalize this to allow to "force-jit" or > "force-interpret" by users, which would fail if kernel didn't support > requested mode. > Thanks for the suggestion, that makes sense. I've started working on a patch today. I'll post again when I get something working and test it. > > > > At first glance it seems that if something like this was to be added > > to a loader, it would have to either somehow be aware of other BPF > > programs being loaded or possibly implement some sort of locking > > mechanism which also seems hacky. From what I understand, doing it in > > the kernel looks even less promising as bpf_jit_enable is a system > > wide setting, and I imagine that changing it to work on a per program > > basis would pretty much require a rework of the current design, so > > that looks even less promising. > > > > It looks like the best option right now is to just run everything in > > interpreted mode, but I want to make sure that I am not missing > > something. If someone has tried doing something similar, it would be > > great to know about that. > > > > Thanks, > > Juraj Vijtiuk > > > > [0] https://lore.kernel.org/netdev/20200305050207.4159-3-luke.r.nels@gmail.com/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Running JITed and interpreted programs simultaneously 2020-10-19 10:20 ` Juraj Vijtiuk @ 2020-10-19 12:58 ` Daniel Borkmann 2020-10-19 18:26 ` Andrii Nakryiko 0 siblings, 1 reply; 7+ messages in thread From: Daniel Borkmann @ 2020-10-19 12:58 UTC (permalink / raw) To: Juraj Vijtiuk, Andrii Nakryiko Cc: bpf, Luka Perkov, David Marcinkovic, alexei.starovoitov On 10/19/20 12:20 PM, Juraj Vijtiuk wrote: > On Wed, Oct 14, 2020 at 12:05 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote: >>> >>> It would be great to hear if anyone has any thoughts on running a set >>> of BPF programs JITed while other programs are run by the interpreter. >>> >>> Something like that would be useful on 32-bit architectures, as the >>> JIT compiler there doesn't support some instructions, primarily >>> instructions that work with 64-bit data. As far as I can tell, it is >>> unlikely that support will be coming soon as it is a general issue for >>> all 32-bit architectures. Atomic operations like BPF_XADD look >>> especially problematic regarding support on 32 bit platforms. From >>> what I managed to see such a conclusion appeared in a few patches >>> where support for 32-bit JITs was added, for example [0]. >>> That results in some programs being runnable with BPF JIT enabled, and >>> some failing during load time, but running successfully without JIT on >>> 32-bit platforms. >>> >>> The only way to run some programs with JIT and some without, that >>> seems possible right now, is to manually change >>> /proc/sys/net/core/bpf_jit_enable every time a program is loaded. >>> Although I've managed to do that and it seems to be working, it seems >>> pretty hacky and looks like it could cause race conditions if multiple >>> programs were loaded, especially by independent loaders. >> >> I agree, the global file is not flexible enough and can cause problems >> in production environment. >> >> I don't see any reason why we shouldn't allow to decide interpreted vs >> jitted mode per program during BPF_PROG_LOAD. >> >> See kernel/bpf/core.c, bpf_prog's jit_requested field determines >> whether a program is going to be jitted or not. It should be trivial >> to allow overriding that during BPF_PROG_LOAD command. >> >> We can probably also generalize this to allow to "force-jit" or >> "force-interpret" by users, which would fail if kernel didn't support >> requested mode. > > Thanks for the suggestion, that makes sense. I've started working on a > patch today. > I'll post again when I get something working and test it. Hmm, I'm probably missing some context, but why is it not enough to just set the bpf_jit_enable to 1, and if 32 bit JITs don't support specific instructions like BPF_XADD then they should transparently fall back to interpreter if you have the latter compiled in. That is what it /should/ do today and user loading the prog shouldn't have to care about it. Juraj, you are suggesting that this is not happening in your case? Or is the issue tail calls? Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this decision will then be pushed to the user who should not have to care about these internals. And how would generic loaders try to react if force-jit fails? They would then fallback to force-interpret same way as kernel does? Wrt BPF_XADD, maybe 32 bit platforms should just implement a function call to the atomic64_add() internally, it will be slow but otoh the rest can then be JITed, so most likely this still ends up being faster than using interpreter for everything anyway. Thanks, Daniel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Running JITed and interpreted programs simultaneously 2020-10-19 12:58 ` Daniel Borkmann @ 2020-10-19 18:26 ` Andrii Nakryiko 2020-10-19 22:02 ` Alexei Starovoitov 0 siblings, 1 reply; 7+ messages in thread From: Andrii Nakryiko @ 2020-10-19 18:26 UTC (permalink / raw) To: Daniel Borkmann Cc: Juraj Vijtiuk, bpf, Luka Perkov, David Marcinkovic, Alexei Starovoitov On Mon, Oct 19, 2020 at 5:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 10/19/20 12:20 PM, Juraj Vijtiuk wrote: > > On Wed, Oct 14, 2020 at 12:05 AM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > >> On Fri, Oct 9, 2020 at 12:58 PM Juraj Vijtiuk <juraj.vijtiuk@sartura.hr> wrote: > >>> > >>> It would be great to hear if anyone has any thoughts on running a set > >>> of BPF programs JITed while other programs are run by the interpreter. > >>> > >>> Something like that would be useful on 32-bit architectures, as the > >>> JIT compiler there doesn't support some instructions, primarily > >>> instructions that work with 64-bit data. As far as I can tell, it is > >>> unlikely that support will be coming soon as it is a general issue for > >>> all 32-bit architectures. Atomic operations like BPF_XADD look > >>> especially problematic regarding support on 32 bit platforms. From > >>> what I managed to see such a conclusion appeared in a few patches > >>> where support for 32-bit JITs was added, for example [0]. > >>> That results in some programs being runnable with BPF JIT enabled, and > >>> some failing during load time, but running successfully without JIT on > >>> 32-bit platforms. > >>> > >>> The only way to run some programs with JIT and some without, that > >>> seems possible right now, is to manually change > >>> /proc/sys/net/core/bpf_jit_enable every time a program is loaded. > >>> Although I've managed to do that and it seems to be working, it seems > >>> pretty hacky and looks like it could cause race conditions if multiple > >>> programs were loaded, especially by independent loaders. > >> > >> I agree, the global file is not flexible enough and can cause problems > >> in production environment. > >> > >> I don't see any reason why we shouldn't allow to decide interpreted vs > >> jitted mode per program during BPF_PROG_LOAD. > >> > >> See kernel/bpf/core.c, bpf_prog's jit_requested field determines > >> whether a program is going to be jitted or not. It should be trivial > >> to allow overriding that during BPF_PROG_LOAD command. > >> > >> We can probably also generalize this to allow to "force-jit" or > >> "force-interpret" by users, which would fail if kernel didn't support > >> requested mode. > > > > Thanks for the suggestion, that makes sense. I've started working on a > > patch today. > > I'll post again when I get something working and test it. > > Hmm, I'm probably missing some context, but why is it not enough to just set the > bpf_jit_enable to 1, and if 32 bit JITs don't support specific instructions like > BPF_XADD then they should transparently fall back to interpreter if you have > the latter compiled in. That is what it /should/ do today and user loading the > prog shouldn't have to care about it. Juraj, you are suggesting that this is not > happening in your case? Or is the issue tail calls? That wasn't happening last time people reported this on ARM32. BPF_XADD was causing load failure, no fail back to interpreter mode. > > Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this > decision will then be pushed to the user who should not have to care about these > internals. And how would generic loaders try to react if force-jit fails? They would > then fallback to force-interpret same way as kernel does? The way I imagined this was if the user wants to force the mode and the kernel doesn't support it (or the program can't be loaded in that mode), then it's a fail-stop, no fall back. And it's strictly an opt-in flag, if nothing is specified then it's current behavior with fallback (which apparently doesn't always work). > > Wrt BPF_XADD, maybe 32 bit platforms should just implement a function call to the > atomic64_add() internally, it will be slow but otoh the rest can then be JITed, so > most likely this still ends up being faster than using interpreter for everything > anyway. > > Thanks, > Daniel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Running JITed and interpreted programs simultaneously 2020-10-19 18:26 ` Andrii Nakryiko @ 2020-10-19 22:02 ` Alexei Starovoitov 2020-10-20 20:56 ` Juraj Vijtiuk 0 siblings, 1 reply; 7+ messages in thread From: Alexei Starovoitov @ 2020-10-19 22:02 UTC (permalink / raw) To: Andrii Nakryiko Cc: Daniel Borkmann, Juraj Vijtiuk, bpf, Luka Perkov, David Marcinkovic On Mon, Oct 19, 2020 at 11:26 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > That wasn't happening last time people reported this on ARM32. > BPF_XADD was causing load failure, no fail back to interpreter mode. > > > > > Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this > > decision will then be pushed to the user who should not have to care about these > > internals. And how would generic loaders try to react if force-jit fails? They would > > then fallback to force-interpret same way as kernel does? > > The way I imagined this was if the user wants to force the mode and > the kernel doesn't support it (or the program can't be loaded in that > mode), then it's a fail-stop, no fall back. And it's strictly an > opt-in flag, if nothing is specified then it's current behavior with > fallback (which apparently doesn't always work). That doesn't sound right. Fallback to interpreter should always work unless features like trampoline are used. But that's not the case for arm32. Missing xadd support shouldn't cause load failure. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Running JITed and interpreted programs simultaneously 2020-10-19 22:02 ` Alexei Starovoitov @ 2020-10-20 20:56 ` Juraj Vijtiuk 0 siblings, 0 replies; 7+ messages in thread From: Juraj Vijtiuk @ 2020-10-20 20:56 UTC (permalink / raw) To: Alexei Starovoitov Cc: Andrii Nakryiko, Daniel Borkmann, bpf, Luka Perkov, David Marcinkovic On Tue, Oct 20, 2020 at 12:02 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Mon, Oct 19, 2020 at 11:26 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > That wasn't happening last time people reported this on ARM32. > > BPF_XADD was causing load failure, no fail back to interpreter mode. > > > > > > > > Wrt force-interpret vs force-jit BPF_PROG_LOAD flag, I'm more concerned that this > > > decision will then be pushed to the user who should not have to care about these > > > internals. And how would generic loaders try to react if force-jit fails? They would > > > then fallback to force-interpret same way as kernel does? > > > > The way I imagined this was if the user wants to force the mode and > > the kernel doesn't support it (or the program can't be loaded in that > > mode), then it's a fail-stop, no fall back. And it's strictly an > > opt-in flag, if nothing is specified then it's current behavior with > > fallback (which apparently doesn't always work). > > That doesn't sound right. > Fallback to interpreter should always work unless features like > trampoline are used. > But that's not the case for arm32. Missing xadd support shouldn't cause > load failure. After some retesting, it turns out that everything is working as it is supposed to. I'm sorry for the confusion this caused. My colleagues and I originally ran into the XADD issue on a device that had CONFIG_BPF_JIT_ALWAYS_ON [0]. That resulted in libbpf reporting the following error: libbpf: load bpf program failed: ERROR: strerror_r(524)=22 Other than that the log was mostly empty, except for the number of processed instructions and other similar info. After the suggestion to try running the program without JIT, we recompiled the image without JIT_ALWAYS_ON, but wrongly assumed that /proc/sys/net/core/bpf_jit_enable has to be set to 0 for the program to work, so we have never tested with bpf_jit_enable set to 1. We have now tested on a device with JIT_ALWAYS_ON turned off, and the program works with bpf_jit_enable set to both 1 or 0, while running on a device with JIT_ALWAYS_ON still causes the same error that we originally encountered. Thank you for the help everyone. [0] https://lore.kernel.org/bpf/CAO__=G6kqajLdP_cWJiAUjXMRdJe2xBy2FJGiM1v4h6YquD3kg@mail.gmail.com/ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-10-20 20:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-09 18:40 Running JITed and interpreted programs simultaneously Juraj Vijtiuk 2020-10-13 22:05 ` Andrii Nakryiko 2020-10-19 10:20 ` Juraj Vijtiuk 2020-10-19 12:58 ` Daniel Borkmann 2020-10-19 18:26 ` Andrii Nakryiko 2020-10-19 22:02 ` Alexei Starovoitov 2020-10-20 20:56 ` Juraj Vijtiuk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).