* [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare @ 2021-05-05 16:23 Florent Revest 2021-05-05 18:55 ` Andrii Nakryiko 0 siblings, 1 reply; 10+ messages in thread From: Florent Revest @ 2021-05-05 16:23 UTC (permalink / raw) To: bpf Cc: ast, daniel, andrii, kpsingh, jackmanb, sdf, linux-kernel, Florent Revest, syzbot The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one per-cpu buffer that they use to store temporary data (arguments to bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it by the end of their scope with bpf_bprintf_cleanup. If one of these helpers gets called within the scope of one of these helpers, for example: a first bpf program gets called, uses bpf_trace_printk which calls raw_spin_lock_irqsave which is traced by another bpf program that calls bpf_trace_printk again, then the second "get" fails. Essentially, these helpers are not re-entrant, and it's not that bad because they would simply return -EBUSY and recover gracefully. However, when this happens, the code hits a WARN_ON_ONCE. The guidelines in include/asm-generic/bug.h say "Do not use these macros [...] on transient conditions like ENOMEM or EAGAIN." This condition qualifies as transient, for example, the next raw_spin_lock_irqsave probe is likely to succeed, so it does not deserve a WARN_ON_ONCE. The guidelines also say "Do not use these macros when checking for invalid external inputs (e.g. invalid system call arguments" and, in a way, this can be seen as an invalid input because syzkaller triggered it. Signed-off-by: Florent Revest <revest@chromium.org> Reported-by: syzbot <syzbot@syzkaller.appspotmail.com> Fixes: d9c9e4db186a ("bpf: Factorize bpf_trace_printk and bpf_seq_printf") --- kernel/bpf/helpers.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 544773970dbc..007fa26eb3f5 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -709,7 +709,7 @@ static int try_get_fmt_tmp_buf(char **tmp_buf) preempt_disable(); used = this_cpu_inc_return(bpf_printf_buf_used); - if (WARN_ON_ONCE(used > 1)) { + if (used > 1) { this_cpu_dec(bpf_printf_buf_used); preempt_enable(); return -EBUSY; -- 2.31.1.527.g47e6f16901-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-05 16:23 [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare Florent Revest @ 2021-05-05 18:55 ` Andrii Nakryiko 2021-05-05 20:00 ` Daniel Borkmann 0 siblings, 1 reply; 10+ messages in thread From: Andrii Nakryiko @ 2021-05-05 18:55 UTC (permalink / raw) To: Florent Revest Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > > The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > per-cpu buffer that they use to store temporary data (arguments to > bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > by the end of their scope with bpf_bprintf_cleanup. > > If one of these helpers gets called within the scope of one of these > helpers, for example: a first bpf program gets called, uses Can we afford having few struct bpf_printf_bufs? They are just 512 bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the only situation where this can occur, right? If someone is doing bpf_snprintf() and interrupt occurs and we run another BPF program, it will be impossible to do bpf_snprintf() or bpf_trace_printk() from the second BPF program, etc. We can't eliminate the probability, but having a small stack of buffers would make the probability so miniscule as to not worry about it at all. Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so the changes are minimal. Nestedness property is preserved for non-sleepable BPF programs, right? If we want this to work for sleepable we'd need to either: 1) disable migration or 2) instead of assuming a stack of buffers, do a loop to find unused one. Should be acceptable performance-wise, as it's not the fastest code anyway (printf'ing in general). In any case, re-using the same buffer for sort-of-optional-to-work bpf_trace_printk() and probably-important-to-work bpf_snprintf() is suboptimal, so seems worth fixing this. Thoughts? > bpf_trace_printk which calls raw_spin_lock_irqsave which is traced by > another bpf program that calls bpf_trace_printk again, then the second > "get" fails. Essentially, these helpers are not re-entrant, and it's not > that bad because they would simply return -EBUSY and recover gracefully. > > However, when this happens, the code hits a WARN_ON_ONCE. The guidelines > in include/asm-generic/bug.h say "Do not use these macros [...] on > transient conditions like ENOMEM or EAGAIN." > > This condition qualifies as transient, for example, the next > raw_spin_lock_irqsave probe is likely to succeed, so it does not deserve > a WARN_ON_ONCE. > > The guidelines also say "Do not use these macros when checking for > invalid external inputs (e.g. invalid system call arguments" and, in a > way, this can be seen as an invalid input because syzkaller triggered > it. > > Signed-off-by: Florent Revest <revest@chromium.org> > Reported-by: syzbot <syzbot@syzkaller.appspotmail.com> > Fixes: d9c9e4db186a ("bpf: Factorize bpf_trace_printk and bpf_seq_printf") > --- > kernel/bpf/helpers.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 544773970dbc..007fa26eb3f5 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -709,7 +709,7 @@ static int try_get_fmt_tmp_buf(char **tmp_buf) > > preempt_disable(); > used = this_cpu_inc_return(bpf_printf_buf_used); > - if (WARN_ON_ONCE(used > 1)) { > + if (used > 1) { > this_cpu_dec(bpf_printf_buf_used); > preempt_enable(); > return -EBUSY; > -- > 2.31.1.527.g47e6f16901-goog > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-05 18:55 ` Andrii Nakryiko @ 2021-05-05 20:00 ` Daniel Borkmann 2021-05-05 20:48 ` Andrii Nakryiko 0 siblings, 1 reply; 10+ messages in thread From: Daniel Borkmann @ 2021-05-05 20:00 UTC (permalink / raw) To: Andrii Nakryiko, Florent Revest Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: >> >> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one >> per-cpu buffer that they use to store temporary data (arguments to >> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it >> by the end of their scope with bpf_bprintf_cleanup. >> >> If one of these helpers gets called within the scope of one of these >> helpers, for example: a first bpf program gets called, uses > > Can we afford having few struct bpf_printf_bufs? They are just 512 > bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > only situation where this can occur, right? If someone is doing > bpf_snprintf() and interrupt occurs and we run another BPF program, it > will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > second BPF program, etc. We can't eliminate the probability, but > having a small stack of buffers would make the probability so > miniscule as to not worry about it at all. > > Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > the changes are minimal. Nestedness property is preserved for > non-sleepable BPF programs, right? If we want this to work for > sleepable we'd need to either: 1) disable migration or 2) instead of > assuming a stack of buffers, do a loop to find unused one. Should be > acceptable performance-wise, as it's not the fastest code anyway > (printf'ing in general). > > In any case, re-using the same buffer for sort-of-optional-to-work > bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > suboptimal, so seems worth fixing this. > > Thoughts? Yes, agree, it would otherwise be really hard to debug. I had the same thought on why not allowing nesting here given users very likely expect these helpers to just work for all the contexts. Thanks, Daniel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-05 20:00 ` Daniel Borkmann @ 2021-05-05 20:48 ` Andrii Nakryiko 2021-05-05 20:52 ` Andrii Nakryiko 0 siblings, 1 reply; 10+ messages in thread From: Andrii Nakryiko @ 2021-05-05 20:48 UTC (permalink / raw) To: Daniel Borkmann Cc: Florent Revest, bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > > On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > >> > >> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > >> per-cpu buffer that they use to store temporary data (arguments to > >> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > >> by the end of their scope with bpf_bprintf_cleanup. > >> > >> If one of these helpers gets called within the scope of one of these > >> helpers, for example: a first bpf program gets called, uses > > > > Can we afford having few struct bpf_printf_bufs? They are just 512 > > bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > > only situation where this can occur, right? If someone is doing > > bpf_snprintf() and interrupt occurs and we run another BPF program, it > > will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > > second BPF program, etc. We can't eliminate the probability, but > > having a small stack of buffers would make the probability so > > miniscule as to not worry about it at all. > > > > Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > > the changes are minimal. Nestedness property is preserved for > > non-sleepable BPF programs, right? If we want this to work for > > sleepable we'd need to either: 1) disable migration or 2) instead of oh wait, we already disable migration for sleepable BPF progs, so it should be good to do nestedness level only > > assuming a stack of buffers, do a loop to find unused one. Should be > > acceptable performance-wise, as it's not the fastest code anyway > > (printf'ing in general). > > > > In any case, re-using the same buffer for sort-of-optional-to-work > > bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > > suboptimal, so seems worth fixing this. > > > > Thoughts? > > Yes, agree, it would otherwise be really hard to debug. I had the same > thought on why not allowing nesting here given users very likely expect > these helpers to just work for all the contexts. > > Thanks, > Daniel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-05 20:48 ` Andrii Nakryiko @ 2021-05-05 20:52 ` Andrii Nakryiko 2021-05-05 22:29 ` Florent Revest 0 siblings, 1 reply; 10+ messages in thread From: Andrii Nakryiko @ 2021-05-05 20:52 UTC (permalink / raw) To: Daniel Borkmann Cc: Florent Revest, bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Wed, May 5, 2021 at 1:48 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > > > On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > > >> > > >> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > > >> per-cpu buffer that they use to store temporary data (arguments to > > >> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > > >> by the end of their scope with bpf_bprintf_cleanup. > > >> > > >> If one of these helpers gets called within the scope of one of these > > >> helpers, for example: a first bpf program gets called, uses > > > > > > Can we afford having few struct bpf_printf_bufs? They are just 512 > > > bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > > > only situation where this can occur, right? If someone is doing > > > bpf_snprintf() and interrupt occurs and we run another BPF program, it > > > will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > > > second BPF program, etc. We can't eliminate the probability, but > > > having a small stack of buffers would make the probability so > > > miniscule as to not worry about it at all. > > > > > > Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > > > the changes are minimal. Nestedness property is preserved for > > > non-sleepable BPF programs, right? If we want this to work for > > > sleepable we'd need to either: 1) disable migration or 2) instead of > > oh wait, we already disable migration for sleepable BPF progs, so it > should be good to do nestedness level only actually, migrate_disable() might not be enough. Unless it is impossible for some reason I miss, worst case it could be that two sleepable programs (A and B) can be intermixed on the same CPU: A starts&sleeps - B starts&sleeps - A continues&returns - B continues and nestedness doesn't work anymore. So something like "reserving a slot" would work better. > > > > assuming a stack of buffers, do a loop to find unused one. Should be > > > acceptable performance-wise, as it's not the fastest code anyway > > > (printf'ing in general). > > > > > > In any case, re-using the same buffer for sort-of-optional-to-work > > > bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > > > suboptimal, so seems worth fixing this. > > > > > > Thoughts? > > > > Yes, agree, it would otherwise be really hard to debug. I had the same > > thought on why not allowing nesting here given users very likely expect > > these helpers to just work for all the contexts. > > > > Thanks, > > Daniel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-05 20:52 ` Andrii Nakryiko @ 2021-05-05 22:29 ` Florent Revest 2021-05-06 18:52 ` Andrii Nakryiko 0 siblings, 1 reply; 10+ messages in thread From: Florent Revest @ 2021-05-05 22:29 UTC (permalink / raw) To: Andrii Nakryiko Cc: Daniel Borkmann, bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Wed, May 5, 2021 at 10:52 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, May 5, 2021 at 1:48 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > > > > On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > > > >> > > > >> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > > > >> per-cpu buffer that they use to store temporary data (arguments to > > > >> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > > > >> by the end of their scope with bpf_bprintf_cleanup. > > > >> > > > >> If one of these helpers gets called within the scope of one of these > > > >> helpers, for example: a first bpf program gets called, uses > > > > > > > > Can we afford having few struct bpf_printf_bufs? They are just 512 > > > > bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > > > > only situation where this can occur, right? If someone is doing > > > > bpf_snprintf() and interrupt occurs and we run another BPF program, it > > > > will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > > > > second BPF program, etc. We can't eliminate the probability, but > > > > having a small stack of buffers would make the probability so > > > > miniscule as to not worry about it at all. > > > > > > > > Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > > > > the changes are minimal. Nestedness property is preserved for > > > > non-sleepable BPF programs, right? If we want this to work for > > > > sleepable we'd need to either: 1) disable migration or 2) instead of > > > > oh wait, we already disable migration for sleepable BPF progs, so it > > should be good to do nestedness level only > > actually, migrate_disable() might not be enough. Unless it is > impossible for some reason I miss, worst case it could be that two > sleepable programs (A and B) can be intermixed on the same CPU: A > starts&sleeps - B starts&sleeps - A continues&returns - B continues > and nestedness doesn't work anymore. So something like "reserving a > slot" would work better. Iiuc try_get_fmt_tmp_buf does preempt_enable to avoid that situation ? > > > > > > assuming a stack of buffers, do a loop to find unused one. Should be > > > > acceptable performance-wise, as it's not the fastest code anyway > > > > (printf'ing in general). > > > > > > > > In any case, re-using the same buffer for sort-of-optional-to-work > > > > bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > > > > suboptimal, so seems worth fixing this. > > > > > > > > Thoughts? > > > > > > Yes, agree, it would otherwise be really hard to debug. I had the same > > > thought on why not allowing nesting here given users very likely expect > > > these helpers to just work for all the contexts. > > > > > > Thanks, > > > Daniel What would you think of just letting the helpers own these 512 bytes buffers as local variables on their stacks ? Then bpf_prepare_bprintf would only need to write there, there would be no acquire semantic (like try_get_fmt_tmp_buf) and the stack frame would just be freed on the helper return so there would be no bpf_printf_cleanup either. We would also not pre-reserve static memory for all CPUs and it becomes trivial to handle re-entrant helper calls. I inherited this per-cpu buffer from the pre-existing bpf_seq_printf code but I've not been convinced of its necessity. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-05 22:29 ` Florent Revest @ 2021-05-06 18:52 ` Andrii Nakryiko 2021-05-06 20:17 ` Florent Revest 0 siblings, 1 reply; 10+ messages in thread From: Andrii Nakryiko @ 2021-05-06 18:52 UTC (permalink / raw) To: Florent Revest Cc: Daniel Borkmann, bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Wed, May 5, 2021 at 3:29 PM Florent Revest <revest@chromium.org> wrote: > > On Wed, May 5, 2021 at 10:52 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, May 5, 2021 at 1:48 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > > > On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > > > > > On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > > > > >> > > > > >> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > > > > >> per-cpu buffer that they use to store temporary data (arguments to > > > > >> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > > > > >> by the end of their scope with bpf_bprintf_cleanup. > > > > >> > > > > >> If one of these helpers gets called within the scope of one of these > > > > >> helpers, for example: a first bpf program gets called, uses > > > > > > > > > > Can we afford having few struct bpf_printf_bufs? They are just 512 > > > > > bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > > > > > only situation where this can occur, right? If someone is doing > > > > > bpf_snprintf() and interrupt occurs and we run another BPF program, it > > > > > will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > > > > > second BPF program, etc. We can't eliminate the probability, but > > > > > having a small stack of buffers would make the probability so > > > > > miniscule as to not worry about it at all. > > > > > > > > > > Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > > > > > the changes are minimal. Nestedness property is preserved for > > > > > non-sleepable BPF programs, right? If we want this to work for > > > > > sleepable we'd need to either: 1) disable migration or 2) instead of > > > > > > oh wait, we already disable migration for sleepable BPF progs, so it > > > should be good to do nestedness level only > > > > actually, migrate_disable() might not be enough. Unless it is > > impossible for some reason I miss, worst case it could be that two > > sleepable programs (A and B) can be intermixed on the same CPU: A > > starts&sleeps - B starts&sleeps - A continues&returns - B continues > > and nestedness doesn't work anymore. So something like "reserving a > > slot" would work better. > > Iiuc try_get_fmt_tmp_buf does preempt_enable to avoid that situation ? > > > > > > > > > assuming a stack of buffers, do a loop to find unused one. Should be > > > > > acceptable performance-wise, as it's not the fastest code anyway > > > > > (printf'ing in general). > > > > > > > > > > In any case, re-using the same buffer for sort-of-optional-to-work > > > > > bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > > > > > suboptimal, so seems worth fixing this. > > > > > > > > > > Thoughts? > > > > > > > > Yes, agree, it would otherwise be really hard to debug. I had the same > > > > thought on why not allowing nesting here given users very likely expect > > > > these helpers to just work for all the contexts. > > > > > > > > Thanks, > > > > Daniel > > What would you think of just letting the helpers own these 512 bytes > buffers as local variables on their stacks ? Then bpf_prepare_bprintf > would only need to write there, there would be no acquire semantic > (like try_get_fmt_tmp_buf) and the stack frame would just be freed on > the helper return so there would be no bpf_printf_cleanup either. We > would also not pre-reserve static memory for all CPUs and it becomes > trivial to handle re-entrant helper calls. > > I inherited this per-cpu buffer from the pre-existing bpf_seq_printf > code but I've not been convinced of its necessity. I got the impression that extra 512 bytes on the kernel stack is quite a lot and that's why we have per-cpu buffers. Especially that bpf_trace_printk() can be called from any context, including NMI. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-06 18:52 ` Andrii Nakryiko @ 2021-05-06 20:17 ` Florent Revest 2021-05-06 21:38 ` Daniel Borkmann 0 siblings, 1 reply; 10+ messages in thread From: Florent Revest @ 2021-05-06 20:17 UTC (permalink / raw) To: Andrii Nakryiko Cc: Daniel Borkmann, bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Thu, May 6, 2021 at 8:52 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, May 5, 2021 at 3:29 PM Florent Revest <revest@chromium.org> wrote: > > > > On Wed, May 5, 2021 at 10:52 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Wed, May 5, 2021 at 1:48 PM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > > > > > On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > > > > > > On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > > > > > >> > > > > > >> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > > > > > >> per-cpu buffer that they use to store temporary data (arguments to > > > > > >> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > > > > > >> by the end of their scope with bpf_bprintf_cleanup. > > > > > >> > > > > > >> If one of these helpers gets called within the scope of one of these > > > > > >> helpers, for example: a first bpf program gets called, uses > > > > > > > > > > > > Can we afford having few struct bpf_printf_bufs? They are just 512 > > > > > > bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > > > > > > only situation where this can occur, right? If someone is doing > > > > > > bpf_snprintf() and interrupt occurs and we run another BPF program, it > > > > > > will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > > > > > > second BPF program, etc. We can't eliminate the probability, but > > > > > > having a small stack of buffers would make the probability so > > > > > > miniscule as to not worry about it at all. > > > > > > > > > > > > Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > > > > > > the changes are minimal. Nestedness property is preserved for > > > > > > non-sleepable BPF programs, right? If we want this to work for > > > > > > sleepable we'd need to either: 1) disable migration or 2) instead of > > > > > > > > oh wait, we already disable migration for sleepable BPF progs, so it > > > > should be good to do nestedness level only > > > > > > actually, migrate_disable() might not be enough. Unless it is > > > impossible for some reason I miss, worst case it could be that two > > > sleepable programs (A and B) can be intermixed on the same CPU: A > > > starts&sleeps - B starts&sleeps - A continues&returns - B continues > > > and nestedness doesn't work anymore. So something like "reserving a > > > slot" would work better. > > > > Iiuc try_get_fmt_tmp_buf does preempt_enable to avoid that situation ? > > > > > > > > > > > > assuming a stack of buffers, do a loop to find unused one. Should be > > > > > > acceptable performance-wise, as it's not the fastest code anyway > > > > > > (printf'ing in general). > > > > > > > > > > > > In any case, re-using the same buffer for sort-of-optional-to-work > > > > > > bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > > > > > > suboptimal, so seems worth fixing this. > > > > > > > > > > > > Thoughts? > > > > > > > > > > Yes, agree, it would otherwise be really hard to debug. I had the same > > > > > thought on why not allowing nesting here given users very likely expect > > > > > these helpers to just work for all the contexts. > > > > > > > > > > Thanks, > > > > > Daniel > > > > What would you think of just letting the helpers own these 512 bytes > > buffers as local variables on their stacks ? Then bpf_prepare_bprintf > > would only need to write there, there would be no acquire semantic > > (like try_get_fmt_tmp_buf) and the stack frame would just be freed on > > the helper return so there would be no bpf_printf_cleanup either. We > > would also not pre-reserve static memory for all CPUs and it becomes > > trivial to handle re-entrant helper calls. > > > > I inherited this per-cpu buffer from the pre-existing bpf_seq_printf > > code but I've not been convinced of its necessity. > > I got the impression that extra 512 bytes on the kernel stack is quite > a lot and that's why we have per-cpu buffers. Especially that > bpf_trace_printk() can be called from any context, including NMI. Ok, I understand. What about having one buffer per helper, synchronized with a spinlock? Actually, bpf_trace_printk already has that, not for the bprintf arguments but for the bprintf output so this wouldn't change much to the performance of the helpers anyway: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/trace/bpf_trace.c?id=9d31d2338950293ec19d9b095fbaa9030899dcb4#n385 These helpers are not performance sensitive so a per-cpu stack of buffers feels over-engineered to me (and is also complexity I feel a bit uncomfortable with). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-06 20:17 ` Florent Revest @ 2021-05-06 21:38 ` Daniel Borkmann 2021-05-07 10:39 ` Florent Revest 0 siblings, 1 reply; 10+ messages in thread From: Daniel Borkmann @ 2021-05-06 21:38 UTC (permalink / raw) To: Florent Revest, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On 5/6/21 10:17 PM, Florent Revest wrote: > On Thu, May 6, 2021 at 8:52 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> On Wed, May 5, 2021 at 3:29 PM Florent Revest <revest@chromium.org> wrote: >>> On Wed, May 5, 2021 at 10:52 PM Andrii Nakryiko >>> <andrii.nakryiko@gmail.com> wrote: >>>> On Wed, May 5, 2021 at 1:48 PM Andrii Nakryiko >>>> <andrii.nakryiko@gmail.com> wrote: >>>>> On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: >>>>>> On 5/5/21 8:55 PM, Andrii Nakryiko wrote: >>>>>>> On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: >>>>>>>> >>>>>>>> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one >>>>>>>> per-cpu buffer that they use to store temporary data (arguments to >>>>>>>> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it >>>>>>>> by the end of their scope with bpf_bprintf_cleanup. >>>>>>>> >>>>>>>> If one of these helpers gets called within the scope of one of these >>>>>>>> helpers, for example: a first bpf program gets called, uses >>>>>>> >>>>>>> Can we afford having few struct bpf_printf_bufs? They are just 512 >>>>>>> bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the >>>>>>> only situation where this can occur, right? If someone is doing >>>>>>> bpf_snprintf() and interrupt occurs and we run another BPF program, it >>>>>>> will be impossible to do bpf_snprintf() or bpf_trace_printk() from the >>>>>>> second BPF program, etc. We can't eliminate the probability, but >>>>>>> having a small stack of buffers would make the probability so >>>>>>> miniscule as to not worry about it at all. >>>>>>> >>>>>>> Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so >>>>>>> the changes are minimal. Nestedness property is preserved for >>>>>>> non-sleepable BPF programs, right? If we want this to work for >>>>>>> sleepable we'd need to either: 1) disable migration or 2) instead of >>>>> >>>>> oh wait, we already disable migration for sleepable BPF progs, so it >>>>> should be good to do nestedness level only >>>> >>>> actually, migrate_disable() might not be enough. Unless it is >>>> impossible for some reason I miss, worst case it could be that two >>>> sleepable programs (A and B) can be intermixed on the same CPU: A >>>> starts&sleeps - B starts&sleeps - A continues&returns - B continues >>>> and nestedness doesn't work anymore. So something like "reserving a >>>> slot" would work better. >>> >>> Iiuc try_get_fmt_tmp_buf does preempt_enable to avoid that situation ? >>> >>>>>>> assuming a stack of buffers, do a loop to find unused one. Should be >>>>>>> acceptable performance-wise, as it's not the fastest code anyway >>>>>>> (printf'ing in general). >>>>>>> >>>>>>> In any case, re-using the same buffer for sort-of-optional-to-work >>>>>>> bpf_trace_printk() and probably-important-to-work bpf_snprintf() is >>>>>>> suboptimal, so seems worth fixing this. >>>>>>> >>>>>>> Thoughts? >>>>>> >>>>>> Yes, agree, it would otherwise be really hard to debug. I had the same >>>>>> thought on why not allowing nesting here given users very likely expect >>>>>> these helpers to just work for all the contexts. >>>>>> >>>>>> Thanks, >>>>>> Daniel >>> >>> What would you think of just letting the helpers own these 512 bytes >>> buffers as local variables on their stacks ? Then bpf_prepare_bprintf >>> would only need to write there, there would be no acquire semantic >>> (like try_get_fmt_tmp_buf) and the stack frame would just be freed on >>> the helper return so there would be no bpf_printf_cleanup either. We >>> would also not pre-reserve static memory for all CPUs and it becomes >>> trivial to handle re-entrant helper calls. >>> >>> I inherited this per-cpu buffer from the pre-existing bpf_seq_printf >>> code but I've not been convinced of its necessity. >> >> I got the impression that extra 512 bytes on the kernel stack is quite >> a lot and that's why we have per-cpu buffers. Especially that >> bpf_trace_printk() can be called from any context, including NMI. > > Ok, I understand. > > What about having one buffer per helper, synchronized with a spinlock? > Actually, bpf_trace_printk already has that, not for the bprintf > arguments but for the bprintf output so this wouldn't change much to > the performance of the helpers anyway: > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/trace/bpf_trace.c?id=9d31d2338950293ec19d9b095fbaa9030899dcb4#n385 > > These helpers are not performance sensitive so a per-cpu stack of > buffers feels over-engineered to me (and is also complexity I feel a > bit uncomfortable with). But wouldn't this have same potential of causing a deadlock? Simple example would be if you have a tracing prog attached to bstr_printf(), and one of the other helpers using the same lock called from a non-tracing prog. If it can be avoided fairly easily, I'd also opt for per-cpu buffers as Andrii mentioned earlier. We've had few prior examples with similar issues [0]. [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9594dc3c7e71b9f52bee1d7852eb3d4e3aea9e99 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare 2021-05-06 21:38 ` Daniel Borkmann @ 2021-05-07 10:39 ` Florent Revest 0 siblings, 0 replies; 10+ messages in thread From: Florent Revest @ 2021-05-07 10:39 UTC (permalink / raw) To: Daniel Borkmann Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko, KP Singh, Brendan Jackman, Stanislav Fomichev, open list, syzbot On Thu, May 6, 2021 at 11:38 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 5/6/21 10:17 PM, Florent Revest wrote: > > On Thu, May 6, 2021 at 8:52 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > >> On Wed, May 5, 2021 at 3:29 PM Florent Revest <revest@chromium.org> wrote: > >>> On Wed, May 5, 2021 at 10:52 PM Andrii Nakryiko > >>> <andrii.nakryiko@gmail.com> wrote: > >>>> On Wed, May 5, 2021 at 1:48 PM Andrii Nakryiko > >>>> <andrii.nakryiko@gmail.com> wrote: > >>>>> On Wed, May 5, 2021 at 1:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > >>>>>> On 5/5/21 8:55 PM, Andrii Nakryiko wrote: > >>>>>>> On Wed, May 5, 2021 at 9:23 AM Florent Revest <revest@chromium.org> wrote: > >>>>>>>> > >>>>>>>> The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one > >>>>>>>> per-cpu buffer that they use to store temporary data (arguments to > >>>>>>>> bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it > >>>>>>>> by the end of their scope with bpf_bprintf_cleanup. > >>>>>>>> > >>>>>>>> If one of these helpers gets called within the scope of one of these > >>>>>>>> helpers, for example: a first bpf program gets called, uses > >>>>>>> > >>>>>>> Can we afford having few struct bpf_printf_bufs? They are just 512 > >>>>>>> bytes, so can we have 3-5 of them? Tracing low-level stuff isn't the > >>>>>>> only situation where this can occur, right? If someone is doing > >>>>>>> bpf_snprintf() and interrupt occurs and we run another BPF program, it > >>>>>>> will be impossible to do bpf_snprintf() or bpf_trace_printk() from the > >>>>>>> second BPF program, etc. We can't eliminate the probability, but > >>>>>>> having a small stack of buffers would make the probability so > >>>>>>> miniscule as to not worry about it at all. > >>>>>>> > >>>>>>> Good thing is that try_get_fmt_tmp_buf() abstracts all the details, so > >>>>>>> the changes are minimal. Nestedness property is preserved for > >>>>>>> non-sleepable BPF programs, right? If we want this to work for > >>>>>>> sleepable we'd need to either: 1) disable migration or 2) instead of > >>>>> > >>>>> oh wait, we already disable migration for sleepable BPF progs, so it > >>>>> should be good to do nestedness level only > >>>> > >>>> actually, migrate_disable() might not be enough. Unless it is > >>>> impossible for some reason I miss, worst case it could be that two > >>>> sleepable programs (A and B) can be intermixed on the same CPU: A > >>>> starts&sleeps - B starts&sleeps - A continues&returns - B continues > >>>> and nestedness doesn't work anymore. So something like "reserving a > >>>> slot" would work better. > >>> > >>> Iiuc try_get_fmt_tmp_buf does preempt_enable to avoid that situation ? > >>> > >>>>>>> assuming a stack of buffers, do a loop to find unused one. Should be > >>>>>>> acceptable performance-wise, as it's not the fastest code anyway > >>>>>>> (printf'ing in general). > >>>>>>> > >>>>>>> In any case, re-using the same buffer for sort-of-optional-to-work > >>>>>>> bpf_trace_printk() and probably-important-to-work bpf_snprintf() is > >>>>>>> suboptimal, so seems worth fixing this. > >>>>>>> > >>>>>>> Thoughts? > >>>>>> > >>>>>> Yes, agree, it would otherwise be really hard to debug. I had the same > >>>>>> thought on why not allowing nesting here given users very likely expect > >>>>>> these helpers to just work for all the contexts. > >>>>>> > >>>>>> Thanks, > >>>>>> Daniel > >>> > >>> What would you think of just letting the helpers own these 512 bytes > >>> buffers as local variables on their stacks ? Then bpf_prepare_bprintf > >>> would only need to write there, there would be no acquire semantic > >>> (like try_get_fmt_tmp_buf) and the stack frame would just be freed on > >>> the helper return so there would be no bpf_printf_cleanup either. We > >>> would also not pre-reserve static memory for all CPUs and it becomes > >>> trivial to handle re-entrant helper calls. > >>> > >>> I inherited this per-cpu buffer from the pre-existing bpf_seq_printf > >>> code but I've not been convinced of its necessity. > >> > >> I got the impression that extra 512 bytes on the kernel stack is quite > >> a lot and that's why we have per-cpu buffers. Especially that > >> bpf_trace_printk() can be called from any context, including NMI. > > > > Ok, I understand. > > > > What about having one buffer per helper, synchronized with a spinlock? > > Actually, bpf_trace_printk already has that, not for the bprintf > > arguments but for the bprintf output so this wouldn't change much to > > the performance of the helpers anyway: > > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/trace/bpf_trace.c?id=9d31d2338950293ec19d9b095fbaa9030899dcb4#n385 > > > > These helpers are not performance sensitive so a per-cpu stack of > > buffers feels over-engineered to me (and is also complexity I feel a > > bit uncomfortable with). > > But wouldn't this have same potential of causing a deadlock? Simple example > would be if you have a tracing prog attached to bstr_printf(), and one of > the other helpers using the same lock called from a non-tracing prog. If Ah, right, I see :/ > it can be avoided fairly easily, I'd also opt for per-cpu buffers as Andrii > mentioned earlier. We've had few prior examples with similar issues [0]. > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9594dc3c7e71b9f52bee1d7852eb3d4e3aea9e99 Ok it's not as bad as I imagined, thank you Daniel :) I'll look into it beginning of next week. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-05-07 10:39 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-05 16:23 [PATCH bpf] bpf: Don't WARN_ON_ONCE in bpf_bprintf_prepare Florent Revest 2021-05-05 18:55 ` Andrii Nakryiko 2021-05-05 20:00 ` Daniel Borkmann 2021-05-05 20:48 ` Andrii Nakryiko 2021-05-05 20:52 ` Andrii Nakryiko 2021-05-05 22:29 ` Florent Revest 2021-05-06 18:52 ` Andrii Nakryiko 2021-05-06 20:17 ` Florent Revest 2021-05-06 21:38 ` Daniel Borkmann 2021-05-07 10:39 ` Florent Revest
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).