All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] libbpf: handle producer position overflow
@ 2023-07-24 13:24 Andrew Werner
  2023-08-22  5:15 ` Andrii Nakryiko
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Werner @ 2023-07-24 13:24 UTC (permalink / raw)
  To: bpf; +Cc: kernel-team, Andrew Werner

Before this patch, the producer position could overflow `unsigned
long`, in which case libbpf would forever stop processing new writes to
the ringbuf. This patch addresses that bug by avoiding ordered
comparison between the consumer and producer position. If the consumer
position is greater than the producer position, the assumption is that
the producer has overflowed.

A more defensive check could be to ensure that the delta is within
the allowed range, but such defensive checks are neither present in
the kernel side code nor in libbpf. The overflow that this patch
handles can occur while the producer and consumer follow a correct
protocol.

A selftest was written to demonstrate the bug, and indeed this patch
allows the test to continue to make progress past the overflow.
However, the author was unable to create a testing environment on a
32-bit machine, and the test requires substantial memory and over 4
hours to hit the overflow point on a 64-bit machine. Thus, the test
is not included in this patch because of the impracticality of running
it.

Additionally, this patch adds commentary around a separate point to note
that the modular arithmetic is valid in the face of overflows, as that
fact may not be obvious to future readers.

Signed-off-by: Andrew Werner <awerner32@gmail.com>
---
 tools/lib/bpf/ringbuf.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 02199364db13..6271757bc3d2 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
 	do {
 		got_new_data = false;
 		prod_pos = smp_load_acquire(r->producer_pos);
-		while (cons_pos < prod_pos) {
+
+		/* Avoid signed comparisons between the positions; the producer
+		 * position can overflow before the consumer position.
+		 */
+		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
 			len = smp_load_acquire(len_ptr);
 
@@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
 	prod_pos = smp_load_acquire(rb->producer_pos);
 
 	max_size = rb->mask + 1;
+
+	/* Note that this formulation in the face of overflow of prod_pos
+	 * so long as the delta between prod_pos and cons_pos remains no
+	 * greater than max_size.
+	 */
 	avail_size = max_size - (prod_pos - cons_pos);
 	/* Round up total size to a multiple of 8. */
 	total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-07-24 13:24 [PATCH] libbpf: handle producer position overflow Andrew Werner
@ 2023-08-22  5:15 ` Andrii Nakryiko
  2023-08-23  1:48   ` Hou Tao
  0 siblings, 1 reply; 8+ messages in thread
From: Andrii Nakryiko @ 2023-08-22  5:15 UTC (permalink / raw)
  To: Andrew Werner; +Cc: bpf, kernel-team

On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@gmail.com> wrote:
>
> Before this patch, the producer position could overflow `unsigned
> long`, in which case libbpf would forever stop processing new writes to
> the ringbuf. This patch addresses that bug by avoiding ordered
> comparison between the consumer and producer position. If the consumer
> position is greater than the producer position, the assumption is that
> the producer has overflowed.
>
> A more defensive check could be to ensure that the delta is within
> the allowed range, but such defensive checks are neither present in
> the kernel side code nor in libbpf. The overflow that this patch
> handles can occur while the producer and consumer follow a correct
> protocol.

Yep, great find!

>
> A selftest was written to demonstrate the bug, and indeed this patch
> allows the test to continue to make progress past the overflow.
> However, the author was unable to create a testing environment on a
> 32-bit machine, and the test requires substantial memory and over 4

2GB of memory for ringbuf, right? Perhaps it would be good to just
outline the repro, even if we won't have it implemented in selftests.
Something along the lines of: a) set up ringbuf of 2GB size and
reserve+commit maximum-sized record (UMAX/4) constantly as fast as
possible. With 1 million records per second repro time should be about
4.7 hours. Can you please update the commit with something like that
instead of a vague "there is repro, but I won't show it ;)" ? Thanks!

> hours to hit the overflow point on a 64-bit machine. Thus, the test
> is not included in this patch because of the impracticality of running
> it.
>
> Additionally, this patch adds commentary around a separate point to note
> that the modular arithmetic is valid in the face of overflows, as that
> fact may not be obvious to future readers.
>
> Signed-off-by: Andrew Werner <awerner32@gmail.com>
> ---
>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index 02199364db13..6271757bc3d2 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
>         do {
>                 got_new_data = false;
>                 prod_pos = smp_load_acquire(r->producer_pos);
> -               while (cons_pos < prod_pos) {
> +
> +               /* Avoid signed comparisons between the positions; the producer

"signed comparisons" is confusing and invalid, as cons_pos and
prod_pos are unsigned and comparison here is unsigned. What you meant
is inequality comparison, which is invalid when done naively (like it
is done in libbpf right now, sigh...), if counters can wrap around.

> +                * position can overflow before the consumer position.
> +                */
> +               while (cons_pos != prod_pos) {

I'm wondering if we should preserve the "consumer pos is before
producer pos" check just for clarity's sake (with a comment about
wrapping around of counters, of course) like:

if ((long)(cons_pos - prod_pos) < 0) ?

BTW, I think kernel code needs fixing as well in
__bpf_user_ringbuf_peek (we should compare consumer/producer positions
directly, only through subtraction and casting to signed long as
above), would you be able to fix it at the same time with libbpf?
Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
make sure we don't directly compare positions anywhere else.

>                         len_ptr = r->data + (cons_pos & r->mask);
>                         len = smp_load_acquire(len_ptr);
>
> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
>         prod_pos = smp_load_acquire(rb->producer_pos);
>
>         max_size = rb->mask + 1;
> +
> +       /* Note that this formulation in the face of overflow of prod_pos
> +        * so long as the delta between prod_pos and cons_pos remains no
> +        * greater than max_size.
> +        */
>         avail_size = max_size - (prod_pos - cons_pos);
>         /* Round up total size to a multiple of 8. */
>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-08-22  5:15 ` Andrii Nakryiko
@ 2023-08-23  1:48   ` Hou Tao
  2023-08-23 16:52     ` Andrii Nakryiko
  0 siblings, 1 reply; 8+ messages in thread
From: Hou Tao @ 2023-08-23  1:48 UTC (permalink / raw)
  To: Andrii Nakryiko, Andrew Werner; +Cc: bpf, kernel-team

Hi,

On 8/22/2023 1:15 PM, Andrii Nakryiko wrote:
> On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@gmail.com> wrote:
>> Before this patch, the producer position could overflow `unsigned
>> long`, in which case libbpf would forever stop processing new writes to
>> the ringbuf. This patch addresses that bug by avoiding ordered
>> comparison between the consumer and producer position. If the consumer
>> position is greater than the producer position, the assumption is that
>> the producer has overflowed.
>>
>> A more defensive check could be to ensure that the delta is within
>> the allowed range, but such defensive checks are neither present in
>> the kernel side code nor in libbpf. The overflow that this patch
>> handles can occur while the producer and consumer follow a correct
>> protocol.
> Yep, great find!
>
>> A selftest was written to demonstrate the bug, and indeed this patch
>> allows the test to continue to make progress past the overflow.
>> However, the author was unable to create a testing environment on a
>> 32-bit machine, and the test requires substantial memory and over 4
> 2GB of memory for ringbuf, right? Perhaps it would be good to just
> outline the repro, even if we won't have it implemented in selftests.
> Something along the lines of: a) set up ringbuf of 2GB size and
> reserve+commit maximum-sized record (UMAX/4) constantly as fast as
> possible. With 1 million records per second repro time should be about
> 4.7 hours. Can you please update the commit with something like that
> instead of a vague "there is repro, but I won't show it ;)" ? Thanks!

I think it would be great that the commit message can elaborate about
the repo. Andrew had already posted an external link to the reproducer
in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713

[0]:
https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@gmail.com/
>> hours to hit the overflow point on a 64-bit machine. Thus, the test
>> is not included in this patch because of the impracticality of running
>> it.
>>
>> Additionally, this patch adds commentary around a separate point to note
>> that the modular arithmetic is valid in the face of overflows, as that
>> fact may not be obvious to future readers.
>>
>> Signed-off-by: Andrew Werner <awerner32@gmail.com>
>> ---
>>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
>> index 02199364db13..6271757bc3d2 100644
>> --- a/tools/lib/bpf/ringbuf.c
>> +++ b/tools/lib/bpf/ringbuf.c
>> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
>>         do {
>>                 got_new_data = false;
>>                 prod_pos = smp_load_acquire(r->producer_pos);
>> -               while (cons_pos < prod_pos) {
>> +
>> +               /* Avoid signed comparisons between the positions; the producer
> "signed comparisons" is confusing and invalid, as cons_pos and
> prod_pos are unsigned and comparison here is unsigned. What you meant
> is inequality comparison, which is invalid when done naively (like it
> is done in libbpf right now, sigh...), if counters can wrap around.
>
>> +                * position can overflow before the consumer position.
>> +                */
>> +               while (cons_pos != prod_pos) {
> I'm wondering if we should preserve the "consumer pos is before
> producer pos" check just for clarity's sake (with a comment about
> wrapping around of counters, of course) like:
>
> if ((long)(cons_pos - prod_pos) < 0) ?
>
> BTW, I think kernel code needs fixing as well in
> __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> directly, only through subtraction and casting to signed long as
> above), would you be able to fix it at the same time with libbpf?
> Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> make sure we don't directly compare positions anywhere else.

I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
help to double-check kernel/bpf/ringbuf.c.
>
>>                         len_ptr = r->data + (cons_pos & r->mask);
>>                         len = smp_load_acquire(len_ptr);
>>
>> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
>>         prod_pos = smp_load_acquire(rb->producer_pos);
>>
>>         max_size = rb->mask + 1;
>> +
>> +       /* Note that this formulation in the face of overflow of prod_pos
>> +        * so long as the delta between prod_pos and cons_pos remains no
>> +        * greater than max_size.
>> +        */
>>         avail_size = max_size - (prod_pos - cons_pos);
>>         /* Round up total size to a multiple of 8. */
>>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
>> --
>> 2.39.2
>>
>>
> .


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-08-23  1:48   ` Hou Tao
@ 2023-08-23 16:52     ` Andrii Nakryiko
  2023-08-23 22:14       ` Andrew Werner
  0 siblings, 1 reply; 8+ messages in thread
From: Andrii Nakryiko @ 2023-08-23 16:52 UTC (permalink / raw)
  To: Hou Tao; +Cc: Andrew Werner, bpf, kernel-team

On Tue, Aug 22, 2023 at 6:49 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 8/22/2023 1:15 PM, Andrii Nakryiko wrote:
> > On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@gmail.com> wrote:
> >> Before this patch, the producer position could overflow `unsigned
> >> long`, in which case libbpf would forever stop processing new writes to
> >> the ringbuf. This patch addresses that bug by avoiding ordered
> >> comparison between the consumer and producer position. If the consumer
> >> position is greater than the producer position, the assumption is that
> >> the producer has overflowed.
> >>
> >> A more defensive check could be to ensure that the delta is within
> >> the allowed range, but such defensive checks are neither present in
> >> the kernel side code nor in libbpf. The overflow that this patch
> >> handles can occur while the producer and consumer follow a correct
> >> protocol.
> > Yep, great find!
> >
> >> A selftest was written to demonstrate the bug, and indeed this patch
> >> allows the test to continue to make progress past the overflow.
> >> However, the author was unable to create a testing environment on a
> >> 32-bit machine, and the test requires substantial memory and over 4
> > 2GB of memory for ringbuf, right? Perhaps it would be good to just
> > outline the repro, even if we won't have it implemented in selftests.
> > Something along the lines of: a) set up ringbuf of 2GB size and
> > reserve+commit maximum-sized record (UMAX/4) constantly as fast as
> > possible. With 1 million records per second repro time should be about
> > 4.7 hours. Can you please update the commit with something like that
> > instead of a vague "there is repro, but I won't show it ;)" ? Thanks!
>
> I think it would be great that the commit message can elaborate about
> the repo. Andrew had already posted an external link to the reproducer
> in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713

Yeah, of course, I saw it. I didn't mean that he actually refuses to
show repro, it's just that the commit message is vague about it and we
should improve that. I think we are on the same page. Including the
link to full repro in addition to more detailed description is also
good.

>
> [0]:
> https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@gmail.com/
> >> hours to hit the overflow point on a 64-bit machine. Thus, the test
> >> is not included in this patch because of the impracticality of running
> >> it.
> >>
> >> Additionally, this patch adds commentary around a separate point to note
> >> that the modular arithmetic is valid in the face of overflows, as that
> >> fact may not be obvious to future readers.
> >>
> >> Signed-off-by: Andrew Werner <awerner32@gmail.com>
> >> ---
> >>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
> >>  1 file changed, 10 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> >> index 02199364db13..6271757bc3d2 100644
> >> --- a/tools/lib/bpf/ringbuf.c
> >> +++ b/tools/lib/bpf/ringbuf.c
> >> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
> >>         do {
> >>                 got_new_data = false;
> >>                 prod_pos = smp_load_acquire(r->producer_pos);
> >> -               while (cons_pos < prod_pos) {
> >> +
> >> +               /* Avoid signed comparisons between the positions; the producer
> > "signed comparisons" is confusing and invalid, as cons_pos and
> > prod_pos are unsigned and comparison here is unsigned. What you meant
> > is inequality comparison, which is invalid when done naively (like it
> > is done in libbpf right now, sigh...), if counters can wrap around.
> >
> >> +                * position can overflow before the consumer position.
> >> +                */
> >> +               while (cons_pos != prod_pos) {
> > I'm wondering if we should preserve the "consumer pos is before
> > producer pos" check just for clarity's sake (with a comment about
> > wrapping around of counters, of course) like:
> >
> > if ((long)(cons_pos - prod_pos) < 0) ?
> >
> > BTW, I think kernel code needs fixing as well in
> > __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> > directly, only through subtraction and casting to signed long as
> > above), would you be able to fix it at the same time with libbpf?
> > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> > make sure we don't directly compare positions anywhere else.
>
> I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
> help to double-check kernel/bpf/ringbuf.c.

great, thanks!

> >
> >>                         len_ptr = r->data + (cons_pos & r->mask);
> >>                         len = smp_load_acquire(len_ptr);
> >>
> >> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
> >>         prod_pos = smp_load_acquire(rb->producer_pos);
> >>
> >>         max_size = rb->mask + 1;
> >> +
> >> +       /* Note that this formulation in the face of overflow of prod_pos
> >> +        * so long as the delta between prod_pos and cons_pos remains no
> >> +        * greater than max_size.
> >> +        */
> >>         avail_size = max_size - (prod_pos - cons_pos);
> >>         /* Round up total size to a multiple of 8. */
> >>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
> >> --
> >> 2.39.2
> >>
> >>
> > .
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-08-23 16:52     ` Andrii Nakryiko
@ 2023-08-23 22:14       ` Andrew Werner
  2023-08-24  0:07         ` Andrii Nakryiko
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Werner @ 2023-08-23 22:14 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: Hou Tao, bpf, kernel-team

On Wed, Aug 23, 2023 at 12:52 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Aug 22, 2023 at 6:49 PM Hou Tao <houtao@huaweicloud.com> wrote:
> >
> > Hi,
> >
> > On 8/22/2023 1:15 PM, Andrii Nakryiko wrote:
> > > On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@gmail.com> wrote:
> > >> Before this patch, the producer position could overflow `unsigned
> > >> long`, in which case libbpf would forever stop processing new writes to
> > >> the ringbuf. This patch addresses that bug by avoiding ordered
> > >> comparison between the consumer and producer position. If the consumer
> > >> position is greater than the producer position, the assumption is that
> > >> the producer has overflowed.
> > >>
> > >> A more defensive check could be to ensure that the delta is within
> > >> the allowed range, but such defensive checks are neither present in
> > >> the kernel side code nor in libbpf. The overflow that this patch
> > >> handles can occur while the producer and consumer follow a correct
> > >> protocol.
> > > Yep, great find!
> > >
> > >> A selftest was written to demonstrate the bug, and indeed this patch
> > >> allows the test to continue to make progress past the overflow.
> > >> However, the author was unable to create a testing environment on a
> > >> 32-bit machine, and the test requires substantial memory and over 4
> > > 2GB of memory for ringbuf, right? Perhaps it would be good to just
> > > outline the repro, even if we won't have it implemented in selftests.
> > > Something along the lines of: a) set up ringbuf of 2GB size and
> > > reserve+commit maximum-sized record (UMAX/4) constantly as fast as
> > > possible. With 1 million records per second repro time should be about
> > > 4.7 hours. Can you please update the commit with something like that
> > > instead of a vague "there is repro, but I won't show it ;)" ? Thanks!
> >
> > I think it would be great that the commit message can elaborate about
> > the repo. Andrew had already posted an external link to the reproducer
> > in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713
>
> Yeah, of course, I saw it. I didn't mean that he actually refuses to
> show repro, it's just that the commit message is vague about it and we
> should improve that. I think we are on the same page. Including the
> link to full repro in addition to more detailed description is also
> good.

I'll update the message as you suggest in v3.

>
> >
> > [0]:
> > https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@gmail.com/
> > >> hours to hit the overflow point on a 64-bit machine. Thus, the test
> > >> is not included in this patch because of the impracticality of running
> > >> it.
> > >>
> > >> Additionally, this patch adds commentary around a separate point to note
> > >> that the modular arithmetic is valid in the face of overflows, as that
> > >> fact may not be obvious to future readers.
> > >>
> > >> Signed-off-by: Andrew Werner <awerner32@gmail.com>
> > >> ---
> > >>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
> > >>  1 file changed, 10 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > >> index 02199364db13..6271757bc3d2 100644
> > >> --- a/tools/lib/bpf/ringbuf.c
> > >> +++ b/tools/lib/bpf/ringbuf.c
> > >> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
> > >>         do {
> > >>                 got_new_data = false;
> > >>                 prod_pos = smp_load_acquire(r->producer_pos);
> > >> -               while (cons_pos < prod_pos) {
> > >> +
> > >> +               /* Avoid signed comparisons between the positions; the producer
> > > "signed comparisons" is confusing and invalid, as cons_pos and
> > > prod_pos are unsigned and comparison here is unsigned. What you meant
> > > is inequality comparison, which is invalid when done naively (like it
> > > is done in libbpf right now, sigh...), if counters can wrap around.
> > >
> > >> +                * position can overflow before the consumer position.
> > >> +                */
> > >> +               while (cons_pos != prod_pos) {
> > > I'm wondering if we should preserve the "consumer pos is before
> > > producer pos" check just for clarity's sake (with a comment about
> > > wrapping around of counters, of course) like:
> > >
> > > if ((long)(cons_pos - prod_pos) < 0) ?

Will do.

> > >
> > > BTW, I think kernel code needs fixing as well in
> > > __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> > > directly, only through subtraction and casting to signed long as
> > > above), would you be able to fix it at the same time with libbpf?
> > > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> > > make sure we don't directly compare positions anywhere else.
> >
> > I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
> > help to double-check kernel/bpf/ringbuf.c.
>
> great, thanks!

I'll update the code in __bpf_user_ringbuf_peek. One observation is
that the code there differs from the regular (non-user) ringbuf and
the libbpf code in that it uses u64 for the positions whereas
everywhere else (including in the struct definition) the types of
these positions are unsigned long. I get that on 64-bit architectures,
practically speaking, these are the same. What I'm less clear on is
whether there's anything that prevents this code from running (perhaps
with bugs) on 32-bit machines. I suppose that on little-endian
machines things will just work out until the positions overflow 32
bits.

Would it make sense for me to make the change to always use unsigned
long for these values in the same change?

>
> > >
> > >>                         len_ptr = r->data + (cons_pos & r->mask);
> > >>                         len = smp_load_acquire(len_ptr);
> > >>
> > >> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
> > >>         prod_pos = smp_load_acquire(rb->producer_pos);
> > >>
> > >>         max_size = rb->mask + 1;
> > >> +
> > >> +       /* Note that this formulation in the face of overflow of prod_pos
> > >> +        * so long as the delta between prod_pos and cons_pos remains no
> > >> +        * greater than max_size.
> > >> +        */
> > >>         avail_size = max_size - (prod_pos - cons_pos);
> > >>         /* Round up total size to a multiple of 8. */
> > >>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
> > >> --
> > >> 2.39.2
> > >>
> > >>
> > > .
> >
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-08-23 22:14       ` Andrew Werner
@ 2023-08-24  0:07         ` Andrii Nakryiko
  2023-08-24 21:09           ` David Vernet
  0 siblings, 1 reply; 8+ messages in thread
From: Andrii Nakryiko @ 2023-08-24  0:07 UTC (permalink / raw)
  To: Andrew Werner, David Vernet; +Cc: Hou Tao, bpf, kernel-team

On Wed, Aug 23, 2023 at 3:14 PM Andrew Werner <awerner32@gmail.com> wrote:
>
> On Wed, Aug 23, 2023 at 12:52 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Aug 22, 2023 at 6:49 PM Hou Tao <houtao@huaweicloud.com> wrote:
> > >
> > > Hi,
> > >
> > > On 8/22/2023 1:15 PM, Andrii Nakryiko wrote:
> > > > On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@gmail.com> wrote:
> > > >> Before this patch, the producer position could overflow `unsigned
> > > >> long`, in which case libbpf would forever stop processing new writes to
> > > >> the ringbuf. This patch addresses that bug by avoiding ordered
> > > >> comparison between the consumer and producer position. If the consumer
> > > >> position is greater than the producer position, the assumption is that
> > > >> the producer has overflowed.
> > > >>
> > > >> A more defensive check could be to ensure that the delta is within
> > > >> the allowed range, but such defensive checks are neither present in
> > > >> the kernel side code nor in libbpf. The overflow that this patch
> > > >> handles can occur while the producer and consumer follow a correct
> > > >> protocol.
> > > > Yep, great find!
> > > >
> > > >> A selftest was written to demonstrate the bug, and indeed this patch
> > > >> allows the test to continue to make progress past the overflow.
> > > >> However, the author was unable to create a testing environment on a
> > > >> 32-bit machine, and the test requires substantial memory and over 4
> > > > 2GB of memory for ringbuf, right? Perhaps it would be good to just
> > > > outline the repro, even if we won't have it implemented in selftests.
> > > > Something along the lines of: a) set up ringbuf of 2GB size and
> > > > reserve+commit maximum-sized record (UMAX/4) constantly as fast as
> > > > possible. With 1 million records per second repro time should be about
> > > > 4.7 hours. Can you please update the commit with something like that
> > > > instead of a vague "there is repro, but I won't show it ;)" ? Thanks!
> > >
> > > I think it would be great that the commit message can elaborate about
> > > the repo. Andrew had already posted an external link to the reproducer
> > > in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713
> >
> > Yeah, of course, I saw it. I didn't mean that he actually refuses to
> > show repro, it's just that the commit message is vague about it and we
> > should improve that. I think we are on the same page. Including the
> > link to full repro in addition to more detailed description is also
> > good.
>
> I'll update the message as you suggest in v3.
>
> >
> > >
> > > [0]:
> > > https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@gmail.com/
> > > >> hours to hit the overflow point on a 64-bit machine. Thus, the test
> > > >> is not included in this patch because of the impracticality of running
> > > >> it.
> > > >>
> > > >> Additionally, this patch adds commentary around a separate point to note
> > > >> that the modular arithmetic is valid in the face of overflows, as that
> > > >> fact may not be obvious to future readers.
> > > >>
> > > >> Signed-off-by: Andrew Werner <awerner32@gmail.com>
> > > >> ---
> > > >>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
> > > >>  1 file changed, 10 insertions(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > > >> index 02199364db13..6271757bc3d2 100644
> > > >> --- a/tools/lib/bpf/ringbuf.c
> > > >> +++ b/tools/lib/bpf/ringbuf.c
> > > >> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
> > > >>         do {
> > > >>                 got_new_data = false;
> > > >>                 prod_pos = smp_load_acquire(r->producer_pos);
> > > >> -               while (cons_pos < prod_pos) {
> > > >> +
> > > >> +               /* Avoid signed comparisons between the positions; the producer
> > > > "signed comparisons" is confusing and invalid, as cons_pos and
> > > > prod_pos are unsigned and comparison here is unsigned. What you meant
> > > > is inequality comparison, which is invalid when done naively (like it
> > > > is done in libbpf right now, sigh...), if counters can wrap around.
> > > >
> > > >> +                * position can overflow before the consumer position.
> > > >> +                */
> > > >> +               while (cons_pos != prod_pos) {
> > > > I'm wondering if we should preserve the "consumer pos is before
> > > > producer pos" check just for clarity's sake (with a comment about
> > > > wrapping around of counters, of course) like:
> > > >
> > > > if ((long)(cons_pos - prod_pos) < 0) ?
>
> Will do.
>
> > > >
> > > > BTW, I think kernel code needs fixing as well in
> > > > __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> > > > directly, only through subtraction and casting to signed long as
> > > > above), would you be able to fix it at the same time with libbpf?
> > > > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> > > > make sure we don't directly compare positions anywhere else.
> > >
> > > I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
> > > help to double-check kernel/bpf/ringbuf.c.
> >
> > great, thanks!
>
> I'll update the code in __bpf_user_ringbuf_peek. One observation is
> that the code there differs from the regular (non-user) ringbuf and
> the libbpf code in that it uses u64 for the positions whereas
> everywhere else (including in the struct definition) the types of
> these positions are unsigned long. I get that on 64-bit architectures,
> practically speaking, these are the same. What I'm less clear on is
> whether there's anything that prevents this code from running (perhaps
> with bugs) on 32-bit machines. I suppose that on little-endian
> machines things will just work out until the positions overflow 32
> bits.
>
> Would it make sense for me to make the change to always use unsigned
> long for these values in the same change?

I'm not sure, but I suspect that u64 usage is not intentional, so we
probably do want to use unsigned long consistently.

cc'ing David as well


>
> >
> > > >
> > > >>                         len_ptr = r->data + (cons_pos & r->mask);
> > > >>                         len = smp_load_acquire(len_ptr);
> > > >>
> > > >> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
> > > >>         prod_pos = smp_load_acquire(rb->producer_pos);
> > > >>
> > > >>         max_size = rb->mask + 1;
> > > >> +
> > > >> +       /* Note that this formulation in the face of overflow of prod_pos
> > > >> +        * so long as the delta between prod_pos and cons_pos remains no
> > > >> +        * greater than max_size.
> > > >> +        */
> > > >>         avail_size = max_size - (prod_pos - cons_pos);
> > > >>         /* Round up total size to a multiple of 8. */
> > > >>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
> > > >> --
> > > >> 2.39.2
> > > >>
> > > >>
> > > > .
> > >
> > >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-08-24  0:07         ` Andrii Nakryiko
@ 2023-08-24 21:09           ` David Vernet
  2023-08-24 22:22             ` Andrew Werner
  0 siblings, 1 reply; 8+ messages in thread
From: David Vernet @ 2023-08-24 21:09 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: Andrew Werner, Hou Tao, bpf, kernel-team

On Wed, Aug 23, 2023 at 05:07:42PM -0700, Andrii Nakryiko wrote:

[...]

> > > > > BTW, I think kernel code needs fixing as well in
> > > > > __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> > > > > directly, only through subtraction and casting to signed long as
> > > > > above), would you be able to fix it at the same time with libbpf?
> > > > > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> > > > > make sure we don't directly compare positions anywhere else.
> > > >
> > > > I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
> > > > help to double-check kernel/bpf/ringbuf.c.
> > >
> > > great, thanks!

Good catch, and thanks!

> >
> > I'll update the code in __bpf_user_ringbuf_peek. One observation is
> > that the code there differs from the regular (non-user) ringbuf and
> > the libbpf code in that it uses u64 for the positions whereas
> > everywhere else (including in the struct definition) the types of
> > these positions are unsigned long. I get that on 64-bit architectures,
> > practically speaking, these are the same. What I'm less clear on is
> > whether there's anything that prevents this code from running (perhaps
> > with bugs) on 32-bit machines. I suppose that on little-endian
> > machines things will just work out until the positions overflow 32
> > bits.
> >
> > Would it make sense for me to make the change to always use unsigned
> > long for these values in the same change?
> 
> I'm not sure, but I suspect that u64 usage is not intentional, so we
> probably do want to use unsigned long consistently.

Yeah, that was an oversight and should be fixed. Thanks for pointing it
out Andrew. FYI, we'd also need to fix user_ring_buffer__reserve() in
libbpf.

Andrew -- I'm happy to send a patch that fixes this. If you'd like to
send it out, please feel free to do so. Just let me know.

Thanks,
David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libbpf: handle producer position overflow
  2023-08-24 21:09           ` David Vernet
@ 2023-08-24 22:22             ` Andrew Werner
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Werner @ 2023-08-24 22:22 UTC (permalink / raw)
  To: void; +Cc: Andrii Nakryiko, Hou Tao, bpf, kernel-team

On Thu, Aug 24, 2023 at 5:09 PM David Vernet <void@manifault.com> wrote:
>
> On Wed, Aug 23, 2023 at 05:07:42PM -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > > > > BTW, I think kernel code needs fixing as well in
> > > > > > __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> > > > > > directly, only through subtraction and casting to signed long as
> > > > > > above), would you be able to fix it at the same time with libbpf?
> > > > > > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> > > > > > make sure we don't directly compare positions anywhere else.
> > > > >
> > > > > I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
> > > > > help to double-check kernel/bpf/ringbuf.c.
> > > >
> > > > great, thanks!
>
> Good catch, and thanks!
>
> > >
> > > I'll update the code in __bpf_user_ringbuf_peek. One observation is
> > > that the code there differs from the regular (non-user) ringbuf and
> > > the libbpf code in that it uses u64 for the positions whereas
> > > everywhere else (including in the struct definition) the types of
> > > these positions are unsigned long. I get that on 64-bit architectures,
> > > practically speaking, these are the same. What I'm less clear on is
> > > whether there's anything that prevents this code from running (perhaps
> > > with bugs) on 32-bit machines. I suppose that on little-endian
> > > machines things will just work out until the positions overflow 32
> > > bits.
> > >
> > > Would it make sense for me to make the change to always use unsigned
> > > long for these values in the same change?
> >
> > I'm not sure, but I suspect that u64 usage is not intentional, so we
> > probably do want to use unsigned long consistently.
>
> Yeah, that was an oversight and should be fixed. Thanks for pointing it
> out Andrew. FYI, we'd also need to fix user_ring_buffer__reserve() in
> libbpf.
>
> Andrew -- I'm happy to send a patch that fixes this. If you'd like to
> send it out, please feel free to do so. Just let me know.

I handled it in v3 which I just posted here:
https://lore.kernel.org/bpf/20230824220907.1172808-1-awerner32@gmail.com/T/#u

>
> Thanks,
> David

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-24 22:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-24 13:24 [PATCH] libbpf: handle producer position overflow Andrew Werner
2023-08-22  5:15 ` Andrii Nakryiko
2023-08-23  1:48   ` Hou Tao
2023-08-23 16:52     ` Andrii Nakryiko
2023-08-23 22:14       ` Andrew Werner
2023-08-24  0:07         ` Andrii Nakryiko
2023-08-24 21:09           ` David Vernet
2023-08-24 22:22             ` Andrew Werner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.