All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf 0/4] libbpf: Fixes for ring buffer
@ 2022-11-11  9:26 Hou Tao
  2022-11-11  9:26 ` [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map Hou Tao
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Hou Tao @ 2022-11-11  9:26 UTC (permalink / raw)
  To: bpf, Yonghong Song
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, houtao1

From: Hou Tao <houtao1@huawei.com>

Hi,

The patch set tries to fix the problems found when testing ringbuf by
using 4KB and 2GB size. Patch 1 fixes the probe of ring buffer map on
host with 64KB page (e.g., an ARM64 host). Patch 2 & 3 fix the overflow
of length when mmaping 2GB kernel ringbuf or user ringbuf on libbpf.
Patch 4 just reject the reservation with invalid size.

Please check individual patch for details. And comments are always
welcome.

Hou Tao (4):
  libbpf: Adjust ring buffer size when probing ring buffer map
  libbpf: Handle size overflow for ringbuf mmap
  libbpf: Handle size overflow for user ringbuf mmap
  libbpf: Check the validity of size in user_ring_buffer__reserve()

 tools/lib/bpf/libbpf.c          |  2 +-
 tools/lib/bpf/libbpf_internal.h |  2 ++
 tools/lib/bpf/libbpf_probes.c   |  2 +-
 tools/lib/bpf/ringbuf.c         | 26 ++++++++++++++++++++++----
 4 files changed, 26 insertions(+), 6 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map
  2022-11-11  9:26 [PATCH bpf 0/4] libbpf: Fixes for ring buffer Hou Tao
@ 2022-11-11  9:26 ` Hou Tao
  2022-11-11 17:47   ` sdf
  2022-11-11  9:26 ` [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap Hou Tao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Hou Tao @ 2022-11-11  9:26 UTC (permalink / raw)
  To: bpf, Yonghong Song
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, houtao1

From: Hou Tao <houtao1@huawei.com>

Adjusting the size of ring buffer when probing ring buffer map, else
the probe may fail on host with 64KB page size (e.g., an ARM64 host).

After the fix, the output of "bpftool feature" on above host will be
correct.

Before :
    eBPF map_type ringbuf is NOT available
    eBPF map_type user_ringbuf is NOT available

After :
    eBPF map_type ringbuf is available
    eBPF map_type user_ringbuf is available

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 tools/lib/bpf/libbpf.c          | 2 +-
 tools/lib/bpf/libbpf_internal.h | 2 ++
 tools/lib/bpf/libbpf_probes.c   | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 184ce1684dcd..907f735568ae 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2353,7 +2353,7 @@ int parse_btf_map_def(const char *map_name, struct btf *btf,
 	return 0;
 }
 
-static size_t adjust_ringbuf_sz(size_t sz)
+size_t adjust_ringbuf_sz(size_t sz)
 {
 	__u32 page_sz = sysconf(_SC_PAGE_SIZE);
 	__u32 mul;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 377642ff51fc..99dc4d6a19be 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -576,4 +576,6 @@ static inline bool is_pow_of_2(size_t x)
 #define PROG_LOAD_ATTEMPTS 5
 int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size, int attempts);
 
+size_t adjust_ringbuf_sz(size_t sz);
+
 #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index f3a8e8e74eb8..29a1db2645fd 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -234,7 +234,7 @@ static int probe_map_create(enum bpf_map_type map_type)
 	case BPF_MAP_TYPE_USER_RINGBUF:
 		key_size = 0;
 		value_size = 0;
-		max_entries = 4096;
+		max_entries = adjust_ringbuf_sz(4096);
 		break;
 	case BPF_MAP_TYPE_STRUCT_OPS:
 		/* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap
  2022-11-11  9:26 [PATCH bpf 0/4] libbpf: Fixes for ring buffer Hou Tao
  2022-11-11  9:26 ` [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map Hou Tao
@ 2022-11-11  9:26 ` Hou Tao
  2022-11-11 17:54   ` sdf
  2022-11-11  9:26 ` [PATCH bpf 3/4] libbpf: Handle size overflow for user " Hou Tao
  2022-11-11  9:26 ` [PATCH bpf 4/4] libbpf: Check the validity of size in user_ring_buffer__reserve() Hou Tao
  3 siblings, 1 reply; 13+ messages in thread
From: Hou Tao @ 2022-11-11  9:26 UTC (permalink / raw)
  To: bpf, Yonghong Song
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, houtao1

From: Hou Tao <houtao1@huawei.com>

The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
will overflow u32 when mapping producer page and data pages. Only
casting max_entries to size_t is not enough, because for 32-bits
application on 64-bits kernel the size of read-only mmap region
also could overflow size_t.

So fixing it by casting the size of read-only mmap region into a __u64
and checking whether or not there will be overflow during mmap.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 tools/lib/bpf/ringbuf.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index d285171d4b69..c4bdc88af672 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	__u32 len = sizeof(info);
 	struct epoll_event *e;
 	struct ring *r;
+	__u64 ro_size;
 	void *tmp;
 	int err;
 
@@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	 * data size to allow simple reading of samples that wrap around the
 	 * end of a ring buffer. See kernel implementation for details.
 	 * */
-	tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
-		   MAP_SHARED, map_fd, rb->page_size);
+	ro_size = rb->page_size + 2 * (__u64)info.max_entries;
+	if (ro_size != (__u64)(size_t)ro_size) {
+		pr_warn("ringbuf: ring buffer size (%u) is too big\n",
+			info.max_entries);
+		return libbpf_err(-E2BIG);
+	}
+	tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
+		   rb->page_size);
 	if (tmp == MAP_FAILED) {
 		err = -errno;
 		ringbuf_unmap_ring(rb, r);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf 3/4] libbpf: Handle size overflow for user ringbuf mmap
  2022-11-11  9:26 [PATCH bpf 0/4] libbpf: Fixes for ring buffer Hou Tao
  2022-11-11  9:26 ` [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map Hou Tao
  2022-11-11  9:26 ` [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap Hou Tao
@ 2022-11-11  9:26 ` Hou Tao
  2022-11-11 20:57   ` Andrii Nakryiko
  2022-11-11  9:26 ` [PATCH bpf 4/4] libbpf: Check the validity of size in user_ring_buffer__reserve() Hou Tao
  3 siblings, 1 reply; 13+ messages in thread
From: Hou Tao @ 2022-11-11  9:26 UTC (permalink / raw)
  To: bpf, Yonghong Song
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, houtao1

From: Hou Tao <houtao1@huawei.com>

Similar with the overflow problem on ringbuf mmap, in user_ringbuf_map()
2 * max_entries may overflow u32 when mapping writeable region.

Fixing it by casting the size of writable mmap region into a __u64 and
checking whether or not there will be overflow during mmap.

Fixes: b66ccae01f1d ("bpf: Add libbpf logic for user-space ring buffer")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 tools/lib/bpf/ringbuf.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index c4bdc88af672..b34e61c538d7 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -355,6 +355,7 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
 {
 	struct bpf_map_info info;
 	__u32 len = sizeof(info);
+	__u64 wr_size;
 	void *tmp;
 	struct epoll_event *rb_epoll;
 	int err;
@@ -391,8 +392,14 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
 	 * simple reading and writing of samples that wrap around the end of
 	 * the buffer.  See the kernel implementation for details.
 	 */
-	tmp = mmap(NULL, rb->page_size + 2 * info.max_entries,
-		   PROT_READ | PROT_WRITE, MAP_SHARED, map_fd, rb->page_size);
+	wr_size = rb->page_size + 2 * (__u64)info.max_entries;
+	if (wr_size != (__u64)(size_t)wr_size) {
+		pr_warn("user ringbuf: ring buf size (%u) is too big\n",
+			info.max_entries);
+		return -E2BIG;
+	}
+	tmp = mmap(NULL, (size_t)wr_size, PROT_READ | PROT_WRITE, MAP_SHARED,
+		   map_fd, rb->page_size);
 	if (tmp == MAP_FAILED) {
 		err = -errno;
 		pr_warn("user ringbuf: failed to mmap data pages for map fd=%d: %d\n",
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf 4/4] libbpf: Check the validity of size in user_ring_buffer__reserve()
  2022-11-11  9:26 [PATCH bpf 0/4] libbpf: Fixes for ring buffer Hou Tao
                   ` (2 preceding siblings ...)
  2022-11-11  9:26 ` [PATCH bpf 3/4] libbpf: Handle size overflow for user " Hou Tao
@ 2022-11-11  9:26 ` Hou Tao
  3 siblings, 0 replies; 13+ messages in thread
From: Hou Tao @ 2022-11-11  9:26 UTC (permalink / raw)
  To: bpf, Yonghong Song
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, houtao1

From: Hou Tao <houtao1@huawei.com>

The top two bits of size are used as busy and discard flags, so reject
the reservation that has any of these special bits in the size. With the
addition of validity check, these is also no need to check whether or
not total_size is overflowed.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 tools/lib/bpf/ringbuf.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index b34e61c538d7..91146562e18a 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -490,6 +490,10 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
 	__u64 cons_pos, prod_pos;
 	struct ringbuf_hdr *hdr;
 
+	/* The top two bits are used as special flags */
+	if (size & (BPF_RINGBUF_BUSY_BIT | BPF_RINGBUF_DISCARD_BIT))
+		return errno = E2BIG, NULL;
+
 	/* Synchronizes with smp_store_release() in __bpf_user_ringbuf_peek() in
 	 * the kernel.
 	 */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map
  2022-11-11  9:26 ` [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map Hou Tao
@ 2022-11-11 17:47   ` sdf
  2022-11-12  2:31     ` Hou Tao
  0 siblings, 1 reply; 13+ messages in thread
From: sdf @ 2022-11-11 17:47 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko, Song Liu,
	Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

On 11/11, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>

> Adjusting the size of ring buffer when probing ring buffer map, else
> the probe may fail on host with 64KB page size (e.g., an ARM64 host).

> After the fix, the output of "bpftool feature" on above host will be
> correct.

> Before :
>      eBPF map_type ringbuf is NOT available
>      eBPF map_type user_ringbuf is NOT available

> After :
>      eBPF map_type ringbuf is available
>      eBPF map_type user_ringbuf is available

> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>   tools/lib/bpf/libbpf.c          | 2 +-
>   tools/lib/bpf/libbpf_internal.h | 2 ++
>   tools/lib/bpf/libbpf_probes.c   | 2 +-
>   3 files changed, 4 insertions(+), 2 deletions(-)

> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 184ce1684dcd..907f735568ae 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -2353,7 +2353,7 @@ int parse_btf_map_def(const char *map_name, struct  
> btf *btf,
>   	return 0;
>   }

> -static size_t adjust_ringbuf_sz(size_t sz)
> +size_t adjust_ringbuf_sz(size_t sz)
>   {
>   	__u32 page_sz = sysconf(_SC_PAGE_SIZE);
>   	__u32 mul;
> diff --git a/tools/lib/bpf/libbpf_internal.h  
> b/tools/lib/bpf/libbpf_internal.h
> index 377642ff51fc..99dc4d6a19be 100644
> --- a/tools/lib/bpf/libbpf_internal.h
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -576,4 +576,6 @@ static inline bool is_pow_of_2(size_t x)
>   #define PROG_LOAD_ATTEMPTS 5
>   int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size, int  
> attempts);

> +size_t adjust_ringbuf_sz(size_t sz);
> +
>   #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
> index f3a8e8e74eb8..29a1db2645fd 100644
> --- a/tools/lib/bpf/libbpf_probes.c
> +++ b/tools/lib/bpf/libbpf_probes.c
> @@ -234,7 +234,7 @@ static int probe_map_create(enum bpf_map_type  
> map_type)
>   	case BPF_MAP_TYPE_USER_RINGBUF:
>   		key_size = 0;
>   		value_size = 0;
> -		max_entries = 4096;
> +		max_entries = adjust_ringbuf_sz(4096);

Why not pass PAGE_SIZE directly here? Something like:

max_entries = sysconf(_SC_PAGE_SIZE);

?

>   		break;
>   	case BPF_MAP_TYPE_STRUCT_OPS:
>   		/* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */
> --
> 2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap
  2022-11-11  9:26 ` [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap Hou Tao
@ 2022-11-11 17:54   ` sdf
  2022-11-11 20:56     ` Andrii Nakryiko
  0 siblings, 1 reply; 13+ messages in thread
From: sdf @ 2022-11-11 17:54 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko, Song Liu,
	Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

On 11/11, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>

> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> will overflow u32 when mapping producer page and data pages. Only
> casting max_entries to size_t is not enough, because for 32-bits
> application on 64-bits kernel the size of read-only mmap region
> also could overflow size_t.

> So fixing it by casting the size of read-only mmap region into a __u64
> and checking whether or not there will be overflow during mmap.

> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index d285171d4b69..c4bdc88af672 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
>   	__u32 len = sizeof(info);
>   	struct epoll_event *e;
>   	struct ring *r;
> +	__u64 ro_size;
>   	void *tmp;
>   	int err;

> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int  
> map_fd,
>   	 * data size to allow simple reading of samples that wrap around the
>   	 * end of a ring buffer. See kernel implementation for details.
>   	 * */
> -	tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> -		   MAP_SHARED, map_fd, rb->page_size);
> +	ro_size = rb->page_size + 2 * (__u64)info.max_entries;

[..]

> +	if (ro_size != (__u64)(size_t)ro_size) {
> +		pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> +			info.max_entries);
> +		return libbpf_err(-E2BIG);
> +	}

Why do we need this check at all? IIUC, the problem is that the expression
"rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
overflow. So why doing this part only isn't enough?

size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);

sizeof(size_t) should be 8, so no overflow is possible?


> +	tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> +		   rb->page_size);
>   	if (tmp == MAP_FAILED) {
>   		err = -errno;
>   		ringbuf_unmap_ring(rb, r);
> --
> 2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap
  2022-11-11 17:54   ` sdf
@ 2022-11-11 20:56     ` Andrii Nakryiko
  2022-11-11 21:24       ` Stanislav Fomichev
  2022-11-12  3:34       ` Hou Tao
  0 siblings, 2 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2022-11-11 20:56 UTC (permalink / raw)
  To: sdf
  Cc: Hou Tao, bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko,
	Song Liu, Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
>
> On 11/11, Hou Tao wrote:
> > From: Hou Tao <houtao1@huawei.com>
>
> > The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> > will overflow u32 when mapping producer page and data pages. Only
> > casting max_entries to size_t is not enough, because for 32-bits
> > application on 64-bits kernel the size of read-only mmap region
> > also could overflow size_t.
>
> > So fixing it by casting the size of read-only mmap region into a __u64
> > and checking whether or not there will be overflow during mmap.
>
> > Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
> > ---
> >   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> >   1 file changed, 9 insertions(+), 2 deletions(-)
>
> > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > index d285171d4b69..c4bdc88af672 100644
> > --- a/tools/lib/bpf/ringbuf.c
> > +++ b/tools/lib/bpf/ringbuf.c
> > @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> >       __u32 len = sizeof(info);
> >       struct epoll_event *e;
> >       struct ring *r;
> > +     __u64 ro_size;

I found ro_size quite a confusing name, let's call it mmap_sz?

> >       void *tmp;
> >       int err;
>
> > @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> > map_fd,
> >        * data size to allow simple reading of samples that wrap around the
> >        * end of a ring buffer. See kernel implementation for details.
> >        * */
> > -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> > -                MAP_SHARED, map_fd, rb->page_size);
> > +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
>
> [..]
>
> > +     if (ro_size != (__u64)(size_t)ro_size) {
> > +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> > +                     info.max_entries);
> > +             return libbpf_err(-E2BIG);
> > +     }
>
> Why do we need this check at all? IIUC, the problem is that the expression
> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> overflow. So why doing this part only isn't enough?
>
> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
>
> sizeof(size_t) should be 8, so no overflow is possible?

not on 32-bit arches, presumably?



>
>
> > +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> > +                rb->page_size);

should we split this mmap into two mmaps -- one for producer_pos page,
another for data area. That will presumably allow to mmap ringbuf with
max_entries = 1GB?

> >       if (tmp == MAP_FAILED) {
> >               err = -errno;
> >               ringbuf_unmap_ring(rb, r);
> > --
> > 2.29.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 3/4] libbpf: Handle size overflow for user ringbuf mmap
  2022-11-11  9:26 ` [PATCH bpf 3/4] libbpf: Handle size overflow for user " Hou Tao
@ 2022-11-11 20:57   ` Andrii Nakryiko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2022-11-11 20:57 UTC (permalink / raw)
  To: Hou Tao, David Vernet
  Cc: bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko, Song Liu,
	Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, houtao1

On Fri, Nov 11, 2022 at 1:01 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> From: Hou Tao <houtao1@huawei.com>
>
> Similar with the overflow problem on ringbuf mmap, in user_ringbuf_map()
> 2 * max_entries may overflow u32 when mapping writeable region.
>
> Fixing it by casting the size of writable mmap region into a __u64 and
> checking whether or not there will be overflow during mmap.
>
> Fixes: b66ccae01f1d ("bpf: Add libbpf logic for user-space ring buffer")
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  tools/lib/bpf/ringbuf.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index c4bdc88af672..b34e61c538d7 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -355,6 +355,7 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
>  {
>         struct bpf_map_info info;
>         __u32 len = sizeof(info);
> +       __u64 wr_size;
>         void *tmp;
>         struct epoll_event *rb_epoll;
>         int err;
> @@ -391,8 +392,14 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
>          * simple reading and writing of samples that wrap around the end of
>          * the buffer.  See the kernel implementation for details.
>          */
> -       tmp = mmap(NULL, rb->page_size + 2 * info.max_entries,
> -                  PROT_READ | PROT_WRITE, MAP_SHARED, map_fd, rb->page_size);
> +       wr_size = rb->page_size + 2 * (__u64)info.max_entries;
> +       if (wr_size != (__u64)(size_t)wr_size) {
> +               pr_warn("user ringbuf: ring buf size (%u) is too big\n",
> +                       info.max_entries);
> +               return -E2BIG;
> +       }
> +       tmp = mmap(NULL, (size_t)wr_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> +                  map_fd, rb->page_size);

same suggestions as in precious patch: s/wr_size/mmap_sz/ and let's
discuss if we should split one mmap into two?

cc'ed David as well

>         if (tmp == MAP_FAILED) {
>                 err = -errno;
>                 pr_warn("user ringbuf: failed to mmap data pages for map fd=%d: %d\n",
> --
> 2.29.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap
  2022-11-11 20:56     ` Andrii Nakryiko
@ 2022-11-11 21:24       ` Stanislav Fomichev
  2022-11-12  3:34       ` Hou Tao
  1 sibling, 0 replies; 13+ messages in thread
From: Stanislav Fomichev @ 2022-11-11 21:24 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hou Tao, bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko,
	Song Liu, Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

On Fri, Nov 11, 2022 at 12:56 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
> >
> > On 11/11, Hou Tao wrote:
> > > From: Hou Tao <houtao1@huawei.com>
> >
> > > The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> > > will overflow u32 when mapping producer page and data pages. Only
> > > casting max_entries to size_t is not enough, because for 32-bits
> > > application on 64-bits kernel the size of read-only mmap region
> > > also could overflow size_t.
> >
> > > So fixing it by casting the size of read-only mmap region into a __u64
> > > and checking whether or not there will be overflow during mmap.
> >
> > > Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> > > Signed-off-by: Hou Tao <houtao1@huawei.com>
> > > ---
> > >   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> > >   1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > > index d285171d4b69..c4bdc88af672 100644
> > > --- a/tools/lib/bpf/ringbuf.c
> > > +++ b/tools/lib/bpf/ringbuf.c
> > > @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> > >       __u32 len = sizeof(info);
> > >       struct epoll_event *e;
> > >       struct ring *r;
> > > +     __u64 ro_size;
>
> I found ro_size quite a confusing name, let's call it mmap_sz?
>
> > >       void *tmp;
> > >       int err;
> >
> > > @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> > > map_fd,
> > >        * data size to allow simple reading of samples that wrap around the
> > >        * end of a ring buffer. See kernel implementation for details.
> > >        * */
> > > -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> > > -                MAP_SHARED, map_fd, rb->page_size);
> > > +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
> >
> > [..]
> >
> > > +     if (ro_size != (__u64)(size_t)ro_size) {
> > > +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> > > +                     info.max_entries);
> > > +             return libbpf_err(-E2BIG);
> > > +     }
> >
> > Why do we need this check at all? IIUC, the problem is that the expression
> > "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> > overflow. So why doing this part only isn't enough?
> >
> > size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> > mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
> >
> > sizeof(size_t) should be 8, so no overflow is possible?
>
> not on 32-bit arches, presumably?

Good point, he even mentions it in the description, I can't read apparently :-/

"Only casting max_entries to size_t is not enough"

>
>
> >
> >
> > > +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> > > +                rb->page_size);
>
> should we split this mmap into two mmaps -- one for producer_pos page,
> another for data area. That will presumably allow to mmap ringbuf with
> max_entries = 1GB?
>
> > >       if (tmp == MAP_FAILED) {
> > >               err = -errno;
> > >               ringbuf_unmap_ring(rb, r);
> > > --
> > > 2.29.2
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map
  2022-11-11 17:47   ` sdf
@ 2022-11-12  2:31     ` Hou Tao
  0 siblings, 0 replies; 13+ messages in thread
From: Hou Tao @ 2022-11-12  2:31 UTC (permalink / raw)
  To: sdf
  Cc: bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko, Song Liu,
	Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

Hi,

On 11/12/2022 1:47 AM, sdf@google.com wrote:
> On 11/11, Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>
>> Adjusting the size of ring buffer when probing ring buffer map, else
>> the probe may fail on host with 64KB page size (e.g., an ARM64 host).
>
>> After the fix, the output of "bpftool feature" on above host will be
>> correct.
>
>> Before :
>>      eBPF map_type ringbuf is NOT available
>>      eBPF map_type user_ringbuf is NOT available
>
>> After :
>>      eBPF map_type ringbuf is available
>>      eBPF map_type user_ringbuf is available
>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> ---
>>   tools/lib/bpf/libbpf.c          | 2 +-
>>   tools/lib/bpf/libbpf_internal.h | 2 ++
>>   tools/lib/bpf/libbpf_probes.c   | 2 +-
>>   3 files changed, 4 insertions(+), 2 deletions(-)
SNIP
>
>>   #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
>> diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
>> index f3a8e8e74eb8..29a1db2645fd 100644
>> --- a/tools/lib/bpf/libbpf_probes.c
>> +++ b/tools/lib/bpf/libbpf_probes.c
>> @@ -234,7 +234,7 @@ static int probe_map_create(enum bpf_map_type map_type)
>>       case BPF_MAP_TYPE_USER_RINGBUF:
>>           key_size = 0;
>>           value_size = 0;
>> -        max_entries = 4096;
>> +        max_entries = adjust_ringbuf_sz(4096);
>
> Why not pass PAGE_SIZE directly here? Something like:
>
> max_entries = sysconf(_SC_PAGE_SIZE);
>
> ?
Good idea. Will do that in v2.
>
>>           break;
>>       case BPF_MAP_TYPE_STRUCT_OPS:
>>           /* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */
>> -- 
>> 2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap
  2022-11-11 20:56     ` Andrii Nakryiko
  2022-11-11 21:24       ` Stanislav Fomichev
@ 2022-11-12  3:34       ` Hou Tao
  2022-11-14 19:51         ` Andrii Nakryiko
  1 sibling, 1 reply; 13+ messages in thread
From: Hou Tao @ 2022-11-12  3:34 UTC (permalink / raw)
  To: Andrii Nakryiko, sdf
  Cc: bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko, Song Liu,
	Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

Hi,

On 11/12/2022 4:56 AM, Andrii Nakryiko wrote:
> On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
>> On 11/11, Hou Tao wrote:
>>> From: Hou Tao <houtao1@huawei.com>
>>> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
>>> will overflow u32 when mapping producer page and data pages. Only
>>> casting max_entries to size_t is not enough, because for 32-bits
>>> application on 64-bits kernel the size of read-only mmap region
>>> also could overflow size_t.
>>> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>> ---
>>>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
>>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
>>> index d285171d4b69..c4bdc88af672 100644
>>> --- a/tools/lib/bpf/ringbuf.c
>>> +++ b/tools/lib/bpf/ringbuf.c
>>> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
>>>       __u32 len = sizeof(info);
>>>       struct epoll_event *e;
>>>       struct ring *r;
>>> +     __u64 ro_size;
> I found ro_size quite a confusing name, let's call it mmap_sz?
OK.
>
>>>       void *tmp;
>>>       int err;
>>> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
>>> map_fd,
>>>        * data size to allow simple reading of samples that wrap around the
>>>        * end of a ring buffer. See kernel implementation for details.
>>>        * */
>>> -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
>>> -                MAP_SHARED, map_fd, rb->page_size);
>>> +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
>> [..]
>>
>>> +     if (ro_size != (__u64)(size_t)ro_size) {
>>> +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
>>> +                     info.max_entries);
>>> +             return libbpf_err(-E2BIG);
>>> +     }
>> Why do we need this check at all? IIUC, the problem is that the expression
>> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
>> overflow. So why doing this part only isn't enough?
>>
>> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
>> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
>>
>> sizeof(size_t) should be 8, so no overflow is possible?
> not on 32-bit arches, presumably?
Yes. For 32-bits kernel, the total size of virtual address space for user space
and kernel space is 4GB, so when map_entries is 2GB, the needed virtual address
space will be 2GB + 4GB, so the mapping of ring buffer will fail either in
kernel or in userspace. A extreme case is 32-bits userspace under 64-bits
kernel. The mapping of 2GB ring buffer in kernel is OK, but 4GB will overflow
size_t on 32-bits userspace.
>


>
>
>>
>>> +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
>>> +                rb->page_size);
> should we split this mmap into two mmaps -- one for producer_pos page,
> another for data area. That will presumably allow to mmap ringbuf with
> max_entries = 1GB?
I don't understand the reason for the splitting. Even without the splitting, in
theory ring buffer with max_entries = 1GB will be OK for 32-bits kernel, despite
in practice the mapping of 1GB ring buffer on 32-bits kernel will fail because
the most common size of kernel virtual address space is 1GB (although ARM could
use VMSPLIT_1G to increase the size of kernel virtual address to 3GB).
>
>>>       if (tmp == MAP_FAILED) {
>>>               err = -errno;
>>>               ringbuf_unmap_ring(rb, r);
>>> --
>>> 2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap
  2022-11-12  3:34       ` Hou Tao
@ 2022-11-14 19:51         ` Andrii Nakryiko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2022-11-14 19:51 UTC (permalink / raw)
  To: Hou Tao
  Cc: sdf, bpf, Yonghong Song, Martin KaFai Lau, Andrii Nakryiko,
	Song Liu, Hao Luo, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	Jiri Olsa, John Fastabend, houtao1

On Fri, Nov 11, 2022 at 7:34 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 11/12/2022 4:56 AM, Andrii Nakryiko wrote:
> > On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
> >> On 11/11, Hou Tao wrote:
> >>> From: Hou Tao <houtao1@huawei.com>
> >>> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> >>> will overflow u32 when mapping producer page and data pages. Only
> >>> casting max_entries to size_t is not enough, because for 32-bits
> >>> application on 64-bits kernel the size of read-only mmap region
> >>> also could overflow size_t.
> >>> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> >>> Signed-off-by: Hou Tao <houtao1@huawei.com>
> >>> ---
> >>>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> >>>   1 file changed, 9 insertions(+), 2 deletions(-)
> >>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> >>> index d285171d4b69..c4bdc88af672 100644
> >>> --- a/tools/lib/bpf/ringbuf.c
> >>> +++ b/tools/lib/bpf/ringbuf.c
> >>> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> >>>       __u32 len = sizeof(info);
> >>>       struct epoll_event *e;
> >>>       struct ring *r;
> >>> +     __u64 ro_size;
> > I found ro_size quite a confusing name, let's call it mmap_sz?
> OK.
> >
> >>>       void *tmp;
> >>>       int err;
> >>> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> >>> map_fd,
> >>>        * data size to allow simple reading of samples that wrap around the
> >>>        * end of a ring buffer. See kernel implementation for details.
> >>>        * */
> >>> -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> >>> -                MAP_SHARED, map_fd, rb->page_size);
> >>> +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
> >> [..]
> >>
> >>> +     if (ro_size != (__u64)(size_t)ro_size) {
> >>> +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> >>> +                     info.max_entries);
> >>> +             return libbpf_err(-E2BIG);
> >>> +     }
> >> Why do we need this check at all? IIUC, the problem is that the expression
> >> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> >> overflow. So why doing this part only isn't enough?
> >>
> >> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> >> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
> >>
> >> sizeof(size_t) should be 8, so no overflow is possible?
> > not on 32-bit arches, presumably?
> Yes. For 32-bits kernel, the total size of virtual address space for user space
> and kernel space is 4GB, so when map_entries is 2GB, the needed virtual address
> space will be 2GB + 4GB, so the mapping of ring buffer will fail either in
> kernel or in userspace. A extreme case is 32-bits userspace under 64-bits
> kernel. The mapping of 2GB ring buffer in kernel is OK, but 4GB will overflow
> size_t on 32-bits userspace.
> >
>
>
> >
> >
> >>
> >>> +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> >>> +                rb->page_size);
> > should we split this mmap into two mmaps -- one for producer_pos page,
> > another for data area. That will presumably allow to mmap ringbuf with
> > max_entries = 1GB?
> I don't understand the reason for the splitting. Even without the splitting, in
> theory ring buffer with max_entries = 1GB will be OK for 32-bits kernel, despite
> in practice the mapping of 1GB ring buffer on 32-bits kernel will fail because
> the most common size of kernel virtual address space is 1GB (although ARM could
> use VMSPLIT_1G to increase the size of kernel virtual address to 3GB).

Yep, never mind. size_t is positive, so it can express up to 4GB, so
2GB + 4KB is fine as is already (even though it most probably will
fail).

> >
> >>>       if (tmp == MAP_FAILED) {
> >>>               err = -errno;
> >>>               ringbuf_unmap_ring(rb, r);
> >>> --
> >>> 2.29.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-11-14 19:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-11  9:26 [PATCH bpf 0/4] libbpf: Fixes for ring buffer Hou Tao
2022-11-11  9:26 ` [PATCH bpf 1/4] libbpf: Adjust ring buffer size when probing ring buffer map Hou Tao
2022-11-11 17:47   ` sdf
2022-11-12  2:31     ` Hou Tao
2022-11-11  9:26 ` [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap Hou Tao
2022-11-11 17:54   ` sdf
2022-11-11 20:56     ` Andrii Nakryiko
2022-11-11 21:24       ` Stanislav Fomichev
2022-11-12  3:34       ` Hou Tao
2022-11-14 19:51         ` Andrii Nakryiko
2022-11-11  9:26 ` [PATCH bpf 3/4] libbpf: Handle size overflow for user " Hou Tao
2022-11-11 20:57   ` Andrii Nakryiko
2022-11-11  9:26 ` [PATCH bpf 4/4] libbpf: Check the validity of size in user_ring_buffer__reserve() Hou Tao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.