bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers
@ 2019-10-31  1:00 Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 1/8] uaccess: Add non-pagefault user-space write function Daniel Borkmann
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf; +Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann

This set adds probe_read_{user,kernel}(), probe_read_str_{user,kernel}()
helpers, fixes probe_write_user() helper and selftests. For details please
see individual patches.

Thanks!

v1 -> v2:
  - standardize unsafe_ptr terminology in uapi header comment (Andrii)
  - probe_read_{user,kernel}[_str] naming scheme (Andrii)
  - use global data in last test case, remove relaxed_maps (Andrii)
  - add strict non-pagefault kernel read funcs to avoid warning in
    kernel probe read helpers (Alexei)

Daniel Borkmann (8):
  uaccess: Add non-pagefault user-space write function
  uaccess: Add strict non-pagefault kernel-space read function
  bpf: Make use of probe_user_write in probe write helper
  bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers
  bpf: Switch BPF probe insns to bpf_probe_read_kernel
  bpf, samples: Use bpf_probe_read_user where appropriate
  bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper
  bpf, testing: Add selftest to read/write sockaddr from user space

 arch/x86/mm/Makefile                          |   2 +-
 arch/x86/mm/maccess.c                         |  38 ++++
 include/linux/uaccess.h                       |  16 ++
 include/uapi/linux/bpf.h                      | 120 +++++++----
 kernel/bpf/core.c                             |   9 +-
 kernel/trace/bpf_trace.c                      | 187 +++++++++++++-----
 mm/maccess.c                                  |  70 ++++++-
 samples/bpf/map_perf_test_kern.c              |   4 +-
 samples/bpf/test_map_in_map_kern.c            |   4 +-
 samples/bpf/test_probe_write_user_kern.c      |   2 +-
 tools/include/uapi/linux/bpf.h                | 120 +++++++----
 .../selftests/bpf/prog_tests/probe_user.c     |  78 ++++++++
 tools/testing/selftests/bpf/progs/kfree_skb.c |   4 +-
 tools/testing/selftests/bpf/progs/pyperf.h    |  67 ++++---
 .../testing/selftests/bpf/progs/strobemeta.h  |  36 ++--
 .../selftests/bpf/progs/test_probe_user.c     |  26 +++
 .../selftests/bpf/progs/test_tcp_estats.c     |   2 +-
 17 files changed, 590 insertions(+), 195 deletions(-)
 create mode 100644 arch/x86/mm/maccess.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_user.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_probe_user.c

-- 
2.21.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 1/8] uaccess: Add non-pagefault user-space write function
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 2/8] uaccess: Add strict non-pagefault kernel-space read function Daniel Borkmann
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf
  Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu

Commit 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions")
missed to add probe write function, therefore factor out a probe_write_common()
helper with most logic of probe_kernel_write() except setting KERNEL_DS, and
add a new probe_user_write() helper so it can be used from BPF side.

Again, on some archs, the user address space and kernel address space can
co-exist and be overlapping, so in such case, setting KERNEL_DS would mean
that the given address is treated as being in kernel address space.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
---
 include/linux/uaccess.h | 12 +++++++++++
 mm/maccess.c            | 45 +++++++++++++++++++++++++++++++++++++----
 2 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index d4ee6e942562..38555435a64a 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -337,6 +337,18 @@ extern long __probe_user_read(void *dst, const void __user *src, size_t size);
 extern long notrace probe_kernel_write(void *dst, const void *src, size_t size);
 extern long notrace __probe_kernel_write(void *dst, const void *src, size_t size);
 
+/*
+ * probe_user_write(): safely attempt to write to a location in user space
+ * @dst: address to write to
+ * @src: pointer to the data that shall be written
+ * @size: size of the data chunk
+ *
+ * Safely write to address @dst from the buffer at @src.  If a kernel fault
+ * happens, handle that and return -EFAULT.
+ */
+extern long notrace probe_user_write(void __user *dst, const void *src, size_t size);
+extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size);
+
 extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);
 extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr,
 				     long count);
diff --git a/mm/maccess.c b/mm/maccess.c
index d065736f6b87..2d3c3d01064c 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -18,6 +18,18 @@ probe_read_common(void *dst, const void __user *src, size_t size)
 	return ret ? -EFAULT : 0;
 }
 
+static __always_inline long
+probe_write_common(void __user *dst, const void *src, size_t size)
+{
+	long ret;
+
+	pagefault_disable();
+	ret = __copy_to_user_inatomic(dst, src, size);
+	pagefault_enable();
+
+	return ret ? -EFAULT : 0;
+}
+
 /**
  * probe_kernel_read(): safely attempt to read from a kernel-space location
  * @dst: pointer to the buffer that shall take the data
@@ -85,6 +97,7 @@ EXPORT_SYMBOL_GPL(probe_user_read);
  * Safely write to address @dst from the buffer at @src.  If a kernel fault
  * happens, handle that and return -EFAULT.
  */
+
 long __weak probe_kernel_write(void *dst, const void *src, size_t size)
     __attribute__((alias("__probe_kernel_write")));
 
@@ -94,15 +107,39 @@ long __probe_kernel_write(void *dst, const void *src, size_t size)
 	mm_segment_t old_fs = get_fs();
 
 	set_fs(KERNEL_DS);
-	pagefault_disable();
-	ret = __copy_to_user_inatomic((__force void __user *)dst, src, size);
-	pagefault_enable();
+	ret = probe_write_common((__force void __user *)dst, src, size);
 	set_fs(old_fs);
 
-	return ret ? -EFAULT : 0;
+	return ret;
 }
 EXPORT_SYMBOL_GPL(probe_kernel_write);
 
+/**
+ * probe_user_write(): safely attempt to write to a user-space location
+ * @dst: address to write to
+ * @src: pointer to the data that shall be written
+ * @size: size of the data chunk
+ *
+ * Safely write to address @dst from the buffer at @src.  If a kernel fault
+ * happens, handle that and return -EFAULT.
+ */
+
+long __weak probe_user_write(void __user *dst, const void *src, size_t size)
+    __attribute__((alias("__probe_user_write")));
+
+long __probe_user_write(void __user *dst, const void *src, size_t size)
+{
+	long ret = -EFAULT;
+	mm_segment_t old_fs = get_fs();
+
+	set_fs(USER_DS);
+	if (access_ok(dst, size))
+		ret = probe_write_common(dst, src, size);
+	set_fs(old_fs);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(probe_user_write);
 
 /**
  * strncpy_from_unsafe: - Copy a NUL terminated string from unsafe address.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 2/8] uaccess: Add strict non-pagefault kernel-space read function
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 1/8] uaccess: Add non-pagefault user-space write function Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31 20:12   ` Andrii Nakryiko
  2019-10-31  1:00 ` [PATCH bpf-next v2 3/8] bpf: Make use of probe_user_write in probe write helper Daniel Borkmann
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf
  Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann, Linus Torvalds,
	Masami Hiramatsu, x86

Add two new probe_kernel_read_strict() and strncpy_from_unsafe_strict()
helpers which by default alias to the __probe_kernel_read() and the
__strncpy_from_unsafe(), respectively, but can be overridden by archs
which have non-overlapping address ranges for kernel space and user
space in order to bail out with -EFAULT when attempting to probe user
memory including non-canonical user access addresses [0].

The idea is that these helpers are complementary to the probe_user_read()
and strncpy_from_unsafe_user() which probe user-only memory. Both added
helpers here do the same, but for kernel-only addresses.

Both set of helpers are going to be used for BPF tracing. They also
explicitly avoid throwing the splat for non-canonical user addresses from
00c42373d397 ("x86-64: add warning for non-canonical user access address
dereferences").

For compat, the current probe_kernel_read() and strncpy_from_unsafe() are
left as-is.

  [0] Documentation/x86/x86_64/mm.txt

      4-level page tables: 0x0000800000000000 - 0xffff7fffffffffff
      5-level page tables: 0x0100000000000000 - 0xfeffffffffffffff

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: x86@kernel.org
---
 arch/x86/mm/Makefile    |  2 +-
 arch/x86/mm/maccess.c   | 38 ++++++++++++++++++++++++++++++++++++++
 include/linux/uaccess.h |  4 ++++
 mm/maccess.c            | 25 ++++++++++++++++++++++++-
 4 files changed, 67 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/mm/maccess.c

diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 84373dc9b341..bbc68a54795e 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -13,7 +13,7 @@ CFLAGS_REMOVE_mem_encrypt_identity.o	= -pg
 endif
 
 obj-y	:=  init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \
-	    pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o
+	    pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o
 
 # Make sure __phys_addr has no stackprotector
 nostackp := $(call cc-option, -fno-stack-protector)
diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c
new file mode 100644
index 000000000000..fcc55a7cbde2
--- /dev/null
+++ b/arch/x86/mm/maccess.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/uaccess.h>
+#include <linux/kernel.h>
+
+static __always_inline u64 canonical_address(u64 vaddr, u8 vaddr_bits)
+{
+	return ((s64)vaddr << (64 - vaddr_bits)) >> (64 - vaddr_bits);
+}
+
+static __always_inline bool non_canonical_address(u64 vaddr)
+{
+#ifdef CONFIG_X86_64
+	return canonical_address(vaddr, boot_cpu_data.x86_virt_bits) != vaddr;
+#else
+	return false;
+#endif
+}
+
+long probe_kernel_read_strict(void *dst, const void *src, size_t size)
+{
+	u64 addr = (unsigned long)src;
+
+	if (unlikely(addr < TASK_SIZE_MAX || non_canonical_address(addr)))
+		return -EFAULT;
+
+	return __probe_kernel_read(dst, src, size);
+}
+
+long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count)
+{
+	u64 addr = (unsigned long)unsafe_addr;
+
+	if (unlikely(addr < TASK_SIZE_MAX || non_canonical_address(addr)))
+		return -EFAULT;
+
+	return __strncpy_from_unsafe(dst, unsafe_addr, count);
+}
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 38555435a64a..67f016010aad 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -311,6 +311,7 @@ copy_struct_from_user(void *dst, size_t ksize, const void __user *src,
  * happens, handle that and return -EFAULT.
  */
 extern long probe_kernel_read(void *dst, const void *src, size_t size);
+extern long probe_kernel_read_strict(void *dst, const void *src, size_t size);
 extern long __probe_kernel_read(void *dst, const void *src, size_t size);
 
 /*
@@ -350,6 +351,9 @@ extern long notrace probe_user_write(void __user *dst, const void *src, size_t s
 extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size);
 
 extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);
+extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr,
+				       long count);
+extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);
 extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr,
 				     long count);
 extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count);
diff --git a/mm/maccess.c b/mm/maccess.c
index 2d3c3d01064c..3ca8d97e5010 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -43,11 +43,20 @@ probe_write_common(void __user *dst, const void *src, size_t size)
  * do_page_fault() doesn't attempt to take mmap_sem.  This makes
  * probe_kernel_read() suitable for use within regions where the caller
  * already holds mmap_sem, or other locks which nest inside mmap_sem.
+ *
+ * probe_kernel_read_strict() is the same as probe_kernel_read() except for
+ * the case where architectures have non-overlapping user and kernel address
+ * ranges: probe_kernel_read_strict() will additionally return -EFAULT for
+ * probing memory on a user address range where probe_user_read() is supposed
+ * to be used instead.
  */
 
 long __weak probe_kernel_read(void *dst, const void *src, size_t size)
     __attribute__((alias("__probe_kernel_read")));
 
+long __weak probe_kernel_read_strict(void *dst, const void *src, size_t size)
+    __attribute__((alias("__probe_kernel_read")));
+
 long __probe_kernel_read(void *dst, const void *src, size_t size)
 {
 	long ret;
@@ -157,8 +166,22 @@ EXPORT_SYMBOL_GPL(probe_user_write);
  *
  * If @count is smaller than the length of the string, copies @count-1 bytes,
  * sets the last byte of @dst buffer to NUL and returns @count.
+ *
+ * strncpy_from_unsafe_strict() is the same as strncpy_from_unsafe() except
+ * for the case where architectures have non-overlapping user and kernel address
+ * ranges: strncpy_from_unsafe_strict() will additionally return -EFAULT for
+ * probing memory on a user address range where strncpy_from_unsafe_user() is
+ * supposed to be used instead.
  */
-long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)
+
+long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)
+    __attribute__((alias("__strncpy_from_unsafe")));
+
+long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr,
+				       long count)
+    __attribute__((alias("__strncpy_from_unsafe")));
+
+long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)
 {
 	mm_segment_t old_fs = get_fs();
 	const void *src = unsafe_addr;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 3/8] bpf: Make use of probe_user_write in probe write helper
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 1/8] uaccess: Add non-pagefault user-space write function Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 2/8] uaccess: Add strict non-pagefault kernel-space read function Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 4/8] bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers Daniel Borkmann
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf; +Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann, Andrii Nakryiko

Convert the bpf_probe_write_user() helper to probe_user_write() such that
writes are not attempted under KERNEL_DS anymore which is buggy as kernel
and user space pointers can have overlapping addresses. Also, given we have
the access_ok() check inside probe_user_write(), the helper doesn't need
to do it twice.

Fixes: 96ae52279594 ("bpf: Add bpf_probe_write_user BPF helper to be called in tracers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
---
 kernel/trace/bpf_trace.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 571c25d60710..91eb17ac4bb6 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -163,7 +163,7 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
-BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
+BPF_CALL_3(bpf_probe_write_user, void __user *, unsafe_ptr, const void *, src,
 	   u32, size)
 {
 	/*
@@ -186,10 +186,8 @@ BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
 		return -EPERM;
 	if (unlikely(!nmi_uaccess_okay()))
 		return -EPERM;
-	if (!access_ok(unsafe_ptr, size))
-		return -EPERM;
 
-	return probe_kernel_write(unsafe_ptr, src, size);
+	return probe_user_write(unsafe_ptr, src, size);
 }
 
 static const struct bpf_func_proto bpf_probe_write_user_proto = {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 4/8] bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
                   ` (2 preceding siblings ...)
  2019-10-31  1:00 ` [PATCH bpf-next v2 3/8] bpf: Make use of probe_user_write in probe write helper Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31 20:07   ` Andrii Nakryiko
  2019-10-31  1:00 ` [PATCH bpf-next v2 5/8] bpf: Switch BPF probe insns to bpf_probe_read_kernel Daniel Borkmann
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf; +Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann

The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
in that they assume they can be used for probing memory access for kernel
space addresses /as well as/ user space addresses.

However, plain use of probe_kernel_read() for both cases will attempt to
always access kernel space address space given access is performed under
KERNEL_DS and some archs in-fact have overlapping address spaces where a
kernel pointer and user pointer would have the /same/ address value and
therefore accessing application memory via bpf_probe_read{,_str}() would
read garbage values.

Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess:
Add non-pagefault user-space read functions"). Unfortunately, the only way
to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
helpers are kept as-is to retain their current behavior.

The two *_user() variants attempt the access always under USER_DS set, the
two *_kernel() variants will -EFAULT when accessing user memory if the
underlying architecture has non-overlapping address ranges, also avoiding
throwing the kernel warning via 00c42373d397 ("x86-64: add warning for
non-canonical user access address dereferences").

Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper")
Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 include/uapi/linux/bpf.h       | 120 +++++++++++++++-------
 kernel/trace/bpf_trace.c       | 181 ++++++++++++++++++++++++---------
 tools/include/uapi/linux/bpf.h | 120 +++++++++++++++-------
 3 files changed, 297 insertions(+), 124 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4af8b0819a32..41855b53e8ff 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -561,10 +561,13 @@ union bpf_attr {
  * 	Return
  * 		0 on success, or a negative error in case of failure.
  *
- * int bpf_probe_read(void *dst, u32 size, const void *src)
+ * int bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
  * 	Description
  * 		For tracing programs, safely attempt to read *size* bytes from
- * 		address *src* and store the data in *dst*.
+ * 		kernel space address *unsafe_ptr* and store the data in *dst*.
+ *
+ * 		Generally, use bpf_probe_read_user() or bpf_probe_read_kernel()
+ * 		instead.
  * 	Return
  * 		0 on success, or a negative error in case of failure.
  *
@@ -1428,43 +1431,12 @@ union bpf_attr {
  *
  * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
  * 	Description
- * 		Copy a NUL terminated string from an unsafe address
- * 		*unsafe_ptr* to *dst*. The *size* should include the
- * 		terminating NUL byte. In case the string length is smaller than
- * 		*size*, the target is not padded with further NUL bytes. If the
- * 		string length is larger than *size*, just *size*-1 bytes are
- * 		copied and the last byte is set to NUL.
- *
- * 		On success, the length of the copied string is returned. This
- * 		makes this helper useful in tracing programs for reading
- * 		strings, and more importantly to get its length at runtime. See
- * 		the following snippet:
- *
- * 		::
- *
- * 			SEC("kprobe/sys_open")
- * 			void bpf_sys_open(struct pt_regs *ctx)
- * 			{
- * 			        char buf[PATHLEN]; // PATHLEN is defined to 256
- * 			        int res = bpf_probe_read_str(buf, sizeof(buf),
- * 				                             ctx->di);
- *
- * 				// Consume buf, for example push it to
- * 				// userspace via bpf_perf_event_output(); we
- * 				// can use res (the string length) as event
- * 				// size, after checking its boundaries.
- * 			}
- *
- * 		In comparison, using **bpf_probe_read()** helper here instead
- * 		to read the string would require to estimate the length at
- * 		compile time, and would often result in copying more memory
- * 		than necessary.
+ * 		Copy a NUL terminated string from an unsafe kernel address
+ * 		*unsafe_ptr* to *dst*. See bpf_probe_read_kernel_str() for
+ * 		more details.
  *
- * 		Another useful use case is when parsing individual process
- * 		arguments or individual environment variables navigating
- * 		*current*\ **->mm->arg_start** and *current*\
- * 		**->mm->env_start**: using this helper and the return value,
- * 		one can quickly iterate at the right offset of the memory area.
+ * 		Generally, use bpf_probe_read_user_str() or bpf_probe_read_kernel_str()
+ * 		instead.
  * 	Return
  * 		On success, the strictly positive length of the string,
  * 		including the trailing NUL character. On error, a negative
@@ -2775,6 +2747,72 @@ union bpf_attr {
  * 		restricted to raw_tracepoint bpf programs.
  * 	Return
  * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr)
+ * 	Description
+ * 		Safely attempt to read *size* bytes from user space address
+ * 		*unsafe_ptr* and store the data in *dst*.
+ * 	Return
+ * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
+ * 	Description
+ * 		Safely attempt to read *size* bytes from kernel space address
+ * 		*unsafe_ptr* and store the data in *dst*.
+ * 	Return
+ * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_probe_read_user_str(void *dst, int size, const void *unsafe_ptr)
+ * 	Description
+ * 		Copy a NUL terminated string from an unsafe user address
+ * 		*unsafe_ptr* to *dst*. The *size* should include the
+ * 		terminating NUL byte. In case the string length is smaller than
+ * 		*size*, the target is not padded with further NUL bytes. If the
+ * 		string length is larger than *size*, just *size*-1 bytes are
+ * 		copied and the last byte is set to NUL.
+ *
+ * 		On success, the length of the copied string is returned. This
+ * 		makes this helper useful in tracing programs for reading
+ * 		strings, and more importantly to get its length at runtime. See
+ * 		the following snippet:
+ *
+ * 		::
+ *
+ * 			SEC("kprobe/sys_open")
+ * 			void bpf_sys_open(struct pt_regs *ctx)
+ * 			{
+ * 			        char buf[PATHLEN]; // PATHLEN is defined to 256
+ * 			        int res = bpf_probe_read_user_str(buf, sizeof(buf),
+ * 				                                  ctx->di);
+ *
+ * 				// Consume buf, for example push it to
+ * 				// userspace via bpf_perf_event_output(); we
+ * 				// can use res (the string length) as event
+ * 				// size, after checking its boundaries.
+ * 			}
+ *
+ * 		In comparison, using **bpf_probe_read_user()** helper here
+ * 		instead to read the string would require to estimate the length
+ * 		at compile time, and would often result in copying more memory
+ * 		than necessary.
+ *
+ * 		Another useful use case is when parsing individual process
+ * 		arguments or individual environment variables navigating
+ * 		*current*\ **->mm->arg_start** and *current*\
+ * 		**->mm->env_start**: using this helper and the return value,
+ * 		one can quickly iterate at the right offset of the memory area.
+ * 	Return
+ * 		On success, the strictly positive length of the string,
+ * 		including the trailing NUL character. On error, a negative
+ * 		value.
+ *
+ * int bpf_probe_read_kernel_str(void *dst, int size, const void *unsafe_ptr)
+ * 	Description
+ * 		Copy a NUL terminated string from an unsafe kernel address *unsafe_ptr*
+ * 		to *dst*. Same semantics as with bpf_probe_read_user_str() apply.
+ * 	Return
+ * 		On success, the strictly positive length of the string,	including
+ * 		the trailing NUL character. On error, a negative value.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2888,7 +2926,11 @@ union bpf_attr {
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
 	FN(tcp_gen_syncookie),		\
-	FN(skb_output),
+	FN(skb_output),			\
+	FN(probe_read_user),		\
+	FN(probe_read_kernel),		\
+	FN(probe_read_user_str),	\
+	FN(probe_read_kernel_str),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 91eb17ac4bb6..9dbc5e610c4c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -138,24 +138,140 @@ static const struct bpf_func_proto bpf_override_return_proto = {
 };
 #endif
 
-BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
+BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size,
+	   const void __user *, unsafe_ptr)
 {
-	int ret;
+	int ret = probe_user_read(dst, unsafe_ptr, size);
 
-	ret = security_locked_down(LOCKDOWN_BPF_READ);
-	if (ret < 0)
-		goto out;
+	if (unlikely(ret < 0))
+		memset(dst, 0, size);
+
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_probe_read_user_proto = {
+	.func		= bpf_probe_read_user,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_ANYTHING,
+};
+
+BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size,
+	   const void __user *, unsafe_ptr)
+{
+	int ret = strncpy_from_unsafe_user(dst, unsafe_ptr, size);
+
+	if (unlikely(ret < 0))
+		memset(dst, 0, size);
+
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_probe_read_user_str_proto = {
+	.func		= bpf_probe_read_user_str,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_ANYTHING,
+};
 
-	ret = probe_kernel_read(dst, unsafe_ptr, size);
+static __always_inline int
+bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr,
+			     const bool compat)
+{
+	int ret = security_locked_down(LOCKDOWN_BPF_READ);
+
+	if (unlikely(ret < 0))
+		goto out;
+	ret = compat ? probe_kernel_read(dst, unsafe_ptr, size) :
+	      probe_kernel_read_strict(dst, unsafe_ptr, size);
 	if (unlikely(ret < 0))
 out:
 		memset(dst, 0, size);
+	return ret;
+}
+
+BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size,
+	   const void *, unsafe_ptr)
+{
+	return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, false);
+}
+
+static const struct bpf_func_proto bpf_probe_read_kernel_proto = {
+	.func		= bpf_probe_read_kernel,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_ANYTHING,
+};
+
+BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size,
+	   const void *, unsafe_ptr)
+{
+	return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, true);
+}
 
+static const struct bpf_func_proto bpf_probe_read_compat_proto = {
+	.func		= bpf_probe_read_compat,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_ANYTHING,
+};
+
+static __always_inline int
+bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr,
+				 const bool compat)
+{
+	int ret = security_locked_down(LOCKDOWN_BPF_READ);
+
+	if (unlikely(ret < 0))
+		goto out;
+	/*
+	 * The strncpy_from_unsafe_*() call will likely not fill the entire
+	 * buffer, but that's okay in this circumstance as we're probing
+	 * arbitrary memory anyway similar to bpf_probe_read_*() and might
+	 * as well probe the stack. Thus, memory is explicitly cleared
+	 * only in error case, so that improper users ignoring return
+	 * code altogether don't copy garbage; otherwise length of string
+	 * is returned that can be used for bpf_perf_event_output() et al.
+	 */
+	ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) :
+	      strncpy_from_unsafe_strict(dst, unsafe_ptr, size);
+	if (unlikely(ret < 0))
+out:
+		memset(dst, 0, size);
 	return ret;
 }
 
-static const struct bpf_func_proto bpf_probe_read_proto = {
-	.func		= bpf_probe_read,
+BPF_CALL_3(bpf_probe_read_kernel_str, void *, dst, u32, size,
+	   const void *, unsafe_ptr)
+{
+	return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, false);
+}
+
+static const struct bpf_func_proto bpf_probe_read_kernel_str_proto = {
+	.func		= bpf_probe_read_kernel_str,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_ANYTHING,
+};
+
+BPF_CALL_3(bpf_probe_read_compat_str, void *, dst, u32, size,
+	   const void *, unsafe_ptr)
+{
+	return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, true);
+}
+
+static const struct bpf_func_proto bpf_probe_read_compat_str_proto = {
+	.func		= bpf_probe_read_compat_str,
 	.gpl_only	= true,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
@@ -583,41 +699,6 @@ static const struct bpf_func_proto bpf_current_task_under_cgroup_proto = {
 	.arg2_type      = ARG_ANYTHING,
 };
 
-BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
-	   const void *, unsafe_ptr)
-{
-	int ret;
-
-	ret = security_locked_down(LOCKDOWN_BPF_READ);
-	if (ret < 0)
-		goto out;
-
-	/*
-	 * The strncpy_from_unsafe() call will likely not fill the entire
-	 * buffer, but that's okay in this circumstance as we're probing
-	 * arbitrary memory anyway similar to bpf_probe_read() and might
-	 * as well probe the stack. Thus, memory is explicitly cleared
-	 * only in error case, so that improper users ignoring return
-	 * code altogether don't copy garbage; otherwise length of string
-	 * is returned that can be used for bpf_perf_event_output() et al.
-	 */
-	ret = strncpy_from_unsafe(dst, unsafe_ptr, size);
-	if (unlikely(ret < 0))
-out:
-		memset(dst, 0, size);
-
-	return ret;
-}
-
-static const struct bpf_func_proto bpf_probe_read_str_proto = {
-	.func		= bpf_probe_read_str,
-	.gpl_only	= true,
-	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
-	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
-	.arg3_type	= ARG_ANYTHING,
-};
-
 struct send_signal_irq_work {
 	struct irq_work irq_work;
 	struct task_struct *task;
@@ -697,8 +778,6 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_map_pop_elem_proto;
 	case BPF_FUNC_map_peek_elem:
 		return &bpf_map_peek_elem_proto;
-	case BPF_FUNC_probe_read:
-		return &bpf_probe_read_proto;
 	case BPF_FUNC_ktime_get_ns:
 		return &bpf_ktime_get_ns_proto;
 	case BPF_FUNC_tail_call:
@@ -725,8 +804,18 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_current_task_under_cgroup_proto;
 	case BPF_FUNC_get_prandom_u32:
 		return &bpf_get_prandom_u32_proto;
+	case BPF_FUNC_probe_read_user:
+		return &bpf_probe_read_user_proto;
+	case BPF_FUNC_probe_read_kernel:
+		return &bpf_probe_read_kernel_proto;
+	case BPF_FUNC_probe_read:
+		return &bpf_probe_read_compat_proto;
+	case BPF_FUNC_probe_read_user_str:
+		return &bpf_probe_read_user_str_proto;
+	case BPF_FUNC_probe_read_kernel_str:
+		return &bpf_probe_read_kernel_str_proto;
 	case BPF_FUNC_probe_read_str:
-		return &bpf_probe_read_str_proto;
+		return &bpf_probe_read_compat_str_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_get_current_cgroup_id:
 		return &bpf_get_current_cgroup_id_proto;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4af8b0819a32..41855b53e8ff 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -561,10 +561,13 @@ union bpf_attr {
  * 	Return
  * 		0 on success, or a negative error in case of failure.
  *
- * int bpf_probe_read(void *dst, u32 size, const void *src)
+ * int bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
  * 	Description
  * 		For tracing programs, safely attempt to read *size* bytes from
- * 		address *src* and store the data in *dst*.
+ * 		kernel space address *unsafe_ptr* and store the data in *dst*.
+ *
+ * 		Generally, use bpf_probe_read_user() or bpf_probe_read_kernel()
+ * 		instead.
  * 	Return
  * 		0 on success, or a negative error in case of failure.
  *
@@ -1428,43 +1431,12 @@ union bpf_attr {
  *
  * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
  * 	Description
- * 		Copy a NUL terminated string from an unsafe address
- * 		*unsafe_ptr* to *dst*. The *size* should include the
- * 		terminating NUL byte. In case the string length is smaller than
- * 		*size*, the target is not padded with further NUL bytes. If the
- * 		string length is larger than *size*, just *size*-1 bytes are
- * 		copied and the last byte is set to NUL.
- *
- * 		On success, the length of the copied string is returned. This
- * 		makes this helper useful in tracing programs for reading
- * 		strings, and more importantly to get its length at runtime. See
- * 		the following snippet:
- *
- * 		::
- *
- * 			SEC("kprobe/sys_open")
- * 			void bpf_sys_open(struct pt_regs *ctx)
- * 			{
- * 			        char buf[PATHLEN]; // PATHLEN is defined to 256
- * 			        int res = bpf_probe_read_str(buf, sizeof(buf),
- * 				                             ctx->di);
- *
- * 				// Consume buf, for example push it to
- * 				// userspace via bpf_perf_event_output(); we
- * 				// can use res (the string length) as event
- * 				// size, after checking its boundaries.
- * 			}
- *
- * 		In comparison, using **bpf_probe_read()** helper here instead
- * 		to read the string would require to estimate the length at
- * 		compile time, and would often result in copying more memory
- * 		than necessary.
+ * 		Copy a NUL terminated string from an unsafe kernel address
+ * 		*unsafe_ptr* to *dst*. See bpf_probe_read_kernel_str() for
+ * 		more details.
  *
- * 		Another useful use case is when parsing individual process
- * 		arguments or individual environment variables navigating
- * 		*current*\ **->mm->arg_start** and *current*\
- * 		**->mm->env_start**: using this helper and the return value,
- * 		one can quickly iterate at the right offset of the memory area.
+ * 		Generally, use bpf_probe_read_user_str() or bpf_probe_read_kernel_str()
+ * 		instead.
  * 	Return
  * 		On success, the strictly positive length of the string,
  * 		including the trailing NUL character. On error, a negative
@@ -2775,6 +2747,72 @@ union bpf_attr {
  * 		restricted to raw_tracepoint bpf programs.
  * 	Return
  * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr)
+ * 	Description
+ * 		Safely attempt to read *size* bytes from user space address
+ * 		*unsafe_ptr* and store the data in *dst*.
+ * 	Return
+ * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
+ * 	Description
+ * 		Safely attempt to read *size* bytes from kernel space address
+ * 		*unsafe_ptr* and store the data in *dst*.
+ * 	Return
+ * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_probe_read_user_str(void *dst, int size, const void *unsafe_ptr)
+ * 	Description
+ * 		Copy a NUL terminated string from an unsafe user address
+ * 		*unsafe_ptr* to *dst*. The *size* should include the
+ * 		terminating NUL byte. In case the string length is smaller than
+ * 		*size*, the target is not padded with further NUL bytes. If the
+ * 		string length is larger than *size*, just *size*-1 bytes are
+ * 		copied and the last byte is set to NUL.
+ *
+ * 		On success, the length of the copied string is returned. This
+ * 		makes this helper useful in tracing programs for reading
+ * 		strings, and more importantly to get its length at runtime. See
+ * 		the following snippet:
+ *
+ * 		::
+ *
+ * 			SEC("kprobe/sys_open")
+ * 			void bpf_sys_open(struct pt_regs *ctx)
+ * 			{
+ * 			        char buf[PATHLEN]; // PATHLEN is defined to 256
+ * 			        int res = bpf_probe_read_user_str(buf, sizeof(buf),
+ * 				                                  ctx->di);
+ *
+ * 				// Consume buf, for example push it to
+ * 				// userspace via bpf_perf_event_output(); we
+ * 				// can use res (the string length) as event
+ * 				// size, after checking its boundaries.
+ * 			}
+ *
+ * 		In comparison, using **bpf_probe_read_user()** helper here
+ * 		instead to read the string would require to estimate the length
+ * 		at compile time, and would often result in copying more memory
+ * 		than necessary.
+ *
+ * 		Another useful use case is when parsing individual process
+ * 		arguments or individual environment variables navigating
+ * 		*current*\ **->mm->arg_start** and *current*\
+ * 		**->mm->env_start**: using this helper and the return value,
+ * 		one can quickly iterate at the right offset of the memory area.
+ * 	Return
+ * 		On success, the strictly positive length of the string,
+ * 		including the trailing NUL character. On error, a negative
+ * 		value.
+ *
+ * int bpf_probe_read_kernel_str(void *dst, int size, const void *unsafe_ptr)
+ * 	Description
+ * 		Copy a NUL terminated string from an unsafe kernel address *unsafe_ptr*
+ * 		to *dst*. Same semantics as with bpf_probe_read_user_str() apply.
+ * 	Return
+ * 		On success, the strictly positive length of the string,	including
+ * 		the trailing NUL character. On error, a negative value.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2888,7 +2926,11 @@ union bpf_attr {
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
 	FN(tcp_gen_syncookie),		\
-	FN(skb_output),
+	FN(skb_output),			\
+	FN(probe_read_user),		\
+	FN(probe_read_kernel),		\
+	FN(probe_read_user_str),	\
+	FN(probe_read_kernel_str),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 5/8] bpf: Switch BPF probe insns to bpf_probe_read_kernel
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
                   ` (3 preceding siblings ...)
  2019-10-31  1:00 ` [PATCH bpf-next v2 4/8] bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31 20:08   ` Andrii Nakryiko
  2019-10-31  1:00 ` [PATCH bpf-next v2 6/8] bpf, samples: Use bpf_probe_read_user where appropriate Daniel Borkmann
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf; +Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann

Commit 2a02759ef5f8 ("bpf: Add support for BTF pointers to interpreter")
explicitly states that the pointer to BTF object is a pointer to a kernel
object or NULL. Therefore we should also switch to using the strict kernel
probe helper which is restricted to kernel addresses only when architectures
have non-overlapping address spaces.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 kernel/bpf/core.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 673f5d40a93e..76452326fd8e 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1309,11 +1309,12 @@ bool bpf_opcode_in_insntable(u8 code)
 }
 
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
-u64 __weak bpf_probe_read(void * dst, u32 size, const void * unsafe_ptr)
+u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
 {
 	memset(dst, 0, size);
 	return -EFAULT;
 }
+
 /**
  *	__bpf_prog_run - run eBPF program on a given context
  *	@regs: is the array of MAX_BPF_EXT_REG eBPF pseudo-registers
@@ -1569,9 +1570,9 @@ static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u6
 	LDST(W,  u32)
 	LDST(DW, u64)
 #undef LDST
-#define LDX_PROBE(SIZEOP, SIZE)						\
-	LDX_PROBE_MEM_##SIZEOP:						\
-		bpf_probe_read(&DST, SIZE, (const void *)(long) SRC);	\
+#define LDX_PROBE(SIZEOP, SIZE)							\
+	LDX_PROBE_MEM_##SIZEOP:							\
+		bpf_probe_read_kernel(&DST, SIZE, (const void *)(long) SRC);	\
 		CONT;
 	LDX_PROBE(B,  1)
 	LDX_PROBE(H,  2)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 6/8] bpf, samples: Use bpf_probe_read_user where appropriate
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
                   ` (4 preceding siblings ...)
  2019-10-31  1:00 ` [PATCH bpf-next v2 5/8] bpf: Switch BPF probe insns to bpf_probe_read_kernel Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 7/8] bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper Daniel Borkmann
  2019-10-31  1:00 ` [PATCH bpf-next v2 8/8] bpf, testing: Add selftest to read/write sockaddr from user space Daniel Borkmann
  7 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf; +Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann, Andrii Nakryiko

Use bpf_probe_read_user() helper instead of bpf_probe_read() for samples that
attach to kprobes probing on user addresses.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
---
 samples/bpf/map_perf_test_kern.c         | 4 ++--
 samples/bpf/test_map_in_map_kern.c       | 4 ++--
 samples/bpf/test_probe_write_user_kern.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index 5c11aefbc489..281bcdaee58e 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -181,8 +181,8 @@ int stress_lru_hmap_alloc(struct pt_regs *ctx)
 	if (addrlen != sizeof(*in6))
 		return 0;
 
-	ret = bpf_probe_read(test_params.dst6, sizeof(test_params.dst6),
-			     &in6->sin6_addr);
+	ret = bpf_probe_read_user(test_params.dst6, sizeof(test_params.dst6),
+				  &in6->sin6_addr);
 	if (ret)
 		goto done;
 
diff --git a/samples/bpf/test_map_in_map_kern.c b/samples/bpf/test_map_in_map_kern.c
index 4f80cbe74c72..32ee752f19df 100644
--- a/samples/bpf/test_map_in_map_kern.c
+++ b/samples/bpf/test_map_in_map_kern.c
@@ -118,7 +118,7 @@ int trace_sys_connect(struct pt_regs *ctx)
 	if (addrlen != sizeof(*in6))
 		return 0;
 
-	ret = bpf_probe_read(dst6, sizeof(dst6), &in6->sin6_addr);
+	ret = bpf_probe_read_user(dst6, sizeof(dst6), &in6->sin6_addr);
 	if (ret) {
 		inline_ret = ret;
 		goto done;
@@ -129,7 +129,7 @@ int trace_sys_connect(struct pt_regs *ctx)
 
 	test_case = dst6[7];
 
-	ret = bpf_probe_read(&port, sizeof(port), &in6->sin6_port);
+	ret = bpf_probe_read_user(&port, sizeof(port), &in6->sin6_port);
 	if (ret) {
 		inline_ret = ret;
 		goto done;
diff --git a/samples/bpf/test_probe_write_user_kern.c b/samples/bpf/test_probe_write_user_kern.c
index a543358218e6..b7c48f37132c 100644
--- a/samples/bpf/test_probe_write_user_kern.c
+++ b/samples/bpf/test_probe_write_user_kern.c
@@ -37,7 +37,7 @@ int bpf_prog1(struct pt_regs *ctx)
 	if (sockaddr_len > sizeof(orig_addr))
 		return 0;
 
-	if (bpf_probe_read(&orig_addr, sizeof(orig_addr), sockaddr_arg) != 0)
+	if (bpf_probe_read_user(&orig_addr, sizeof(orig_addr), sockaddr_arg) != 0)
 		return 0;
 
 	mapped_addr = bpf_map_lookup_elem(&dnat_map, &orig_addr);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 7/8] bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
                   ` (5 preceding siblings ...)
  2019-10-31  1:00 ` [PATCH bpf-next v2 6/8] bpf, samples: Use bpf_probe_read_user where appropriate Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  2019-10-31 20:10   ` Andrii Nakryiko
  2019-10-31  1:00 ` [PATCH bpf-next v2 8/8] bpf, testing: Add selftest to read/write sockaddr from user space Daniel Borkmann
  7 siblings, 1 reply; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf; +Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann

Use probe read *_{kernel,user}{,_str}() helpers instead of bpf_probe_read()
or bpf_probe_read_user_str() for program tests where appropriate.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 tools/testing/selftests/bpf/progs/kfree_skb.c |  4 +-
 tools/testing/selftests/bpf/progs/pyperf.h    | 67 ++++++++++---------
 .../testing/selftests/bpf/progs/strobemeta.h  | 36 +++++-----
 .../selftests/bpf/progs/test_tcp_estats.c     |  2 +-
 4 files changed, 57 insertions(+), 52 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/kfree_skb.c b/tools/testing/selftests/bpf/progs/kfree_skb.c
index 89af8a921ee4..489319ea1d6a 100644
--- a/tools/testing/selftests/bpf/progs/kfree_skb.c
+++ b/tools/testing/selftests/bpf/progs/kfree_skb.c
@@ -79,11 +79,11 @@ int trace_kfree_skb(struct trace_kfree_skb *ctx)
 		func = ptr->func;
 	}));
 
-	bpf_probe_read(&pkt_type, sizeof(pkt_type), _(&skb->__pkt_type_offset));
+	bpf_probe_read_kernel(&pkt_type, sizeof(pkt_type), _(&skb->__pkt_type_offset));
 	pkt_type &= 7;
 
 	/* read eth proto */
-	bpf_probe_read(&pkt_data, sizeof(pkt_data), data + 12);
+	bpf_probe_read_kernel(&pkt_data, sizeof(pkt_data), data + 12);
 
 	bpf_printk("rcuhead.next %llx func %llx\n", ptr, func);
 	bpf_printk("skb->len %d users %d pkt_type %x\n",
diff --git a/tools/testing/selftests/bpf/progs/pyperf.h b/tools/testing/selftests/bpf/progs/pyperf.h
index 003fe106fc70..71d383cc9b85 100644
--- a/tools/testing/selftests/bpf/progs/pyperf.h
+++ b/tools/testing/selftests/bpf/progs/pyperf.h
@@ -72,9 +72,9 @@ static __always_inline void *get_thread_state(void *tls_base, PidData *pidData)
 	void* thread_state;
 	int key;
 
-	bpf_probe_read(&key, sizeof(key), (void*)(long)pidData->tls_key_addr);
-	bpf_probe_read(&thread_state, sizeof(thread_state),
-		       tls_base + 0x310 + key * 0x10 + 0x08);
+	bpf_probe_read_user(&key, sizeof(key), (void*)(long)pidData->tls_key_addr);
+	bpf_probe_read_user(&thread_state, sizeof(thread_state),
+			    tls_base + 0x310 + key * 0x10 + 0x08);
 	return thread_state;
 }
 
@@ -82,31 +82,33 @@ static __always_inline bool get_frame_data(void *frame_ptr, PidData *pidData,
 					   FrameData *frame, Symbol *symbol)
 {
 	// read data from PyFrameObject
-	bpf_probe_read(&frame->f_back,
-		       sizeof(frame->f_back),
-		       frame_ptr + pidData->offsets.PyFrameObject_back);
-	bpf_probe_read(&frame->f_code,
-		       sizeof(frame->f_code),
-		       frame_ptr + pidData->offsets.PyFrameObject_code);
+	bpf_probe_read_user(&frame->f_back,
+			    sizeof(frame->f_back),
+			    frame_ptr + pidData->offsets.PyFrameObject_back);
+	bpf_probe_read_user(&frame->f_code,
+			    sizeof(frame->f_code),
+			    frame_ptr + pidData->offsets.PyFrameObject_code);
 
 	// read data from PyCodeObject
 	if (!frame->f_code)
 		return false;
-	bpf_probe_read(&frame->co_filename,
-		       sizeof(frame->co_filename),
-		       frame->f_code + pidData->offsets.PyCodeObject_filename);
-	bpf_probe_read(&frame->co_name,
-		       sizeof(frame->co_name),
-		       frame->f_code + pidData->offsets.PyCodeObject_name);
+	bpf_probe_read_user(&frame->co_filename,
+			    sizeof(frame->co_filename),
+			    frame->f_code + pidData->offsets.PyCodeObject_filename);
+	bpf_probe_read_user(&frame->co_name,
+			    sizeof(frame->co_name),
+			    frame->f_code + pidData->offsets.PyCodeObject_name);
 	// read actual names into symbol
 	if (frame->co_filename)
-		bpf_probe_read_str(&symbol->file,
-				   sizeof(symbol->file),
-				   frame->co_filename + pidData->offsets.String_data);
+		bpf_probe_read_user_str(&symbol->file,
+					sizeof(symbol->file),
+					frame->co_filename +
+					pidData->offsets.String_data);
 	if (frame->co_name)
-		bpf_probe_read_str(&symbol->name,
-				   sizeof(symbol->name),
-				   frame->co_name + pidData->offsets.String_data);
+		bpf_probe_read_user_str(&symbol->name,
+					sizeof(symbol->name),
+					frame->co_name +
+					pidData->offsets.String_data);
 	return true;
 }
 
@@ -174,9 +176,9 @@ static __always_inline int __on_event(struct pt_regs *ctx)
 	event->kernel_stack_id = bpf_get_stackid(ctx, &stackmap, 0);
 
 	void* thread_state_current = (void*)0;
-	bpf_probe_read(&thread_state_current,
-		       sizeof(thread_state_current),
-		       (void*)(long)pidData->current_state_addr);
+	bpf_probe_read_user(&thread_state_current,
+			    sizeof(thread_state_current),
+			    (void*)(long)pidData->current_state_addr);
 
 	struct task_struct* task = (struct task_struct*)bpf_get_current_task();
 	void* tls_base = (void*)task;
@@ -188,11 +190,13 @@ static __always_inline int __on_event(struct pt_regs *ctx)
 	if (pidData->use_tls) {
 		uint64_t pthread_created;
 		uint64_t pthread_self;
-		bpf_probe_read(&pthread_self, sizeof(pthread_self), tls_base + 0x10);
+		bpf_probe_read_user(&pthread_self, sizeof(pthread_self),
+				    tls_base + 0x10);
 
-		bpf_probe_read(&pthread_created,
-			       sizeof(pthread_created),
-			       thread_state + pidData->offsets.PyThreadState_thread);
+		bpf_probe_read_user(&pthread_created,
+				    sizeof(pthread_created),
+				    thread_state +
+				    pidData->offsets.PyThreadState_thread);
 		event->pthread_match = pthread_created == pthread_self;
 	} else {
 		event->pthread_match = 1;
@@ -204,9 +208,10 @@ static __always_inline int __on_event(struct pt_regs *ctx)
 		Symbol sym = {};
 		int cur_cpu = bpf_get_smp_processor_id();
 
-		bpf_probe_read(&frame_ptr,
-			       sizeof(frame_ptr),
-			       thread_state + pidData->offsets.PyThreadState_frame);
+		bpf_probe_read_user(&frame_ptr,
+				    sizeof(frame_ptr),
+				    thread_state +
+				    pidData->offsets.PyThreadState_frame);
 
 		int32_t* symbol_counter = bpf_map_lookup_elem(&symbolmap, &sym);
 		if (symbol_counter == NULL)
diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
index 067eb625d01c..4bf16e0a1b0e 100644
--- a/tools/testing/selftests/bpf/progs/strobemeta.h
+++ b/tools/testing/selftests/bpf/progs/strobemeta.h
@@ -98,7 +98,7 @@ struct strobe_map_raw {
 	/*
 	 * having volatile doesn't change anything on BPF side, but clang
 	 * emits warnings for passing `volatile const char *` into
-	 * bpf_probe_read_str that expects just `const char *`
+	 * bpf_probe_read_user_str that expects just `const char *`
 	 */
 	const char* tag;
 	/*
@@ -309,18 +309,18 @@ static __always_inline void *calc_location(struct strobe_value_loc *loc,
 	dtv_t *dtv;
 	void *tls_ptr;
 
-	bpf_probe_read(&tls_index, sizeof(struct tls_index),
-		       (void *)loc->offset);
+	bpf_probe_read_user(&tls_index, sizeof(struct tls_index),
+			    (void *)loc->offset);
 	/* valid module index is always positive */
 	if (tls_index.module > 0) {
 		/* dtv = ((struct tcbhead *)tls_base)->dtv[tls_index.module] */
-		bpf_probe_read(&dtv, sizeof(dtv),
-			       &((struct tcbhead *)tls_base)->dtv);
+		bpf_probe_read_user(&dtv, sizeof(dtv),
+				    &((struct tcbhead *)tls_base)->dtv);
 		dtv += tls_index.module;
 	} else {
 		dtv = NULL;
 	}
-	bpf_probe_read(&tls_ptr, sizeof(void *), dtv);
+	bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
 	/* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
 	return tls_ptr && tls_ptr != (void *)-1
 		? tls_ptr + tls_index.offset
@@ -336,7 +336,7 @@ static __always_inline void read_int_var(struct strobemeta_cfg *cfg,
 	if (!location)
 		return;
 
-	bpf_probe_read(value, sizeof(struct strobe_value_generic), location);
+	bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
 	data->int_vals[idx] = value->val;
 	if (value->header.len)
 		data->int_vals_set_mask |= (1 << idx);
@@ -356,13 +356,13 @@ static __always_inline uint64_t read_str_var(struct strobemeta_cfg *cfg,
 	if (!location)
 		return 0;
 
-	bpf_probe_read(value, sizeof(struct strobe_value_generic), location);
-	len = bpf_probe_read_str(payload, STROBE_MAX_STR_LEN, value->ptr);
+	bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
+	len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN, value->ptr);
 	/*
-	 * if bpf_probe_read_str returns error (<0), due to casting to
+	 * if bpf_probe_read_user_str returns error (<0), due to casting to
 	 * unsinged int, it will become big number, so next check is
 	 * sufficient to check for errors AND prove to BPF verifier, that
-	 * bpf_probe_read_str won't return anything bigger than
+	 * bpf_probe_read_user_str won't return anything bigger than
 	 * STROBE_MAX_STR_LEN
 	 */
 	if (len > STROBE_MAX_STR_LEN)
@@ -391,8 +391,8 @@ static __always_inline void *read_map_var(struct strobemeta_cfg *cfg,
 	if (!location)
 		return payload;
 
-	bpf_probe_read(value, sizeof(struct strobe_value_generic), location);
-	if (bpf_probe_read(&map, sizeof(struct strobe_map_raw), value->ptr))
+	bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
+	if (bpf_probe_read_user(&map, sizeof(struct strobe_map_raw), value->ptr))
 		return payload;
 
 	descr->id = map.id;
@@ -402,7 +402,7 @@ static __always_inline void *read_map_var(struct strobemeta_cfg *cfg,
 		data->req_meta_valid = 1;
 	}
 
-	len = bpf_probe_read_str(payload, STROBE_MAX_STR_LEN, map.tag);
+	len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN, map.tag);
 	if (len <= STROBE_MAX_STR_LEN) {
 		descr->tag_len = len;
 		payload += len;
@@ -418,15 +418,15 @@ static __always_inline void *read_map_var(struct strobemeta_cfg *cfg,
 			break;
 
 		descr->key_lens[i] = 0;
-		len = bpf_probe_read_str(payload, STROBE_MAX_STR_LEN,
-					 map.entries[i].key);
+		len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN,
+					      map.entries[i].key);
 		if (len <= STROBE_MAX_STR_LEN) {
 			descr->key_lens[i] = len;
 			payload += len;
 		}
 		descr->val_lens[i] = 0;
-		len = bpf_probe_read_str(payload, STROBE_MAX_STR_LEN,
-					 map.entries[i].val);
+		len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN,
+					      map.entries[i].val);
 		if (len <= STROBE_MAX_STR_LEN) {
 			descr->val_lens[i] = len;
 			payload += len;
diff --git a/tools/testing/selftests/bpf/progs/test_tcp_estats.c b/tools/testing/selftests/bpf/progs/test_tcp_estats.c
index c8c595da38d4..87b7d934ce73 100644
--- a/tools/testing/selftests/bpf/progs/test_tcp_estats.c
+++ b/tools/testing/selftests/bpf/progs/test_tcp_estats.c
@@ -38,7 +38,7 @@
 #include <sys/socket.h>
 #include "bpf_helpers.h"
 
-#define _(P) ({typeof(P) val = 0; bpf_probe_read(&val, sizeof(val), &P); val;})
+#define _(P) ({typeof(P) val = 0; bpf_probe_read_kernel(&val, sizeof(val), &P); val;})
 #define TCP_ESTATS_MAGIC 0xBAADBEEF
 
 /* This test case needs "sock" and "pt_regs" data structure.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v2 8/8] bpf, testing: Add selftest to read/write sockaddr from user space
  2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
                   ` (6 preceding siblings ...)
  2019-10-31  1:00 ` [PATCH bpf-next v2 7/8] bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper Daniel Borkmann
@ 2019-10-31  1:00 ` Daniel Borkmann
  7 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2019-10-31  1:00 UTC (permalink / raw)
  To: bpf
  Cc: netdev, ast, andrii.nakryiko, Daniel Borkmann, Andrii Nakryiko,
	Ilya Leoshkevich

Tested on x86-64 and Ilya was also kind enough to give it a spin on
s390x, both passing with probe_user:OK there. The test is using the
newly added bpf_probe_read_user() to dump sockaddr from connect call
into .bss BPF map and overrides the user buffer via bpf_probe_write_user():

  # ./test_progs
  [...]
  #17 pkt_md_access:OK
  #18 probe_user:OK
  #19 prog_run_xattr:OK
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 .../selftests/bpf/prog_tests/probe_user.c     | 78 +++++++++++++++++++
 .../selftests/bpf/progs/test_probe_user.c     | 26 +++++++
 2 files changed, 104 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_user.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_probe_user.c

diff --git a/tools/testing/selftests/bpf/prog_tests/probe_user.c b/tools/testing/selftests/bpf/prog_tests/probe_user.c
new file mode 100644
index 000000000000..6cc36c87bdbc
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/probe_user.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+
+void test_probe_user(void)
+{
+#define kprobe_name "__sys_connect"
+	const char *prog_name = "kprobe/" kprobe_name;
+	const char *obj_file = "./test_probe_user.o";
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, );
+	int err, results_map_fd, sock_fd, duration;
+	struct sockaddr curr, orig, tmp;
+	struct sockaddr_in *in = (struct sockaddr_in *)&curr;
+	struct bpf_link *kprobe_link = NULL;
+	struct bpf_program *kprobe_prog;
+	struct bpf_object *obj;
+	static const int zero = 0;
+
+	obj = bpf_object__open_file(obj_file, &opts);
+	if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj)))
+		return;
+
+	kprobe_prog = bpf_object__find_program_by_title(obj, prog_name);
+	if (CHECK(!kprobe_prog, "find_probe",
+		  "prog '%s' not found\n", prog_name))
+		goto cleanup;
+
+	err = bpf_object__load(obj);
+	if (CHECK(err, "obj_load", "err %d\n", err))
+		goto cleanup;
+
+	results_map_fd = bpf_find_map(__func__, obj, "test_pro.bss");
+	if (CHECK(results_map_fd < 0, "find_bss_map",
+		  "err %d\n", results_map_fd))
+		goto cleanup;
+
+	kprobe_link = bpf_program__attach_kprobe(kprobe_prog, false,
+						 kprobe_name);
+	if (CHECK(IS_ERR(kprobe_link), "attach_kprobe",
+		  "err %ld\n", PTR_ERR(kprobe_link))) {
+		kprobe_link = NULL;
+		goto cleanup;
+	}
+
+	memset(&curr, 0, sizeof(curr));
+	in->sin_family = AF_INET;
+	in->sin_port = htons(5555);
+	in->sin_addr.s_addr = inet_addr("255.255.255.255");
+	memcpy(&orig, &curr, sizeof(curr));
+
+	sock_fd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK(sock_fd < 0, "create_sock_fd", "err %d\n", sock_fd))
+		goto cleanup;
+
+	connect(sock_fd, &curr, sizeof(curr));
+	close(sock_fd);
+
+	err = bpf_map_lookup_elem(results_map_fd, &zero, &tmp);
+	if (CHECK(err, "get_kprobe_res",
+		  "failed to get kprobe res: %d\n", err))
+		goto cleanup;
+
+	in = (struct sockaddr_in *)&tmp;
+	if (CHECK(memcmp(&tmp, &orig, sizeof(orig)), "check_kprobe_res",
+		  "wrong kprobe res from probe read: %s:%u\n",
+		  inet_ntoa(in->sin_addr), ntohs(in->sin_port)))
+		goto cleanup;
+
+	memset(&tmp, 0xab, sizeof(tmp));
+
+	in = (struct sockaddr_in *)&curr;
+	if (CHECK(memcmp(&curr, &tmp, sizeof(tmp)), "check_kprobe_res",
+		  "wrong kprobe res from probe write: %s:%u\n",
+		  inet_ntoa(in->sin_addr), ntohs(in->sin_port)))
+		goto cleanup;
+cleanup:
+	bpf_link__destroy(kprobe_link);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_probe_user.c b/tools/testing/selftests/bpf/progs/test_probe_user.c
new file mode 100644
index 000000000000..1871e2ece0c4
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_probe_user.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/ptrace.h>
+#include <linux/bpf.h>
+
+#include <netinet/in.h>
+
+#include "bpf_helpers.h"
+#include "bpf_tracing.h"
+
+static struct sockaddr_in old;
+
+SEC("kprobe/__sys_connect")
+int handle_sys_connect(struct pt_regs *ctx)
+{
+	void *ptr = (void *)PT_REGS_PARM2(ctx);
+	struct sockaddr_in new;
+
+	bpf_probe_read_user(&old, sizeof(old), ptr);
+	__builtin_memset(&new, 0xab, sizeof(new));
+	bpf_probe_write_user(ptr, &new, sizeof(new));
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2 4/8] bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers
  2019-10-31  1:00 ` [PATCH bpf-next v2 4/8] bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers Daniel Borkmann
@ 2019-10-31 20:07   ` Andrii Nakryiko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2019-10-31 20:07 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: bpf, Networking, Alexei Starovoitov

On Wed, Oct 30, 2019 at 6:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
> in that they assume they can be used for probing memory access for kernel
> space addresses /as well as/ user space addresses.
>
> However, plain use of probe_kernel_read() for both cases will attempt to
> always access kernel space address space given access is performed under
> KERNEL_DS and some archs in-fact have overlapping address spaces where a
> kernel pointer and user pointer would have the /same/ address value and
> therefore accessing application memory via bpf_probe_read{,_str}() would
> read garbage values.
>
> Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess:
> Add non-pagefault user-space read functions"). Unfortunately, the only way
> to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
> and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
> helpers are kept as-is to retain their current behavior.
>
> The two *_user() variants attempt the access always under USER_DS set, the
> two *_kernel() variants will -EFAULT when accessing user memory if the
> underlying architecture has non-overlapping address ranges, also avoiding
> throwing the kernel warning via 00c42373d397 ("x86-64: add warning for
> non-canonical user access address dereferences").
>
> Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper")
> Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---

LGTM!

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  include/uapi/linux/bpf.h       | 120 +++++++++++++++-------
>  kernel/trace/bpf_trace.c       | 181 ++++++++++++++++++++++++---------
>  tools/include/uapi/linux/bpf.h | 120 +++++++++++++++-------
>  3 files changed, 297 insertions(+), 124 deletions(-)
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2 5/8] bpf: Switch BPF probe insns to bpf_probe_read_kernel
  2019-10-31  1:00 ` [PATCH bpf-next v2 5/8] bpf: Switch BPF probe insns to bpf_probe_read_kernel Daniel Borkmann
@ 2019-10-31 20:08   ` Andrii Nakryiko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2019-10-31 20:08 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: bpf, Networking, Alexei Starovoitov

On Wed, Oct 30, 2019 at 6:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Commit 2a02759ef5f8 ("bpf: Add support for BTF pointers to interpreter")
> explicitly states that the pointer to BTF object is a pointer to a kernel
> object or NULL. Therefore we should also switch to using the strict kernel
> probe helper which is restricted to kernel addresses only when architectures
> have non-overlapping address spaces.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  kernel/bpf/core.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2 7/8] bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper
  2019-10-31  1:00 ` [PATCH bpf-next v2 7/8] bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper Daniel Borkmann
@ 2019-10-31 20:10   ` Andrii Nakryiko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2019-10-31 20:10 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: bpf, Networking, Alexei Starovoitov

On Wed, Oct 30, 2019 at 6:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Use probe read *_{kernel,user}{,_str}() helpers instead of bpf_probe_read()
> or bpf_probe_read_user_str() for program tests where appropriate.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---

Thanks!

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/testing/selftests/bpf/progs/kfree_skb.c |  4 +-
>  tools/testing/selftests/bpf/progs/pyperf.h    | 67 ++++++++++---------
>  .../testing/selftests/bpf/progs/strobemeta.h  | 36 +++++-----
>  .../selftests/bpf/progs/test_tcp_estats.c     |  2 +-
>  4 files changed, 57 insertions(+), 52 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2 2/8] uaccess: Add strict non-pagefault kernel-space read function
  2019-10-31  1:00 ` [PATCH bpf-next v2 2/8] uaccess: Add strict non-pagefault kernel-space read function Daniel Borkmann
@ 2019-10-31 20:12   ` Andrii Nakryiko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2019-10-31 20:12 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: bpf, Networking, Alexei Starovoitov, Linus Torvalds,
	Masami Hiramatsu, x86

On Wed, Oct 30, 2019 at 6:00 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Add two new probe_kernel_read_strict() and strncpy_from_unsafe_strict()
> helpers which by default alias to the __probe_kernel_read() and the
> __strncpy_from_unsafe(), respectively, but can be overridden by archs
> which have non-overlapping address ranges for kernel space and user
> space in order to bail out with -EFAULT when attempting to probe user
> memory including non-canonical user access addresses [0].
>
> The idea is that these helpers are complementary to the probe_user_read()
> and strncpy_from_unsafe_user() which probe user-only memory. Both added
> helpers here do the same, but for kernel-only addresses.
>
> Both set of helpers are going to be used for BPF tracing. They also
> explicitly avoid throwing the splat for non-canonical user addresses from
> 00c42373d397 ("x86-64: add warning for non-canonical user access address
> dereferences").
>
> For compat, the current probe_kernel_read() and strncpy_from_unsafe() are
> left as-is.
>
>   [0] Documentation/x86/x86_64/mm.txt
>
>       4-level page tables: 0x0000800000000000 - 0xffff7fffffffffff
>       5-level page tables: 0x0100000000000000 - 0xfeffffffffffffff
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: x86@kernel.org
> ---

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  arch/x86/mm/Makefile    |  2 +-
>  arch/x86/mm/maccess.c   | 38 ++++++++++++++++++++++++++++++++++++++
>  include/linux/uaccess.h |  4 ++++
>  mm/maccess.c            | 25 ++++++++++++++++++++++++-
>  4 files changed, 67 insertions(+), 2 deletions(-)
>  create mode 100644 arch/x86/mm/maccess.c
>

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-10-31 20:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-31  1:00 [PATCH bpf-next v2 0/8] Fix BPF probe memory helpers Daniel Borkmann
2019-10-31  1:00 ` [PATCH bpf-next v2 1/8] uaccess: Add non-pagefault user-space write function Daniel Borkmann
2019-10-31  1:00 ` [PATCH bpf-next v2 2/8] uaccess: Add strict non-pagefault kernel-space read function Daniel Borkmann
2019-10-31 20:12   ` Andrii Nakryiko
2019-10-31  1:00 ` [PATCH bpf-next v2 3/8] bpf: Make use of probe_user_write in probe write helper Daniel Borkmann
2019-10-31  1:00 ` [PATCH bpf-next v2 4/8] bpf: Add probe_read_{user,kernel} and probe_read_{user,kernel}_str helpers Daniel Borkmann
2019-10-31 20:07   ` Andrii Nakryiko
2019-10-31  1:00 ` [PATCH bpf-next v2 5/8] bpf: Switch BPF probe insns to bpf_probe_read_kernel Daniel Borkmann
2019-10-31 20:08   ` Andrii Nakryiko
2019-10-31  1:00 ` [PATCH bpf-next v2 6/8] bpf, samples: Use bpf_probe_read_user where appropriate Daniel Borkmann
2019-10-31  1:00 ` [PATCH bpf-next v2 7/8] bpf, testing: Convert prog tests to probe_read_{user,kernel}{,_str} helper Daniel Borkmann
2019-10-31 20:10   ` Andrii Nakryiko
2019-10-31  1:00 ` [PATCH bpf-next v2 8/8] bpf, testing: Add selftest to read/write sockaddr from user space Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).