All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf
@ 2024-02-02 10:39 Hou Tao
  2024-02-02 10:39 ` [PATCH bpf v3 1/3] x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h Hou Tao
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Hou Tao @ 2024-02-02 10:39 UTC (permalink / raw)
  To: x86, bpf
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H . Peter Anvin, linux-kernel,
	xingwei lee, Jann Horn, Sohil Mehta, Yonghong Song, houtao1

From: Hou Tao <houtao1@huawei.com>

Hi,

As reported by syzboot [1] and [2], when trying to read vsyscall page
by using bpf_probe_read_kernel() or bpf_probe_read(), oops may happen.

Thomas Gleixner had proposed a test patch [3], but it seems that no
formal patch is posted after about one month [4], so I post it instead
and add an Originally-by tag in patch #2.

Patch #1 makes is_vsyscall_vaddr() being a common helper. Patch #2 fixes
the problem by disallowing vsyscall page read for
copy_from_kernel_nofault(). Patch #3 adds one test case to ensure the
read of vsyscall page through bpf is rejected. Please see individual
patches for more details.

Comments are always welcome.

[1]: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com/
[2]: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com/
[3]: https://lore.kernel.org/bpf/87r0jwquhv.ffs@tglx/
[4]: https://lore.kernel.org/bpf/e24b125c-8ff4-9031-6c53-67ff2e01f316@huaweicloud.com/

Change Log:
v3:
 * rephrase commit message for patch #1 & #2 (Sohil)
 * reword comments in copy_from_kernel_nofault_allowed() (Sohil)
 * add Rvb tag for patch #1 and Acked-by tag for patch #3 (Sohil, Yonghong)

v2: https://lore.kernel.org/bpf/20240126115423.3943360-1-houtao@huaweicloud.com/
  * move is_vsyscall_vaddr to asm/vsyscall.h instead (Sohil)
  * elaborate on the reason for disallowing of vsyscall page read in
    copy_from_kernel_nofault_allowed() (Sohil)
  * update the commit message of patch #2 to more clearly explain how
    the oops occurs. (Sohil)
  * update the commit message of patch #3 to explain the expected return
    values of various bpf helpers (Yonghong)

v1: https://lore.kernel.org/bpf/20240119073019.1528573-1-houtao@huaweicloud.com/

Hou Tao (3):
  x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h
  x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
  selftest/bpf: Test the read of vsyscall page under x86-64

 arch/x86/include/asm/vsyscall.h               | 10 ++++
 arch/x86/mm/fault.c                           |  9 ---
 arch/x86/mm/maccess.c                         | 10 ++++
 .../selftests/bpf/prog_tests/read_vsyscall.c  | 57 +++++++++++++++++++
 .../selftests/bpf/progs/read_vsyscall.c       | 45 +++++++++++++++
 5 files changed, 122 insertions(+), 9 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/read_vsyscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/read_vsyscall.c

-- 
2.29.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf v3 1/3] x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h
  2024-02-02 10:39 [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf Hou Tao
@ 2024-02-02 10:39 ` Hou Tao
  2024-02-02 10:39 ` [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() Hou Tao
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Hou Tao @ 2024-02-02 10:39 UTC (permalink / raw)
  To: x86, bpf
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H . Peter Anvin, linux-kernel,
	xingwei lee, Jann Horn, Sohil Mehta, Yonghong Song, houtao1

From: Hou Tao <houtao1@huawei.com>

Move is_vsyscall_vaddr() into asm/vsyscall.h to make it available for
copy_from_kernel_nofault_allowed() in arch/x86/mm/maccess.c.

Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 arch/x86/include/asm/vsyscall.h | 10 ++++++++++
 arch/x86/mm/fault.c             |  9 ---------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index ab60a71a8dcb9..472f0263dbc61 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -4,6 +4,7 @@
 
 #include <linux/seqlock.h>
 #include <uapi/asm/vsyscall.h>
+#include <asm/page_types.h>
 
 #ifdef CONFIG_X86_VSYSCALL_EMULATION
 extern void map_vsyscall(void);
@@ -24,4 +25,13 @@ static inline bool emulate_vsyscall(unsigned long error_code,
 }
 #endif
 
+/*
+ * The (legacy) vsyscall page is the long page in the kernel portion
+ * of the address space that has user-accessible permissions.
+ */
+static inline bool is_vsyscall_vaddr(unsigned long vaddr)
+{
+	return unlikely((vaddr & PAGE_MASK) == VSYSCALL_ADDR);
+}
+
 #endif /* _ASM_X86_VSYSCALL_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 679b09cfe241c..d6375b3c633bc 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -798,15 +798,6 @@ show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 	show_opcodes(regs, loglvl);
 }
 
-/*
- * The (legacy) vsyscall page is the long page in the kernel portion
- * of the address space that has user-accessible permissions.
- */
-static bool is_vsyscall_vaddr(unsigned long vaddr)
-{
-	return unlikely((vaddr & PAGE_MASK) == VSYSCALL_ADDR);
-}
-
 static void
 __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		       unsigned long address, u32 pkey, int si_code)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
  2024-02-02 10:39 [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf Hou Tao
  2024-02-02 10:39 ` [PATCH bpf v3 1/3] x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h Hou Tao
@ 2024-02-02 10:39 ` Hou Tao
  2024-02-02 18:43   ` Sohil Mehta
  2024-02-15 11:13   ` Thomas Gleixner
  2024-02-02 10:39 ` [PATCH bpf v3 3/3] selftest/bpf: Test the read of vsyscall page under x86-64 Hou Tao
  2024-02-16  3:30 ` [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf patchwork-bot+netdevbpf
  3 siblings, 2 replies; 8+ messages in thread
From: Hou Tao @ 2024-02-02 10:39 UTC (permalink / raw)
  To: x86, bpf
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H . Peter Anvin, linux-kernel,
	xingwei lee, Jann Horn, Sohil Mehta, Yonghong Song, houtao1

From: Hou Tao <houtao1@huawei.com>

When trying to use copy_from_kernel_nofault() to read vsyscall page
through a bpf program, the following oops was reported:

  BUG: unable to handle page fault for address: ffffffffff600000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0
  Oops: 0000 [#1] PREEMPT SMP PTI
  CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
  RIP: 0010:copy_from_kernel_nofault+0x6f/0x110
  ......
  Call Trace:
   <TASK>
   ? copy_from_kernel_nofault+0x6f/0x110
   bpf_probe_read_kernel+0x1d/0x50
   bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d
   trace_call_bpf+0xc5/0x1c0
   perf_call_bpf_enter.isra.0+0x69/0xb0
   perf_syscall_enter+0x13e/0x200
   syscall_trace_enter+0x188/0x1c0
   do_syscall_64+0xb5/0xe0
   entry_SYSCALL_64_after_hwframe+0x6e/0x76
   </TASK>
  ......
  ---[ end trace 0000000000000000 ]---

The oops is triggered when:

1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall
page and invokes copy_from_kernel_nofault() which in turn calls
__get_user_asm().

2) Because the vsyscall page address is not readable from kernel space,
a page fault exception is triggered accordingly.

3) handle_page_fault() considers the vsyscall page address as a user
space address instead of a kernel space address. This results in the
fix-up setup by bpf not being applied and a page_fault_oops() is invoked
due to SMAP.

Considering handle_page_fault() has already considered the vsyscall page
address as a userspace address, fix the problem by disallowing vsyscall
page read for copy_from_kernel_nofault().

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com
Reported-by: xingwei lee <xrivendell7@gmail.com>
Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 arch/x86/mm/maccess.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c
index 6993f026adec9..42115ac079cfe 100644
--- a/arch/x86/mm/maccess.c
+++ b/arch/x86/mm/maccess.c
@@ -3,6 +3,8 @@
 #include <linux/uaccess.h>
 #include <linux/kernel.h>
 
+#include <asm/vsyscall.h>
+
 #ifdef CONFIG_X86_64
 bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
 {
@@ -15,6 +17,14 @@ bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
 	if (vaddr < TASK_SIZE_MAX + PAGE_SIZE)
 		return false;
 
+	/*
+	 * Reading from the vsyscall page may cause an unhandled fault in
+	 * certain cases.  Though it is at an address above TASK_SIZE_MAX, it is
+	 * usually considered as a user space address.
+	 */
+	if (is_vsyscall_vaddr(vaddr))
+		return false;
+
 	/*
 	 * Allow everything during early boot before 'x86_virt_bits'
 	 * is initialized.  Needed for instruction decoding in early
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf v3 3/3] selftest/bpf: Test the read of vsyscall page under x86-64
  2024-02-02 10:39 [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf Hou Tao
  2024-02-02 10:39 ` [PATCH bpf v3 1/3] x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h Hou Tao
  2024-02-02 10:39 ` [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() Hou Tao
@ 2024-02-02 10:39 ` Hou Tao
  2024-02-16  3:30 ` [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf patchwork-bot+netdevbpf
  3 siblings, 0 replies; 8+ messages in thread
From: Hou Tao @ 2024-02-02 10:39 UTC (permalink / raw)
  To: x86, bpf
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H . Peter Anvin, linux-kernel,
	xingwei lee, Jann Horn, Sohil Mehta, Yonghong Song, houtao1

From: Hou Tao <houtao1@huawei.com>

Under x86-64, when using bpf_probe_read_kernel{_str}() or
bpf_probe_read{_str}() to read vsyscall page, the read may trigger oops,
so add one test case to ensure that the problem is fixed. Beside those
four bpf helpers mentioned above, testing the read of vsyscall page by
using bpf_probe_read_user{_str} and bpf_copy_from_user{_task}() as well.

The test case passes the address of vsyscall page to these six helpers
and checks whether the returned values are expected:

1) For bpf_probe_read_kernel{_str}()/bpf_probe_read{_str}(), the
   expected return value is -ERANGE as shown below:

bpf_probe_read_kernel_common
  copy_from_kernel_nofault
    // false, return -ERANGE
    copy_from_kernel_nofault_allowed

2) For bpf_probe_read_user{_str}(), the expected return value is -EFAULT
   as show below:

bpf_probe_read_user_common
  copy_from_user_nofault
    // false, return -EFAULT
    __access_ok

3) For bpf_copy_from_user(), the expected return value is -EFAULT:

// return -EFAULT
bpf_copy_from_user
  copy_from_user
    _copy_from_user
      // return false
      access_ok

4) For bpf_copy_from_user_task(), the expected return value is -EFAULT:

// return -EFAULT
bpf_copy_from_user_task
  access_process_vm
    // return 0
    vma_lookup()
    // return 0
    expand_stack()

The occurrence of oops depends on the availability of CPU SMAP [1]
feature and there are three possible configurations of vsyscall page in
the boot cmd-line: vsyscall={xonly|none|emulate}, so there are a total
of six possible combinations. Under all these combinations, the test
case runs successfully.

[1]: https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 .../selftests/bpf/prog_tests/read_vsyscall.c  | 57 +++++++++++++++++++
 .../selftests/bpf/progs/read_vsyscall.c       | 45 +++++++++++++++
 2 files changed, 102 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/read_vsyscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/read_vsyscall.c

diff --git a/tools/testing/selftests/bpf/prog_tests/read_vsyscall.c b/tools/testing/selftests/bpf/prog_tests/read_vsyscall.c
new file mode 100644
index 0000000000000..3405923fe4e65
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/read_vsyscall.c
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2024. Huawei Technologies Co., Ltd */
+#include "test_progs.h"
+#include "read_vsyscall.skel.h"
+
+#if defined(__x86_64__)
+/* For VSYSCALL_ADDR */
+#include <asm/vsyscall.h>
+#else
+/* To prevent build failure on non-x86 arch */
+#define VSYSCALL_ADDR 0UL
+#endif
+
+struct read_ret_desc {
+	const char *name;
+	int ret;
+} all_read[] = {
+	{ .name = "probe_read_kernel", .ret = -ERANGE },
+	{ .name = "probe_read_kernel_str", .ret = -ERANGE },
+	{ .name = "probe_read", .ret = -ERANGE },
+	{ .name = "probe_read_str", .ret = -ERANGE },
+	{ .name = "probe_read_user", .ret = -EFAULT },
+	{ .name = "probe_read_user_str", .ret = -EFAULT },
+	{ .name = "copy_from_user", .ret = -EFAULT },
+	{ .name = "copy_from_user_task", .ret = -EFAULT },
+};
+
+void test_read_vsyscall(void)
+{
+	struct read_vsyscall *skel;
+	unsigned int i;
+	int err;
+
+#if !defined(__x86_64__)
+	test__skip();
+	return;
+#endif
+	skel = read_vsyscall__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "read_vsyscall open_load"))
+		return;
+
+	skel->bss->target_pid = getpid();
+	err = read_vsyscall__attach(skel);
+	if (!ASSERT_EQ(err, 0, "read_vsyscall attach"))
+		goto out;
+
+	/* userspace may don't have vsyscall page due to LEGACY_VSYSCALL_NONE,
+	 * but it doesn't affect the returned error codes.
+	 */
+	skel->bss->user_ptr = (void *)VSYSCALL_ADDR;
+	usleep(1);
+
+	for (i = 0; i < ARRAY_SIZE(all_read); i++)
+		ASSERT_EQ(skel->bss->read_ret[i], all_read[i].ret, all_read[i].name);
+out:
+	read_vsyscall__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/read_vsyscall.c b/tools/testing/selftests/bpf/progs/read_vsyscall.c
new file mode 100644
index 0000000000000..986f96687ae15
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/read_vsyscall.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2024. Huawei Technologies Co., Ltd */
+#include <linux/types.h>
+#include <bpf/bpf_helpers.h>
+
+#include "bpf_misc.h"
+
+int target_pid = 0;
+void *user_ptr = 0;
+int read_ret[8];
+
+char _license[] SEC("license") = "GPL";
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int do_probe_read(void *ctx)
+{
+	char buf[8];
+
+	if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
+		return 0;
+
+	read_ret[0] = bpf_probe_read_kernel(buf, sizeof(buf), user_ptr);
+	read_ret[1] = bpf_probe_read_kernel_str(buf, sizeof(buf), user_ptr);
+	read_ret[2] = bpf_probe_read(buf, sizeof(buf), user_ptr);
+	read_ret[3] = bpf_probe_read_str(buf, sizeof(buf), user_ptr);
+	read_ret[4] = bpf_probe_read_user(buf, sizeof(buf), user_ptr);
+	read_ret[5] = bpf_probe_read_user_str(buf, sizeof(buf), user_ptr);
+
+	return 0;
+}
+
+SEC("fentry.s/" SYS_PREFIX "sys_nanosleep")
+int do_copy_from_user(void *ctx)
+{
+	char buf[8];
+
+	if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
+		return 0;
+
+	read_ret[6] = bpf_copy_from_user(buf, sizeof(buf), user_ptr);
+	read_ret[7] = bpf_copy_from_user_task(buf, sizeof(buf), user_ptr,
+					      bpf_get_current_task_btf(), 0);
+
+	return 0;
+}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
  2024-02-02 10:39 ` [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() Hou Tao
@ 2024-02-02 18:43   ` Sohil Mehta
  2024-02-14 22:51     ` Alexei Starovoitov
  2024-02-15 11:13   ` Thomas Gleixner
  1 sibling, 1 reply; 8+ messages in thread
From: Sohil Mehta @ 2024-02-02 18:43 UTC (permalink / raw)
  To: Hou Tao, x86, bpf
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H . Peter Anvin, linux-kernel,
	xingwei lee, Jann Horn, Yonghong Song, houtao1

On 2/2/2024 2:39 AM, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> When trying to use copy_from_kernel_nofault() to read vsyscall page
> through a bpf program, the following oops was reported:
> 
>   BUG: unable to handle page fault for address: ffffffffff600000
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0
>   Oops: 0000 [#1] PREEMPT SMP PTI
>   CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
>   RIP: 0010:copy_from_kernel_nofault+0x6f/0x110
>   ......
>   Call Trace:
>    <TASK>
>    ? copy_from_kernel_nofault+0x6f/0x110
>    bpf_probe_read_kernel+0x1d/0x50
>    bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d
>    trace_call_bpf+0xc5/0x1c0
>    perf_call_bpf_enter.isra.0+0x69/0xb0
>    perf_syscall_enter+0x13e/0x200
>    syscall_trace_enter+0x188/0x1c0
>    do_syscall_64+0xb5/0xe0
>    entry_SYSCALL_64_after_hwframe+0x6e/0x76
>    </TASK>
>   ......
>   ---[ end trace 0000000000000000 ]---
> 
> The oops is triggered when:
> 
> 1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall
> page and invokes copy_from_kernel_nofault() which in turn calls
> __get_user_asm().
> 
> 2) Because the vsyscall page address is not readable from kernel space,
> a page fault exception is triggered accordingly.
> 
> 3) handle_page_fault() considers the vsyscall page address as a user
> space address instead of a kernel space address. This results in the
> fix-up setup by bpf not being applied and a page_fault_oops() is invoked
> due to SMAP.
> 
> Considering handle_page_fault() has already considered the vsyscall page
> address as a userspace address, fix the problem by disallowing vsyscall
> page read for copy_from_kernel_nofault().
> 
> Originally-by: Thomas Gleixner <tglx@linutronix.de>
> Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com
> Reported-by: xingwei lee <xrivendell7@gmail.com>
> Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  arch/x86/mm/maccess.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 

Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
  2024-02-02 18:43   ` Sohil Mehta
@ 2024-02-14 22:51     ` Alexei Starovoitov
  0 siblings, 0 replies; 8+ messages in thread
From: Alexei Starovoitov @ 2024-02-14 22:51 UTC (permalink / raw)
  To: Sohil Mehta, Thomas Gleixner
  Cc: Hou Tao, X86 ML, bpf, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Ingo Molnar, Borislav Petkov, H . Peter Anvin,
	LKML, xingwei lee, Jann Horn, Yonghong Song, Hou Tao

On Fri, Feb 2, 2024 at 11:03 AM Sohil Mehta <sohil.mehta@intel.com> wrote:
>
> On 2/2/2024 2:39 AM, Hou Tao wrote:
> > From: Hou Tao <houtao1@huawei.com>
> >
> > When trying to use copy_from_kernel_nofault() to read vsyscall page
> > through a bpf program, the following oops was reported:
> >
> >   BUG: unable to handle page fault for address: ffffffffff600000
> >   #PF: supervisor read access in kernel mode
> >   #PF: error_code(0x0000) - not-present page
> >   PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0
> >   Oops: 0000 [#1] PREEMPT SMP PTI
> >   CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58
> >   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
> >   RIP: 0010:copy_from_kernel_nofault+0x6f/0x110
> >   ......
> >   Call Trace:
> >    <TASK>
> >    ? copy_from_kernel_nofault+0x6f/0x110
> >    bpf_probe_read_kernel+0x1d/0x50
> >    bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d
> >    trace_call_bpf+0xc5/0x1c0
> >    perf_call_bpf_enter.isra.0+0x69/0xb0
> >    perf_syscall_enter+0x13e/0x200
> >    syscall_trace_enter+0x188/0x1c0
> >    do_syscall_64+0xb5/0xe0
> >    entry_SYSCALL_64_after_hwframe+0x6e/0x76
> >    </TASK>
> >   ......
> >   ---[ end trace 0000000000000000 ]---
> >
> > The oops is triggered when:
> >
> > 1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall
> > page and invokes copy_from_kernel_nofault() which in turn calls
> > __get_user_asm().
> >
> > 2) Because the vsyscall page address is not readable from kernel space,
> > a page fault exception is triggered accordingly.
> >
> > 3) handle_page_fault() considers the vsyscall page address as a user
> > space address instead of a kernel space address. This results in the
> > fix-up setup by bpf not being applied and a page_fault_oops() is invoked
> > due to SMAP.
> >
> > Considering handle_page_fault() has already considered the vsyscall page
> > address as a userspace address, fix the problem by disallowing vsyscall
> > page read for copy_from_kernel_nofault().
> >
> > Originally-by: Thomas Gleixner <tglx@linutronix.de>

Thomas,

could you please Ack the patch if you're still ok with it,
so we can take through the bpf tree to Linus soon ?

Not only syzbot, but real users are hitting this bug.

Thanks!

> > Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com
> > Reported-by: xingwei lee <xrivendell7@gmail.com>
> > Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
> > ---
> >  arch/x86/mm/maccess.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
  2024-02-02 10:39 ` [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() Hou Tao
  2024-02-02 18:43   ` Sohil Mehta
@ 2024-02-15 11:13   ` Thomas Gleixner
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2024-02-15 11:13 UTC (permalink / raw)
  To: Hou Tao, x86, bpf
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Ingo Molnar,
	Borislav Petkov, H . Peter Anvin, linux-kernel, xingwei lee,
	Jann Horn, Sohil Mehta, Yonghong Song, houtao1

On Fri, Feb 02 2024 at 18:39, Hou Tao wrote:
>
> Originally-by: Thomas Gleixner <tglx@linutronix.de>
> Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com
> Reported-by: xingwei lee <xrivendell7@gmail.com>
> Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com
> Signed-off-by: Hou Tao <houtao1@huawei.com>

Acked-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf
  2024-02-02 10:39 [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf Hou Tao
                   ` (2 preceding siblings ...)
  2024-02-02 10:39 ` [PATCH bpf v3 3/3] selftest/bpf: Test the read of vsyscall page under x86-64 Hou Tao
@ 2024-02-16  3:30 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-02-16  3:30 UTC (permalink / raw)
  To: Hou Tao
  Cc: x86, bpf, dave.hansen, luto, peterz, tglx, mingo, bp, hpa,
	linux-kernel, xrivendell7, jannh, sohil.mehta, yonghong.song,
	houtao1

Hello:

This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Fri,  2 Feb 2024 18:39:32 +0800 you wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Hi,
> 
> As reported by syzboot [1] and [2], when trying to read vsyscall page
> by using bpf_probe_read_kernel() or bpf_probe_read(), oops may happen.
> 
> [...]

Here is the summary with links:
  - [bpf,v3,1/3] x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h
    https://git.kernel.org/bpf/bpf/c/ee0e39a63b78
  - [bpf,v3,2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
    https://git.kernel.org/bpf/bpf/c/32019c659ecf
  - [bpf,v3,3/3] selftest/bpf: Test the read of vsyscall page under x86-64
    https://git.kernel.org/bpf/bpf/c/be66d79189ec

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-02-16  3:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-02 10:39 [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf Hou Tao
2024-02-02 10:39 ` [PATCH bpf v3 1/3] x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h Hou Tao
2024-02-02 10:39 ` [PATCH bpf v3 2/3] x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() Hou Tao
2024-02-02 18:43   ` Sohil Mehta
2024-02-14 22:51     ` Alexei Starovoitov
2024-02-15 11:13   ` Thomas Gleixner
2024-02-02 10:39 ` [PATCH bpf v3 3/3] selftest/bpf: Test the read of vsyscall page under x86-64 Hou Tao
2024-02-16  3:30 ` [PATCH bpf v3 0/3] Fix the read of vsyscall page through bpf patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.