All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yu, Yu-cheng" <yu-cheng.yu@intel.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: linux-arch <linux-arch@vger.kernel.org>, X86 ML <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux API <linux-api@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Balbir Singh <bsingharora@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Eugene Syromiatnikov <esyr@redhat.com>,
	Florian Weimer <fweimer@redhat.com>,
	"H.J. Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromium.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>, Pavel Machek <pavel@ucw.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Weijiang Yang <weijiang.yang@intel.com>,
	Pengfei Xu <pengfei.xu@intel.com>,
	Haitao Huang <haitao.huang@intel.com>
Subject: Re: extending ucontext (Re: [PATCH v26 25/30] x86/cet/shstk: Handle signals for shadow stack)
Date: Thu, 6 May 2021 15:05:15 -0700	[thread overview]
Message-ID: <bf16ab7e-bf27-68eb-efc9-c0468fb1c651@intel.com> (raw)
In-Reply-To: <5fc5dea4-0705-2aad-cf8f-7ff78a5e518a@intel.com>

On 5/4/2021 1:49 PM, Yu, Yu-cheng wrote:
> On 4/30/2021 11:32 AM, Yu, Yu-cheng wrote:
>> On 4/30/2021 10:47 AM, Andy Lutomirski wrote:
>>> On Fri, Apr 30, 2021 at 10:00 AM Yu, Yu-cheng <yu-cheng.yu@intel.com> 
>>> wrote:
>>>>
>>>> On 4/28/2021 4:03 PM, Andy Lutomirski wrote:
>>>>> On Tue, Apr 27, 2021 at 1:44 PM Yu-cheng Yu <yu-cheng.yu@intel.com> 
>>>>> wrote:
>>>>>>
>>>>>> When shadow stack is enabled, a task's shadow stack states must be 
>>>>>> saved
>>>>>> along with the signal context and later restored in sigreturn. 
>>>>>> However,
>>>>>> currently there is no systematic facility for extending a signal 
>>>>>> context.
>>>>>> There is some space left in the ucontext, but changing ucontext is 
>>>>>> likely
>>>>>> to create compatibility issues and there is not enough space for 
>>>>>> further
>>>>>> extensions.
>>>>>>
>>>>>> Introduce a signal context extension struct 'sc_ext', which is 
>>>>>> used to save
>>>>>> shadow stack restore token address.  The extension is located 
>>>>>> above the fpu
>>>>>> states, plus alignment.  The struct can be extended (such as the 
>>>>>> ibt's
>>>>>> wait_endbr status to be introduced later), and sc_ext.total_size 
>>>>>> field
>>>>>> keeps track of total size.
>>>>>
>>>>> I still don't like this.
>>>>>

[...]

>>>>>
>>>>> That's where we are right now upstream.  The kernel has a parser for
>>>>> the FPU state that is bugs piled upon bugs and is going to have to be
>>>>> rewritten sometime soon.  On top of all this, we have two upcoming
>>>>> features, both of which require different kinds of extensions:
>>>>>
>>>>> 1. AVX-512.  (Yeah, you thought this story was over a few years ago,
>>>>> but no.  And AMX makes it worse.)  To make a long story short, we
>>>>> promised user code many years ago that a signal frame fit in 2048
>>>>> bytes with some room to spare.  With AVX-512 this is false.  With AMX
>>>>> it's so wrong it's not even funny.  The only way out of the mess
>>>>> anyone has come up with involves making the length of the FPU state
>>>>> vary depending on which features are INIT, i.e. making it more compact
>>>>> than "compact" mode is.  This has a side effect: it's no longer
>>>>> possible to modify the state in place, because enabling a feature with
>>>>> no space allocated will make the structure bigger, and the stack won't
>>>>> have room.  Fortunately, one can relocate the entire FPU state, update
>>>>> the pointer in mcontext, and the kernel will happily follow the
>>>>> pointer.  So new code on a new kernel using a super-compact state
>>>>> could expand the state by allocating new memory (on the heap? very
>>>>> awkwardly on the stack?) and changing the pointer.  For all we know,
>>>>> some code already fiddles with the pointer.  This is great, except
>>>>> that your patch sticks more data at the end of the FPU block that no
>>>>> one is expecting, and your sigreturn code follows that pointer, and
>>>>> will read off into lala land.
>>>>>
>>>>
>>>> Then, what about we don't do that at all.  Is it possible from now 
>>>> on we
>>>> don't stick more data at the end, and take the relocating-fpu approach?
>>>>
>>>>> 2. CET.  CET wants us to find a few more bytes somewhere, and those
>>>>> bytes logically belong in ucontext, and here we are.
>>>>>
>>>>
>>>> Fortunately, we can spare CET the need of ucontext extension.  When the
>>>> kernel handles sigreturn, the user-mode shadow stack pointer is 
>>>> right at
>>>> the restore token.  There is no need to put that in ucontext.
>>>
>>> That seems entirely reasonable.  This might also avoid needing to
>>> teach CRIU about CET at all.
>>>
>>>>
>>>> However, the WAIT_ENDBR status needs to be saved/restored for signals.
>>>> Since IBT is now dependent on shadow stack, we can use a spare bit of
>>>> the shadow stack restore token for that.
>>>
>>> That seems like unnecessary ABI coupling.  We have plenty of bits in
>>> uc_flags, and we have an entire reserved word in sigcontext.  How
>>> about just sticking this bit in one of those places?
>>
>> Yes, I will make it UC_WAIT_ENDBR.
> 
> Personally, I think an explicit flag is cleaner than using a reserved 
> word somewhere.  However, there is a small issue: ia32 has no uc_flags.
> 
> This series can support legacy apps up to now.  But, instead of creating 
> too many special cases, perhaps we should drop CET support of ia32?
> 
> Thoughts?
> 

Once we have UC_WAIT_ENDBR, IBT signal handling becomes quite simple. 
Like the following:

diff --git a/arch/x86/include/uapi/asm/ucontext.h 
b/arch/x86/include/uapi/asm/ucontext.h
index 5657b7a49f03..96375d609e11 100644
--- a/arch/x86/include/uapi/asm/ucontext.h
+++ b/arch/x86/include/uapi/asm/ucontext.h
@@ -49,6 +49,11 @@
   */
  #define UC_SIGCONTEXT_SS	0x2
  #define UC_STRICT_RESTORE_SS	0x4
+
+/*
+ * UC_WAIT_ENDBR indicates the task is in wait-ENDBR status.
+ */
+#define UC_WAIT_ENDBR		0x08
  #endif

  #include <asm-generic/ucontext.h>
diff --git a/arch/x86/kernel/ibt.c b/arch/x86/kernel/ibt.c
index d2563dd4759f..da804314ddc4 100644
--- a/arch/x86/kernel/ibt.c
+++ b/arch/x86/kernel/ibt.c
@@ -66,3 +66,32 @@ void ibt_disable(void)
  	ibt_set_clear_msr_bits(0, CET_ENDBR_EN);
  	current->thread.cet.ibt = 0;
  }
+
+int ibt_get_clear_wait_endbr(void)
+{
+	u64 msr_val = 0;
+
+	if (!current->thread.cet.ibt)
+		return 0;
+
+	fpregs_lock();
+
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		__fpregs_load_activate();
+
+	if (!rdmsrl_safe(MSR_IA32_U_CET, &msr_val))
+		wrmsrl(MSR_IA32_U_CET, msr_val & ~CET_WAIT_ENDBR);
+
+	fpregs_unlock();
+
+	return msr_val & CET_WAIT_ENDBR;
+}
+
+int ibt_set_wait_endbr(void)
+{
+	if (!current->thread.cet.ibt)
+		return 0;
+
+
+	return ibt_set_clear_msr_bits(CET_WAIT_ENDBR, 0);
+}
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 66b662e57e19..5afd15419006 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -46,6 +46,7 @@
  #include <asm/syscall.h>
  #include <asm/sigframe.h>
  #include <asm/signal.h>
+#include <asm/cet.h>

  #ifdef CONFIG_X86_64
  /*
@@ -134,6 +135,9 @@ static int restore_sigcontext(struct pt_regs *regs,
  	 */
  	if (unlikely(!(uc_flags & UC_STRICT_RESTORE_SS) && 
user_64bit_mode(regs)))
  		force_valid_ss(regs);
+
+	if (uc_flags & UC_WAIT_ENDBR)
+		ibt_set_wait_endbr();
  #endif

  	return fpu__restore_sig((void __user *)sc.fpstate,
@@ -433,6 +437,9 @@ static unsigned long frame_uc_flags(struct pt_regs 
*regs)
  	if (likely(user_64bit_mode(regs)))
  		flags |= UC_STRICT_RESTORE_SS;

+	if (ibt_get_clear_wait_endbr())
+		flags |= UC_WAIT_ENDBR;
+
  	return flags;
  }


However, this cannot handle ia32 with no SA_SIGINFO.  For that, can we 
create a synthetic token on the shadow stack?

- The token points to itself with reserved bit[1] set, and cannot be 
used for RSTORSSP.
- The token only exists for ia32 with no SA_SIGINFO *AND* when the task 
is in wait-endbr.

The signal shadow stack will look like this:

--> ssp before signal
     synthetic IBT token (for ia32 no SA_SIGINFO)
     shadow stack restore token
     sigreturn address

The synthetic token is not valid in other situations.
How is that?

Thanks,
Yu-cheng

  reply	other threads:[~2021-05-06 22:05 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27 20:42 [PATCH v26 00/30] Control-flow Enforcement: Shadow Stack Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 01/30] Documentation/x86: Add CET description Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 02/30] x86/cet/shstk: Add Kconfig option for Shadow Stack Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 03/30] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 04/30] x86/cpufeatures: Introduce CPU setup and option parsing for CET Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 05/30] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 06/30] x86/cet: Add control-protection fault handler Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 07/30] x86/mm: Remove _PAGE_DIRTY from kernel RO pages Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 08/30] x86/mm: Move pmd_write(), pud_write() up in the file Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 09/30] x86/mm: Introduce _PAGE_COW Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 10/30] drm/i915/gvt: Change _PAGE_DIRTY to _PAGE_DIRTY_BITS Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 11/30] x86/mm: Update pte_modify for _PAGE_COW Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 12/30] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 13/30] mm: Introduce VM_SHADOW_STACK for shadow stack memory Yu-cheng Yu
2021-04-27 20:42 ` [PATCH v26 14/30] x86/mm: Shadow Stack page fault error checking Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 15/30] x86/mm: Update maybe_mkwrite() for shadow stack Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 16/30] mm: Fixup places that call pte_mkwrite() directly Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 17/30] mm: Add guard pages around a shadow stack Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 18/30] mm/mmap: Add shadow stack pages to memory accounting Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 19/30] mm: Update can_follow_write_pte() for shadow stack Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 20/30] mm/mprotect: Exclude shadow stack from preserve_write Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 21/30] mm: Re-introduce vm_flags to do_mmap() Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 22/30] x86/cet/shstk: Add user-mode shadow stack support Yu-cheng Yu
2021-04-28 17:52   ` Borislav Petkov
2021-04-28 18:39     ` Yu, Yu-cheng
2021-04-29  9:12       ` Borislav Petkov
2021-04-29 16:17         ` Yu, Yu-cheng
2021-04-29 16:45           ` Borislav Petkov
2021-04-27 20:43 ` [PATCH v26 23/30] x86/cet/shstk: Handle thread shadow stack Yu-cheng Yu
2021-05-10 14:15   ` Borislav Petkov
2021-05-10 22:57     ` Yu, Yu-cheng
2021-05-11 17:09       ` Borislav Petkov
2021-05-12  8:12         ` David Laight
2021-05-11 18:35     ` Yu, Yu-cheng
2021-05-12 15:56       ` Borislav Petkov
2021-04-27 20:43 ` [PATCH v26 24/30] x86/cet/shstk: Introduce shadow stack token setup/verify routines Yu-cheng Yu
2021-05-17  7:45   ` Borislav Petkov
2021-05-17 20:55     ` Yu, Yu-cheng
2021-05-18  0:14       ` Eugene Syromiatnikov
2021-05-18 17:58         ` Borislav Petkov
2021-05-18 19:45           ` Yu, Yu-cheng
2021-05-18 18:05         ` Yu, Yu-cheng
2021-05-18  5:56       ` Borislav Petkov
2021-05-21 16:17     ` Yu, Yu-cheng
2021-05-21 18:40       ` Borislav Petkov
2021-04-27 20:43 ` [PATCH v26 25/30] x86/cet/shstk: Handle signals for shadow stack Yu-cheng Yu
2021-04-28 23:03   ` extending ucontext (Re: [PATCH v26 25/30] x86/cet/shstk: Handle signals for shadow stack) Andy Lutomirski
2021-04-28 23:03     ` Andy Lutomirski
2021-04-28 23:20     ` Yu, Yu-cheng
2021-04-29  7:28     ` Cyrill Gorcunov
2021-04-29 14:44       ` Andy Lutomirski
2021-04-29 14:44         ` Andy Lutomirski
2021-04-29 15:35         ` Cyrill Gorcunov
2021-04-30  6:45     ` Florian Weimer
2021-04-30  6:45       ` Florian Weimer
2021-04-30 17:00     ` Yu, Yu-cheng
2021-04-30 17:47       ` Andy Lutomirski
2021-04-30 17:47         ` Andy Lutomirski
2021-04-30 18:32         ` Yu, Yu-cheng
2021-05-04 20:49           ` Yu, Yu-cheng
2021-05-06 22:05             ` Yu, Yu-cheng [this message]
2021-05-06 23:31               ` Andy Lutomirski
2021-05-06 23:31                 ` Andy Lutomirski
2021-05-02 23:23         ` Andy Lutomirski
2021-05-02 23:23           ` Andy Lutomirski
2021-05-03  6:03           ` H. Peter Anvin
2021-05-03 15:13           ` Yu, Yu-cheng
2021-05-03 15:29             ` Andy Lutomirski
2021-05-03 20:25               ` Yu, Yu-cheng
2021-04-27 20:43 ` [PATCH v26 26/30] ELF: Introduce arch_setup_elf_property() Yu-cheng Yu
2021-05-19 18:10   ` Borislav Petkov
2021-05-19 22:14     ` Yu, Yu-cheng
2021-05-20  9:26       ` Borislav Petkov
2021-05-20 17:18         ` Yu, Yu-cheng
2021-05-20 17:35           ` Borislav Petkov
2021-05-20 17:51             ` Yu, Yu-cheng
2021-05-20 17:38       ` Matthew Wilcox
2021-05-20 17:52         ` Yu, Yu-cheng
2021-05-20 21:06           ` Yu, Yu-cheng
2021-04-27 20:43 ` [PATCH v26 27/30] x86/cet/shstk: Add arch_prctl functions for shadow stack Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 28/30] mm: Move arch_calc_vm_prot_bits() to arch/x86/include/asm/mman.h Yu-cheng Yu
2021-04-27 20:43 ` [PATCH v26 29/30] mm: Update arch_validate_flags() to test vma anonymous Yu-cheng Yu
2021-05-11 11:35   ` Kirill A. Shutemov
2021-04-27 20:43 ` [PATCH v26 30/30] mm: Introduce PROT_SHADOW_STACK for shadow stack Yu-cheng Yu
2021-05-11 11:48   ` Kirill A. Shutemov
2021-05-11 14:44     ` Yu, Yu-cheng
2021-04-29 17:13 ` [PATCH v26 00/30] Control-flow Enforcement: Shadow Stack Borislav Petkov
2021-04-29 17:32   ` Yu, Yu-cheng
2021-04-29 17:49     ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf16ab7e-bf27-68eb-efc9-c0468fb1c651@intel.com \
    --to=yu-cheng.yu@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=bsingharora@gmail.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=esyr@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=gorcunov@gmail.com \
    --cc=haitao.huang@intel.com \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=oleg@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=pengfei.xu@intel.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vedvyas.shanbhogue@intel.com \
    --cc=weijiang.yang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.