xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()
@ 2020-10-14 18:47 Andrew Cooper
  2020-10-15  8:50 ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2020-10-14 18:47 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

cpu_smpboot_alloc() is designed to be idempotent with respect to partially
initialised state.  This occurs for S3 and CPU parking, where enough state to
handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
when we otherwise want to offline the CPU.

For simplicity between various configuration, Xen always uses shadow stack
mappings (Read-only + Dirty) for the guard page, irrespective of whether
CET-SS is enabled.

Unfortunately, the CET-SS changes in memguard_guard_stack() broke idempotency
by first writing out the supervisor shadow stack tokens with plain writes,
then changing the mapping to being read-only.

This ordering is strictly necessary to configure the BSP, which cannot have
the supervisor tokens be written with WRSS.

Instead of calling memguard_guard_stack() unconditionally, call it only when
actually allocating a new stack.  Xenheap allocates are guaranteed to be
writeable, and the net result is idempotency WRT configuring stack_base[].

Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

This can more easily be demonstrated with CPU hotplug than S3, and the absence
of bug reports goes to show how rarely hotplug is used.

v2:
 * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
   turn the BSP shadow stack into regular mappings, and #DF as soon as the TLB
   shootdown completes.  For CPU Parking, it would invalidate the shadow stack
   of the parked CPUs, causing a #DF on the next NMI/#MC to hit the thread.
---
 xen/arch/x86/smpboot.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 5708573c41..67e727cebd 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -997,16 +997,18 @@ static int cpu_smpboot_alloc(unsigned int cpu)
         memflags = MEMF_node(node);
 
     if ( stack_base[cpu] == NULL )
+    {
         stack_base[cpu] = alloc_xenheap_pages(STACK_ORDER, memflags);
-    if ( stack_base[cpu] == NULL )
-        goto out;
+        if ( !stack_base[cpu] )
+            goto out;
+
+        memguard_guard_stack(stack_base[cpu]);
+    }
 
     info = get_cpu_info_from_stack((unsigned long)stack_base[cpu]);
     info->processor_id = cpu;
     info->per_cpu_offset = __per_cpu_offset[cpu];
 
-    memguard_guard_stack(stack_base[cpu]);
-
     gdt = per_cpu(gdt, cpu) ?: alloc_xenheap_pages(0, memflags);
     if ( gdt == NULL )
         goto out;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()
  2020-10-14 18:47 [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc() Andrew Cooper
@ 2020-10-15  8:50 ` Jan Beulich
  2020-10-15 14:02   ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2020-10-15  8:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Roger Pau Monné, Wei Liu

On 14.10.2020 20:47, Andrew Cooper wrote:
> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
> initialised state.  This occurs for S3 and CPU parking, where enough state to
> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
> when we otherwise want to offline the CPU.
> 
> For simplicity between various configuration, Xen always uses shadow stack
> mappings (Read-only + Dirty) for the guard page, irrespective of whether
> CET-SS is enabled.
> 
> Unfortunately, the CET-SS changes in memguard_guard_stack() broke idempotency
> by first writing out the supervisor shadow stack tokens with plain writes,
> then changing the mapping to being read-only.
> 
> This ordering is strictly necessary to configure the BSP, which cannot have
> the supervisor tokens be written with WRSS.
> 
> Instead of calling memguard_guard_stack() unconditionally, call it only when
> actually allocating a new stack.  Xenheap allocates are guaranteed to be
> writeable, and the net result is idempotency WRT configuring stack_base[].
> 
> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> 
> This can more easily be demonstrated with CPU hotplug than S3, and the absence
> of bug reports goes to show how rarely hotplug is used.
> 
> v2:
>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
>    turn the BSP shadow stack into regular mappings, and #DF as soon as the TLB
>    shootdown completes.

The code change looks correct to me, but since I don't understand
this part I'm afraid I may be overlooking something. I understand
the "turn the BSP shadow stack into regular mappings" relates to
cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
didn't think we come through cpu_smpboot_free() for the BSP upon
entering or leaving S3.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()
  2020-10-15  8:50 ` Jan Beulich
@ 2020-10-15 14:02   ` Andrew Cooper
  2020-10-15 15:16     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2020-10-15 14:02 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel, Roger Pau Monné, Wei Liu

On 15/10/2020 09:50, Jan Beulich wrote:
> On 14.10.2020 20:47, Andrew Cooper wrote:
>> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
>> initialised state.  This occurs for S3 and CPU parking, where enough state to
>> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
>> when we otherwise want to offline the CPU.
>>
>> For simplicity between various configuration, Xen always uses shadow stack
>> mappings (Read-only + Dirty) for the guard page, irrespective of whether
>> CET-SS is enabled.
>>
>> Unfortunately, the CET-SS changes in memguard_guard_stack() broke idempotency
>> by first writing out the supervisor shadow stack tokens with plain writes,
>> then changing the mapping to being read-only.
>>
>> This ordering is strictly necessary to configure the BSP, which cannot have
>> the supervisor tokens be written with WRSS.
>>
>> Instead of calling memguard_guard_stack() unconditionally, call it only when
>> actually allocating a new stack.  Xenheap allocates are guaranteed to be
>> writeable, and the net result is idempotency WRT configuring stack_base[].
>>
>> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>>
>> This can more easily be demonstrated with CPU hotplug than S3, and the absence
>> of bug reports goes to show how rarely hotplug is used.
>>
>> v2:
>>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
>>    turn the BSP shadow stack into regular mappings, and #DF as soon as the TLB
>>    shootdown completes.
> The code change looks correct to me, but since I don't understand
> this part I'm afraid I may be overlooking something. I understand
> the "turn the BSP shadow stack into regular mappings" relates to
> cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
> didn't think we come through cpu_smpboot_free() for the BSP upon
> entering or leaving S3.

The v1 really did fix Marek's repro of the problem.

The only possible way this can occur is if, somewhere, there is a call
to cpu_smpboot_free() for CPU0 with remove=0 on the S3 path

I have to admit that I can't actually spot where it is.


Either way - it doesn't impact the fix, which attempts to make "the
stack" into a single object.  I experimented with introducing
smpboot_{alloc,free}_stack(), but the result wasn't clean and I
abandoned that approach.

~Andrew


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()
  2020-10-15 14:02   ` Andrew Cooper
@ 2020-10-15 15:16     ` Jan Beulich
  2020-10-15 16:38       ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2020-10-15 15:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Roger Pau Monné, Wei Liu

On 15.10.2020 16:02, Andrew Cooper wrote:
> On 15/10/2020 09:50, Jan Beulich wrote:
>> On 14.10.2020 20:47, Andrew Cooper wrote:
>>> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
>>> initialised state.  This occurs for S3 and CPU parking, where enough state to
>>> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
>>> when we otherwise want to offline the CPU.
>>>
>>> For simplicity between various configuration, Xen always uses shadow stack
>>> mappings (Read-only + Dirty) for the guard page, irrespective of whether
>>> CET-SS is enabled.
>>>
>>> Unfortunately, the CET-SS changes in memguard_guard_stack() broke idempotency
>>> by first writing out the supervisor shadow stack tokens with plain writes,
>>> then changing the mapping to being read-only.
>>>
>>> This ordering is strictly necessary to configure the BSP, which cannot have
>>> the supervisor tokens be written with WRSS.
>>>
>>> Instead of calling memguard_guard_stack() unconditionally, call it only when
>>> actually allocating a new stack.  Xenheap allocates are guaranteed to be
>>> writeable, and the net result is idempotency WRT configuring stack_base[].
>>>
>>> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
>>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Jan Beulich <JBeulich@suse.com>
>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>> CC: Wei Liu <wl@xen.org>
>>>
>>> This can more easily be demonstrated with CPU hotplug than S3, and the absence
>>> of bug reports goes to show how rarely hotplug is used.
>>>
>>> v2:
>>>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
>>>    turn the BSP shadow stack into regular mappings, and #DF as soon as the TLB
>>>    shootdown completes.
>> The code change looks correct to me, but since I don't understand
>> this part I'm afraid I may be overlooking something. I understand
>> the "turn the BSP shadow stack into regular mappings" relates to
>> cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
>> didn't think we come through cpu_smpboot_free() for the BSP upon
>> entering or leaving S3.
> 
> The v1 really did fix Marek's repro of the problem.
> 
> The only possible way this can occur is if, somewhere, there is a call
> to cpu_smpboot_free() for CPU0 with remove=0 on the S3 path

I didn't think it was the BSP's stack that got written to, but the
first AP's before letting it run.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()
  2020-10-15 15:16     ` Jan Beulich
@ 2020-10-15 16:38       ` Andrew Cooper
  2020-10-16  6:45         ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2020-10-15 16:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel, Roger Pau Monné, Wei Liu

On 15/10/2020 16:16, Jan Beulich wrote:
> On 15.10.2020 16:02, Andrew Cooper wrote:
>> On 15/10/2020 09:50, Jan Beulich wrote:
>>> On 14.10.2020 20:47, Andrew Cooper wrote:
>>>> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
>>>> initialised state.  This occurs for S3 and CPU parking, where enough state to
>>>> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
>>>> when we otherwise want to offline the CPU.
>>>>
>>>> For simplicity between various configuration, Xen always uses shadow stack
>>>> mappings (Read-only + Dirty) for the guard page, irrespective of whether
>>>> CET-SS is enabled.
>>>>
>>>> Unfortunately, the CET-SS changes in memguard_guard_stack() broke idempotency
>>>> by first writing out the supervisor shadow stack tokens with plain writes,
>>>> then changing the mapping to being read-only.
>>>>
>>>> This ordering is strictly necessary to configure the BSP, which cannot have
>>>> the supervisor tokens be written with WRSS.
>>>>
>>>> Instead of calling memguard_guard_stack() unconditionally, call it only when
>>>> actually allocating a new stack.  Xenheap allocates are guaranteed to be
>>>> writeable, and the net result is idempotency WRT configuring stack_base[].
>>>>
>>>> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
>>>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> ---
>>>> CC: Jan Beulich <JBeulich@suse.com>
>>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>>> CC: Wei Liu <wl@xen.org>
>>>>
>>>> This can more easily be demonstrated with CPU hotplug than S3, and the absence
>>>> of bug reports goes to show how rarely hotplug is used.
>>>>
>>>> v2:
>>>>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
>>>>    turn the BSP shadow stack into regular mappings, and #DF as soon as the TLB
>>>>    shootdown completes.
>>> The code change looks correct to me, but since I don't understand
>>> this part I'm afraid I may be overlooking something. I understand
>>> the "turn the BSP shadow stack into regular mappings" relates to
>>> cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
>>> didn't think we come through cpu_smpboot_free() for the BSP upon
>>> entering or leaving S3.
>> The v1 really did fix Marek's repro of the problem.
>>
>> The only possible way this can occur is if, somewhere, there is a call
>> to cpu_smpboot_free() for CPU0 with remove=0 on the S3 path
> I didn't think it was the BSP's stack that got written to, but the
> first AP's before letting it run.

Oh yes - my analysis was wrong.  The CPU notifier for CPU 1 to come up
runs on CPU 0.

So only the --- text was wrong.  Are you happy with the fix now?

~Andrew


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()
  2020-10-15 16:38       ` Andrew Cooper
@ 2020-10-16  6:45         ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2020-10-16  6:45 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Roger Pau Monné, Wei Liu

On 15.10.2020 18:38, Andrew Cooper wrote:
> On 15/10/2020 16:16, Jan Beulich wrote:
>> On 15.10.2020 16:02, Andrew Cooper wrote:
>>> On 15/10/2020 09:50, Jan Beulich wrote:
>>>> On 14.10.2020 20:47, Andrew Cooper wrote:
>>>>> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
>>>>> initialised state.  This occurs for S3 and CPU parking, where enough state to
>>>>> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
>>>>> when we otherwise want to offline the CPU.
>>>>>
>>>>> For simplicity between various configuration, Xen always uses shadow stack
>>>>> mappings (Read-only + Dirty) for the guard page, irrespective of whether
>>>>> CET-SS is enabled.
>>>>>
>>>>> Unfortunately, the CET-SS changes in memguard_guard_stack() broke idempotency
>>>>> by first writing out the supervisor shadow stack tokens with plain writes,
>>>>> then changing the mapping to being read-only.
>>>>>
>>>>> This ordering is strictly necessary to configure the BSP, which cannot have
>>>>> the supervisor tokens be written with WRSS.
>>>>>
>>>>> Instead of calling memguard_guard_stack() unconditionally, call it only when
>>>>> actually allocating a new stack.  Xenheap allocates are guaranteed to be
>>>>> writeable, and the net result is idempotency WRT configuring stack_base[].
>>>>>
>>>>> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
>>>>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> ---
>>>>> CC: Jan Beulich <JBeulich@suse.com>
>>>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>>>> CC: Wei Liu <wl@xen.org>
>>>>>
>>>>> This can more easily be demonstrated with CPU hotplug than S3, and the absence
>>>>> of bug reports goes to show how rarely hotplug is used.
>>>>>
>>>>> v2:
>>>>>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
>>>>>    turn the BSP shadow stack into regular mappings, and #DF as soon as the TLB
>>>>>    shootdown completes.
>>>> The code change looks correct to me, but since I don't understand
>>>> this part I'm afraid I may be overlooking something. I understand
>>>> the "turn the BSP shadow stack into regular mappings" relates to
>>>> cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
>>>> didn't think we come through cpu_smpboot_free() for the BSP upon
>>>> entering or leaving S3.
>>> The v1 really did fix Marek's repro of the problem.
>>>
>>> The only possible way this can occur is if, somewhere, there is a call
>>> to cpu_smpboot_free() for CPU0 with remove=0 on the S3 path
>> I didn't think it was the BSP's stack that got written to, but the
>> first AP's before letting it run.
> 
> Oh yes - my analysis was wrong.  The CPU notifier for CPU 1 to come up
> runs on CPU 0.
> 
> So only the --- text was wrong.  Are you happy with the fix now?

Indeed I am:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-16  6:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-14 18:47 [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc() Andrew Cooper
2020-10-15  8:50 ` Jan Beulich
2020-10-15 14:02   ` Andrew Cooper
2020-10-15 15:16     ` Jan Beulich
2020-10-15 16:38       ` Andrew Cooper
2020-10-16  6:45         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).