All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>, Juergen Gross <jgross@suse.com>
Cc: wei.liu2@citrix.com, George.Dunlap@eu.citrix.com,
	ian.jackson@eu.citrix.com, Dario Faggioli <dfaggioli@suse.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH RFC v2 00/12] xen/x86: use per-vcpu stacks for 64 bit pv domains
Date: Mon, 22 Jan 2018 18:39:06 +0000	[thread overview]
Message-ID: <1484b1c8-6140-323e-9933-e1120f1e2a0d@citrix.com> (raw)
In-Reply-To: <5A6624A602000078001A1375@prv-mh.provo.novell.com>

On 22/01/18 16:51, Jan Beulich wrote:
>>>> On 22.01.18 at 16:00, <jgross@suse.com> wrote:
>> On 22/01/18 15:48, Jan Beulich wrote:
>>>>>> On 22.01.18 at 15:38, <jgross@suse.com> wrote:
>>>> On 22/01/18 15:22, Jan Beulich wrote:
>>>>>>>> On 22.01.18 at 15:18, <jgross@suse.com> wrote:
>>>>>> On 22/01/18 13:50, Jan Beulich wrote:
>>>>>>>>>> On 22.01.18 at 13:32, <jgross@suse.com> wrote:
>>>>>>>> As a preparation for doing page table isolation in the Xen hypervisor
>>>>>>>> in order to mitigate "Meltdown" use dedicated stacks, GDT and TSS for
>>>>>>>> 64 bit PV domains mapped to the per-domain virtual area.
>>>>>>>>
>>>>>>>> The per-vcpu stacks are used for early interrupt handling only. After
>>>>>>>> saving the domain's registers stacks are switched back to the normal
>>>>>>>> per physical cpu ones in order to be able to address on-stack data
>>>>>>>> from other cpus e.g. while handling IPIs.
>>>>>>>>
>>>>>>>> Adding %cr3 switching between saving of the registers and switching
>>>>>>>> the stacks will enable the possibility to run guest code without any
>>>>>>>> per physical cpu mapping, i.e. avoiding the threat of a guest being
>>>>>>>> able to access other domains data.
>>>>>>>>
>>>>>>>> Without any further measures it will still be possible for e.g. a
>>>>>>>> guest's user program to read stack data of another vcpu of the same
>>>>>>>> domain, but this can be easily avoided by a little PV-ABI modification
>>>>>>>> introducing per-cpu user address spaces.
>>>>>>>>
>>>>>>>> This series is meant as a replacement for Andrew's patch series:
>>>>>>>> "x86: Prerequisite work for a Xen KAISER solution".
>>>>>>> Considering in particular the two reverts, what I'm missing here
>>>>>>> is a clear description of the meaningful additional protection this
>>>>>>> approach provides over the band-aid. For context see also
>>>>>>> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg01735.html 
>>>>>> My approach supports mapping only the following data while the guest is
>>>>>> running (apart form the guest's own data, of course):
>>>>>>
>>>>>> - the per-vcpu entry stacks of the domain which will contain only the
>>>>>>   guest's registers saved when an interrupt occurs
>>>>>> - the per-vcpu GDTs and TSSs of the domain
>>>>>> - the IDT
>>>>>> - the interrupt handler code (arch/x86/x86_64/[compat/]entry.S
>>>>>>
>>>>>> All other hypervisor data and code can be completely hidden from the
>>>>>> guests.
>>>>> I understand that. What I'm not clear about is: Which parts of
>>>>> the additionally hidden data are actually necessary (or at least
>>>>> very desirable) to hide?
>>>> Necessary:
>>>> - other guests' memory (e.g. physical memory 1:1 mapping)
>>>> - data from other guests e.g.in stack pages, debug buffers, I/O buffers,
>>>>   code emulator buffers
>>>> - other guests' register values e.g. in vcpu structure
>>> All of this is already being made invisible by the band-aid (with the
>>> exception of leftovers on the hypervisor stacks across context
>>> switches, which we've already said could be taken care of by
>>> memset()ing that area). I'm asking about the _additional_ benefits
>>> of your approach.
>> I'm quite sure the performance will be much better as it doesn't require
>> per physical cpu L4 page tables, but just a shadow L4 table for each
>> guest L4 table, similar to the Linux kernel KPTI approach.
> But isn't that model having the same synchronization issues upon
> guest L4 updates which Andrew was fighting with?

(Condensing a lot of threads down into one)

All the methods have L4 synchronisation update issues, until we have a
PV ABI which guarantees that L4's don't get reused.  Any improvements to
the shadowing/synchronisation algorithm will benefit all approaches.

Juergen: you're now adding a LTR into the context switch path which
tends to be very slow.  I.e. As currently presented, this series
necessarily has a higher runtime overhead than Jan's XPTI.

One of my concerns is that this patch series moves further away from the
secondary goal of my KAISER series, which was to have the IDT and GDT
mapped at the same linear addresses on every CPU so a) SIDT/SGDT don't
leak which CPU you're currently scheduled on into PV guests and b) the
context switch code can drop a load of its slow instructions like LGDT
and the VMWRITEs to update the VMCS.

Jan: As to the things not covered by the current XPTI, hiding most of
the .text section is important to prevent fingerprinting or ROP
scanning.  This is a defence-in-depth argument, but a guest being easily
able to identify whether certain XSAs are fixed or not is quite bad. 
Also, a load of CPU 0's data data-structures, including the stack is
visible in .data.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-01-22 18:40 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-22 12:32 [PATCH RFC v2 00/12] xen/x86: use per-vcpu stacks for 64 bit pv domains Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 01/12] x86: cleanup processor.h Juergen Gross
2018-01-22 12:52   ` Jan Beulich
     [not found]   ` <5A65ECA502000078001A111C@suse.com>
2018-01-22 14:10     ` Juergen Gross
2018-01-22 14:25       ` Andrew Cooper
2018-01-22 14:32         ` Jan Beulich
2018-01-22 12:32 ` [PATCH RFC v2 02/12] x86: don't use hypervisor stack size for dumping guest stacks Juergen Gross
2018-01-23  9:26   ` Jan Beulich
     [not found]   ` <5A670DEF02000078001A16AF@suse.com>
2018-01-23  9:58     ` Juergen Gross
2018-01-23 10:11       ` Jan Beulich
     [not found]       ` <5A67187C02000078001A1742@suse.com>
2018-01-23 10:19         ` Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 03/12] x86: do a revert of e871e80c38547d9faefc6604532ba3e985e65873 Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 04/12] x86: revert 5784de3e2067ed73efc2fe42e62831e8ae7f46c4 Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 05/12] x86: don't access saved user regs via rsp in trap handlers Juergen Gross
2018-01-30 14:49   ` Jan Beulich
     [not found]   ` <5A70941B02000078001A3BF0@suse.com>
2018-01-30 16:33     ` Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 06/12] x86: add a xpti command line parameter Juergen Gross
2018-01-30 15:39   ` Jan Beulich
     [not found]   ` <5A709FDF02000078001A3C2C@suse.com>
2018-01-30 16:51     ` Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 07/12] x86: allow per-domain mappings without NX bit or with specific mfn Juergen Gross
2018-01-29 17:06   ` Jan Beulich
     [not found]   ` <5A6F62B602000078001A3810@suse.com>
2018-01-30  8:02     ` Juergen Gross
2018-01-30  8:41       ` Jan Beulich
2018-01-31 10:30   ` Jan Beulich
2018-01-22 12:32 ` [PATCH RFC v2 08/12] xen/x86: use dedicated function for tss initialization Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 09/12] x86: enhance syscall stub to work in per-domain mapping Juergen Gross
2018-01-30 15:11   ` Jan Beulich
     [not found]   ` <5A70991902000078001A3C16@suse.com>
2018-01-30 16:50     ` Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 10/12] x86: allocate per-vcpu stacks for interrupt entries Juergen Gross
2018-01-30 15:40   ` Jan Beulich
2018-02-09 12:35     ` Juergen Gross
2018-02-13  9:10       ` Jan Beulich
     [not found]   ` <5A70A01402000078001A3C30@suse.com>
2018-01-30 17:12     ` Juergen Gross
2018-01-31 10:18       ` Jan Beulich
2018-01-22 12:32 ` [PATCH RFC v2 11/12] x86: modify interrupt handlers to support stack switching Juergen Gross
2018-01-30 16:07   ` Jan Beulich
     [not found]   ` <5A70A63D02000078001A3C7C@suse.com>
2018-01-30 17:19     ` Juergen Gross
2018-01-31 10:36       ` Jan Beulich
     [not found]       ` <5A71AA4202000078001A3F56@suse.com>
2018-02-02 15:42         ` Juergen Gross
2018-01-22 12:32 ` [PATCH RFC v2 12/12] x86: activate per-vcpu stacks in case of xpti Juergen Gross
2018-01-30 16:33   ` Jan Beulich
     [not found]   ` <5A70AC7F02000078001A3CA6@suse.com>
2018-01-30 17:33     ` Juergen Gross
2018-01-31 10:40       ` Jan Beulich
2018-01-22 12:50 ` [PATCH RFC v2 00/12] xen/x86: use per-vcpu stacks for 64 bit pv domains Jan Beulich
     [not found] ` <5A65EC0A02000078001A1118@suse.com>
2018-01-22 14:18   ` Juergen Gross
2018-01-22 14:22     ` Jan Beulich
     [not found]     ` <5A6601D302000078001A1230@suse.com>
2018-01-22 14:38       ` Juergen Gross
2018-01-22 14:48         ` Jan Beulich
     [not found]         ` <5A6607DB02000078001A127B@suse.com>
2018-01-22 15:00           ` Juergen Gross
2018-01-22 16:51             ` Jan Beulich
2018-01-22 18:39               ` Andrew Cooper [this message]
2018-01-22 18:48                 ` George Dunlap
2018-01-22 19:02                   ` Andrew Cooper
2018-01-23  8:36                     ` Jan Beulich
2018-01-23 11:23                       ` Andrew Cooper
2018-01-23 11:06                     ` George Dunlap
2018-01-23  6:34                 ` Juergen Gross
2018-01-23  7:21                   ` Juergen Gross
2018-01-23  8:53                   ` Jan Beulich
     [not found]                   ` <5A67061F02000078001A1669@suse.com>
2018-01-23  9:24                     ` Juergen Gross
2018-01-23  9:31                       ` Jan Beulich
     [not found]                       ` <5A670F0E02000078001A16C9@suse.com>
2018-01-23 10:10                         ` Juergen Gross
2018-01-23 11:45                           ` Andrew Cooper
2018-01-23 13:31                             ` Juergen Gross
2018-01-23 13:24                 ` Dario Faggioli
2018-01-23 16:45                 ` George Dunlap
2018-01-23 16:56                   ` Juergen Gross
2018-01-23 17:33                     ` George Dunlap
2018-01-24  7:37                       ` Jan Beulich
     [not found]             ` <5A6624A602000078001A1375@suse.com>
2018-01-23  5:50               ` Juergen Gross
2018-01-23  8:40                 ` Jan Beulich
     [not found]                 ` <5A67030F02000078001A164B@suse.com>
2018-01-23  9:45                   ` Juergen Gross
2018-01-22 21:45 ` Konrad Rzeszutek Wilk
2018-01-23  6:38   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1484b1c8-6140-323e-9933-e1120f1e2a0d@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=dfaggioli@suse.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jgross@suse.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.