All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "Martin Pohlack" <mpohlack@amazon.de>,
	"Julien Grall" <julien.grall@arm.com>,
	"Jan Beulich" <JBeulich@suse.com>,
	"Joao Martins" <joao.m.martins@oracle.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Daniel Kiper" <daniel.kiper@oracle.com>,
	"Marek Marczykowski" <marmarek@invisiblethingslab.com>,
	"Anthony Liguori" <aliguori@amazon.com>,
	"Dannowski, Uwe" <uwed@amazon.de>,
	"Lars Kurth" <lars.kurth@citrix.com>,
	"Konrad Wilk" <konrad.wilk@oracle.com>,
	"Ross Philipson" <ross.philipson@oracle.com>,
	"Dario Faggioli" <dfaggioli@suse.com>,
	"Matt Wilson" <msw@amazon.com>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Juergen Gross" <JGross@suse.com>,
	"Sergey Dyasli" <sergey.dyasli@citrix.com>,
	"Wei Liu" <wei.liu2@citrix.com>,
	"George Dunlap" <george.dunlap@eu.citrix.com>,
	"Xen-devel List" <xen-devel@lists.xen.org>,
	"Mihai Donțu" <mdontu@bitdefender.com>,
	"Woodhouse, David" <dwmw@amazon.co.uk>
Subject: Re: Ongoing/future speculative mitigation work
Date: Fri, 7 Dec 2018 18:40:52 +0000	[thread overview]
Message-ID: <20181207184051.l6owpsjvecog6zhx@zion.uk.xensource.com> (raw)
In-Reply-To: <e3219697-0759-39fc-2486-715cdec1ca9e@citrix.com>

On Thu, Oct 18, 2018 at 06:46:22PM +0100, Andrew Cooper wrote:
> Hello,
> 
> This is an accumulation and summary of various tasks which have been
> discussed since the revelation of the speculative security issues in
> January, and also an invitation to discuss alternative ideas.  They are
> x86 specific, but a lot of the principles are architecture-agnostic.
> 
> 1) A secrets-free hypervisor.
> 
> Basically every hypercall can be (ab)used by a guest, and used as an
> arbitrary cache-load gadget.  Logically, this is the first half of a
> Spectre SP1 gadget, and is usually the first stepping stone to
> exploiting one of the speculative sidechannels.
> 
> Short of compiling Xen with LLVM's Speculative Load Hardening (which is
> still experimental, and comes with a ~30% perf hit in the common case),
> this is unavoidable.  Furthermore, throwing a few array_index_nospec()
> into the code isn't a viable solution to the problem.
> 
> An alternative option is to have less data mapped into Xen's virtual
> address space - if a piece of memory isn't mapped, it can't be loaded
> into the cache.
> 
> An easy first step here is to remove Xen's directmap, which will mean
> that guests general RAM isn't mapped by default into Xen's address
> space.  This will come with some performance hit, as the
> map_domain_page() infrastructure will now have to actually
> create/destroy mappings, but removing the directmap will cause an
> improvement for non-speculative security as well (No possibility of
> ret2dir as an exploit technique).
> 
> Beyond the directmap, there are plenty of other interesting secrets in
> the Xen heap and other mappings, such as the stacks of the other pcpus. 
> Fixing this requires moving Xen to having a non-uniform memory layout,
> and this is much harder to change.  I already experimented with this as
> a meltdown mitigation around about a year ago, and posted the resulting
> series on Jan 4th,
> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00274.html,
> some trivial bits of which have already found their way upstream.
> 
> To have a non-uniform memory layout, Xen may not share L4 pagetables. 
> i.e. Xen must never have two pcpus which reference the same pagetable in
> %cr3.
> 
> This property already holds for 32bit PV guests, and all HVM guests, but
> 64bit PV guests are the sticking point.  Because Linux has a flat memory
> layout, when a 64bit PV guest schedules two threads from the same
> process on separate vcpus, those two vcpus have the same virtual %cr3,
> and currently, Xen programs the same real %cr3 into hardware.
> 
> If we want Xen to have a non-uniform layout, are two options are:
> * Fix Linux to have the same non-uniform layout that Xen wants
> (Backwards compatibility for older 64bit PV guests can be achieved with
> xen-shim).
> * Make use XPTI algorithm (specifically, the pagetable sync/copy part)
> forever more in the future.
> 
> Option 2 isn't great (especially for perf on fixed hardware), but does
> keep all the necessary changes in Xen.  Option 1 looks to be the better
> option longterm.
> 
> As an interesting point to note.  The 32bit PV ABI prohibits sharing of
> L3 pagetables, because back in the 32bit hypervisor days, we used to
> have linear mappings in the Xen virtual range.  This check is stale
> (from a functionality point of view), but still present in Xen.  A
> consequence of this is that 32bit PV guests definitely don't share
> top-level pagetables across vcpus.

Correction: 32bit PV ABI prohibits sharing of L2 pagetables, but L3
pagetables can be shared. So guests will schedule the same top-level
pagetables across vcpus. 

But, 64bit Xen creates a monitor table for 32bit PAE guest and put the
CR3 provided by guest to the first slot, so pcpus don't share the same
L4 pagetables. The property we want still holds.

> 
> Juergen/Boris: Do you have any idea if/how easy this infrastructure
> would be to implement for 64bit PV guests as well?  If a PV guest can
> advertise via Elfnote that it won't share top-level pagetables, then we
> can audit this trivially in Xen.
> 

After reading Linux kernel code, I think it is not going to be trivial.
As now threads in Linux share one pagetable (as it should be).

In order to make each thread has its own pagetable while still maintain
the illusion of one address space, there needs to be synchronisation
under the hood.

There is code in Linux to synchronise vmalloc, but that's only for the
kernel portion. The infrastructure to synchronise userspace portion is
missing.

One idea is to follow the same model as vmalloc -- maintain a reference
pagetable in struct mm and a list of pagetables for threads, then
synchronise the pagetables in the page fault handler. But this is
probably a bit hard to sell to Linux maintainers because it will touch a
lot of the non-Xen code, increase complexity and decrease performance.

Thoughts?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2018-12-07 18:40 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-18 17:46 Ongoing/future speculative mitigation work Andrew Cooper
2018-10-19  8:09 ` Dario Faggioli
2018-10-19 12:17   ` Andrew Cooper
2018-10-22  9:32     ` Mihai Donțu
2018-10-22 14:55 ` Wei Liu
2018-10-22 15:09   ` Woodhouse, David
2018-10-22 15:14     ` Andrew Cooper
2018-10-25 14:50   ` Jan Beulich
2018-10-25 14:56     ` George Dunlap
2018-10-25 15:02       ` Jan Beulich
2018-10-25 16:29         ` Andrew Cooper
2018-10-25 16:43           ` George Dunlap
2018-10-25 16:50             ` Andrew Cooper
2018-10-25 17:07               ` George Dunlap
2018-10-26  9:16           ` Jan Beulich
2018-10-26  9:28             ` Wei Liu
2018-10-26  9:56               ` Jan Beulich
2018-10-26 10:51                 ` George Dunlap
2018-10-26 11:20                   ` Jan Beulich
2018-10-26 11:24                     ` George Dunlap
2018-10-26 11:33                       ` Jan Beulich
2018-10-26 11:43                         ` George Dunlap
2018-10-26 11:45                           ` Jan Beulich
2018-12-11 18:05                     ` Wei Liu
     [not found]                       ` <FB70ABC00200007CA293CED3@prv1-mh.provo.novell.com>
2018-12-12  8:32                         ` Jan Beulich
2018-10-24 15:24 ` Tamas K Lengyel
2018-10-25 16:01   ` Dario Faggioli
2018-10-25 16:25     ` Tamas K Lengyel
2018-10-25 17:23       ` Dario Faggioli
2018-10-25 17:29         ` Tamas K Lengyel
2018-10-26  7:31           ` Dario Faggioli
2018-10-25 16:55   ` Andrew Cooper
2018-10-25 17:01     ` George Dunlap
2018-10-25 17:35       ` Tamas K Lengyel
2018-10-25 17:43         ` Andrew Cooper
2018-10-25 17:58           ` Tamas K Lengyel
2018-10-25 18:13             ` Andrew Cooper
2018-10-25 18:35               ` Tamas K Lengyel
2018-10-25 18:39                 ` Andrew Cooper
2018-10-26  7:49                 ` Dario Faggioli
2018-10-26 12:01                   ` Tamas K Lengyel
2018-10-26 14:17                     ` Dario Faggioli
2018-10-26 10:11               ` George Dunlap
2018-12-07 18:40 ` Wei Liu [this message]
2018-12-10 12:12   ` George Dunlap
2018-12-10 12:19     ` George Dunlap
2019-01-24 11:44 ` Reducing or removing direct map from xen (was Re: Ongoing/future speculative mitigation work) Wei Liu
2019-01-24 16:00   ` George Dunlap
2019-02-07 16:50   ` Wei Liu
2019-02-20 12:29   ` Wei Liu
2019-02-20 13:00     ` Roger Pau Monné
2019-02-20 13:09       ` Wei Liu
2019-02-20 17:08         ` Wei Liu
2019-02-21  9:59           ` Roger Pau Monné
2019-02-21 17:51             ` Wei Liu
2019-02-22 11:48           ` Jan Beulich
2019-02-22 11:50             ` Wei Liu
2019-02-22 12:06               ` Jan Beulich
2019-02-22 12:11                 ` Wei Liu
2019-02-22 12:47                   ` Jan Beulich
2019-02-22 13:19                     ` Wei Liu
     [not found]                       ` <158783E402000088A293CED3@prv1-mh.provo.novell.com>
2019-02-22 13:24                         ` Jan Beulich
2019-02-22 13:27                           ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181207184051.l6owpsjvecog6zhx@zion.uk.xensource.com \
    --to=wei.liu2@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=JGross@suse.com \
    --cc=aliguori@amazon.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=daniel.kiper@oracle.com \
    --cc=dfaggioli@suse.com \
    --cc=dwmw@amazon.co.uk \
    --cc=george.dunlap@eu.citrix.com \
    --cc=joao.m.martins@oracle.com \
    --cc=julien.grall@arm.com \
    --cc=konrad.wilk@oracle.com \
    --cc=lars.kurth@citrix.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=mdontu@bitdefender.com \
    --cc=mpohlack@amazon.de \
    --cc=msw@amazon.com \
    --cc=ross.philipson@oracle.com \
    --cc=sergey.dyasli@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=uwed@amazon.de \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.