All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Jackson <ian.jackson@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	dgdegra@tycho.nsa.gov, Wei Liu <wei.liu2@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Tim Deegan <tim@xen.org>, David Vrabel <david.vrabel@citrix.com>,
	Anthony Perard <anthony.perard@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Device model operation hypercall (DMOP, re qemu depriv)
Date: Mon, 1 Aug 2016 12:32:54 +0100	[thread overview]
Message-ID: <22431.13158.46970.556765@mariner.uk.xensource.com> (raw)
In-Reply-To: <579A3A62.1020700@citrix.com>

Introducing HVMCTL, Jan wrote:
> A long while back separating out all control kind operations (intended
> for use by only the control domain or device model) from the currect
> hvmop hypercall has been discussed. This series aims at finally making
> this reality (at once allowing to streamline the associated XSM checking).

I think we need to introduce a new hypercall (which I will call DMOP
for now) which may augment or replace some of HVMCTL.  Let me explain:


We would like to be able to deprivilege qemu-in-dom0.  This is
because qemu has a large attack surface and has a history of security
bugs.  If we get this right we can easily reduce the impact of `guest
can take over qemu' bugs to DoS; and perhaps with a bit of effort we
can eliminate the DoS too.  (qemu stubdom are another way to do this
but they have their own difficulties.)

A part of this plan has to be a way for qemu to make hypercalls
related to the guest it is servicing.  But qemu needs to be _unable_
to make _other_ hypercalls.

I see four possible approaches.  In IMO increasing order of
desirability:

1. We could simply patch the dom0 privcmd driver to know exactly which
   hypercalls are permitted.  This is obviously never going to work
   because there would have to be a massive table in the kernel, kept
   in step with Xen.  We could have a kind of pattern matching engine
   instead, and load the tables from userspace, but that's a daft
   edifice to be building (even if we reuse BPF or something) and a
   total pain to maintain.

2. We could have some kind of privileged proxy or helper process,
   which makes the hypercalls on instruction from qemu.  This would be
   quite complicated and involve a lot of back-and-forth parameter
   passing.  Like option 1, this arrangement would end up embedding
   detailed knowledge about which hypercalls are appropriate, and have
   to understand all of their parameters.

3. We could have the dom0 privcmd driver wrap each of qemu's
   hypercalls in a special "wrap up with different XSM tag" hypercall.
   Then, we could specify the set of allowable hypercalls with XSM.
   If we want qemu deprivileged by default, this depends on turning
   XSM on by default.  But we want qemu depriv ASAP and there are
   difficulties with XSM by default.  This approach also involves
   writing a large and hard-to-verify hypercall permission table, in
   the form of an XSM policy.

4. We could invent a new hypercall `DMOP' for hypercalls which device
   models should be able to use, which always has the target domain in
   a fixed location in the arguments.  We have the dom0 privcmd driver
   know about this one hypercall number and the location of the target
   domid.

Option 4 has the following advantages:

* The specification of which hypercalls are authorised to qemu is
  integrated with the specification of the hypercalls themselves:
  There is no need to maintain a separate table which can get out of
  step (or contain security bugs).

* The changes required to the rest of the system are fairly small.
  In particular:

* We need only one small, non-varying, patch to the dom0 kernel.


Let me flesh out option 4 in more detail:


We define a new hypercall DMOP.

Its first argument is always a target domid.  The DMOP hypercall
number and position of the target domid in the arguments are fixed.

A DMOP is defined to never put at risk the stability or security of
the whole system, nor of the domain which calls DMOP.  However, a DMOP
may have arbitrary effects on the target domid.

In the privcmd driver, we provide a new restriction ioctl, which takes
a domid parameter.  After that restriction ioctl is called, the
privcmd driver will permit only DMOP hypercalls, and only with the
specified target domid.

Since the hypercall number and the target domid are stable, this is a
simple check which will not need to be updated as new DMOPs are
defined (and old ones retired).

DMOPs are not available to guests (other than stub device model
domains) and do not form part of the guest-stable ABI.  Where the set
of operations provided through DMOPs overlaps with guest-stable
hypercalls, identical functionality must provided through both
parts of the hypercall namespace.

Privileged toolstack software is permitted to use DMOPs as well as
other hypercalls, of course.  So there is no need to duplicate
functionality between DMOPs and non-stable privileged toolstack
hypercalls.


On ABI/API stability:

For this scheme to work, it is not essential that the DMOPs themselves
should have a stable ABI.

However, we do want to be able to decouple qemu versions from Xen
versions.  This could be done by having the relevant bit of libxc (let
us suppose libdevicemodel) be capable of driving multiple versions of
Xen.  Or by having different libdevicemodel versions, one for each
version of Xen, and some kind of ad-hoc select-the-right-library
arrangement to cope with dual booting.

Alternatively, old DMOP interfaces (ie, old DMOPs) could simply be
retained for a few Xen releases and then retired, providing a
semi-stable ABI to device model software.

In any case, probably the DMOP opcode needs to be a wide field so that
when new DMOPs, or new versions of old DMOPs, arise, we can assign
them new numbers.  (Alternatively we could have a version field in
every DMOP which is checked for equality, but that makes some
compatibility strategies more painful.)


What do people think ?

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-08-01 11:32 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-28 17:01 XenProject/XenServer QEMU working group, Friday 8th July, 2016, 15:00 Jennifer Herbert
2016-08-01 11:32 ` Ian Jackson [this message]
2016-08-01 12:41   ` Device model operation hypercall (DMOP, re qemu depriv) Jan Beulich
2016-08-02 11:38     ` Wei Liu
2016-08-02 11:58       ` Jan Beulich
2016-08-02 13:02         ` David Vrabel
2016-08-02 13:29           ` Jan Beulich
2016-08-03 10:29       ` Ian Jackson
2016-08-03 12:03         ` Jan Beulich
2016-08-03 13:37           ` Ian Jackson
2016-08-03 14:16             ` Jan Beulich
2016-08-03 14:21               ` George Dunlap
2016-08-03 16:10                 ` Ian Jackson
2016-08-03 16:18                   ` Jan Beulich
2016-08-04 11:21                     ` Ian Jackson
2016-08-04 13:24                       ` Jan Beulich
2016-08-05 16:28                         ` Ian Jackson
2016-08-08 11:18                           ` Jan Beulich
2016-08-08 13:46                             ` Ian Jackson
2016-08-08 14:07                               ` Jan Beulich
2016-08-26 11:38                                 ` Ian Jackson
2016-08-26 12:58                                   ` Jan Beulich
2016-08-26 14:35                                     ` Ian Jackson
2016-08-26 15:13                                       ` Jan Beulich
2016-08-30 11:02                                         ` Ian Jackson
2016-08-30 21:47                                           ` Stefano Stabellini
2016-09-02 14:08                                           ` Wei Liu
2016-08-09 10:29                               ` Jan Beulich
2016-08-09 10:48                                 ` Ian Jackson
2016-08-09 11:30                                   ` Jan Beulich
2016-08-12  9:44                                     ` George Dunlap
2016-08-12 11:50                                       ` Jan Beulich
2016-08-15  9:39                                         ` George Dunlap
2016-08-15 10:19                                           ` Jan Beulich
2016-08-15 10:47                                             ` George Dunlap
2016-08-15 11:20                                               ` Jan Beulich
2016-08-15 12:07                                                 ` Ian Jackson
2016-08-15 14:20                                                   ` Jan Beulich
2016-08-15 14:57                                                 ` George Dunlap
2016-08-15 15:22                                                   ` Jan Beulich
2016-08-15 14:50                                 ` David Vrabel
2016-08-15 15:24                                   ` Jan Beulich
2016-08-26 11:29                                     ` Ian Jackson
2016-08-26 12:58                                       ` Jan Beulich
2016-08-02 11:37   ` Wei Liu
2016-08-02 11:42     ` George Dunlap
2016-08-02 12:34       ` Wei Liu
2016-09-09 15:16   ` Jennifer Herbert
2016-09-09 15:34     ` David Vrabel
2016-09-12 13:47     ` George Dunlap
2016-09-12 14:32     ` Jan Beulich
2016-09-13 10:37       ` George Dunlap
2016-09-13 11:53         ` Jan Beulich
2016-09-13 16:07       ` David Vrabel
2016-09-14  9:51         ` Jan Beulich
2016-09-21 11:21           ` Ian Jackson
2016-09-21 11:28             ` George Dunlap
2016-09-21 11:58               ` Jan Beulich
2016-09-21 11:55             ` Jan Beulich
2016-09-21 12:23               ` Device model operation hypercall (DMOP, re qemu depriv) [and 1 more messages] Ian Jackson
2016-09-21 12:48                 ` Jan Beulich
2016-09-21 13:24                   ` Ian Jackson
2016-09-21 13:56                     ` Jan Beulich
2016-09-21 15:06                       ` Ian Jackson
2016-09-21 17:09                       ` George Dunlap
2016-09-22  8:47                         ` Jan Beulich
2016-09-09 16:18 ` XenProject/XenServer QEMU working group minutes, 30th August 2016 Jennifer Herbert
2016-09-12  7:16   ` Juergen Gross
2016-10-14 18:01   ` QEMU XenServer/XenProject Working group meeting 29th September 2016 Jennifer Herbert
2016-10-18 19:54     ` Stefano Stabellini
2016-10-20 17:37       ` Lars Kurth
2016-10-20 18:53         ` Stefano Stabellini
2017-02-28 18:18     ` QEMU XenServer/XenProject Working group meeting 10th February 2017 Jennifer Herbert
2017-06-05 13:48       ` QEMU XenServer/XenProject Working group meeting 10th May 2017 Jennifer Herbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22431.13158.46970.556765@mariner.uk.xensource.com \
    --to=ian.jackson@eu.citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=konrad.wilk@oracle.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.