All of lore.kernel.org
 help / color / mirror / Atom feed
From: John G Johnson <john.g.johnson@oracle.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "\"Daniel P. Berrangé\"" <berrange@redhat.com>,
	"Elena Ufimtseva" <elena.ufimtseva@oracle.com>,
	sstabellini@kernel.org, "Jag Raman" <jag.raman@oracle.com>,
	konrad.wilk@oracle.com, "Stefan Hajnoczi" <stefanha@gmail.com>,
	qemu-devel@nongnu.org, ross.lagerwall@citrix.com,
	liran.alon@oracle.com, kanth.ghatraju@oracle.com
Subject: Re: [Qemu-devel] [multiprocess RFC PATCH 36/37] multi-process: add the concept description to docs/devel/qemu-multiprocess
Date: Thu, 7 Mar 2019 15:29:41 -0800	[thread overview]
Message-ID: <BDEBF2EE-DE0F-46CF-B60E-536B3DA9BF77@oracle.com> (raw)
In-Reply-To: <20190307192727.GG2915@stefanha-x1.localdomain>



> On Mar 7, 2019, at 11:27 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Thu, Mar 07, 2019 at 02:51:20PM +0000, Daniel P. Berrangé wrote:
>> On Thu, Mar 07, 2019 at 02:26:09PM +0000, Stefan Hajnoczi wrote:
>>> On Wed, Mar 06, 2019 at 11:22:53PM -0800, elena.ufimtseva@oracle.com wrote:
>>>> diff --git a/docs/devel/qemu-multiprocess.txt b/docs/devel/qemu-multiprocess.txt
>>>> new file mode 100644
>>>> index 0000000..e29c6c8
>>>> --- /dev/null
>>>> +++ b/docs/devel/qemu-multiprocess.txt
>>> 
>>> Thanks for this document and the interesting work that you are doing.
>>> I'd like to discuss the security advantages gained by disaggregating
>>> QEMU in more detail.
>>> 
>>> The security model for VMs managed by libvirt (most production x86, ppc,
>>> s390 guests) is that the QEMU process is untrusted and only has access
>>> to resources belonging to the guest.  SELinux is used to restrict the
>>> process from accessing other files, processes, etc on the host.
>> 
>> NB it doesn't have to be SELinux. Libvirt also supports AppArmor and
>> can even do isolation with traditional DAC by putting each QEMU under
>> a distinct UID/GID and having libvirtd set ownership on resources each
>> VM is permitted to use.
>> 
>>> QEMU does not hold privileged resources that must be kept away from the
>>> guest.  An escaped guest can access its image file, tap file descriptor,
>>> etc but they are the same resources it could already access via device
>>> emulation.
>>> 
>>> Can you give specific examples of how disaggregation improves security?
> 
> Elena & collaborators: Dan has posted some ideas but please share yours
> so the security benefits of this patch series can be better understood.
> 

	Dan covered the main point.  The security regime we use (selinux)
constrains the actions of processes on objects, so having multiple processes
allows us to apply more fine-grained policies.


>> I guess one obvious answer is that the existing security mechanisms like
>> SELinux/ApArmor/DAC can be made to work in a more fine grained manner if
>> there are distinct processes. This would allow for a more useful seccomp
>> filter to better protect against secondary kernel exploits should QEMU
>> itself be exploited, if we can protect individual components.
> 
> Fine-grained sandboxing is possible in theory but tedious in practice.
> From what I can tell this patch series doesn't implement any sandboxing
> for child processes.
> 

	The policies aren’t in QEMU, but in the selinux config files.
They would say, for example, that when the QEMU process exec()s the
disk emulation process, the process security context type transitions
to a new type.  This type would have permission to access the VM image
objects, whereas the QEMU process type (and any other device emulation
process types) cannot access them.

	If you wanted to use DAC, you could do the something similar by
making the disk emulation executable setuid to a UID than can access
VM image files.

	In either case, the policies and permissions are set up before
libvirt even runs, so it doesn’t need to be aware of them.


> There must be a convenient way to get fine-grained sandboxing for
> disaggregated devices.  In other words, it shouldn't be left as an
> exercise to device process authors.
> 

	We can add some MAC or DAC suggestions in the documentation.


> How to do this in practice must be clear from the beginning if
> fine-grained sandboxing is the main selling point.
> 
> Some details to start the discussion:
> 
> * How will fine-grained SELinux/AppArmor/DAC policies be configured for
>   each process?  I guess this requires root, so does libvirt need to
>   know about each process?
> 

	The polices would apply to process security context types (or
UIDs in a DAC regime), so I would not expect libvirt to be aware of them.


> * We need to make sure that processes cannot send signals to each
>   other, ptrace, interfere in /proc/$PID, etc.  How will this be done?
> 

	Any process type restrictions would be enforced by selinux.


> * Were you planning to use any other sandboxing mechanisms
>   (namespaces?)?  How will they be set up if the device processed is
>   forked/executed by an unprivileged QEMU?
> 

	All of the QEMU-related process related to a single VM will run
in the same container, but the container is created, along with it selinux
policies, before libvirt is run.


>> Not everything is protected by MAC/DAC. For example network based disks
>> typically have a username + password for accessing the remote storage
>> server. Best practice would be a distinct username for every QEMU process
>> such that each can only access its own storage, but I don't know of any
>> app which does that. So ability to split off backends into separate
>> processes could limit exposure of information that is not otherwise
>> protected by current protection models.
> 
> If the disaggregated disk process with a global username + password is
> compromised then all your disk images are compromised.  So you still
> need to follow the best practice of per-VM credentials even with
> disaggregation, and if you do then disaggregation doesn't add anything!
> 

	You could put disk secrets in files that can only be read by the
disk emulation process type.  If you wanted even finer granularity, you
could use MCS to run each disk controller instance in a different security
context category, and make the secret files only readable by the corresponding
category.

	Another layer of security would be to have network security policies
that only allow the disk emulation processes to connect to the storage servers.

								JJ

  reply	other threads:[~2019-03-07 23:30 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-07  7:22 [Qemu-devel] [multiprocess RFC PATCH 36/37] multi-process: add the concept description to docs/devel/qemu-multiprocess elena.ufimtseva
2019-03-07  8:14 ` Thomas Huth
2019-03-07 14:16   ` Kevin Wolf
2019-03-07 14:21     ` Thomas Huth
2019-03-07 14:40       ` Konrad Rzeszutek Wilk
2019-03-07 14:53         ` Thomas Huth
2019-03-08 18:22     ` Elena Ufimtseva
2019-03-07 14:26 ` Stefan Hajnoczi
2019-03-07 14:51   ` Daniel P. Berrangé
2019-03-07 16:05     ` Michael S. Tsirkin
2019-03-07 16:19       ` Daniel P. Berrangé
2019-03-07 16:46         ` Michael S. Tsirkin
2019-03-07 16:49           ` Daniel P. Berrangé
2019-03-07 19:27     ` Stefan Hajnoczi
2019-03-07 23:29       ` John G Johnson [this message]
2019-03-08  9:50         ` Stefan Hajnoczi
     [not found]           ` <20190326080822.GC21018@stefanha-x1.localdomain>
     [not found]             ` <e5395abf-6b41-46c8-f5af-3210077dfdd5@oracle.com>
     [not found]               ` <CAAdtpL4ztcpf-CTx0fc5T_+VQ+8upHa2pEMoiZPcmBXOO6L3Og@mail.gmail.com>
2019-04-23 21:26                 ` Jag Raman
2019-04-25 15:44                   ` Stefan Hajnoczi
2019-04-25 15:44                     ` Stefan Hajnoczi
2019-05-07 19:00                     ` Jag Raman
2019-05-23 10:40                       ` Stefan Hajnoczi
2019-06-11 15:53                         ` Jag Raman
2019-05-23 11:11                       ` Stefan Hajnoczi
2019-05-28 15:18                         ` Elena Ufimtseva
2019-05-30 20:54                           ` Elena Ufimtseva
2019-06-11 15:59                             ` Jag Raman
2019-06-12 16:24                             ` Stefan Hajnoczi
2019-06-12 17:01                               ` Elena Ufimtseva
2019-03-11 10:20         ` Daniel P. Berrangé
2019-05-07 21:00           ` Elena Ufimtseva
2019-05-23 11:22             ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BDEBF2EE-DE0F-46CF-B60E-536B3DA9BF77@oracle.com \
    --to=john.g.johnson@oracle.com \
    --cc=berrange@redhat.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=jag.raman@oracle.com \
    --cc=kanth.ghatraju@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=liran.alon@oracle.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ross.lagerwall@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.