xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange@redhat.com>
To: George Dunlap <george.dunlap@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Wei Liu <wei.liu2@citrix.com>, Doug Goldstein <cardoe@cardoe.com>,
	Xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [libvirt] Questions about virtlogd
Date: Wed, 8 Jun 2016 13:11:54 +0100	[thread overview]
Message-ID: <20160608121154.GH7760@redhat.com> (raw)
In-Reply-To: <5757FA29.10605@citrix.com>

On Wed, Jun 08, 2016 at 11:57:45AM +0100, George Dunlap wrote:
> 
> Well we definitely want to make it possible for people to use xl while
> still avoiding DoSes.  But at the simplest level this could be done by
> having qemu's stderr/stdout piped to /dev/null by default, and allowing
> an option for the admin to enable piping it to a file on a per-guest
> basis when necessary.
> 
> This would effectively be declaring a "proper solution" out-of-scope,
> while not opening up our users to security issues.

I hadn't thought of /dev/null as an option, that's a nice idea.


BTW, I want to raise another item as more general food for thought
which is somewhat related to this topic, albeit not useful as an
solution to use today.

If you consider the risk to be a compromised QEMU inflicting DoS
on the host filesystem, then the stderr/out logging and serial
port file writing is not the only attack vector. If you give QEMU
any kind of file based disk, then (unless you have quotas in place)
it can expand those disk files to arbitrary size - this is by design
of course, since QEMU needs to save snapshots inside qcow2, and has
the ability to resize virtual disks, etc.

At the same time, it would be nice to be able to have a possibility
too have a locked down solution. Conceptually it would be nice to be
able to place size limits on individual files that were open. It turns
out Linux introduced just such a facility under the concept "file sealing"

The motivation for this was kDBus which wanted a way to share tmpfs backed
file handles between processes with certain policy rules. This concept was
implemented using a new fcntl() call F_ADD_SEAL which allows a number of
flags to be set against a file descriptor:

  - SHRINK: If SEAL_SHRINK is set, the file in question cannot be reduced
            in size. This affects ftruncate() and open(O_TRUNC).
  - GROW: If SEAL_GROW is set, the file in question cannot be increased
          in size. This affects ftruncate(), fallocate() and write().
  - WRITE: If SEAL_WRITE is set, no write operations (besides resizing)
           are possible. This affects fallocate(PUNCH_HOLE), mmap() and
           write().
  - SEAL: If SEAL_SEAL is set, no further seals can be added to a file.
          This basically prevents the F_ADD_SEAL operation on a file and
          can be set to prevent others from adding further seals that you
          don't want.

The SEAL_GROW flag seems like exactly the kind of thing QEMU could make
use of. First of all we would have to assume that the QEMU disk image
is fully pre-allocated, ie a non-sparse raw file, or a qcow2 file which
has had its extents fully allocated. This is not unreasonable for many
usage scenarios. I could imagine -drive gaining a new parameter
'growable=yes|no'. eg

  $QEMU  -drive file=/some/image/vm.qcow2,growable=no

would cause QEMU to set SEAL_GROW when it open the disk image. After
that point it is not possible for a compromised QEMU to cause further
disk allocation on the /some/image filesystem. This of course relies
on SELinux/AppArmour/other-MAC to prevent QEMU opening/creating other
arbitrary files.

Hotplug is not so simple though. At the time we hotplug the disk we
have to assume QEMU is already hostile, so we can't really rely on
QEMU to honour the request to set SEAL_GROW. To deal with this we
would have to deny QEMU any "open" permission using MAC. Instead
libvirt (or equiv) would have to be responsible for opening disk
images, setting SEAL_GROW and passing the file descriptor onto the
running QEMU to use.

This all feels remarkably simple and useful, with one huge glaring
issue....

....the kernel only implemented support for F_ADD_SEAL on the tmpfs
filesystem, since that's all kDBus needs it for :-) To be useful for
QEMU/libxl/libvirt/etc we'd at least need it supported on ext4 and xfs,
and preferrably NFS too. I've no clue how hard this would be since I'm
not at all familiar with the kernel code.

So clearly this isn't something we can use at time in the forseeable
future, but if people are interested in attacking the more general
problem, it might be an approach worth investigating to see if it
really is viable for the future.

How it would relate to the bug we're talking about here, is that I
could imagine pre-allocating the logfile at say 128 KB in size, opening
it, setting the SEAL_GROW flag and then attaching it to QEMU stdout/err.
This would let QEMU write straight to the file as normal, but not let
it exceed 128 KB.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-06-08 12:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20160607121153.GL25922@citrix.com>
2016-06-07 13:21 ` [libvirt] Questions about virtlogd Daniel P. Berrange
2016-06-07 15:57   ` Wei Liu
2016-06-08  9:50     ` George Dunlap
     [not found]     ` <5757EA60.4030004@citrix.com>
2016-06-08 10:07       ` Daniel P. Berrange
     [not found]       ` <20160608100716.GD7760@redhat.com>
2016-06-08 10:57         ` George Dunlap
2016-06-08 11:53           ` Doug Goldstein
2016-06-08 12:46             ` Wei Liu
2016-06-08 13:05               ` George Dunlap
2016-06-08 13:09                 ` Wei Liu
2016-06-08 12:11           ` Daniel P. Berrange [this message]
2016-06-08 12:57             ` George Dunlap
2016-06-08 12:25         ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160608121154.GH7760@redhat.com \
    --to=berrange@redhat.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=cardoe@cardoe.com \
    --cc=george.dunlap@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).