All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To: Daniel Stodden <Daniel.Stodden@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	"Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
	Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Subject: Re: Re: blktap: Sync with XCP, dropping zero-copy.
Date: Wed, 17 Nov 2010 12:35:59 +0000	[thread overview]
Message-ID: <alpine.DEB.2.00.1011171139500.2373@kaball-desktop> (raw)
In-Reply-To: <1289961649.11102.1115.camel@agari.van.xensource.com>

On Wed, 17 Nov 2010, Daniel Stodden wrote:
> I'm not against reducing code and effort. But in order to switch to a
> different base we would need a drop-in match for VHD and at least a good
> match for all the control machinery on which xen-sm presently depends.
> There's also a lot of investment in filter drivers etc.
> 

I am hoping we don't actually have to rewrite all that code but that we
would be able to refactor it to fit the qemu driver APIs.


> Then there is SM control, stuff like pause/unpause to get guests off the
> storage nodes for snapshot/coalesce, more recently calls for statistics
> and monitoring, tweaking some physical I/O details, etc. Used to be a
> bitch, nowadays it's somewhat simpler, but that's all stuff we
> completely depend on.
> 

Upstream qemu has an RPC interface called QMP, that can be used to issue
commands and retrieve informations. For example snapshots are already
supported by this interface.
We need to support QMP one way or another, even only for the pure qemu
emulation use case, so we might as well exploit it.


> Moving blkback out of kernel space, into tapdisk, is predictable in size
> and complexity. Replacing tapdisks altogether would be quite a different
> story.
> 

This is not a one day project, we have time for this.
I was thinking about starting to make use of blkback qemu (with the
current qemu-xen) as a fallback when blkback2 is not present, mainly to
allow developers to work with upstream 2.6.37.
Meanwhile in the next few months upstream qemu should apply our patches
and therefore we could start using upstream qemu for development with
xen-unstable. Upstream qemu will offer much better aio support and QMP.
At that point we could start adding a VHD driver to upstream qemu and
slowly everything else we need. By the time 2.6.38/39 is out we could have
a proper backend with VHD support.
What do you think?


> The remainder below isn't fully qualified, just random bits coming to my
> mind, assuming you're not talking about sharing code/libs and
> frameworks, but actual processes.
> 
> 1st, what's the rationale with fully PV'd guests on xen? (That argument
> might not count if just taking qemu as the container process and
> stripping emulation for those.)
> 

PV guest needs qemu already for the console and the framebuffer
backends, so it just fits in the current picture without modifications.


> Related, there's the question of memory footprint. Kernel blkback is
> extremely lightweight. Moving the datapath into userland can create
> headaches, especially on 32bit dom0s with lot of guests and disks on
> backends which used to be bare LUNs under blkback. That's a problem
> tapdisk has to face too, just wondering about the size of the issue in
> qemu.

The PV qemu (the qemu run for PV guests) is very different from the HVM
qemu: it does very little, only runs the backends. I expect its memory
footprint to be really small.


> 
> Related, Xapi depends a lot on dom0 plugs, where the datapath can be
> somewhat hairy when it comes to blocking I/O and resource allocation.
> 

I am not sure what do you mean here.


> Then there is sharing. Storage activation normally doesn't operate in a
> specific VM context. It presently doesn't even relate to a particular
> VBD, much less a VM. For qemu alone, putting storage virtualization into
> the same address space is an obvious choice. For Xen, enforcing that
> sounds like a step backward.

Having the backend in qemu and running it in a VM context are two
different things. Qemu is very flexible in this regard, with one line
change (or maybe just different command line options) you can have qemu
doing hardware emulation, PV backends, both, or only some PV backends.
You could have:

- 1 qemu doing hardware emulation and PV backend;
- 1 qemu doing hardware emulation and another one doing PV backends;
- 1 qemu doing hardware emulation and 1 qemu doing some PV backends and
1 qemu doing the other PV backends;

and so on.
The only thing that cannot be easily split at the moment is the hardware
emulation, but the backends are completely modular.

> 
> >From the shared-framework perspective, and the amount of code involved:
> The ring path alone is too small to consider, and the more difficult
> parts on top of that like state machines for write ordering and syncing
> etc are hard to share because the depend on the queue implementation and
> image driver interface. 
> 

Yeah, if you are thinking about refactoring blktap2 in libraries and use
them in both the stand alone tapdisk case and the qemu case, it is
probably not worth it. 
In any case in qemu they do not depend on the image driver interface
because it is generic.


> Control might be a different story. As far as frontend/backend IPC via
> xenstore goes, right now I still feel like those backends could be
> managed by a single daemon, similar to what blktapctrl did (let's just
> make it stateless/restartable this time). I guess qemu processes run
> their xenstore trees already fine, but internally?

Yes, they do. There is a generic xen_backend interface that adds support
for Xen frontend/backend pairs. Using xen_backend each qemu instance
listens to its own xenstore backend path.

  reply	other threads:[~2010-11-17 12:35 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12 23:31 blktap: Sync with XCP, dropping zero-copy Daniel Stodden
2010-11-12 23:31 ` [PATCH 1/5] blktap: Manage segment buffers in mempools Daniel Stodden
2010-11-12 23:31 ` [PATCH 2/5] blktap: Make VMAs non-foreign and bounce buffered Daniel Stodden
2010-11-12 23:31 ` [PATCH 3/5] blktap: Add queue access macros Daniel Stodden
2010-11-12 23:31 ` [PATCH 4/5] blktap: Forward port to 2.6.32 Daniel Stodden
2010-11-12 23:31 ` [PATCH 5/5] Fix compilation format warning in drivers/xen/blktap/device.c Daniel Stodden
2010-11-13  0:50 ` blktap: Sync with XCP, dropping zero-copy Jeremy Fitzhardinge
2010-11-13  3:56   ` Daniel Stodden
     [not found]   ` <1289620544.11102.373.camel@agari.van.xensource.com>
2010-11-15 18:27     ` Jeremy Fitzhardinge
2010-11-15 19:19       ` Ian Campbell
2010-11-15 19:34         ` Jeremy Fitzhardinge
2010-11-15 20:07           ` Ian Campbell
2010-11-16  0:43             ` Daniel Stodden
2010-11-16  9:13       ` Daniel Stodden
2010-11-16 12:17         ` Stefano Stabellini
2010-11-16 16:11           ` Konrad Rzeszutek Wilk
2010-11-16 16:16             ` Stefano Stabellini
2010-11-17  2:40           ` Daniel Stodden
2010-11-17 12:35             ` Stefano Stabellini [this message]
2010-11-17 15:34               ` Jonathan Ludlam
2010-11-16 13:00         ` Dave Scott
2010-11-16 14:48           ` Stefano Stabellini
2010-11-16 17:56         ` Jeremy Fitzhardinge
2010-11-16 21:28           ` Daniel Stodden
2010-11-17 17:04             ` Ian Campbell
2010-11-17 19:27               ` Daniel Stodden
2010-11-18 13:56                 ` Ian Campbell
2010-11-18 19:37                   ` Daniel Stodden
2010-11-19 10:57                     ` Ian Campbell
2010-11-17 18:00             ` Jeremy Fitzhardinge
2010-11-17 20:21               ` Daniel Stodden
2010-11-17 21:02                 ` Jeremy Fitzhardinge
2010-11-17 21:57                   ` Daniel Stodden
2010-11-17 22:14                     ` Jeremy Fitzhardinge
     [not found]                       ` <1290035201.11102.1577.camel@agari.van.xensource.com>
     [not found]                         ` <4CE46A03.3010104@goop.org>
     [not found]                           ` <1290040898.11102.1709.camel@agari.van.xensource.com>
2010-11-18  2:29                             ` Jeremy Fitzhardinge
2010-11-17 23:32                     ` Daniel Stodden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1011171139500.2373@kaball-desktop \
    --to=stefano.stabellini@eu.citrix.com \
    --cc=Daniel.Stodden@citrix.com \
    --cc=Xen-devel@lists.xensource.com \
    --cc=jeremy@goop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.