All of lore.kernel.org
 help / color / mirror / Atom feed
* Upstream QEMU based stubdom and rump kernel
@ 2015-03-17 14:29 Wei Liu
  2015-03-17 14:54 ` Ian Campbell
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Wei Liu @ 2015-03-17 14:29 UTC (permalink / raw)
  To: rumpkernel-users
  Cc: wei.liu2, Ian Campbell, Stefano Stabellini, Ian Jackson,
	xen-devel, Anthony PERARD

Hi all

I'm now working on upstream QEMU stubdom, and rump kernel seems to be a
good fit for this purpose.

A bit background information. A stubdom is a service domain.  With QEMU
stubdom we are able to run QEMU device emulation code in a separate
domain so that bugs in QEMU don't affect Dom0 (the controlling domain).
Xen currently has a QEMU stubdom, but it's based on our fork of ancient
QEMU (plus some other libraries and mini-os). Eventually we would like
to use upstream QEMU in stubdom.

I've now successfully built QEMU upstream with rump kernel. However to
make it fully functional as a stubdom, there are some missing pieces to
be added in.

1. The ability to access QMP socket (a unix socket) from Dom0. That
   will be used to issue command to QEMU.
2. The ability to access files in Dom0. That will be used to write to /
   read from QEMU state file.
3. The building process requires mini-os headers. That will be used
   to build libxc (the controlling library).

(Xen folks, do I miss anything?)

One of my lessons learned from the existing stubdom stuffs is that I
should work with upstream and produce maintainable code. So before I do
anything for real I'd better consult the community. My gut feeling is
that the first two requirements are not really Xen specific. Let me know
what you guys plan and think.

Wei.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:29 Upstream QEMU based stubdom and rump kernel Wei Liu
@ 2015-03-17 14:54 ` Ian Campbell
  2015-03-17 14:57   ` Wei Liu
  2015-03-17 15:15 ` Anthony PERARD
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2015-03-17 14:54 UTC (permalink / raw)
  To: Wei Liu
  Cc: Anthony PERARD, rumpkernel-users, Ian Jackson,
	Stefano Stabellini, xen-devel

On Tue, 2015-03-17 at 14:29 +0000, Wei Liu wrote:
> 2. The ability to access files in Dom0. That will be used to write to /
>    read from QEMU state file.

This requirement is not as broad as you make it sound.

All which is really required is the ability to slurp in or write out a
blob of bytes to a service running in a control domain, not actual
ability to read/write files in dom0 (which would need careful security
consideration!).

For the old qemu-traditional stubdom for example this is implemented as
a pair of console devices (one r/o for restore + one w/o for save) which
are setup by the toolstack at start of day and pre-plumbed into two
temporary files.

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:54 ` Ian Campbell
@ 2015-03-17 14:57   ` Wei Liu
  2015-03-17 15:07     ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2015-03-17 14:57 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, xen-devel,
	rumpkernel-users, Anthony PERARD

On Tue, Mar 17, 2015 at 02:54:09PM +0000, Ian Campbell wrote:
> On Tue, 2015-03-17 at 14:29 +0000, Wei Liu wrote:
> > 2. The ability to access files in Dom0. That will be used to write to /
> >    read from QEMU state file.
> 
> This requirement is not as broad as you make it sound.
> 

Yes. You're right.

> All which is really required is the ability to slurp in or write out a
> blob of bytes to a service running in a control domain, not actual

This is more accurate.

> ability to read/write files in dom0 (which would need careful security
> consideration!).
> 
> For the old qemu-traditional stubdom for example this is implemented as
> a pair of console devices (one r/o for restore + one w/o for save) which
> are setup by the toolstack at start of day and pre-plumbed into two
> temporary files.
> 

Unfortunately I don't think that hack in mini-os is upstreamable in rump
kernel.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:57   ` Wei Liu
@ 2015-03-17 15:07     ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2015-03-17 15:07 UTC (permalink / raw)
  To: Wei Liu
  Cc: Anthony PERARD, rumpkernel-users, Ian Jackson,
	Stefano Stabellini, xen-devel

On Tue, 2015-03-17 at 14:57 +0000, Wei Liu wrote:
> On Tue, Mar 17, 2015 at 02:54:09PM +0000, Ian Campbell wrote:
> > On Tue, 2015-03-17 at 14:29 +0000, Wei Liu wrote:
> > > 2. The ability to access files in Dom0. That will be used to write to /
> > >    read from QEMU state file.
> > 
> > This requirement is not as broad as you make it sound.
> > 
> 
> Yes. You're right.
> 
> > All which is really required is the ability to slurp in or write out a
> > blob of bytes to a service running in a control domain, not actual
> 
> This is more accurate.

It's probably also worth also mentioning that it is a streaming read or
write, no need to support seek or such things.

> > ability to read/write files in dom0 (which would need careful security
> > consideration!).
> > 
> > For the old qemu-traditional stubdom for example this is implemented as
> > a pair of console devices (one r/o for restore + one w/o for save) which
> > are setup by the toolstack at start of day and pre-plumbed into two
> > temporary files.
> > 
> 
> Unfortunately I don't think that hack in mini-os is upstreamable in rump
> kernel.

The mini-os implementation is hacky, it is ultimately just a way of
implementing open("/dev/hvc1", "r") without actually having to have all
of that sort of thing really.

But the concept of "open a r/o device and read from it" (or vice versa)
doesn't seem to be too bad to me and I expected rumpkernels to have some
sort of concept like this somewhere.

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:29 Upstream QEMU based stubdom and rump kernel Wei Liu
  2015-03-17 14:54 ` Ian Campbell
@ 2015-03-17 15:15 ` Anthony PERARD
  2015-03-17 15:23   ` Stefano Stabellini
                     ` (2 more replies)
  2015-03-17 16:06 ` Antti Kantee
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 26+ messages in thread
From: Anthony PERARD @ 2015-03-17 15:15 UTC (permalink / raw)
  To: Wei Liu
  Cc: rumpkernel-users, Ian Jackson, Ian Campbell, Stefano Stabellini,
	xen-devel

On Tue, Mar 17, 2015 at 02:29:07PM +0000, Wei Liu wrote:
> I've now successfully built QEMU upstream with rump kernel. However to
> make it fully functional as a stubdom, there are some missing pieces to
> be added in.
> 
> 1. The ability to access QMP socket (a unix socket) from Dom0. That
>    will be used to issue command to QEMU.

The QMP "socket" does not needs to be a unix socket. It can be any of
those (from qemu --help):
Character device options:
-chardev null,id=id[,mux=on|off]
-chardev socket,id=id[,host=host],port=port[,to=to][,ipv4][,ipv6][,nodelay][,reconnect=seconds]
         [,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off] (tcp)
-chardev socket,id=id,path=path[,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off] (unix)
-chardev udp,id=id[,host=host],port=port[,localaddr=localaddr]
         [,localport=localport][,ipv4][,ipv6][,mux=on|off]
-chardev msmouse,id=id[,mux=on|off]
-chardev vc,id=id[[,width=width][,height=height]][[,cols=cols][,rows=rows]]
         [,mux=on|off]
-chardev ringbuf,id=id[,size=size]
-chardev file,id=id,path=path[,mux=on|off]
-chardev pipe,id=id,path=path[,mux=on|off]
-chardev pty,id=id[,mux=on|off]
-chardev stdio,id=id[,mux=on|off][,signal=on|off]
-chardev serial,id=id,path=path[,mux=on|off]
-chardev tty,id=id,path=path[,mux=on|off]
-chardev parallel,id=id,path=path[,mux=on|off]
-chardev parport,id=id,path=path[,mux=on|off]
-chardev spicevmc,id=id,name=name[,debug=debug]
-chardev spiceport,id=id,name=name[,debug=debug]

> 2. The ability to access files in Dom0. That will be used to write to /
>    read from QEMU state file.

To save a QEMU state (write), we do use a filename. But I guest we could
expand the QMP command (xen-save-devices-state) to use something else, if
it's easier.

To restore, we provide a file descriptor from libxl to QEMU, with the fd on
the file that contain the state we want to restore. But there are a few
other way to load a state (from qemu.git/docs/migration.txt):
- tcp migration: do the migration using tcp sockets
- unix migration: do the migration using unix sockets
- exec migration: do the migration using the stdin/stdout through a process.
- fd migration: do the migration using an file descriptor that is
  passed to QEMU.  QEMU doesn't care how this file descriptor is opened.

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 15:15 ` Anthony PERARD
@ 2015-03-17 15:23   ` Stefano Stabellini
  2015-03-17 15:27   ` Wei Liu
  2015-03-19 11:16   ` Ian Campbell
  2 siblings, 0 replies; 26+ messages in thread
From: Stefano Stabellini @ 2015-03-17 15:23 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Ian Jackson,
	xen-devel, rumpkernel-users

On Tue, 17 Mar 2015, Anthony PERARD wrote:
> On Tue, Mar 17, 2015 at 02:29:07PM +0000, Wei Liu wrote:
> > I've now successfully built QEMU upstream with rump kernel. However to
> > make it fully functional as a stubdom, there are some missing pieces to
> > be added in.
> > 
> > 1. The ability to access QMP socket (a unix socket) from Dom0. That
> >    will be used to issue command to QEMU.
> 
> The QMP "socket" does not needs to be a unix socket. It can be any of
> those (from qemu --help):
> Character device options:
> -chardev null,id=id[,mux=on|off]
> -chardev socket,id=id[,host=host],port=port[,to=to][,ipv4][,ipv6][,nodelay][,reconnect=seconds]
>          [,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off] (tcp)
> -chardev socket,id=id,path=path[,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off] (unix)
> -chardev udp,id=id[,host=host],port=port[,localaddr=localaddr]
>          [,localport=localport][,ipv4][,ipv6][,mux=on|off]
> -chardev msmouse,id=id[,mux=on|off]
> -chardev vc,id=id[[,width=width][,height=height]][[,cols=cols][,rows=rows]]
>          [,mux=on|off]
> -chardev ringbuf,id=id[,size=size]
> -chardev file,id=id,path=path[,mux=on|off]
> -chardev pipe,id=id,path=path[,mux=on|off]
> -chardev pty,id=id[,mux=on|off]
> -chardev stdio,id=id[,mux=on|off][,signal=on|off]
> -chardev serial,id=id,path=path[,mux=on|off]
> -chardev tty,id=id,path=path[,mux=on|off]
> -chardev parallel,id=id,path=path[,mux=on|off]
> -chardev parport,id=id,path=path[,mux=on|off]
> -chardev spicevmc,id=id,name=name[,debug=debug]
> -chardev spiceport,id=id,name=name[,debug=debug]
> 
> > 2. The ability to access files in Dom0. That will be used to write to /
> >    read from QEMU state file.
> 
> To save a QEMU state (write), we do use a filename. But I guest we could
> expand the QMP command (xen-save-devices-state) to use something else, if
> it's easier.
> 
> To restore, we provide a file descriptor from libxl to QEMU, with the fd on
> the file that contain the state we want to restore. But there are a few
> other way to load a state (from qemu.git/docs/migration.txt):
> - tcp migration: do the migration using tcp sockets
> - unix migration: do the migration using unix sockets
> - exec migration: do the migration using the stdin/stdout through a process.
> - fd migration: do the migration using an file descriptor that is
>   passed to QEMU.  QEMU doesn't care how this file descriptor is opened.

QEMU would definitely be happy if we started using fds instead of files
to save/restore the state on Xen.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 15:15 ` Anthony PERARD
  2015-03-17 15:23   ` Stefano Stabellini
@ 2015-03-17 15:27   ` Wei Liu
  2015-03-17 15:38     ` Ian Campbell
  2015-03-19 11:16   ` Ian Campbell
  2 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2015-03-17 15:27 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Ian Jackson,
	xen-devel, rumpkernel-users

On Tue, Mar 17, 2015 at 03:15:17PM +0000, Anthony PERARD wrote:
> On Tue, Mar 17, 2015 at 02:29:07PM +0000, Wei Liu wrote:
> > I've now successfully built QEMU upstream with rump kernel. However to
> > make it fully functional as a stubdom, there are some missing pieces to
> > be added in.
> > 
> > 1. The ability to access QMP socket (a unix socket) from Dom0. That
> >    will be used to issue command to QEMU.
> 
> The QMP "socket" does not needs to be a unix socket. It can be any of
> those (from qemu --help):
> Character device options:
> -chardev null,id=id[,mux=on|off]
> -chardev socket,id=id[,host=host],port=port[,to=to][,ipv4][,ipv6][,nodelay][,reconnect=seconds]
>          [,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off] (tcp)
> -chardev socket,id=id,path=path[,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off] (unix)
> -chardev udp,id=id[,host=host],port=port[,localaddr=localaddr]
>          [,localport=localport][,ipv4][,ipv6][,mux=on|off]
> -chardev msmouse,id=id[,mux=on|off]
> -chardev vc,id=id[[,width=width][,height=height]][[,cols=cols][,rows=rows]]
>          [,mux=on|off]
> -chardev ringbuf,id=id[,size=size]
> -chardev file,id=id,path=path[,mux=on|off]
> -chardev pipe,id=id,path=path[,mux=on|off]
> -chardev pty,id=id[,mux=on|off]
> -chardev stdio,id=id[,mux=on|off][,signal=on|off]
> -chardev serial,id=id,path=path[,mux=on|off]
> -chardev tty,id=id,path=path[,mux=on|off]
> -chardev parallel,id=id,path=path[,mux=on|off]
> -chardev parport,id=id,path=path[,mux=on|off]
> -chardev spicevmc,id=id,name=name[,debug=debug]
> -chardev spiceport,id=id,name=name[,debug=debug]
> 

Ha, thanks for the list. My brain was too locked in to the current
implementation.

So yes, we now have an array of possible transports at our disposal.

> > 2. The ability to access files in Dom0. That will be used to write to /
> >    read from QEMU state file.
> 
> To save a QEMU state (write), we do use a filename. But I guest we could
> expand the QMP command (xen-save-devices-state) to use something else, if
> it's easier.
> 

That's also an option.

> To restore, we provide a file descriptor from libxl to QEMU, with the fd on
> the file that contain the state we want to restore. But there are a few
> other way to load a state (from qemu.git/docs/migration.txt):
> - tcp migration: do the migration using tcp sockets
> - unix migration: do the migration using unix sockets
> - exec migration: do the migration using the stdin/stdout through a process.

This looks most interesting as it implies we can easily pipe a console
to it.

Wei.

> - fd migration: do the migration using an file descriptor that is
>   passed to QEMU.  QEMU doesn't care how this file descriptor is opened.
> 
> -- 
> Anthony PERARD

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 15:27   ` Wei Liu
@ 2015-03-17 15:38     ` Ian Campbell
  2015-03-18 11:24       ` Martin Lucina
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2015-03-17 15:38 UTC (permalink / raw)
  To: Wei Liu
  Cc: Anthony PERARD, rumpkernel-users, Ian Jackson,
	Stefano Stabellini, xen-devel

On Tue, 2015-03-17 at 15:27 +0000, Wei Liu wrote:
> This looks most interesting as it implies we can easily pipe a console
> to it.

BTW, rather than rawe consoles we should probably consider using the
channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:29 Upstream QEMU based stubdom and rump kernel Wei Liu
  2015-03-17 14:54 ` Ian Campbell
  2015-03-17 15:15 ` Anthony PERARD
@ 2015-03-17 16:06 ` Antti Kantee
  2015-03-18 11:22   ` Martin Lucina
  2015-03-18 11:20 ` Martin Lucina
  2015-03-19  0:19 ` Samuel Thibault
  4 siblings, 1 reply; 26+ messages in thread
From: Antti Kantee @ 2015-03-17 16:06 UTC (permalink / raw)
  To: wei.liu2, rumpkernel-users
  Cc: Anthony PERARD, Stefano Stabellini, Ian Jackson, Ian Campbell, xen-devel

On 17/03/15 14:29, Wei Liu wrote:
> I've now successfully built QEMU upstream with rump kernel. However to
> make it fully functional as a stubdom, there are some missing pieces to
> be added in.
>
> 1. The ability to access QMP socket (a unix socket) from Dom0. That
>     will be used to issue command to QEMU.
> 2. The ability to access files in Dom0. That will be used to write to /
>     read from QEMU state file.

There's a way to map file access to rump kernel hypercalls with a 
facility called etfs (extra-terrestrial file system).  In fact, the 
current implementation for accessing the Xen block device from the rump 
kernel is done using etfs (... historical reasons, I'd have to go back 
5+ years to explain why it doesn't attach as a regular block device).

etfs isn't a file system, e.g. it doesn't allow listing files or 
removing them, but it does give you complete control of what happens 
when data is read or written for /some/path.  But based on the other 
posts, sounds like it might be enough for what you need.

See:
http://man.netbsd.org/cgi-bin/man-cgi?rump_etfs++NetBSD-current

> 3. The building process requires mini-os headers. That will be used
>     to build libxc (the controlling library).

That's not really a problem, though I do want to limit the amount of 
interface we claim to support with rump kernels.  For example, ISTR you 
mentioned on irc you'd like to use minios wait.h.  It would be better to 
use pthread synchronization instead of minios synchronization.  That 
way, if we do have a need to change the underlying threading in the 
future, you won't run into trouble.

So, we should just determine what is actually needed and expose those 
bits by default.

> One of my lessons learned from the existing stubdom stuffs is that I
> should work with upstream and produce maintainable code. So before I do
> anything for real I'd better consult the community. My gut feeling is
> that the first two requirements are not really Xen specific. Let me know
> what you guys plan and think.

Yes, please.  If there's something silly going on, it's most likely due to:

1) we didn't get that far in our experiments and weren't aware of it
2) we were aware, but some bits were even sillier, taking priority

Either way, a real need is a definite reason to expedite fixing.

   - antti

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:29 Upstream QEMU based stubdom and rump kernel Wei Liu
                   ` (2 preceding siblings ...)
  2015-03-17 16:06 ` Antti Kantee
@ 2015-03-18 11:20 ` Martin Lucina
  2015-03-18 19:05   ` Anil Madhavapeddy
  2015-03-19  0:19 ` Samuel Thibault
  4 siblings, 1 reply; 26+ messages in thread
From: Martin Lucina @ 2015-03-18 11:20 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, Stefano Stabellini, Anil Madhavapeddy, Ian Jackson,
	Richard Mortier, xen-devel, rumpkernel-users, Thomas Gazagnaire,
	Anthony PERARD

(Adding some of the Mirage folks to Cc:)

wei.liu2@citrix.com said:
> Hi all
> 
> I'm now working on upstream QEMU stubdom, and rump kernel seems to be a
> good fit for this purpose.
> 
> A bit background information. A stubdom is a service domain.  With QEMU
> stubdom we are able to run QEMU device emulation code in a separate
> domain so that bugs in QEMU don't affect Dom0 (the controlling domain).
> Xen currently has a QEMU stubdom, but it's based on our fork of ancient
> QEMU (plus some other libraries and mini-os). Eventually we would like
> to use upstream QEMU in stubdom.
> 
> I've now successfully built QEMU upstream with rump kernel. However to
> make it fully functional as a stubdom, there are some missing pieces to
> be added in.
> 
> 1. The ability to access QMP socket (a unix socket) from Dom0. That
>    will be used to issue command to QEMU.
> 2. The ability to access files in Dom0. That will be used to write to /
>    read from QEMU state file.

As I understand from Stefano's and Anthony's replies in this thread, both
of the above can be implemented using an AF_UNIX or AF_INET socket on the
QEMU end. Such an implementation would not require anything special done in
QEMU, just telling it which socket to use using existing mechanisms.

So, let's step back a bit: What we need is a trusted communication channel
from a Rump Kernel domU to dom0, using existing socket or socket-like[*]
APIs at both the domU and dom0 ends.

This fits in with a couple of things I hope to make time to work on in the
next couple of months:

 1. Introspection of Rump Kernel domUs for ops purposes, i.e. get some
    basic "ps", "top", "vmstat"-like information about what the domU is
    doing from the dom0.

 2. Connecting up multiple Rump Kernel domUs and/or Mirage domUs. The
    general idea here is that you can have e.g. a Mirage domU running a
    HTTP+TLS frontend, communicating with a Rump Kernel domU running PHP +
    FastCGI.

    The Mirage folks are already doing something similar in their 
    Jitsu work, using a protocol called Conduit which runs over vchan.

Now, both of the above require exactly the same underlying mechanism.

Point 2. will further require implementing support in the Rump Kernel,
either for a shim which would proxy AF_UNIX / AF_INET transparently using
vchan, or possibly later implementing a separate socket family (AF_VCHAN /
AF_HYPER?). Once that is done you should be able to just drop it in to
QEMU on Rump.

[*] Aside: What I mean by socket-like is that the implementation does not
need to be in the dom0 kernel, it can just be a user-space library. For
example, see the nanomsg or ZeroMQ APIs, which I have worked on extensively
in the past.

> 3. The building process requires mini-os headers. That will be used
>    to build libxc (the controlling library).

As Antti already suggested, if you can use POSIX interfaces rather than
mini-os ones in QEMU, then that would be a better approach.

> One of my lessons learned from the existing stubdom stuffs is that I
> should work with upstream and produce maintainable code. So before I do
> anything for real I'd better consult the community. My gut feeling is
> that the first two requirements are not really Xen specific. Let me know
> what you guys plan and think.

Thanks for getting in touch. I think this is an important discussion!

Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 16:06 ` Antti Kantee
@ 2015-03-18 11:22   ` Martin Lucina
  2015-03-18 13:22     ` Antti Kantee
  0 siblings, 1 reply; 26+ messages in thread
From: Martin Lucina @ 2015-03-18 11:22 UTC (permalink / raw)
  To: Antti Kantee
  Cc: wei.liu2, Ian Campbell, Stefano Stabellini, Ian Jackson,
	xen-devel, rumpkernel-users, Anthony PERARD

pooka@rumpkernel.org said:
> etfs isn't a file system, e.g. it doesn't allow listing files or
> removing them, but it does give you complete control of what happens
> when data is read or written for /some/path.  But based on the other
> posts, sounds like it might be enough for what you need.
> 
> See:
> http://man.netbsd.org/cgi-bin/man-cgi?rump_etfs++NetBSD-current

They'd still need to implement the rumphyper/Mini-OS backend to get etfs to
talk over vchan to the dom0, right?

> That's not really a problem, though I do want to limit the amount of
> interface we claim to support with rump kernels.  For example, ISTR
> you mentioned on irc you'd like to use minios wait.h.  It would be
> better to use pthread synchronization instead of minios
> synchronization.  That way, if we do have a need to change the
> underlying threading in the future, you won't run into trouble.
> 
> So, we should just determine what is actually needed and expose
> those bits by default.

+1

Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 15:38     ` Ian Campbell
@ 2015-03-18 11:24       ` Martin Lucina
  2015-03-18 11:30         ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Martin Lucina @ 2015-03-18 11:24 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, xen-devel,
	rumpkernel-users, Anthony PERARD

ian.campbell@citrix.com said:
> On Tue, 2015-03-17 at 15:27 +0000, Wei Liu wrote:
> > This looks most interesting as it implies we can easily pipe a console
> > to it.
> 
> BTW, rather than rawe consoles we should probably consider using the
> channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt

What would be the advantage/rationale for using channels rather than vchan?
(See my other reply to this thread)

Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 11:24       ` Martin Lucina
@ 2015-03-18 11:30         ` Ian Campbell
  2015-03-18 12:45           ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2015-03-18 11:30 UTC (permalink / raw)
  To: Martin Lucina
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, xen-devel,
	rumpkernel-users, Anthony PERARD

On Wed, 2015-03-18 at 12:24 +0100, Martin Lucina wrote:
> ian.campbell@citrix.com said:
> > On Tue, 2015-03-17 at 15:27 +0000, Wei Liu wrote:
> > > This looks most interesting as it implies we can easily pipe a console
> > > to it.
> > 
> > BTW, rather than rawe consoles we should probably consider using the
> > channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt
> 
> What would be the advantage/rationale for using channels rather than vchan?
> (See my other reply to this thread)

Not much really.

About the only relevant difference between vchan and channels(/consoles)
is that there is an existing backend running on most xen systems
(xenconsoled) which can be leveraged in some cases for channels, whereas
vchan would need a specific backend writing for each case.

Apart from that implementation convenience vchan is probably going to be
better in terms of proper integration for the other end.

But iff the decision goes the way of consoles then using channels in
preference to raw consoles makes sense.

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 11:30         ` Ian Campbell
@ 2015-03-18 12:45           ` Stefano Stabellini
  2015-03-18 16:46             ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2015-03-18 12:45 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, xen-devel,
	rumpkernel-users, Anthony PERARD, Martin Lucina

On Wed, 18 Mar 2015, Ian Campbell wrote:
> On Wed, 2015-03-18 at 12:24 +0100, Martin Lucina wrote:
> > ian.campbell@citrix.com said:
> > > On Tue, 2015-03-17 at 15:27 +0000, Wei Liu wrote:
> > > > This looks most interesting as it implies we can easily pipe a console
> > > > to it.
> > > 
> > > BTW, rather than rawe consoles we should probably consider using the
> > > channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt
> > 
> > What would be the advantage/rationale for using channels rather than vchan?
> > (See my other reply to this thread)
> 
> Not much really.
> 
> About the only relevant difference between vchan and channels(/consoles)
> is that there is an existing backend running on most xen systems
> (xenconsoled) which can be leveraged in some cases for channels, whereas
> vchan would need a specific backend writing for each case.
> 
> Apart from that implementation convenience vchan is probably going to be
> better in terms of proper integration for the other end.
> 
> But iff the decision goes the way of consoles then using channels in
> preference to raw consoles makes sense.

I think that for simplicity's sake and to limit dependencies on the
system, using consoles for low bandwidth channels, such as QMP, is
preferable.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 11:22   ` Martin Lucina
@ 2015-03-18 13:22     ` Antti Kantee
  0 siblings, 0 replies; 26+ messages in thread
From: Antti Kantee @ 2015-03-18 13:22 UTC (permalink / raw)
  To: wei.liu2, rumpkernel-users, xen-devel, Stefano Stabellini,
	Anthony PERARD, Ian Jackson, Ian Campbell

On 18/03/15 11:22, Martin Lucina wrote:
> pooka@rumpkernel.org said:
>> etfs isn't a file system, e.g. it doesn't allow listing files or
>> removing them, but it does give you complete control of what happens
>> when data is read or written for /some/path.  But based on the other
>> posts, sounds like it might be enough for what you need.
>>
>> See:
>> http://man.netbsd.org/cgi-bin/man-cgi?rump_etfs++NetBSD-current
>
> They'd still need to implement the rumphyper/Mini-OS backend to get etfs to
> talk over vchan to the dom0, right?

Strictly speaking, they'd have to implement the iov{read,write} 
hypercalls to do that.  But, no, etfs doesn't do magic.  IOW, they'd 
have to define what "host path" means.

It occurred to me that I wrote that manpage, umm, 5 years ago when rump 
kernels ran only in userspace and "host path" was a better defined term. 
  Pile that manpage on the neverending heap of documentation which broke 
while the code kept working.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 12:45           ` Stefano Stabellini
@ 2015-03-18 16:46             ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2015-03-18 16:46 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Ian Jackson, xen-devel, rumpkernel-users,
	Anthony PERARD, Martin Lucina

On Wed, 2015-03-18 at 12:45 +0000, Stefano Stabellini wrote:
> On Wed, 18 Mar 2015, Ian Campbell wrote:
> > On Wed, 2015-03-18 at 12:24 +0100, Martin Lucina wrote:
> > > ian.campbell@citrix.com said:
> > > > On Tue, 2015-03-17 at 15:27 +0000, Wei Liu wrote:
> > > > > This looks most interesting as it implies we can easily pipe a console
> > > > > to it.
> > > > 
> > > > BTW, rather than rawe consoles we should probably consider using the
> > > > channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt
> > > 
> > > What would be the advantage/rationale for using channels rather than vchan?
> > > (See my other reply to this thread)
> > 
> > Not much really.
> > 
> > About the only relevant difference between vchan and channels(/consoles)
> > is that there is an existing backend running on most xen systems
> > (xenconsoled) which can be leveraged in some cases for channels, whereas
> > vchan would need a specific backend writing for each case.
> > 
> > Apart from that implementation convenience vchan is probably going to be
> > better in terms of proper integration for the other end.
> > 
> > But iff the decision goes the way of consoles then using channels in
> > preference to raw consoles makes sense.
> 
> I think that for simplicity's sake and to limit dependencies on the
> system, using consoles for low bandwidth channels, such as QMP, is
> preferable.

s/consoles/channels/, please ;-)

That said, a having libxl be a user of libvchan to slurp the data in/out
of qemu directly (perhaps using the datacopier infrastructure) might be
nicer from a design point of view, since it would mean libxl could
read/write things directly instead of via a temp file and it takes
xenconsoled out of that path, which might be nice.

Ian,

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 11:20 ` Martin Lucina
@ 2015-03-18 19:05   ` Anil Madhavapeddy
  2015-03-18 19:11     ` Martin Lucina
  2015-03-18 20:23     ` Antti Kantee
  0 siblings, 2 replies; 26+ messages in thread
From: Anil Madhavapeddy @ 2015-03-18 19:05 UTC (permalink / raw)
  To: Martin Lucina
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Richard Mortier, xen-devel, rumpkernel-users,
	Thomas Gazagnaire, Anthony PERARD

On 18 Mar 2015, at 11:20, Martin Lucina <martin@lucina.net> wrote:
>> 
>> A bit background information. A stubdom is a service domain.  With QEMU
>> stubdom we are able to run QEMU device emulation code in a separate
>> domain so that bugs in QEMU don't affect Dom0 (the controlling domain).
>> Xen currently has a QEMU stubdom, but it's based on our fork of ancient
>> QEMU (plus some other libraries and mini-os). Eventually we would like
>> to use upstream QEMU in stubdom.
>> 
>> I've now successfully built QEMU upstream with rump kernel. However to
>> make it fully functional as a stubdom, there are some missing pieces to
>> be added in.
>> 
>> 1. The ability to access QMP socket (a unix socket) from Dom0. That
>>   will be used to issue command to QEMU.
>> 2. The ability to access files in Dom0. That will be used to write to /
>>   read from QEMU state file.
> 
> As I understand from Stefano's and Anthony's replies in this thread, both
> of the above can be implemented using an AF_UNIX or AF_INET socket on the
> QEMU end. Such an implementation would not require anything special done in
> QEMU, just telling it which socket to use using existing mechanisms.
> 
> So, let's step back a bit: What we need is a trusted communication channel
> from a Rump Kernel domU to dom0, using existing socket or socket-like[*]
> APIs at both the domU and dom0 ends.
> 
> This fits in with a couple of things I hope to make time to work on in the
> next couple of months:
> 
> 1. Introspection of Rump Kernel domUs for ops purposes, i.e. get some
>    basic "ps", "top", "vmstat"-like information about what the domU is
>    doing from the dom0.
> 
> 2. Connecting up multiple Rump Kernel domUs and/or Mirage domUs. The
>    general idea here is that you can have e.g. a Mirage domU running a
>    HTTP+TLS frontend, communicating with a Rump Kernel domU running PHP +
>    FastCGI.
> 
>    The Mirage folks are already doing something similar in their 
>    Jitsu work, using a protocol called Conduit which runs over vchan.

Yeah, this is currently requiring a couple of things:

- kicking the tires with Vchan and its associated machinery, which has
  taken some time.  Dave Scott has built a complementary system for
  the xentropyd which simply sets up a console ring instead of vchan.
  This has the drawback of being a single fixed page, but far simpler.

- A XenStore protocol for setting up stream connections.  This could
  indeed quite easily turn into a AF_VCHAN that could be transparently
  used by rump/Mirage/HaLVM and normal domUs for VM<->VM comms.

> Now, both of the above require exactly the same underlying mechanism.
> 
> Point 2. will further require implementing support in the Rump Kernel,
> either for a shim which would proxy AF_UNIX / AF_INET transparently using
> vchan, or possibly later implementing a separate socket family (AF_VCHAN /
> AF_HYPER?). Once that is done you should be able to just drop it in to
> QEMU on Rump.

I'm a little wary of point 2) asking for filesystem access to dom0.  What
exactly is the qemu state API?  Does it need arbitrary file access, or is
there a slightly higher level set of operations that could be marshalled
along the socket?  In fact, why doesn't qemu privilege separate and use
a QMP socket for its host filesystem operations as well?

> 
> [*] Aside: What I mean by socket-like is that the implementation does not
> need to be in the dom0 kernel, it can just be a user-space library. For
> example, see the nanomsg or ZeroMQ APIs, which I have worked on extensively
> in the past.
> 
>> 3. The building process requires mini-os headers. That will be used
>>   to build libxc (the controlling library).
> 
> As Antti already suggested, if you can use POSIX interfaces rather than
> mini-os ones in QEMU, then that would be a better approach.
> 
>> One of my lessons learned from the existing stubdom stuffs is that I
>> should work with upstream and produce maintainable code. So before I do
>> anything for real I'd better consult the community. My gut feeling is
>> that the first two requirements are not really Xen specific. Let me know
>> what you guys plan and think.
> 
> Thanks for getting in touch. I think this is an important discussion!

Very much so -- the time is definitely right to establish some unikernel
interop standards.  I'm also looking forward to better rump<->MirageOS
comms in particular (shiny new protocol stacks working alongside existing
applications in separate VM containers).

-anil

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 19:05   ` Anil Madhavapeddy
@ 2015-03-18 19:11     ` Martin Lucina
  2015-03-18 20:23     ` Antti Kantee
  1 sibling, 0 replies; 26+ messages in thread
From: Martin Lucina @ 2015-03-18 19:11 UTC (permalink / raw)
  To: Anil Madhavapeddy
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Richard Mortier, xen-devel, rumpkernel-users,
	Thomas Gazagnaire, Anthony PERARD

anil@recoil.org said:
> > Point 2. will further require implementing support in the Rump Kernel,
> > either for a shim which would proxy AF_UNIX / AF_INET transparently using
> > vchan, or possibly later implementing a separate socket family (AF_VCHAN /
> > AF_HYPER?). Once that is done you should be able to just drop it in to
> > QEMU on Rump.
> 
> I'm a little wary of point 2) asking for filesystem access to dom0.  What
> exactly is the qemu state API?  Does it need arbitrary file access, or is
> there a slightly higher level set of operations that could be marshalled
> along the socket?  In fact, why doesn't qemu privilege separate and use
> a QMP socket for its host filesystem operations as well?

Email thread context confusion here. I meant point 2 that I wrote
(connecting up Rump<->Mirage domUs where the Rump application is unmodified
and listens on what it believes is AF_INET/AF_UNIX).

Regarding the qemu state API, as I understood from others replies to the
full thread
(http://www.freelists.org/post/rumpkernel-users/Upstream-QEMU-based-stubdom-and-rump-kernel)
the state API does not require filesystem access and can be made to work
over a socket.

Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 19:05   ` Anil Madhavapeddy
  2015-03-18 19:11     ` Martin Lucina
@ 2015-03-18 20:23     ` Antti Kantee
  2015-03-18 21:21       ` Anil Madhavapeddy
  1 sibling, 1 reply; 26+ messages in thread
From: Antti Kantee @ 2015-03-18 20:23 UTC (permalink / raw)
  To: anil, Martin Lucina
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Richard Mortier, xen-devel, rumpkernel-users,
	Thomas Gazagnaire, Anthony PERARD

On 18/03/15 19:05, Anil Madhavapeddy wrote:
>> This fits in with a couple of things I hope to make time to work on in the
>> next couple of months:
>>
>> 1. Introspection of Rump Kernel domUs for ops purposes, i.e. get some
>>     basic "ps", "top", "vmstat"-like information about what the domU is
>>     doing from the dom0.
>>
>> 2. Connecting up multiple Rump Kernel domUs and/or Mirage domUs. The
>>     general idea here is that you can have e.g. a Mirage domU running a
>>     HTTP+TLS frontend, communicating with a Rump Kernel domU running PHP +
>>     FastCGI.
>>
>>     The Mirage folks are already doing something similar in their
>>     Jitsu work, using a protocol called Conduit which runs over vchan.
>
> Yeah, this is currently requiring a couple of things:
>
> - kicking the tires with Vchan and its associated machinery, which has
>    taken some time.  Dave Scott has built a complementary system for
>    the xentropyd which simply sets up a console ring instead of vchan.
>    This has the drawback of being a single fixed page, but far simpler.
>
> - A XenStore protocol for setting up stream connections.  This could
>    indeed quite easily turn into a AF_VCHAN that could be transparently
>    used by rump/Mirage/HaLVM and normal domUs for VM<->VM comms.

This is not an argument for or against; if you want to expose 
AF_WHATEVER to applications running on a rump kernel, you need to sell 
AF_WHATEVER to NetBSD, not to rumpkernel-users.  Well, preferably you 
need to sell it to everyone implementing sockets and running on some 
sort of hypervisor, but of course gotta start from somewhere.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 20:23     ` Antti Kantee
@ 2015-03-18 21:21       ` Anil Madhavapeddy
  2015-03-18 22:07         ` Antti Kantee
  0 siblings, 1 reply; 26+ messages in thread
From: Anil Madhavapeddy @ 2015-03-18 21:21 UTC (permalink / raw)
  To: Antti Kantee
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Richard Mortier, xen-devel, rumpkernel-users,
	Thomas Gazagnaire, Anthony PERARD, Martin Lucina

On 18 Mar 2015, at 20:23, Antti Kantee <pooka@iki.fi> wrote:
> 
> On 18/03/15 19:05, Anil Madhavapeddy wrote:
>>> This fits in with a couple of things I hope to make time to work on in the
>>> next couple of months:
>>> 
>>> 1. Introspection of Rump Kernel domUs for ops purposes, i.e. get some
>>>    basic "ps", "top", "vmstat"-like information about what the domU is
>>>    doing from the dom0.
>>> 
>>> 2. Connecting up multiple Rump Kernel domUs and/or Mirage domUs. The
>>>    general idea here is that you can have e.g. a Mirage domU running a
>>>    HTTP+TLS frontend, communicating with a Rump Kernel domU running PHP +
>>>    FastCGI.
>>> 
>>>    The Mirage folks are already doing something similar in their
>>>    Jitsu work, using a protocol called Conduit which runs over vchan.
>> 
>> Yeah, this is currently requiring a couple of things:
>> 
>> - kicking the tires with Vchan and its associated machinery, which has
>>   taken some time.  Dave Scott has built a complementary system for
>>   the xentropyd which simply sets up a console ring instead of vchan.
>>   This has the drawback of being a single fixed page, but far simpler.
>> 
>> - A XenStore protocol for setting up stream connections.  This could
>>   indeed quite easily turn into a AF_VCHAN that could be transparently
>>   used by rump/Mirage/HaLVM and normal domUs for VM<->VM comms.
> 
> This is not an argument for or against; if you want to expose AF_WHATEVER to applications running on a rump kernel, you need to sell AF_WHATEVER to NetBSD, not to rumpkernel-users.  Well, preferably you need to sell it to everyone implementing sockets and running on some sort of hypervisor, but of course gotta start from somewhere.

Given that most of the uses of this will be in userspace code, just
faking out AF_UNIX in Rump does seem a lot easier.  It doesn't matter
to MirageOS either way -- we just need a well-defined XenStore/ring
protocol to obey to do connection setup on the other side.

-anil

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 21:21       ` Anil Madhavapeddy
@ 2015-03-18 22:07         ` Antti Kantee
  2015-03-19  8:48           ` Martin Lucina
  0 siblings, 1 reply; 26+ messages in thread
From: Antti Kantee @ 2015-03-18 22:07 UTC (permalink / raw)
  To: Anil Madhavapeddy
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Richard Mortier, xen-devel, rumpkernel-users,
	Thomas Gazagnaire, Anthony PERARD, Martin Lucina

On 18/03/15 21:21, Anil Madhavapeddy wrote:
>> This is not an argument for or against; if you want to expose AF_WHATEVER to applications running on a rump kernel, you need to sell AF_WHATEVER to NetBSD, not to rumpkernel-users.  Well, preferably you need to sell it to everyone implementing sockets and running on some sort of hypervisor, but of course gotta start from somewhere.
>
> Given that most of the uses of this will be in userspace code, just
> faking out AF_UNIX in Rump does seem a lot easier.  It doesn't matter
> to MirageOS either way -- we just need a well-defined XenStore/ring
> protocol to obey to do connection setup on the other side.

Where do you propose to inject that faking out (and what does it even 
mean)?  Someone at Berkeley decided that socket drivers should be 
globally enumerated, and PF_UNIX leads to exactly one handler.  Just 
hacking hooks as local patches into the PF_UNIX driver is against the 
whole point of having unmodified, tested drivers from upstream.

So, if you want your bus to appear as a socket to userspace, I don't see 
any shortcut to not going via NetBSD.  If you're happy with something 
else than a socket, that's another story.

Especially if the interface doesn't matter too much for whatever purpose 
you plan to use it for, it's silly to specify the interface so that the 
implementation process is as convoluted as possible ;)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 14:29 Upstream QEMU based stubdom and rump kernel Wei Liu
                   ` (3 preceding siblings ...)
  2015-03-18 11:20 ` Martin Lucina
@ 2015-03-19  0:19 ` Samuel Thibault
  4 siblings, 0 replies; 26+ messages in thread
From: Samuel Thibault @ 2015-03-19  0:19 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, Stefano Stabellini, Ian Jackson, xen-devel,
	rumpkernel-users, Anthony PERARD

Hello,

Wei Liu, le Tue 17 Mar 2015 14:29:07 +0000, a écrit :
> One of my lessons learned from the existing stubdom stuffs is that I
> should work with upstream and produce maintainable code.

Not only maintainable, but really make sure to have the time to stick
with upstream on the long run, first until it gets integrated in the
upstream QEMU release process and then still to maintain it there on the
long run.  The old work on mini-os qemu stubdomain wasn't too bad, but
without actual integration in the QEMU process, and nobody to update the
fork, it was deemed to fall behind.

Samuel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-18 22:07         ` Antti Kantee
@ 2015-03-19  8:48           ` Martin Lucina
  2015-03-19  9:35             ` Antti Kantee
  0 siblings, 1 reply; 26+ messages in thread
From: Martin Lucina @ 2015-03-19  8:48 UTC (permalink / raw)
  To: Antti Kantee
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Anil Madhavapeddy, Ian Jackson, Richard Mortier, xen-devel,
	rumpkernel-users, Thomas Gazagnaire, Anthony PERARD

pooka@iki.fi said:
> Where do you propose to inject that faking out (and what does it
> even mean)?  Someone at Berkeley decided that socket drivers should
> be globally enumerated, and PF_UNIX leads to exactly one handler.
> Just hacking hooks as local patches into the PF_UNIX driver is
> against the whole point of having unmodified, tested drivers from
> upstream.

We do not want to "hack hooks as local patches into the PF_UNIX driver".
Rather, we'd like to develop an entirely new driver (nothing wrong with
that?), which would mimic PF_UNIX semantics but talk to hyperspace instead.

See below for the purpose we want to use it for.

> So, if you want your bus to appear as a socket to userspace, I don't
> see any shortcut to not going via NetBSD.  If you're happy with
> something else than a socket, that's another story.
> 
> Especially if the interface doesn't matter too much for whatever
> purpose you plan to use it for, it's silly to specify the interface
> so that the implementation process is as convoluted as possible ;)

By "faking out" Anil means a shim to get existing applications
which currently use PF_UNIX (and possibly PF_INET, though that will be
harder to fake) to use the hypervisor bus to talk to another colocated
unikernel instead.

The motivations for this are:

- Taking the TCP stack out of the picture entirely for intra-unikernel
  comms (eg. PHP unikernel <-> MySQL unikernel). Both of those could be
  thus be linked without the PF_INET component.
- This means that you do not need to set up and manage a TCP network in
  your infrastructure for intra-unikernel comms, which is a huge advantage
  from an operations point of view.
- It also means that unikernels which should not be talking TCP to
  anywhere, ever, can't do that.

Anil, have I missed anything?

So, the interface does matter in the sense that it should be as simple as
possible to take an existing application and get it to use the new bus.
This could be as simple as linking your unikernel against -lrumpnet_hyper
instead of -lrumpnet_local.

Taking a longer-term view, I do think that there is a wider case for
PF_HYPER and I will be happy to sell it to NetBSD (or whoever) once we are
ready to make that case.

In my mind the semantics of PF_HYPER from an application PoV are pretty
clear: exactly the same as PF_UNIX except that you substitute "filesystem
path" for "hyperspace path", with the exact semantics of "hyperspace path"
left up to your hypervisor. The application need not care, as long as you
can tell it to e.g. use "vchan/mysql" instead of "/tmp/mysql.sock" when
doing bind().

Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-19  8:48           ` Martin Lucina
@ 2015-03-19  9:35             ` Antti Kantee
  2015-03-19 12:22               ` Anil Madhavapeddy
  0 siblings, 1 reply; 26+ messages in thread
From: Antti Kantee @ 2015-03-19  9:35 UTC (permalink / raw)
  To: Anil Madhavapeddy, Wei Liu, rumpkernel-users, xen-devel,
	Stefano Stabellini, David Scott, Anthony PERARD, Ian Jackson,
	Ian Campbell, Thomas Gazagnaire, Richard Mortier

On 19/03/15 08:48, Martin Lucina wrote:
> By "faking out" Anil means a shim to get existing applications
> which currently use PF_UNIX (and possibly PF_INET, though that will be
> harder to fake) to use the hypervisor bus to talk to another colocated
> unikernel instead.
>
> The motivations for this are:
>
> - Taking the TCP stack out of the picture entirely for intra-unikernel
>    comms (eg. PHP unikernel <-> MySQL unikernel). Both of those could be
>    thus be linked without the PF_INET component.
> - This means that you do not need to set up and manage a TCP network in
>    your infrastructure for intra-unikernel comms, which is a huge advantage
>    from an operations point of view.
> - It also means that unikernels which should not be talking TCP to
>    anywhere, ever, can't do that.

Aah, ic, you want to do what rumpnet_sockin does, except use the 
hypervisor bus instead of an external sockets-like networking facility 
like sockin does.

rumpnet_sockin was indeed originally developed so that you wouldn't need 
to include the full TCP/IP stack in a rump kernel, which is nice for 
scenarios where you want to do networking without configuring anything 
for each guest instance; running the kernel NFS client in userspace and 
using the host's network was the original use case.

Yea, that'll just work on the rump kernel side for PF_INET/PF_INET6 
(though you might have to do a bit more handling in your "fake" driver). 
  Not sure what doing the same for PF_UNIX would entail, if anything 
special, but only one way to find out.

   - antti

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-17 15:15 ` Anthony PERARD
  2015-03-17 15:23   ` Stefano Stabellini
  2015-03-17 15:27   ` Wei Liu
@ 2015-03-19 11:16   ` Ian Campbell
  2 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2015-03-19 11:16 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Ian Jackson, rumpkernel-users, Wei Liu, Stefano Stabellini, xen-devel

On Tue, 2015-03-17 at 15:15 +0000, Anthony PERARD wrote:
> On Tue, Mar 17, 2015 at 02:29:07PM +0000, Wei Liu wrote:
> > I've now successfully built QEMU upstream with rump kernel. However to
> > make it fully functional as a stubdom, there are some missing pieces to
> > be added in.
> > 
> > 1. The ability to access QMP socket (a unix socket) from Dom0. That
> >    will be used to issue command to QEMU.
> 
> The QMP "socket" does not needs to be a unix socket. It can be any of
> those (from qemu --help):
> Character device options:
> -chardev null,id=id[,mux=on|off]

How much flexibility/modularity is there on the qemu side for adding new
chardev types? Could we for example add "-chardev vchan,path=path"
without too much trouble?

> To save a QEMU state (write), we do use a filename. But I guest we could
> expand the QMP command (xen-save-devices-state) to use something else, if
> it's easier.

Like, perhaps, an arbitrary chardev?

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Upstream QEMU based stubdom and rump kernel
  2015-03-19  9:35             ` Antti Kantee
@ 2015-03-19 12:22               ` Anil Madhavapeddy
  0 siblings, 0 replies; 26+ messages in thread
From: Anil Madhavapeddy @ 2015-03-19 12:22 UTC (permalink / raw)
  To: Antti Kantee
  Cc: David Scott, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Richard Mortier, xen-devel, rumpkernel-users,
	Thomas Gazagnaire, Anthony PERARD

On 19 Mar 2015, at 09:35, Antti Kantee <pooka@iki.fi> wrote:
> 
> On 19/03/15 08:48, Martin Lucina wrote:
>> By "faking out" Anil means a shim to get existing applications
>> which currently use PF_UNIX (and possibly PF_INET, though that will be
>> harder to fake) to use the hypervisor bus to talk to another colocated
>> unikernel instead.
>> 
>> The motivations for this are:
>> 
>> - Taking the TCP stack out of the picture entirely for intra-unikernel
>>   comms (eg. PHP unikernel <-> MySQL unikernel). Both of those could be
>>   thus be linked without the PF_INET component.
>> - This means that you do not need to set up and manage a TCP network in
>>   your infrastructure for intra-unikernel comms, which is a huge advantage
>>   from an operations point of view.
>> - It also means that unikernels which should not be talking TCP to
>>   anywhere, ever, can't do that.
> 
> Aah, ic, you want to do what rumpnet_sockin does, except use the hypervisor bus instead of an external sockets-like networking facility like sockin does.
> 
> rumpnet_sockin was indeed originally developed so that you wouldn't need to include the full TCP/IP stack in a rump kernel, which is nice for scenarios where you want to do networking without configuring anything for each guest instance; running the kernel NFS client in userspace and using the host's network was the original use case.
> 
> Yea, that'll just work on the rump kernel side for PF_INET/PF_INET6 (though you might have to do a bit more handling in your "fake" driver).  Not sure what doing the same for PF_UNIX would entail, if anything special, but only one way to find out.

That's right -- the primary motivation from my end is to short-circuit all the unnecessary network stack serialisation and configuration, and end up with a very simple data path such as shared memory rings and/or vchan.  The challenge is figuring out where to hook in the dynamic lookups required, and what form they would take on the coordination bus (XenStore).

One slight hitch with using XenStore for this is that its permissions model isn't quite good enough to build a full Plan9-like interface (where every listen is published in a per-VM path and can be written to by a connecting VM).  Dave Scott had some thoughts on how to extend XS with this, but it wouldn't be a short-term solution for working with existing toolstacks.  One workaround is to have a trusted arbiter VM running that would coordinate the establishment of connections and hand them off.

-anil

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2015-03-19 12:22 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-17 14:29 Upstream QEMU based stubdom and rump kernel Wei Liu
2015-03-17 14:54 ` Ian Campbell
2015-03-17 14:57   ` Wei Liu
2015-03-17 15:07     ` Ian Campbell
2015-03-17 15:15 ` Anthony PERARD
2015-03-17 15:23   ` Stefano Stabellini
2015-03-17 15:27   ` Wei Liu
2015-03-17 15:38     ` Ian Campbell
2015-03-18 11:24       ` Martin Lucina
2015-03-18 11:30         ` Ian Campbell
2015-03-18 12:45           ` Stefano Stabellini
2015-03-18 16:46             ` Ian Campbell
2015-03-19 11:16   ` Ian Campbell
2015-03-17 16:06 ` Antti Kantee
2015-03-18 11:22   ` Martin Lucina
2015-03-18 13:22     ` Antti Kantee
2015-03-18 11:20 ` Martin Lucina
2015-03-18 19:05   ` Anil Madhavapeddy
2015-03-18 19:11     ` Martin Lucina
2015-03-18 20:23     ` Antti Kantee
2015-03-18 21:21       ` Anil Madhavapeddy
2015-03-18 22:07         ` Antti Kantee
2015-03-19  8:48           ` Martin Lucina
2015-03-19  9:35             ` Antti Kantee
2015-03-19 12:22               ` Anil Madhavapeddy
2015-03-19  0:19 ` Samuel Thibault

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.