Andi Kleen wrote: > On Wed, Apr 01, 2009 at 08:03:49AM -0400, Gregory Haskins wrote: > >> Andi Kleen wrote: >> >>> Gregory Haskins writes: >>> >>> What might be useful is if you could expand a bit more on what the high level >>> use cases for this. >>> >>> Questions that come to mind and that would be good to answer: >>> >>> This seems to be aimed at having multiple VMs talk >>> to each other, but not talk to the rest of the world, correct? >>> Is that a common use case? >>> >>> >> Actually we didn't design specifically for either type of environment. >> > > But surely you must have some specific use case in mind? Something > that it does better than the various methods that are available > today. Or rather there must be some problem you're trying > to solve. I'm just not sure what that problem exactly is. > Performance. We are trying to create a high performance IO infrastructure. Ideally we would like to see things like virtual-machines have bare-metal performance (or as close as possible) using just pure software on commodity hardware. The data I provided shows that something like KVM with virtio-net does a good job on throughput even on 10GE, but the latency is several orders of magnitude slower than bare-metal. We are addressing this issue and others like it that are a result of the current design of out-of-kernel emulation. > >> What we *are* trying to address is making an easy way to declare virtual >> resources directly in the kernel so that they can be accessed more >> efficiently. Contrast that to the way its done today, where the models >> live in, say, qemu userspace. >> >> So instead of having >> guest->host->qemu::virtio-net->tap->[iptables|bridge], you simply have >> guest->host->[iptables|bridge]. How you make your private network (if >> > > So is the goal more performance or simplicity or what? > (Answered above) > >>> What would be the use cases for non networking devices? >>> >>> How would the interfaces to the user look like? >>> >>> >> I am not sure if you are asking about the guests perspective or the >> host-administators perspective. >> > > I was wondering about the host-administrators perspective. > Ah, ok. Sorry about that. It was probably good to document that other thing anyway, so no harm. So about the host-administrator interface. The whole thing is driven by configfs, and the basics are already covered in the documentation in patch 2, so I wont repeat it here. Here is a reference to the file for everyone's convenience: http://git.kernel.org/?p=linux/kernel/git/ghaskins/vbus/linux-2.6.git;a=blob;f=Documentation/vbus.txt;h=e8a05dafaca2899d37bd4314fb0c7529c167ee0f;hb=f43949f7c340bf667e68af6e6a29552e62f59033 So a sufficiently privileged user can instantiate a new bus (e.g. container) and devices on that bus via configfs operations. The types of devices available to instantiate are dictated by whatever vbus-device modules you have loaded into your particular kernel. The loaded modules available are enumerated under /sys/vbus/deviceclass. Now presumably the administrator knows what a particular module is and how to configure it before instantiating it. Once they instantiate it, it will present an interface in sysfs with a set of attributes. For example, an instantiated venet-tap looks like this: ghaskins@test:~> tree /sys/vbus/devices /sys/vbus/devices `-- foo |-- class -> ../../deviceclass/venet-tap |-- client_mac |-- enabled |-- host_mac |-- ifname `-- interfaces `-- 0 -> ../../../instances/bar/devices/0 Some of these attributes, like "class" and "interfaces" are default attributes that are filled in by the infrastructure. Other attributes, like "client_mac" and "enabled" are properties defined by the venet-tap module itself. So the administrator can then set these attributes as desired to manipulate the configuration of the instance of the device, on a per device basis. So now imagine we have some kind of disk-io vbus device that is designed to act kind of like a file-loopback device. It might define an attribute allowing you to specify the path to the file/block-dev that you want it to export. (Warning: completely fictitious "tree" output to follow ;) ghaskins@test:~> tree /sys/vbus/devices /sys/vbus/devices `-- foo |-- class -> ../../deviceclass/vdisk |-- src_path `-- interfaces `-- 0 -> ../../../instances/bar/devices/0 So the admin would instantiate this "vdisk" device and do: 'echo /path/to/my/exported/disk.dat > /sys/vbus/devices/foo/src_path' To point the device to the file on the host that it wants to present as a vdisk. Any guest that has access to the particular bus that contains this device would then see it as a standard "vdisk" ABI device (as if there where such a thing, yet) and could talk to it using a vdisk specific driver. A property of a vbus is that it is inherited by children. Today, I do not have direct support in qemu for creating/configuring vbus devices. Instead what I do is I set up the vbus and devices from bash, and then launch qemu-kvm so it inherits the bus. Someday (soon, unless you guys start telling me this whole idea is rubbish ;) I will add support so you could do things like "-net nic,model=venet" and that would trigger qemu to go out and create the container/device on its own. TBD. I hope this helps to clarify! -Greg