On Thu, Dec 03, 2020 at 09:40:37AM -0500, Peter Xu wrote: > On Thu, Dec 03, 2020 at 11:10:36AM +0000, Stefan Hajnoczi wrote: > > On Wed, Dec 02, 2020 at 01:06:28PM -0500, Peter Xu wrote: > > > On Wed, Nov 25, 2020 at 12:44:07PM -0800, Elena Afanasova wrote: > > > > > > [...] > > > > > > > Wire protocol > > > > ------------- > > > > The protocol spoken over the file descriptor is as follows. The device reads > > > > commands from the file descriptor with the following layout:: > > > > > > > > struct ioregionfd_cmd { > > > > __u32 info; > > > > __u32 padding; > > > > __u64 user_data; > > > > __u64 offset; > > > > __u64 data; > > > > }; > > > > > > I'm thinking whether it would be nice to have a handshake on the wire protocol > > > before starting the cmd/resp sequence. > > > > > > I was thinking about migration - we have had a hard time trying to be > > > compatible between old/new qemus. Now we fixed those by applying the same > > > migration capabilities on both sides always so we do the handshake "manually" > > > from libvirt, but it really should be done with a real handshake on the > > > channel, imho.. That's another story, for sure. > > > > > > My understanding is that the wire protocol is kind of a standalone (but tiny) > > > protocol between kvm and the emulation process. So I'm thinking the handshake > > > could also help when e.g. kvm can fallback to an old version of wire protocol > > > if it knows the emulation binary is old. Ideally, I think this could even > > > happen without VMM's awareness. > > > > > > [...] > > > > I imagined that would happen in the control plane (KVM ioctls) instead > > of the data plane (the fd). There is a flags field in > > ioctl(KVM_SET_IOREGION): > > > > struct kvm_ioregion { > > __u64 guest_paddr; /* guest physical address */ > > __u64 memory_size; /* bytes */ > > __u64 user_data; > > __s32 fd; /* previously created with KVM_CREATE_IOREGIONFD */ > > __u32 flags; > > __u8 pad[32]; > > }; > > > > When userspace sets up the ioregionfd it can tell the kernel which > > features to enable. > > > > Feature availability can be checked through ioctl(KVM_CHECK_EXTENSION). > > > > Do you think this existing mechanism is enough? It's not clear to me > > what kind of additional negotiation would be necessary between the > > device emulation process and KVM after the ioregionfd has been > > registered? > > Yes I think kvm capability can be used as a way for the handshake between VMM > and KVM. However I'm not sure how to do similar things to the emulation > process besides passing over the ioregionfd, because that's between VMM and the > emulation process. Is there a way to do so with current proposasl? The interface is designed so the VMM controls the ioregionfd setup on behalf of the device emulation process. This is for security: the device emulation process is untrusted. There is setup involving the VMM and the device emulation process that is outside the scope of the KVM API where the device emulation process could request for the VMM to enable ioregionfd features. For example, if QEMU is initializing a vfio-user device emulation program then QEMU queries available vfio-user device regions from the device emulation program. This is where extra information like ioregionfd support would be indicated by the device emulation program. > The out-of-order feature may not be a good enough example if it's destined to > be useless... but let's just imagine we have such a requirement as an extention > to the current wire protocol. What I was thinking was whether we could have > another handshake somehow to the emulation process, so that we can identify "ok > this emulation process supports out-of-band" or vice versa. Only with that > information could VMM enable/disable the out-of-band in KVM. > > The handshake to the emulation process can start from either VMM or KVM. And > my previous idea was simply let KVM and emualtion process talk rather than VMM > against the emulation, because they really look like two isolated protocols: > > - The VMM <-> KVM talks via KVM_SET_IOREGIONFD, it is only responsible to > setup a mmio/pio region in the guest and create the fd. As long as the > region is properly setup, and the fd is passed over to emulation process, > it's done as a proxy layer. > > - The KVM <-> emulation process talks via the wire protocol (so this protocol > can be irrelevant to KVM_SET_IOREGIONFD protocol itself). For example the > out-of-band feature changes the behavior of the wire protocol. Ideally it > does not even need a new KVM capability because KVM_SET_IOREGIONFD > functionality shouldn't be touched. > > I thought it would be cleaner to have these protocols separate, but I could > also be wrong or over-engineering. This makes sense to me if the wire protocol needs to become complex. For now sticking to the simple control/data plane approach is what I would suggest for security. Stefan