On Fri, 26 Jun 2020 14:49:37 +0200 Janosch Frank wrote: > On 6/26/20 12:58 PM, Daniel P. Berrangé wrote: > > On Fri, Jun 26, 2020 at 11:29:03AM +0100, Dr. David Alan Gilbert wrote: > >> * Janosch Frank (frankja@linux.ibm.com) wrote: > >>> On 6/26/20 11:32 AM, Daniel P. Berrangé wrote: > >>>> On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: > >>>>> On 6/26/20 8:53 AM, David Hildenbrand wrote: > >>>>>>>>>> Does this have any implications when probing with the 'none' machine? > >>>>>>>>> > >>>>>>>>> I'm not sure. In your case, I guess the cpu bit would still show up > >>>>>>>>> as before, so it would tell you base feature availability, but not > >>>>>>>>> whether you can use the new configuration option. > >>>>>>>>> > >>>>>>>>> Since the HTL option is generic, you could still set it on the "none" > >>>>>>>>> machine, though it wouldn't really have any effect. That is, if you > >>>>>>>>> could create a suitable object to point it at, which would depend on > >>>>>>>>> ... details. > >>>>>>>>> > >>>>>>>> > >>>>>>>> The important point is that we never want the (expanded) host cpu model > >>>>>>>> look different when either specifying or not specifying the HTL > >>>>>>>> property. > >>>>>>> > >>>>>>> Ah, yes, I see your point. So my current suggestion will satisfy > >>>>>>> that, basically it is: > >>>>>>> > >>>>>>> cpu has unpack (inc. by default) && htl specified > >>>>>>> => works (allowing secure), as expected > >>>>>> > >>>>>> ack > >>>>>> > >>>>>>> > >>>>>>> !cpu has unpack && htl specified > >>>>>>> => bails out with an error > >>>>>> > >>>>>> ack > >>>>>> > >>>>>>> > >>>>>>> !cpu has unpack && !htl specified > >>>>>>> => works for a non-secure guest, as expected > >>>>>>> => guest will fail if it attempts to go secure > >>>>>> > >>>>>> ack, behavior just like running on older hw without unpack > >>>>>> > >>>>>>> > >>>>>>> cpu has unpack && !htl specified > >>>>>>> => works as expected for a non-secure guest (unpack feature is > >>>>>>> present, but unused) > >>>>>>> => secure guest may work "by accident", but only if all virtio > >>>>>>> properties have the right values, which is the user's > >>>>>>> problem > >>>>>>> > >>>>>>> That last case is kinda ugly, but I think it's tolerable. > >>>>>> > >>>>>> Right, we must not affect non-secure guests, and existing secure setups > >>>>>> (e.g., older qemu machines). Will have to think about this some more, > >>>>>> but does not sound too crazy. > >>>>> > >>>>> I severely dislike having to specify things to make PV work. > >>>>> The IOMMU is already a thorn in our side and we're working on making the > >>>>> whole ordeal completely transparent so the only requirement to make this > >>>>> work is the right machine, kernel, qemu and kernel cmd line option > >>>>> "prot_virt=1". That's why we do the reboot into PV mode in the first place. > >>>>> > >>>>> I.e. the goal is that if customers convert compatible guests into > >>>>> protected ones and start them up on a z15 on a distro with PV support > >>>>> they can just use the guest without having to change XML or command line > >>>>> parameters. > >>>> > >>>> If you're exposing new features to the guest machine, then it is usually > >>>> to be expected that XML and QEMU command line will change. Some simple > >>>> things might be hidable behind a new QEMU machine type or CPU model, but > >>>> there's a limit to how much should be hidden that way while staying sane. > >>>> > >>>> I'd really expect the configuration to change when switching a guest to > >>>> a new hardware platform and wanting major new functionality to be enabled. > >>>> The XML / QEMU config is a low level instantiation of a particular feature > >>>> set, optimized for a specific machine, rather than a high level description > >>>> of ideal "best" config independent of host machine. > >>> > >>> You still have to set the host command line and make sure that unpack is > >>> available. Currently you also have to specify the IOMMU which we like to > >>> drop as a requirement. Everything else is dependent on runtime > >>> information which tells us if we need to take a PV or non-PV branch. > >>> Having the unpack facility should be enough to use the unpack facility. > >>> > >>> Keep in mind that we have no real concept of a special protected VM to > >>> begin with. If the VM never boots into a protected kernel it will never > >>> be protected. On a reboot it drops from protected into unprotected mode > >>> to execute the bios and boot loader and then may or may not move back > >>> into a protected state. > >> > >> My worry isn't actually how painful adding all the iommu glue is, but > >> what happens when users forget; especially if they forget for one > >> device. > >> > >> I could appreciate having a machine option to cause iommu to then get > >> turned on with all other devices; but I think also we could do with > >> something that failed with a nice error if an iommu flag was missing. > >> For SEV this could be done pretty early, but for power/s390 I guess > >> you'd have to do this when someone tried to enable secure mode, but > >> I'm not sure you can tell. > > > > What is the cost / downside of turning on the iommu option for virtio > > devices ? Is it something that is reasonable for a mgmt app todo > > unconditionally, regardless of whether memory encryption is in use, > > or will that have a negative impact on things ? > > speed, memory usage and compatibility problems. > There might also be a problem with s390 having to use <=2GB iommu areas > in the guest, I need to check with Halil if this is still true. It is partially true. The coherent_dma_mask is 31 bit and the dma_mask is 64. That means if iommu=on but !PV the coherent stuff will use <= 2GB (that stuff allocated by virtio core, like virtqueues, CCWs, etc.) but there will be no bounce buffering. We don't even initialize swiotlb if !PV. I agree with Janosch, we want iommu='on' only when really needed. I've tried to make that point several times. Regards, Halil > > Also, if the default or specified IOMMU buffer size isn't big enough for > your IO workload the guest is gonna have a very bad time. I.e. if > somebody has an alternative implementation of bounce buffers we'd be > happy to take it :) > > > > > Regards, > > Daniel > > > >