linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Matthew Rosato <mjrosato@linux.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	linux-s390@vger.kernel.org, cohuck@redhat.com,
	farman@linux.ibm.com, pmorel@linux.ibm.com,
	borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com,
	gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com,
	frankja@linux.ibm.com, david@redhat.com, imbrenda@linux.ibm.com,
	vneethv@linux.ibm.com, oberpar@linux.ibm.com,
	freude@linux.ibm.com, thuth@redhat.com, pasic@linux.ibm.com,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier
Date: Thu, 10 Feb 2022 15:06:35 +0100	[thread overview]
Message-ID: <fc5cce270dc01d46a6a42f2d268166a0a952fcb3.camel@linux.ibm.com> (raw)
In-Reply-To: <20220210130150.GF4160@nvidia.com>

On Thu, 2022-02-10 at 09:01 -0400, Jason Gunthorpe wrote:
> On Thu, Feb 10, 2022 at 12:15:58PM +0100, Niklas Schnelle wrote:
> 
> > In a KVM or z/VM guest the guest is informed that IOMMU translations
> > need to be refreshed even for previously invalid IOVAs. With this the
> > guest builds it's IOMMU translation tables as normal but then does a
> > RPCIT for the IOVA range it touched. In the hypervisor we can then
> > simply walk the translation tables, pin the guest pages and map them in
> > the host IOMMU. Prior to this series this happened in QEMU which does
> > the map via vfio-iommu-type1 from user-space. This works and will
> > remain as a fallback. Sadly it is quite slow and has a large impact on
> > performance as we need to do a lot of mapping operations as the DMA API
> > of the guest goes through the virtual IOMMU. This series thus adds the
> > same functionality but as a KVM intercept of RPCIT. Now I think this
> > neatly fits into KVM, we're emulating an instruction after all and most
> > of its work is KVM specific pinning of guest pages. Importantly all
> > other handling like IOMMU domain attachment still goes through vfio-
> > iommu-type1 and we just fast path the map/unmap operations.
> 
> So you create an iommu_domain and then hand it over to kvm which then
> does map/unmap operations on it under the covers?

Yes

> 
> How does the page pinning work?

The pinning is done directly in the RPCIT interception handler pinning
both the IOMMU tables and the guest pages mapped for DMA.

> 
> In the design we are trying to reach I would say this needs to be
> modeled as a special iommu_domain that has this automatic map/unmap
> behavior from following user pages. Creating it would specify the kvm
> and the in-guest base address of the guest's page table. 

Makes sense.

> Then the
> magic kernel code you describe can operate on its own domain without
> becoming confused with a normal map/unmap domain.

This sounds like an interesting idea. Looking at
drivers/iommu/s390_iommu.c most of that is pretty trivial domain
handling. I wonder if we could share this by marking the existing
s390_iommu_domain type with kind of a "lent out to KVM" flag. Possibly
by simply having a non-NULL pointer to a struct holding the guest base
address and kvm etc? That way we can share the setup/tear down of the
domain and of host IOMMU tables as well as aperture checks the same
while also being able to keep the IOMMU API from interfering with the
KVM RPCIT intercept and vice versa. I.e. while the domain is under
control of KVM's RPCIT handling we make all IOMMU map/unmap fail.

To me this more direct involvement of IOMMU and KVM on s390x is also a
direct consequence of it using special instructions. Naturally those
instructions can be intercepted or run under hardware accelerated
virtualization.

> 
> It is like the HW nested translation other CPUs are doing, but instead
> of HW nested, it is SW nested.

Yes very good analogy. Has any of that nested IOMMU translations work
been merged yet? I know AMD had something in the works and nested
translations have been around for the MMU for a while and are also used
on s390x. We're definitely thinking about HW nested IOMMU translations
too so any design we come up with would be able to deal with that too.
Basically we would then execute RPCIT without leaving the hardware
virtualization mode (SIE). We believe that that would require pinning
all of guest memory though because HW can't really pin pages.


  reply	other threads:[~2022-02-10 14:06 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-04 21:15 [PATCH v3 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 01/30] s390/sclp: detect the zPCI load/store interpretation facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 02/30] s390/sclp: detect the AISII facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 03/30] s390/sclp: detect the AENI facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 04/30] s390/sclp: detect the AISI facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 05/30] s390/airq: pass more TPI info to airq handlers Matthew Rosato
2022-02-07  8:28   ` Cornelia Huck
2022-02-04 21:15 ` [PATCH v3 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
2022-02-07  8:29   ` Cornelia Huck
2022-02-07  8:42   ` Claudio Imbrenda
2022-02-04 21:15 ` [PATCH v3 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 08/30] s390/pci: stash associated GISA designation Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 09/30] s390/pci: export some routines related to RPCIT processing Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 10/30] s390/pci: stash dtsm and maxstbl Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 11/30] s390/pci: add helper function to find device by handle Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 12/30] s390/pci: get SHM information from list pci Matthew Rosato
2022-02-07 10:08   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 14/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV Matthew Rosato
2022-02-07  8:35   ` Cornelia Huck
2022-02-07 15:43     ` Matthew Rosato
2022-02-07 17:59       ` Cornelia Huck
2022-02-07 20:09         ` Matthew Rosato
2022-02-10 10:07           ` Cornelia Huck
2022-02-04 21:15 ` [PATCH v3 15/30] KVM: s390: pci: add basic kvm_zdev structure Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 16/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 17/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications Matthew Rosato
2022-02-14 12:59   ` Pierre Morel
2022-02-14 20:35     ` Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 18/30] KVM: s390: mechanism to enable guest zPCI Interpretation Matthew Rosato
2022-02-14 13:06   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 19/30] KVM: s390: pci: provide routines for enabling/disabling interpretation Matthew Rosato
2022-02-14 13:22   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 20/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 21/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 22/30] KVM: s390: pci: handle refresh of PCI translations Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 23/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier Matthew Rosato
2022-02-08 17:43   ` Alex Williamson
2022-02-08 18:51     ` Jason Gunthorpe
2022-02-08 19:26       ` Alex Williamson
2022-02-08 19:51         ` Jason Gunthorpe
2022-02-08 20:33         ` Matthew Rosato
2022-02-08 20:40           ` Jason Gunthorpe
2022-02-08 21:37             ` Matthew Rosato
2022-02-10 11:15             ` Niklas Schnelle
2022-02-10 13:01               ` Jason Gunthorpe
2022-02-10 14:06                 ` Niklas Schnelle [this message]
2022-02-10 15:23                   ` Jason Gunthorpe
2022-02-10 18:59                     ` Matthew Rosato
2022-02-10 23:45                       ` Jason Gunthorpe
2022-02-04 21:15 ` [PATCH v3 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
2022-02-07 16:38   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 28/30] vfio-pci/zdev: add DTSM to clp group capability Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation Matthew Rosato
2022-02-07 16:36   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 30/30] MAINTAINERS: additional files related kvm s390 pci passthrough Matthew Rosato
2022-02-07 13:04   ` Christian Borntraeger
2022-02-07 15:44     ` Matthew Rosato
2022-02-04 21:33 ` [PATCH v3 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc5cce270dc01d46a6a42f2d268166a0a952fcb3.camel@linux.ibm.com \
    --to=schnelle@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=frankja@linux.ibm.com \
    --cc=freude@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=oberpar@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=pmorel@linux.ibm.com \
    --cc=thuth@redhat.com \
    --cc=vneethv@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).