kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Rosato <mjrosato@linux.ibm.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: linux-s390@vger.kernel.org, cohuck@redhat.com,
	schnelle@linux.ibm.com, farman@linux.ibm.com,
	pmorel@linux.ibm.com, borntraeger@linux.ibm.com,
	hca@linux.ibm.com, gor@linux.ibm.com,
	gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com,
	frankja@linux.ibm.com, david@redhat.com, imbrenda@linux.ibm.com,
	vneethv@linux.ibm.com, oberpar@linux.ibm.com,
	freude@linux.ibm.com, thuth@redhat.com, pasic@linux.ibm.com,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier
Date: Tue, 8 Feb 2022 15:33:58 -0500	[thread overview]
Message-ID: <438d8b1e-e149-35f1-a8c9-ed338eb97430@linux.ibm.com> (raw)
In-Reply-To: <20220208122624.43ad52ef.alex.williamson@redhat.com>

On 2/8/22 2:26 PM, Alex Williamson wrote:
> On Tue, 8 Feb 2022 14:51:41 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
>> On Tue, Feb 08, 2022 at 10:43:19AM -0700, Alex Williamson wrote:
>>> On Fri,  4 Feb 2022 16:15:30 -0500
>>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>>    
>>>> KVM zPCI passthrough device logic will need a reference to the associated
>>>> kvm guest that has access to the device.  Let's register a group notifier
>>>> for VFIO_GROUP_NOTIFY_SET_KVM to catch this information in order to create
>>>> an association between a kvm guest and the host zdev.
>>>>
>>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>>>   arch/s390/include/asm/kvm_pci.h  |  2 ++
>>>>   drivers/vfio/pci/vfio_pci_core.c |  2 ++
>>>>   drivers/vfio/pci/vfio_pci_zdev.c | 46 ++++++++++++++++++++++++++++++++
>>>>   include/linux/vfio_pci_core.h    | 10 +++++++
>>>>   4 files changed, 60 insertions(+)
>>>>
>>>> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
>>>> index e4696f5592e1..16290b4cf2a6 100644
>>>> +++ b/arch/s390/include/asm/kvm_pci.h
>>>> @@ -16,6 +16,7 @@
>>>>   #include <linux/kvm.h>
>>>>   #include <linux/pci.h>
>>>>   #include <linux/mutex.h>
>>>> +#include <linux/notifier.h>
>>>>   #include <asm/pci_insn.h>
>>>>   #include <asm/pci_dma.h>
>>>>   
>>>> @@ -32,6 +33,7 @@ struct kvm_zdev {
>>>>   	u64 rpcit_count;
>>>>   	struct kvm_zdev_ioat ioat;
>>>>   	struct zpci_fib fib;
>>>> +	struct notifier_block nb;
>>>>   };
>>>>   
>>>>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>>>> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
>>>> index f948e6cd2993..fc57d4d0abbe 100644
>>>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>>>> @@ -452,6 +452,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
>>>>   
>>>>   	vfio_pci_vf_token_user_add(vdev, -1);
>>>>   	vfio_spapr_pci_eeh_release(vdev->pdev);
>>>> +	vfio_pci_zdev_release(vdev);
>>>>   	vfio_pci_core_disable(vdev);
>>>>   
>>>>   	mutex_lock(&vdev->igate);
>>>> @@ -470,6 +471,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
>>>>   void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
>>>>   {
>>>>   	vfio_pci_probe_mmaps(vdev);
>>>> +	vfio_pci_zdev_open(vdev);
>>>>   	vfio_spapr_pci_eeh_open(vdev->pdev);
>>>>   	vfio_pci_vf_token_user_add(vdev, 1);
>>>>   }
>>>
>>> If this handling were for a specific device, I think we'd be suggesting
>>> this is the point at which we cross over to a vendor variant making use
>>> of vfio-pci-core rather than hooking directly into the core code.
>>
>> Personally, I think it is wrong layering for VFIO to be aware of KVM
>> like this. This marks the first time that VFIO core code itself is
>> being made aware of the KVM linkage.
> 
> I agree, but I've resigned that I've lost that battle.  Both mdev vGPU
> vendors make specific assumptions about running on a VM.  VFIO was
> never intended to be tied to KVM or the specific use case of a VM.
> 
>> It copies the same kind of design the s390 specific mdev use of
>> putting VFIO in charge of KVM functionality. If we are doing this we
>> should just give up and admit that KVM is a first-class part of struct
>> vfio_device and get rid of the notifier stuff too, at least for s390.
> 
> Euw.  You're right, I really don't like vfio core code embracing this
> dependency for s390, device specific use cases are bad enough.
> 
>> Reading the patches and descriptions pretty much everything is boiling
>> down to 'use vfio to tell the kvm architecture code to do something' -
>> which I think needs to be handled through a KVM side ioctl.
> 
> AIF at least sounds a lot like the reason we invented the irq bypass
> mechanism to allow interrupt producers and consumers to register
> independently and associate to each other with a shared token.

Yes, these do sound quite similar, looking at it now though I haven't 
yet fully grokked irq bypass...  But with AIF you have the case where 
either the interrupt will be delivered directly to a guest from firmware 
via an s390 construct (gisa) or under various circumstances the host 
(kvm) will be prodded to perform the delivery (still via gisa) instead.

> 
> Is the purpose of IOAT to associate the device to a set of KVM page
> tables?  That seems like a container or future iommufd operation.  I

Yes, here we are establishing a relationship with the DMA table in the 
guest so that once mappings are established guest PCI operations 
(handled via special instructions in s390) don't need to go through the 
host but can be directly handled by firmware (so, effectively guest can 
keep running on its vcpu vs breaking out).

> read DTSM as supported formats for the IOAT.
> 
>> Or, at the very least, everything needs to be described in some way
>> that makes it clear what is happening to userspace, without kvm,
>> through these ioctls.

Nothing, they don't need these ioctls.  Userspace without a KVM 
registration for the device in question gets -EINVAL.

> 
> As I understand the discussion here:
> 
> https://lore.kernel.org/all/20220204211536.321475-15-mjrosato@linux.ibm.com/
> 
> The assumption is that there is no non-KVM userspace currently.  This
> seems like a regression to me.

It's more that non-KVM userspace doesn't care about what these ioctls 
are doing...  The enabling of 'interp, aif, ioat' is only pertinent when 
there is a KVM userspace, specifically because the information being 
shared / actions being performed as a result are only relevant to 
properly enabling zPCI features when the zPCI device is being passed 
through to a VM guest.  If you're just using a userspace driver to talk 
to the device (no KVM guest involved) then the kernel zPCI layer already 
has this device set up using whatever s390 facilities are available.

> 
>> This seems especially true now that it seems s390 PCI support is
>> almost truely functional, with actual new userspace instructions to
>> issue MMIO operations that work outside of KVM.
>>
>> I'm not sure how this all fits together, but I would expect an outcome
>> where DPDK could run on these new systems and not have to know
>> anything more about s390 beyond using the proper MMIO instructions via
>> some compilation time enablement.
> 
> Yes, fully enabling zPCI with vfio, but only for KVM is not optimal.

See above.  I think there is a misunderstanding here, it's not that we 
are only enabling zPCI with vfio for KVM, but rather than when using 
vfio to pass the device to a guest there is additional work that has to 
happen in order to 'fully enable' zPCI.

> 
>> (I've been reviewing s390 patches updating rdma for a parallel set of
>> stuff)
>>
>>> this is meant to extend vfio-pci proper for the whole arch.  Is there a
>>> compromise in using #ifdefs in vfio_pci_ops to call into zpci specific
>>> code that implements these arch specific hooks and the core for
>>> everything else?  SPAPR code could probably converted similarly, it
>>> exists here for legacy reasons. [Cc Jason]
>>
>> I'm not sure I get what you are suggesting? Where would these ifdefs
>> be?
> 
> Essentially just:
> 
> static const struct vfio_device_ops vfio_pci_ops = {
>          .name           = "vfio-pci",
> #ifdef CONFIG_S390
>          .open_device    = vfio_zpci_open_device,
>          .close_device   = vfio_zpci_close_device,
>          .ioctl          = vfio_zpci_ioctl,
> #else
>          .open_device    = vfio_pci_open_device,
>          .close_device   = vfio_pci_core_close_device,
>          .ioctl          = vfio_pci_core_ioctl,
> #endif
>          .read           = vfio_pci_core_read,
>          .write          = vfio_pci_core_write,
>          .mmap           = vfio_pci_core_mmap,
>          .request        = vfio_pci_core_request,
>          .match          = vfio_pci_core_match,
> };
> 
> It would at least provide more validation/exercise of the core/vendor
> split.  Thanks,
> 
> Alex
> 


  parent reply	other threads:[~2022-02-08 22:23 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-04 21:15 [PATCH v3 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 01/30] s390/sclp: detect the zPCI load/store interpretation facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 02/30] s390/sclp: detect the AISII facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 03/30] s390/sclp: detect the AENI facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 04/30] s390/sclp: detect the AISI facility Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 05/30] s390/airq: pass more TPI info to airq handlers Matthew Rosato
2022-02-07  8:28   ` Cornelia Huck
2022-02-04 21:15 ` [PATCH v3 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
2022-02-07  8:29   ` Cornelia Huck
2022-02-07  8:42   ` Claudio Imbrenda
2022-02-04 21:15 ` [PATCH v3 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 08/30] s390/pci: stash associated GISA designation Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 09/30] s390/pci: export some routines related to RPCIT processing Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 10/30] s390/pci: stash dtsm and maxstbl Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 11/30] s390/pci: add helper function to find device by handle Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 12/30] s390/pci: get SHM information from list pci Matthew Rosato
2022-02-07 10:08   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 14/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV Matthew Rosato
2022-02-07  8:35   ` Cornelia Huck
2022-02-07 15:43     ` Matthew Rosato
2022-02-07 17:59       ` Cornelia Huck
2022-02-07 20:09         ` Matthew Rosato
2022-02-10 10:07           ` Cornelia Huck
2022-02-04 21:15 ` [PATCH v3 15/30] KVM: s390: pci: add basic kvm_zdev structure Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 16/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 17/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications Matthew Rosato
2022-02-14 12:59   ` Pierre Morel
2022-02-14 20:35     ` Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 18/30] KVM: s390: mechanism to enable guest zPCI Interpretation Matthew Rosato
2022-02-14 13:06   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 19/30] KVM: s390: pci: provide routines for enabling/disabling interpretation Matthew Rosato
2022-02-14 13:22   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 20/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 21/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 22/30] KVM: s390: pci: handle refresh of PCI translations Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 23/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier Matthew Rosato
2022-02-08 17:43   ` Alex Williamson
2022-02-08 18:51     ` Jason Gunthorpe
2022-02-08 19:26       ` Alex Williamson
2022-02-08 19:51         ` Jason Gunthorpe
2022-02-08 20:33         ` Matthew Rosato [this message]
2022-02-08 20:40           ` Jason Gunthorpe
2022-02-08 21:37             ` Matthew Rosato
2022-02-10 11:15             ` Niklas Schnelle
2022-02-10 13:01               ` Jason Gunthorpe
2022-02-10 14:06                 ` Niklas Schnelle
2022-02-10 15:23                   ` Jason Gunthorpe
2022-02-10 18:59                     ` Matthew Rosato
2022-02-10 23:45                       ` Jason Gunthorpe
2022-02-04 21:15 ` [PATCH v3 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
2022-02-07 16:38   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 28/30] vfio-pci/zdev: add DTSM to clp group capability Matthew Rosato
2022-02-04 21:15 ` [PATCH v3 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation Matthew Rosato
2022-02-07 16:36   ` Pierre Morel
2022-02-04 21:15 ` [PATCH v3 30/30] MAINTAINERS: additional files related kvm s390 pci passthrough Matthew Rosato
2022-02-07 13:04   ` Christian Borntraeger
2022-02-07 15:44     ` Matthew Rosato
2022-02-04 21:33 ` [PATCH v3 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=438d8b1e-e149-35f1-a8c9-ed338eb97430@linux.ibm.com \
    --to=mjrosato@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=frankja@linux.ibm.com \
    --cc=freude@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=oberpar@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=pmorel@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=thuth@redhat.com \
    --cc=vneethv@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).