All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"liranl@nvidia.com" <liranl@nvidia.com>,
	"oren@nvidia.com" <oren@nvidia.com>,
	"tzahio@nvidia.com" <tzahio@nvidia.com>,
	"leonro@nvidia.com" <leonro@nvidia.com>,
	"yarong@nvidia.com" <yarong@nvidia.com>,
	"aviadye@nvidia.com" <aviadye@nvidia.com>,
	"shahafs@nvidia.com" <shahafs@nvidia.com>,
	"artemp@nvidia.com" <artemp@nvidia.com>,
	"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
	"ACurrid@nvidia.com" <ACurrid@nvidia.com>,
	"gmataev@nvidia.com" <gmataev@nvidia.com>,
	"cjia@nvidia.com" <cjia@nvidia.com>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"yishaih@nvidia.com" <yishaih@nvidia.com>,
	"aik@ozlabs.ru" <aik@ozlabs.ru>,
	"Zhao, Yan Y" <yan.y.zhao@intel.com>
Subject: Re: [PATCH v2 0/9] Introduce vfio-pci-core subsystem
Date: Wed, 10 Feb 2021 09:37:46 -0700	[thread overview]
Message-ID: <20210210093746.7736b25c@omen.home.shazbot.org> (raw)
In-Reply-To: <20210210133452.GW4247@nvidia.com>

On Wed, 10 Feb 2021 09:34:52 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Feb 10, 2021 at 07:52:08AM +0000, Tian, Kevin wrote:
> > > This subsystem framework will also ease on adding vendor specific
> > > functionality to VFIO devices in the future by allowing another module
> > > to provide the pci_driver that can setup number of details before
> > > registering to VFIO subsystem (such as inject its own operations).  
> > 
> > I'm a bit confused about the change from v1 to v2, especially about
> > how to inject module specific operations. From live migration p.o.v
> > it may requires two hook points at least for some devices (e.g. i40e 
> > in original Yan's example):  
> 
> IMHO, it was too soon to give up on putting the vfio_device_ops in the
> final driver- we should try to define a reasonable public/private
> split of vfio_pci_device as is the norm in the kernel. No reason we
> can't achieve that.
> 
> >  register a migration region and intercept guest writes to specific
> > registers. [PATCH 4/9] demonstrates the former but not the latter
> > (which is allowed in v1).  
> 
> And this is why, the ROI to wrapper every vfio op in a PCI op just to
> keep vfio_pci_device completely private is poor :(

Says someone who doesn't need to maintain the core, fixing bugs and
adding features, while not breaking vendor driver touching private data
in unexpected ways ;)

> > Then another question. Once we have this framework in place, do we 
> > mandate this approach for any vendor specific tweak or still allow
> > doing it as vfio_pci_core extensions (such as igd and zdev in this
> > series)?  
> 
> I would say no to any further vfio_pci_core extensions that are tied
> to specific PCI devices. Things like zdev are platform features, they
> are not tied to specific PCI devices
> 
> > If the latter, what is the criteria to judge which way is desired? Also what 
> > about the scenarios where we just want one-time vendor information, 
> > e.g. to tell whether a device can tolerate arbitrary I/O page faults [1] or
> > the offset in VF PCI config space to put PASID/ATS/PRI capabilities [2]?
> > Do we expect to create a module for each device to provide such info?
> > Having those questions answered is helpful for better understanding of
> > this proposal IMO. 😊
> > 
> > [1] https://lore.kernel.org/kvm/d4c51504-24ed-2592-37b4-f390b97fdd00@huawei.com/T/  
> 
> SVA is a platform feature, so no problem. Don't see a vfio-pci change
> in here?
> 
> > [2] https://lore.kernel.org/kvm/20200407095801.648b1371@w520.home/  
> 
> This one could have been done as a broadcom_vfio_pci driver. Not sure
> exposing the entire config space unprotected is safe, hard to know
> what the device has put in there, and if it is secure to share with a
> guest..

The same argument could be made about the whole device, not just config
space.  In fact we know that devices like GPUs provide access to config
space through other address spaces, I/O port and MMIO BARs.  Config
space is only the architected means to manipulate the link interface
and device features, we can't determine if it's the only means.  Any
emulation or access restrictions we put in config space are meant to
make the device work better for the user (ex. re-using host kernel
device quirks) or prevent casual misconfiguration (ex. incompatible mps
settings, BAR registers), the safety of assigning a device to a user
is only as good as the isolation and error handling that the platform
allows.

 
> > MDEV core is already a well defined subsystem to connect mdev
> > bus driver (vfio-mdev) and mdev device driver (mlx5-mdev).  
> 
> mdev is two things
> 
>  - a driver core bus layer and sysfs that makes a lifetime model
>  - a vfio bus driver that doesn't do anything but forward ops to the
>    main ops
> 
> > vfio-mdev is just the channel to bring VFIO APIs through mdev core
> > to underlying vendor specific mdev device driver, which is already
> > granted flexibility to tweak whatever needs through mdev_parent_ops.  
> 
> This is the second thing, and it could just be deleted. The actual
> final mdev driver can just use vfio_device_ops directly. The
> redirection shim in vfio_mdev.c doesn't add value.
> 
> > Then what exact extension is talked here by creating another subsystem
> > module? or are we talking about some general library which can be
> > shared by underlying mdev device drivers to reduce duplicated
> > emulation code?  
> 
> IMHO it is more a design philosophy that the end driver should
> implement the vfio_device_ops directly vs having a stack of ops
> structs.

And that's where turning vfio-pci into a vfio-pci-core library comes
in, lowering the bar for a vendor driver to re-use what vfio-pci has
already done.  This doesn't change the basic vfio architecture that
already allows any vendor driver to register with its own
vfio_device_ops, it's just code-reuse, which only makes sense if we're
not just shifting the support burden from the vendor driver to the new
library-ized core.  Like Kevin though, I don't really understand the
hand-wave application to mdev.  Sure, vfio-mdev could be collapsed now
that we've rejected that there could be other drivers binding to mdev
devices, but mdev-core provides a common lifecycle management within a
class of devices, so if you subscribe to mdev, your vendor driver is
providing essentially vfio_device_ops plus the mdev extra callbacks.
If your vendor driver doesn't subscribe to that lifecycle, then just
implement a vfio bus driver with your own vfio_device_ops.  The nature
of that mdev lifecycle doesn't seem to mesh well with a library that
expects to manage a whole physical PCI device though.  Thanks,

Alex


  reply	other threads:[~2021-02-10 16:43 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01 16:28 [PATCH v2 0/9] Introduce vfio-pci-core subsystem Max Gurtovoy
2021-02-01 16:28 ` [PATCH 1/9] vfio-pci: rename vfio_pci.c to vfio_pci_core.c Max Gurtovoy
2021-02-01 16:28 ` [PATCH 2/9] vfio-pci: introduce vfio_pci_core subsystem driver Max Gurtovoy
2021-02-01 16:28 ` [PATCH 3/9] vfio-pci-core: export vfio_pci_register_dev_region function Max Gurtovoy
2021-02-01 16:28 ` [PATCH 4/9] mlx5-vfio-pci: add new vfio_pci driver for mlx5 devices Max Gurtovoy
2021-02-01 16:28 ` [PATCH 5/9] vfio-pci/zdev: remove unused vdev argument Max Gurtovoy
2021-02-01 17:27   ` Matthew Rosato
2021-02-02  7:57   ` Cornelia Huck
2021-02-02 17:21     ` Alex Williamson
2021-02-01 16:28 ` [PATCH 6/9] vfio-pci/zdev: fix possible segmentation fault issue Max Gurtovoy
2021-02-01 16:52   ` Cornelia Huck
2021-02-01 17:08     ` Matthew Rosato
2021-02-01 20:47       ` Alex Williamson
2021-02-02  7:58         ` Cornelia Huck
2021-02-01 16:28 ` [PATCH 7/9] vfio/pci: use s390 naming instead of zdev Max Gurtovoy
2021-02-01 16:28 ` [PATCH 8/9] vfio/pci: use x86 naming instead of igd Max Gurtovoy
2021-02-01 17:14   ` Cornelia Huck
2021-02-01 17:49     ` Matthew Rosato
2021-02-01 18:42       ` Alex Williamson
2021-02-02 16:06         ` Cornelia Huck
2021-02-02 17:10           ` Jason Gunthorpe
2021-02-11 15:47             ` Max Gurtovoy
2021-02-11 16:29               ` Matthew Rosato
2021-02-11 17:39                 ` Cornelia Huck
2021-02-02 17:41           ` Max Gurtovoy
2021-02-02 17:54             ` Alex Williamson
2021-02-02 18:50               ` Jason Gunthorpe
2021-02-02 18:55                 ` Christoph Hellwig
2021-02-02 19:05                   ` Jason Gunthorpe
2021-02-02 19:37                 ` Alex Williamson
2021-02-02 20:44                   ` Jason Gunthorpe
2021-02-02 20:59                     ` Max Gurtovoy
2021-02-02 21:30                       ` Alex Williamson
2021-02-02 23:06                         ` Jason Gunthorpe
2021-02-02 23:59                           ` Alex Williamson
2021-02-03 13:54                             ` Jason Gunthorpe
2021-02-11  8:47                               ` Christoph Hellwig
2021-02-11 14:30                                 ` Jason Gunthorpe
2021-02-11  8:44                             ` Christoph Hellwig
2021-02-11 19:43                               ` Alex Williamson
     [not found]             ` <806c138e-685c-0955-7c15-93cb1d4fe0d9@ozlabs.ru>
2021-02-03 16:07               ` Max Gurtovoy
     [not found]                 ` <83ef0164-6291-c3d1-0ce5-2c9d6c97469e@ozlabs.ru>
2021-02-04 12:51                   ` Jason Gunthorpe
2021-02-05  0:42                     ` Alexey Kardashevskiy
2021-02-08 12:44                       ` Max Gurtovoy
2021-02-09  1:55                         ` Alexey Kardashevskiy
2021-02-08 18:13                       ` Jason Gunthorpe
2021-02-09  1:51                         ` Alexey Kardashevskiy
2021-02-04  9:12               ` Max Gurtovoy
2021-02-11  8:50                 ` Christoph Hellwig
2021-02-11 14:49                   ` Jason Gunthorpe
2021-02-01 16:28 ` [PATCH 9/9] vfio/pci: use powernv naming instead of nvlink2 Max Gurtovoy
2021-02-01 18:35   ` Jason Gunthorpe
2021-02-10  7:52 ` [PATCH v2 0/9] Introduce vfio-pci-core subsystem Tian, Kevin
2021-02-10 13:34   ` Jason Gunthorpe
2021-02-10 16:37     ` Alex Williamson [this message]
2021-02-10 17:08       ` Jason Gunthorpe
2021-02-11  8:36     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210210093746.7736b25c@omen.home.shazbot.org \
    --to=alex.williamson@redhat.com \
    --cc=ACurrid@nvidia.com \
    --cc=aik@ozlabs.ru \
    --cc=artemp@nvidia.com \
    --cc=aviadye@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=gmataev@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liranl@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=oren@nvidia.com \
    --cc=shahafs@nvidia.com \
    --cc=tzahio@nvidia.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yarong@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.