linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Raj, Ashok" <ashok.raj@intel.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"Jiang, Dave" <dave.jiang@intel.com>,
	"vkoul@kernel.org" <vkoul@kernel.org>,
	"megha.dey@linux.intel.com" <megha.dey@linux.intel.com>,
	"maz@kernel.org" <maz@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"Pan, Jacob jun" <jacob.jun.pan@intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Lu, Baolu" <baolu.lu@intel.com>,
	"Kumar, Sanjay K" <sanjay.k.kumar@intel.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	"Lin, Jing" <jing.lin@intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"parav@mellanox.com" <parav@mellanox.com>,
	"dmaengine@vger.kernel.org" <dmaengine@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Ashok Raj <ashok.raj@intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH RFC 00/15] Add VFIO mediated device support and IMS support for the idxd driver.
Date: Fri, 8 May 2020 17:09:09 -0700	[thread overview]
Message-ID: <20200509000909.GA79981@otc-nc-03> (raw)
In-Reply-To: <20200508231610.GO19158@mellanox.com>

Hi Jason

On Fri, May 08, 2020 at 08:16:10PM -0300, Jason Gunthorpe wrote:
> On Fri, May 08, 2020 at 01:47:10PM -0700, Raj, Ashok wrote:
> 
> > Even when uaccel was under development, one of the options
> > was to use VFIO as the transport, goal was the same i.e to keep
> > the user space have one interface. 
> 
> I feel a bit out of the loop here, uaccel isn't in today's kernel is
> it? I've heard about it for a while, it sounds very similar to RDMA,
> so I hope they took some of my advice...

I think since 5.7 maybe? drivers/misc/uacce. I don't think this is like
RDMA, its just a plain accelerator. There is no connection management,
memory registration or other things.. IB was my first job at Intel,
but saying that i would be giving my age away :)

> 
> > But the needs of generic user space application is significantly
> > different from exporting a more functional device model to guest,
> > which isn't full emulated device. which is why VFIO didn't make
> > sense for native use.
> 
> I'm not sure this is true. We've done these kinds of emulated SIOV
> like things already and there is a huge overlap between what a generic
> user application needs and what the VMM neds. Actually almost a
> perfect subset except for interrupt remapping (which is quite
> trivial).

From a simple user application POV, if we need to do simple compression
or such with a shared WQ, all the application needs do do is
bind_mm() that somehow associates the process address space with the 
IOMMU to create that association and communication channel.

For supporting this with guest user, we need to support the same actions
from a guest OS. i.e a guest OS bind should be serviced and end up with the 
IOMMU plumbing it with the guest cr3, and making sure the guest 2nd level 
is plumed right for the nested walk. 

Now we can certainly go bolt all these things again. When VFIO has already 
done the pluming in a generic way.

> 
> The things vfio focuses on, like groups and managing a real config
> space just don't apply here.
> 
> > And when we move things from VFIO which is already established
> > as a general device model and accepted by multiple VMM's it gives
> > instant footing without a whole redesign. 
> 
> Yes, I understand, but I think you need to get more people to support
> this idea. From my standpoint this is taking secure lean VMMs and

When we decided on VFIO, it was after using the best practices then,
after discussion with Kirti Wankhede and Alex. Kevin had used it for
graphics virtualization. It was even presented at KVM forum and such
dating back to 2017. No one has raised alarms until now :-)


> putting emulation code back into them, except in a more dangerous
> kernel location. This does not seem like a net win to me.

Its not a whole lot of emulation right? mdev are soft partitioned. There is
just a single PF, but we can create a separate partition for the guest using
PASID along with the normal BDF (RID). And exposing a consistent PCI like
interface to user space you get everything else for free.

Yes, its not SRIOV, but giving that interface to user space via VFIO, we get 
all of that functionality without having to reinvent a different way to do it.

vDPA went the other way, IRC, they went and put a HW implementation of what
virtio is in hardware. So they sort of fit the model. Here the instance
looks and feels like real hardware for the setup and control aspect.


> 
> You'd be much better to have some userspace library scheme instead of
> being completely tied to a kernel interface for modularity.

Couldn't agree more :-).. all I'm asking is if we can do a phased approach to 
get to that goodness! If we need to move things to user space for emulation
that's a great goal, but it can be evolutionary.

> 
> > When we move things from VFIO to uaccel to bolt on the functionality
> > like VFIO, I suspect we would be moving code/functionality from VFIO
> > to Uaccel. I don't know what the net gain would be.
> 
> Most of VFIO functionality is already decomposed inside the kernel,
> and you need most of it to do secure user access anyhow.
> 
> > For mdev, would you agree we can keep the current architecture,
> > and investigate moving some emulation code to user space (say even for
> > standard vfio_pci) and then expand scope later.
> 
> I won't hard NAK this, but I think you need more people to support
> this general idea of more emulation code in the kernel to go ahead -
> particularly since this is one of many future drivers along this
> design.
> 
> It would be good to hear from the VMM teams that this is what they
> want (and why), for instance.

IRC Paolo was present I think and we can find other VMM folks to chime in if
that helps.


  parent reply	other threads:[~2020-05-09  0:09 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21 23:33 [PATCH RFC 00/15] Add VFIO mediated device support and IMS support for the idxd driver Dave Jiang
2020-04-21 23:33 ` [PATCH RFC 01/15] drivers/base: Introduce platform_msi_ops Dave Jiang
2020-04-26  7:01   ` Greg KH
2020-04-27 21:38     ` Dave Jiang
2020-04-28  7:34       ` Greg KH
2020-04-21 23:33 ` [PATCH RFC 02/15] drivers/base: Introduce a new platform-msi list Dave Jiang
2020-04-25 21:13   ` Thomas Gleixner
2020-05-04  0:08     ` Dey, Megha
2020-04-21 23:34 ` [PATCH RFC 03/15] drivers/base: Allocate/free platform-msi interrupts by group Dave Jiang
2020-04-25 21:23   ` Thomas Gleixner
2020-05-04  0:08     ` Dey, Megha
2020-04-21 23:34 ` [PATCH RFC 04/15] drivers/base: Add support for a new IMS irq domain Dave Jiang
2020-04-23 20:11   ` Jason Gunthorpe
2020-05-01 22:30     ` Dey, Megha
2020-05-03 22:25       ` Jason Gunthorpe
2020-05-03 22:40         ` Dey, Megha
2020-05-03 22:46           ` Jason Gunthorpe
2020-05-04  0:25             ` Dey, Megha
2020-05-04 12:14               ` Jason Gunthorpe
2020-05-06 10:27                 ` Tian, Kevin
2020-04-25 21:38   ` Thomas Gleixner
2020-05-04  0:11     ` Dey, Megha
2020-04-21 23:34 ` [PATCH RFC 05/15] ims-msi: Add mask/unmask routines Dave Jiang
2020-04-25 21:49   ` Thomas Gleixner
2020-05-04  0:16     ` Dey, Megha
2020-04-21 23:34 ` [PATCH RFC 06/15] ims-msi: Enable IMS interrupts Dave Jiang
2020-04-25 22:13   ` Thomas Gleixner
2020-05-04  0:17     ` Dey, Megha
2020-04-21 23:34 ` [PATCH RFC 07/15] Documentation: Interrupt Message store Dave Jiang
2020-04-23 20:04   ` Jason Gunthorpe
2020-05-01 22:32     ` Dey, Megha
2020-05-03 22:28       ` Jason Gunthorpe
2020-05-03 22:41         ` Dey, Megha
2020-04-21 23:34 ` [PATCH RFC 08/15] vfio/mdev: Add a member for iommu domain in mdev_device Dave Jiang
2020-04-21 23:34 ` [PATCH RFC 09/15] vfio/type1: Save domain when attach domain to mdev Dave Jiang
2020-04-21 23:34 ` [PATCH RFC 10/15] dmaengine: idxd: add config support for readonly devices Dave Jiang
2020-04-21 23:34 ` [PATCH RFC 11/15] dmaengine: idxd: add IMS support in base driver Dave Jiang
2020-04-21 23:35 ` [PATCH RFC 12/15] dmaengine: idxd: add device support functions in prep for mdev Dave Jiang
2020-04-21 23:35 ` [PATCH RFC 13/15] dmaengine: idxd: add support for VFIO mediated device Dave Jiang
2020-04-21 23:35 ` [PATCH RFC 14/15] dmaengine: idxd: add error notification from host driver to " Dave Jiang
2020-04-21 23:35 ` [PATCH RFC 15/15] dmaengine: idxd: add ABI documentation for mediated device support Dave Jiang
2020-04-21 23:54 ` [PATCH RFC 00/15] Add VFIO mediated device support and IMS support for the idxd driver Jason Gunthorpe
2020-04-22  0:53   ` Tian, Kevin
2020-04-22 11:50     ` Jason Gunthorpe
2020-04-22 21:14       ` Raj, Ashok
2020-04-23 19:12         ` Jason Gunthorpe
2020-04-24  3:27           ` Tian, Kevin
2020-04-24 12:44             ` Jason Gunthorpe
2020-04-24 16:25               ` Tian, Kevin
2020-04-24 18:12                 ` Jason Gunthorpe
2020-04-26  5:18                   ` Tian, Kevin
2020-04-26 19:13                     ` Jason Gunthorpe
2020-04-27  3:43                       ` Alex Williamson
2020-04-27 11:58                         ` Jason Gunthorpe
2020-04-27 13:19                           ` Alex Williamson
2020-04-27 13:22                             ` Jason Gunthorpe
2020-04-27 14:18                               ` Alex Williamson
2020-04-27 14:25                                 ` Jason Gunthorpe
2020-04-27 15:41                                   ` Alex Williamson
2020-04-27 16:16                                     ` Jason Gunthorpe
2020-04-27 16:25                                       ` Dave Jiang
2020-04-27 21:56                                         ` Jason Gunthorpe
2020-04-29  9:42                               ` Tian, Kevin
2020-05-08 20:47                                 ` Raj, Ashok
2020-05-08 23:16                                   ` Jason Gunthorpe
2020-05-08 23:52                                     ` Dave Jiang
2020-05-09  0:09                                     ` Raj, Ashok [this message]
2020-05-09 12:21                                       ` Jason Gunthorpe
2020-05-13  2:29                                         ` Jason Wang
2020-05-13  8:30                                         ` Tian, Kevin
2020-05-13 12:40                                           ` Jason Gunthorpe
2020-04-27 12:13                       ` Tian, Kevin
2020-04-27 12:55                         ` Jason Gunthorpe
2020-04-22 21:24   ` Dan Williams
2020-04-23 19:17     ` Dan Williams
2020-04-23 19:49       ` Jason Gunthorpe
2020-05-01 22:31         ` Dey, Megha
2020-05-03 22:21           ` Jason Gunthorpe
2020-05-03 22:32             ` Dey, Megha
2020-04-23 19:18     ` Jason Gunthorpe
2020-05-01 22:31       ` Dey, Megha
2020-05-03 22:22         ` Jason Gunthorpe
2020-05-03 22:31           ` Dey, Megha
2020-05-03 22:36             ` Jason Gunthorpe
2020-05-04  0:20               ` Dey, Megha
2020-04-22 23:04   ` Dey, Megha
2020-04-23 19:44     ` Jason Gunthorpe
2020-05-01 22:32       ` Dey, Megha
2020-04-24  6:31   ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200509000909.GA79981@otc-nc-03 \
    --to=ashok.raj@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@mellanox.com \
    --cc=jing.lin@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=megha.dey@linux.intel.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=rafael@kernel.org \
    --cc=sanjay.k.kumar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vkoul@kernel.org \
    --cc=x86@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).