dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dave Jiang <dave.jiang@intel.com>,
	alex.williamson@redhat.com, kwankhede@nvidia.com,
	vkoul@kernel.org, megha.dey@intel.com, jacob.jun.pan@intel.com,
	ashok.raj@intel.com, yi.l.liu@intel.com, baolu.lu@intel.com,
	kevin.tian@intel.com, sanjay.k.kumar@intel.com,
	tony.luck@intel.com, dan.j.williams@intel.com,
	eric.auger@redhat.com, pbonzini@redhat.com,
	dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH v6 15/20] vfio/mdev: idxd: ims domain setup for the vdcm
Date: Tue, 01 Jun 2021 01:55:22 +0200	[thread overview]
Message-ID: <875yyy31cl.ffs@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20210531165754.GV1002214@nvidia.com>

On Mon, May 31 2021 at 13:57, Jason Gunthorpe wrote:
> On Mon, May 31, 2021 at 04:02:02PM +0200, Thomas Gleixner wrote:
>> > I'm quite surprised that every mdev doesn't create its own ims_domain
>> > in its probe function.
>> 
>> What for?
>
> IDXD wouldn't need it, but proper IMS HW with no bound of number of
> vectors can't provide a ims_info.max_slots value here.

There is no need to do so:

     https://lore.kernel.org/r/20200826112335.202234502@linutronix.de

which has the IMS_MSI_QUEUE variant at which you looked at and said:

 "I haven't looked through everything in detail, but this does look like
  it is good for the mlx5 devices."

ims_info.max_slots is a property of the IMS_MSI_ARRAY and does not make
any restrictions on other storage.
  
> Instead each use use site, like VFIO, would want to specify the number
> of vectors to allocate for its own usage, then parcel them out one by
> one in the normal way. Basically VFIO is emulating a normal MSI-X
> table.

Just with a size which exceeds a normal MSI-X table, but that's an
implementation detail of the underlying physical device. It does not put
any restrictions on mdev at all.

>> > This places a global total limit on the # of vectors which makes me
>> > ask what was the point of using IMS in the first place ?
>>
>> That depends on how IMS is implemented. The IDXD variant has a fixed
>> sized message store which is shared between all subdevices, so yet
>> another domain would not provide any value.
>
> Right, IDXD would have been perfectly happy to use the normal MSI-X
> table from what I can see.

Again. No, it's a storage size problem and regular MSI-X does not
support auxiliary data.

>> For the case where the IMS store is seperate, you still have one central
>> irqdomain per physical device. The domain allocation function can then
>> create storage on demand or reuse existing storage and just fill in the
>> pointers.
>
> I think it is philosophically backwards, and it is in part what is
> motivating pretending this weird auxdomain and PASID stuff is generic.

That's a different story and as I explained to Dave already hacking all
this into mdev is backwards, but that does not make your idea of a
irqdomain per mdev any more reasonable.

The mdev does not do anything irq chip/domain related. It uses what the
underlying physical device provides. If you think otherwise then please
provide me the hierarchical model which I explained here:

 https://lore.kernel.org/r/87h7tcgbs2.fsf@nanos.tec.linutronix.de
 https://lore.kernel.org/r/87bljg7u4f.fsf@nanos.tec.linutronix.de

> The VFIO model is the IRQ table is associated with a VM. When the
> vfio_device is created it decides how big the MSI-X table will be and
> it needs to allocate a block of interrupts to emulate it. For security
> those interrupts need to be linked in the HW to the vfio_device and
> the VM. ie VM A cannot trigger an interrupt that would deliver to VM
> B.

Fine.

> IDXD choose to use the PASID, but other HW might use a generic VM_ID.

So what?

> Further, IDXD choose to use a VM_ID per IMS entry, but other HW is
> likely to use a VM_ID per block of IMS entries. Ie the HW tree starts
> a VM object, then locates the IMS table for that object, then triggers
> the interrupt.

If you read my other reply to Dave carefuly then you might have noticed
that this is crap and irrelevant because the ID (whatever it is) is per
device and that ID has to be stored in the device. Whether the actual
irq chip/domain driver implementation uses it per associated irq or not
does not matter at all.

> If we think about the later sort of HW I don't think the whole aux
> data and domain per pci function makes alot of sense.

First of all that is already debunked and will go nowhere and second
there is no requirement to implement this for some other incarnation of
IMS when done correctly. That whole irq_set_auxdata() stuff is not going
anywhere simply because it's not needed at all.

All what's needed is a function to store some sort of ID per device
(mdev) and the underlying IMS driver takes care of what to do with it.

That has to happen before the interrupts are allocated and if that info
is invalid then the allocation function can reject it.

> You'd want a domain per VM_ID and all the IMS entires in that domain
> share the same VM_ID. In this regard the irq domain will correspond to
> the security boundary.

The real problems are:

  - Intel misled me with the requirement to set PASID after the fact
    which is simply wrong and what caused me to come up with that
    irq_set_auxdata() workaround.

  - Their absolute ignorance for proper layering led to adding all that
    irq_set_auxdata() muck to this mdev library.

Ergo, the proper thing to do is to fix this ID storage problem (PASID,
VM_ID or whatever) at the proper place, i.e. store it in struct device
(which is associated to that mdev) and let the individual drivers handle
it as they require.

It's that simple and this needs to be fixed and not some made up
philosophical question about irqdomains per mdev. Those would be even
worse than what Intel did here.

Thanks,

        tglx



  reply	other threads:[~2021-05-31 23:55 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-22  0:19 [PATCH v6 00/20] Add VFIO mediated device support and DEV-MSI support for the idxd driver Dave Jiang
2021-05-22  0:19 ` [PATCH v6 01/20] vfio/mdev: idxd: add theory of operation documentation for idxd mdev Dave Jiang
2021-05-22  0:19 ` [PATCH v6 02/20] dmaengine: idxd: add external module driver support for dsa_bus_type Dave Jiang
2021-05-22  0:19 ` [PATCH v6 03/20] dmaengine: idxd: add IMS offset and size retrieval code Dave Jiang
2021-05-22  0:19 ` [PATCH v6 04/20] dmaengine: idxd: add portal offset for IMS portals Dave Jiang
2021-05-22  0:19 ` [PATCH v6 05/20] vfio: mdev: common lib code for setting up Interrupt Message Store Dave Jiang
2021-05-24  0:02   ` Jason Gunthorpe
2021-05-28  1:49     ` Dave Jiang
2021-05-28 12:21       ` Jason Gunthorpe
2021-05-28 16:37         ` Dave Jiang
2021-05-28 16:39           ` Jason Gunthorpe
2021-05-31 10:41             ` Thomas Gleixner
2021-05-31 13:48   ` Thomas Gleixner
2021-05-31 15:24     ` Jason Gunthorpe
2021-06-08 15:57     ` Dave Jiang
2021-06-08 17:22       ` Jason Gunthorpe
2021-06-10 13:00       ` Thomas Gleixner
2021-05-22  0:19 ` [PATCH v6 06/20] vfio/mdev: idxd: add PCI config for read/write for mdev Dave Jiang
2021-05-22  0:19 ` [PATCH v6 07/20] vfio/mdev: idxd: Add administrative commands emulation " Dave Jiang
2021-05-22  0:19 ` [PATCH v6 08/20] vfio/mdev: idxd: Add mdev device context initialization Dave Jiang
2021-05-22  0:20 ` [PATCH v6 09/20] vfio/mdev: Add mmio read/write support for mdev Dave Jiang
2021-05-22  0:20 ` [PATCH v6 10/20] vfio/mdev: idxd: add mdev type as a new wq type Dave Jiang
2021-05-22  0:20 ` [PATCH v6 11/20] vfio/mdev: idxd: Add basic driver setup for idxd mdev Dave Jiang
2021-05-23 23:27   ` Jason Gunthorpe
2021-05-23 23:52   ` Jason Gunthorpe
2021-05-22  0:20 ` [PATCH v6 12/20] vfio: move VFIO PCI macros to common header Dave Jiang
2021-06-04  3:47   ` Alex Williamson
2021-05-22  0:20 ` [PATCH v6 13/20] vfio/mdev: idxd: add mdev driver registration and helper functions Dave Jiang
2021-05-23 23:32   ` Jason Gunthorpe
2021-05-23 23:46   ` Jason Gunthorpe
2021-05-22  0:20 ` [PATCH v6 14/20] vfio/mdev: idxd: add 1dwq-v1 mdev type Dave Jiang
2021-05-23 23:47   ` Jason Gunthorpe
2021-05-22  0:20 ` [PATCH v6 15/20] vfio/mdev: idxd: ims domain setup for the vdcm Dave Jiang
2021-05-23 23:50   ` Jason Gunthorpe
2021-05-27  0:22     ` Dave Jiang
2021-05-27  0:54       ` Jason Gunthorpe
2021-05-27  1:15         ` Dave Jiang
2021-05-27  1:41         ` Raj, Ashok
2021-05-27 13:36           ` Jason Gunthorpe
2021-05-31 14:02     ` Thomas Gleixner
2021-05-31 16:57       ` Jason Gunthorpe
2021-05-31 23:55         ` Thomas Gleixner [this message]
2021-06-01 12:16           ` Jason Gunthorpe
2021-05-22  0:20 ` [PATCH v6 16/20] vfio/mdev: idxd: add new wq state for mdev Dave Jiang
2021-05-22  0:20 ` [PATCH v6 17/20] vfio/mdev: idxd: add error notification from host driver to mediated device Dave Jiang
2021-05-22  0:20 ` [PATCH v6 18/20] vfio: move vfio_pci_set_ctx_trigger_single to common code Dave Jiang
2021-05-22  0:21 ` [PATCH v6 19/20] vfio: mdev: Add device request interface Dave Jiang
2021-05-23 22:38   ` Jason Gunthorpe
2021-05-22  0:21 ` [PATCH v6 20/20] vfio: mdev: idxd: setup request interrupt Dave Jiang
2021-05-23 23:22 ` [PATCH v6 00/20] Add VFIO mediated device support and DEV-MSI support for the idxd driver Jason Gunthorpe
2021-06-02 15:40   ` Dave Jiang
2021-06-02 23:17     ` Jason Gunthorpe
2021-06-03  1:11       ` Tian, Kevin
2021-06-03  1:49         ` Jason Gunthorpe
2021-06-03  5:52           ` Tian, Kevin
2021-06-03 11:23             ` Jason Gunthorpe
2021-06-04  3:40             ` Alex Williamson
2021-06-07  6:22               ` Tian, Kevin
2021-06-07 18:13                 ` Dave Jiang
2021-06-07 19:11                   ` Jason Gunthorpe
2021-06-08 16:02                     ` Dave Jiang
2021-06-11 18:21                       ` Dave Jiang
2021-06-11 18:29                         ` Jason Gunthorpe
2021-06-07 18:28                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875yyy31cl.ffs@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=megha.dey@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=sanjay.k.kumar@intel.com \
    --cc=tony.luck@intel.com \
    --cc=vkoul@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).