dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: David Woodhouse <dwmw2@infradead.org>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: "Raj\, Ashok" <ashok.raj@intel.com>, "Tian\,
	Kevin" <kevin.tian@intel.com>, "Jiang\,
	Dave" <dave.jiang@intel.com>, Bjorn Helgaas <helgaas@kernel.org>,
	"vkoul\@kernel.org" <vkoul@kernel.org>, "Dey\,
	Megha" <megha.dey@intel.com>, "maz\@kernel.org" <maz@kernel.org>,
	"bhelgaas\@google.com" <bhelgaas@google.com>,
	"alex.williamson\@redhat.com" <alex.williamson@redhat.com>, "Pan\,
	Jacob jun" <jacob.jun.pan@intel.com>, "Liu\,
	Yi L" <yi.l.liu@intel.com>, "Lu\, Baolu" <baolu.lu@intel.com>,
	"Kumar\, Sanjay K" <sanjay.k.kumar@intel.com>, "Luck\,
	Tony" <tony.luck@intel.com>,
	"jing.lin\@intel.com" <jing.lin@intel.com>,
	"kwankhede\@nvidia.com" <kwankhede@nvidia.com>,
	"eric.auger\@redhat.com" <eric.auger@redhat.com>,
	"parav\@mellanox.com" <parav@mellanox.com>,
	"rafael\@kernel.org" <rafael@kernel.org>,
	"netanelg\@mellanox.com" <netanelg@mellanox.com>,
	"shahafs\@mellanox.com" <shahafs@mellanox.com>,
	"yan.y.zhao\@linux.intel.com" <yan.y.zhao@linux.intel.com>,
	"pbonzini\@redhat.com" <pbonzini@redhat.com>, "Ortiz\,
	Samuel" <samuel.ortiz@intel.com>, "Hossain\,
	Mona" <mona.hossain@intel.com>,
	"dmaengine\@vger.kernel.org" <dmaengine@vger.kernel.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-pci\@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kvm\@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH v4 06/17] PCI: add SIOV and IMS capability detection
Date: Sun, 08 Nov 2020 23:47:13 +0100	[thread overview]
Message-ID: <87h7pzjwjy.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <d69953378bd1fdcdda54a2fbe285f6c0b1484e8a.camel@infradead.org>

On Sun, Nov 08 2020 at 19:36, David Woodhouse wrote:
> On Sun, 2020-11-08 at 19:47 +0100, Thomas Gleixner wrote:
>> So this needs some thought.
>
> The problem here is that Intel implemented interrupt remapping in a way
> which is anathema to structured, ordered IRQ domains.
>
> When a guest writes an MSI message (addr/data) to the MSI table of a
> PCI device which has been assigned to that guest, it *doesn't* properly
> inherit the MSI composition from a parent irqdomain which knows about
> the (host-side) IOMMU.
>
> What actually happens is the hypervisor *traps* the writes to the
> device's MSI table, and translates them *then*.

That's what I showed in the ascii art :)

> In *precisely* the fashion which we're trying to avoid for IMS.

At least for the IMS variant where the storage is not in trappable
device memory.

> Now, you can imagine a world where it wasn't like this, where
> Remappable Format MSI messages don't exist, and where we let guests
> write native MSI message to the device without trapping — and where the
> IOMMU then sees the incoming interrupt and has to map the APIC ID to a
> *virtual* CPU for that guest, based on the PCI source-id of the
> device.

That would be not convoluted enough and make too much sense.

> In that world, IMS would work naturally. But that isn't how Intel
> designed interrupt remapping. They *designed* to have to trap and
> translate as the message is written to the device.
>
> So it does look like we're going to need a hypercall interface to
> compose an MSI message on behalf of the guest, for IMS to use. In fact
> PCI devices assigned to a guest could use that too, and then we'd only
> need to trap-and-remap any attempt to write a Compatibility Format MSI
> to the device's MSI table, while letting Remappable Format messages get
> written directly.

Yes, if we have the HCALL domain then the message composed by the
hypervisor is valid for everything not only IMS. That's why I left out
any specifics on the Busdomain side. It does not matter which kind of
bus that is. The only mechanics which is provided by the busdomain is
to store the precomposed message and eventually provide mask/unmask at
that level.

> We'd also need a way for an OS running on bare metal to *know* that
> it's on bare metal and can just compose MSI messages for itself. Since
> we do expect bare metal to have an IOMMU, perhaps that is just a
> feature flag on the IOMMU?

There are still CPUs w/o IOMMU out there and new ones are shipped.

So you would basically mandate that IMS with memory storage can only
work on bare metal when the CPU has an IOMMU.

Jason said in [1]: "For x86 I think we could accept linking this to
IOMMU, if really necessary."

OTOH, what's the chance that a guest runs on something which

  1) Does not have X86_FEATURE_HYPERVISOR set in cpuid 1/EDX

and

  2) Cannot be identified as Xen domain

and

  3) Does not have a DMI vendor entry which identifies the
     virtualization solution (we don't use that today, but
     adding that table is trivial enough)

and

  4) Has such an IMS device passed through?

Possible, yes. Likely, no. Do we care?

> That or Intel needs to fix the IOMMU to do proper virtualisation and
> actually translate "Compatibility Format" MSIs for a guest too.

Is that going to happen before I retire?

Thanks,

        tglx

[1] https://lore.kernel.org/r/20200822005125.GB1152540@nvidia.com

  reply	other threads:[~2020-11-08 22:47 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-30 18:50 [PATCH v4 00/17] Add VFIO mediated device support and DEV-MSI support for the idxd driver Dave Jiang
2020-10-30 18:50 ` [PATCH v4 01/17] irqchip: Add IMS (Interrupt Message Store) driver Dave Jiang
2020-10-30 22:01   ` Thomas Gleixner
2020-10-30 18:51 ` [PATCH v4 02/17] iommu/vt-d: Add DEV-MSI support Dave Jiang
2020-10-30 20:31   ` Thomas Gleixner
2020-10-30 20:52     ` Dave Jiang
2020-10-30 18:51 ` [PATCH v4 03/17] dmaengine: idxd: add theory of operation documentation for idxd mdev Dave Jiang
2020-10-30 18:51 ` [PATCH v4 04/17] dmaengine: idxd: add support for readonly config devices Dave Jiang
2020-10-30 18:51 ` [PATCH v4 05/17] dmaengine: idxd: add interrupt handle request support Dave Jiang
2020-10-30 18:51 ` [PATCH v4 06/17] PCI: add SIOV and IMS capability detection Dave Jiang
2020-10-30 19:51   ` Bjorn Helgaas
2020-10-30 21:20     ` Dave Jiang
2020-10-30 21:50       ` Bjorn Helgaas
2020-10-30 22:45       ` Jason Gunthorpe
2020-10-30 22:49         ` Dave Jiang
2020-11-02 13:21           ` Jason Gunthorpe
2020-11-03  2:49             ` Tian, Kevin
2020-11-03 12:43               ` Jason Gunthorpe
2020-11-04  3:41                 ` Tian, Kevin
2020-11-04 12:40                   ` Jason Gunthorpe
2020-11-04 13:34                     ` Tian, Kevin
2020-11-04 13:54                       ` Jason Gunthorpe
2020-11-06  9:48                         ` Tian, Kevin
2020-11-06 13:14                           ` Jason Gunthorpe
2020-11-06 16:48                             ` Raj, Ashok
2020-11-06 17:51                               ` Jason Gunthorpe
2020-11-06 23:47                                 ` Dan Williams
2020-11-07  0:12                                   ` Jason Gunthorpe
2020-11-07  1:42                                     ` Dan Williams
2020-11-08 18:11                                     ` Raj, Ashok
2020-11-08 18:34                                       ` David Woodhouse
2020-11-08 23:25                                         ` Raj, Ashok
2020-11-10 14:19                                           ` Raj, Ashok
2020-11-10 14:41                                             ` David Woodhouse
2020-11-08 23:41                                       ` Jason Gunthorpe
2020-11-09  0:05                                         ` Raj, Ashok
2020-11-08 18:47                                     ` Thomas Gleixner
2020-11-08 19:36                                       ` David Woodhouse
2020-11-08 22:47                                         ` Thomas Gleixner [this message]
2020-11-08 23:29                                           ` Jason Gunthorpe
2020-11-11 15:41                                         ` Christoph Hellwig
2020-11-11 16:09                                           ` Raj, Ashok
2020-11-11 22:27                                             ` Thomas Gleixner
2020-11-11 23:03                                               ` Raj, Ashok
2020-11-12  1:13                                                 ` Thomas Gleixner
2020-11-12 13:10                                                 ` Jason Gunthorpe
2020-11-08 23:23                                       ` Jason Gunthorpe
2020-11-08 23:36                                         ` Raj, Ashok
2020-11-09  7:37                                         ` Tian, Kevin
2020-11-09 16:46                                           ` Jason Gunthorpe
2020-11-08 23:58                                       ` Raj, Ashok
2020-11-09  7:59                                         ` Tian, Kevin
2020-11-09 11:21                                         ` Thomas Gleixner
2020-11-09 17:30                                           ` Jason Gunthorpe
2020-11-09 22:40                                             ` Raj, Ashok
2020-11-09 22:42                                             ` Thomas Gleixner
2020-11-10  5:14                                               ` Raj, Ashok
2020-11-10 10:27                                                 ` Thomas Gleixner
2020-11-10 14:13                                                   ` Raj, Ashok
2020-11-10 14:23                                                     ` Jason Gunthorpe
2020-11-11  2:17                                                       ` Tian, Kevin
2020-11-12 13:46                                                         ` Jason Gunthorpe
2020-11-11  7:14                                                     ` Tian, Kevin
2020-11-12 19:32                                                       ` Konrad Rzeszutek Wilk
2020-11-12 22:42                                                         ` Thomas Gleixner
2020-11-13  2:42                                                           ` Tian, Kevin
2020-11-13 12:57                                                             ` Jason Gunthorpe
2020-11-13 13:32                                                             ` Thomas Gleixner
2020-11-13 16:12                                                               ` Luck, Tony
2020-11-13 17:38                                                                 ` Raj, Ashok
2020-11-14 10:34                                                           ` Christoph Hellwig
2020-11-14 21:18                                                             ` Raj, Ashok
2020-11-15 11:26                                                               ` Thomas Gleixner
2020-11-15 19:31                                                                 ` Raj, Ashok
2020-11-15 22:11                                                                   ` Thomas Gleixner
2020-11-16  0:22                                                                     ` Raj, Ashok
2020-11-16  7:31                                                                       ` Tian, Kevin
2020-11-16 15:46                                                                         ` Jason Gunthorpe
2020-11-16 17:56                                                                           ` Thomas Gleixner
2020-11-16 18:02                                                                             ` Jason Gunthorpe
2020-11-16 20:37                                                                               ` Thomas Gleixner
2020-11-16 23:51                                                                               ` Tian, Kevin
2020-11-17  9:21                                                                                 ` Thomas Gleixner
2020-11-16  8:25                                                               ` Christoph Hellwig
2020-11-10 14:19                                                 ` Jason Gunthorpe
2020-11-11  2:35                                                   ` Tian, Kevin
2020-11-08 21:18                             ` Thomas Gleixner
2020-11-08 22:09                               ` David Woodhouse
2020-11-08 22:52                                 ` Thomas Gleixner
2020-11-07  0:32                           ` Thomas Gleixner
2020-11-09  5:25                             ` Tian, Kevin
2020-10-30 18:51 ` [PATCH v4 07/17] dmaengine: idxd: add IMS support in base driver Dave Jiang
2020-10-30 18:51 ` [PATCH v4 08/17] dmaengine: idxd: add device support functions in prep for mdev Dave Jiang
2020-10-30 18:51 ` [PATCH v4 09/17] dmaengine: idxd: add basic mdev registration and helper functions Dave Jiang
2020-10-30 18:51 ` [PATCH v4 10/17] dmaengine: idxd: add emulation rw routines Dave Jiang
2020-10-30 18:52 ` [PATCH v4 11/17] dmaengine: idxd: prep for virtual device commands Dave Jiang
2020-10-30 18:52 ` [PATCH v4 12/17] dmaengine: idxd: virtual device commands emulation Dave Jiang
2020-10-30 18:52 ` [PATCH v4 13/17] dmaengine: idxd: ims setup for the vdcm Dave Jiang
2020-10-30 21:26   ` Thomas Gleixner
2020-10-30 18:52 ` [PATCH v4 14/17] dmaengine: idxd: add mdev type as a new wq type Dave Jiang
2020-10-30 18:52 ` [PATCH v4 15/17] dmaengine: idxd: add dedicated wq mdev type Dave Jiang
2020-10-30 18:52 ` [PATCH v4 16/17] dmaengine: idxd: add new wq state for mdev Dave Jiang
2020-10-30 18:52 ` [PATCH v4 17/17] dmaengine: idxd: add error notification from host driver to mediated device Dave Jiang
2020-10-30 18:58 ` [PATCH v4 00/17] Add VFIO mediated device support and DEV-MSI support for the idxd driver Jason Gunthorpe
2020-10-30 19:13   ` Dave Jiang
2020-10-30 19:17     ` Jason Gunthorpe
2020-10-30 19:23       ` Raj, Ashok
2020-10-30 19:30         ` Jason Gunthorpe
2020-10-30 20:43           ` Raj, Ashok
2020-10-30 22:54             ` Jason Gunthorpe
2020-10-31  2:50             ` Thomas Gleixner
2020-10-31 23:53               ` Raj, Ashok
2020-11-02 13:20                 ` Jason Gunthorpe
2020-11-02 16:20                   ` Raj, Ashok
2020-11-02 17:19                     ` Jason Gunthorpe
2020-11-02 18:18                       ` Dave Jiang
2020-11-02 18:26                         ` Jason Gunthorpe
2020-11-02 18:38                           ` Dan Williams
2020-11-02 18:51                             ` Jason Gunthorpe
2020-11-02 19:26                               ` Dan Williams
2020-10-30 20:48 ` Thomas Gleixner
2020-10-30 20:59   ` Dave Jiang
2020-10-30 22:10     ` Thomas Gleixner
     [not found] <draft-875z6ekcj5.fsf@nanos.tec.linutronix.de>
2020-11-09 14:08 ` [PATCH v4 06/17] PCI: add SIOV and IMS capability detection Thomas Gleixner
2020-11-09 18:10   ` Raj, Ashok

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7pzjwjy.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@nvidia.com \
    --cc=jing.lin@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=megha.dey@intel.com \
    --cc=mona.hossain@intel.com \
    --cc=netanelg@mellanox.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=rafael@kernel.org \
    --cc=samuel.ortiz@intel.com \
    --cc=sanjay.k.kumar@intel.com \
    --cc=shahafs@mellanox.com \
    --cc=tony.luck@intel.com \
    --cc=vkoul@kernel.org \
    --cc=yan.y.zhao@linux.intel.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).