linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alex Williamson <alex.williamson@redhat.com>,
	Robin Murphy <robin.murphy@arm.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	linuxppc-dev@lists.ozlabs.org,
	David Gibson <david@gibson.dropbear.id.au>,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org,
	Yongji Xie <elohimes@gmail.com>,
	Eric Auger <eric.auger@redhat.com>,
	Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>,
	Jike Song <jike.song@intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Joerg Roedel <joro@8bytes.org>,
	Arvind Yadav <arvind.yadav.cs@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
	Neo Jia <cjia@nvidia.com>, Paul Mackerras <paulus@samba.org>,
	Vlad Tsyrklevich <vlad@tsyrklevich.net>,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table
Date: Wed, 16 Aug 2017 10:35:49 +1000	[thread overview]
Message-ID: <1502843749.4493.67.camel@kernel.crashing.org> (raw)
In-Reply-To: <20170815103717.3b64e10c@w520.home>

On Tue, 2017-08-15 at 10:37 -0600, Alex Williamson wrote:
> Of course I don't think either of those are worth imposing a
> performance penalty where we don't otherwise need one.  However, if we
> look at a VM scenario where the guest is following the PCI standard for
> programming MSI-X interrupts (ie. not POWER), we need some mechanism to
> intercept those MMIO writes to the vector table and configure the host
> interrupt domain of the device rather than allowing the guest direct
> access.  This is simply part of virtualizing the device to the guest.
> So even if the kernel allows mmap'ing the vector table, the hypervisor
> needs to trap it, so the mmap isn't required or used anyway.  It's only
> when you define a non-PCI standard for your guest to program
> interrupts, as POWER has done, and can therefore trust that the
> hypervisor does not need to trap on the vector table that having that
> mmap'able vector table becomes fully useful.  AIUI, ARM supports 64k
> pages too... does ARM have any strategy that would actually make it
> possible to make use of an mmap covering the vector table?  Thanks,

WTF ???? Alex, can you stop once and for all with all that "POWER is
not standard" bullshit please ? It's completely wrong.

This has nothing to do with PCIe standard !

The PCIe standard says strictly *nothing* whatsoever about how an OS
obtains the magic address/values to put in the device and how the PCIe
host bridge may do appropriate fitering.

There is nothing on POWER that prevents the guest from writing the MSI-
X address/data by hand. The problem isn't who writes the values or even
how. The problem breaks down into these two things that are NOT covered
by any aspect of the PCIe standard:

  1- The OS needs to obtain address/data values for an MSI that will
"work" for the device.

  2- The HW+HV needs to prevent collateral damage caused by a device
issuing stores to incorrect address or with incorrect data. Now *this*
is necessary for *ANY* kind of DMA whether it's an MSI or something
else anyway.

Now, the filtering done by qemu is NOT a reasonable way to handle 2)
and whatever excluse about "making it harder" doesn't fly a meter when
it comes to security. Making it "harder to break accidentally" I also
don't buy, people don't just randomly put things in their MSI-X tables
"accidentally", that stuff works or doesn't.

That leaves us with 1). Now this is purely a platform specific matters,
not a spec matter. Once the HW has a way to enforce you can only
generate "allowed" MSIs it becomes a matter of having some FW mechanism
that can be used to informed the OS what address/values to use for a
given interrupts.

This is provided on POWER by a combination of device-tree and RTAS. It
could be that x86/ARM64 doesn't provide good enough mechanisms via ACPI
but this is no way a problem of standard compliance, just inferior
firmware interfaces.

So again, for the 234789246th time in years, can we get that 1-bit-of-
information sorted one way or another so we can fix our massive
performance issue instead of adding yet another dozen layers of paint
on that shed ?

Ben.

  reply	other threads:[~2017-08-16  0:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-07  7:25 [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table Alexey Kardashevskiy
2017-08-07  7:25 ` [RFC PATCH v5 1/5] iommu: Add capabilities to a group Alexey Kardashevskiy
2017-08-09  5:55   ` David Gibson
2017-08-07  7:25 ` [RFC PATCH v5 2/5] iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ remapping Alexey Kardashevskiy
2017-08-07  7:25 ` [RFC PATCH v5 3/5] iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is enabled Alexey Kardashevskiy
2017-08-07  7:25 ` [RFC PATCH v5 4/5] powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX Alexey Kardashevskiy
2017-08-07  7:25 ` [RFC PATCH v5 5/5] vfio-pci: Allow to expose MSI-X table to userspace when safe Alexey Kardashevskiy
2017-08-09  6:59   ` David Gibson
2017-08-14  9:45 ` [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table Alexey Kardashevskiy
2017-08-14 13:12   ` Robin Murphy
2017-08-15  1:16     ` Jike Song
2017-08-15  1:33       ` Benjamin Herrenschmidt
2017-08-15  1:47         ` Jike Song
2017-08-15  5:38           ` Benjamin Herrenschmidt
2017-08-15 14:48         ` David Laight
2017-08-15  5:42     ` Benjamin Herrenschmidt
2017-08-15 16:37     ` Alex Williamson
2017-08-16  0:35       ` Benjamin Herrenschmidt [this message]
2017-08-16 16:56         ` Alex Williamson
2017-08-17  4:43           ` Benjamin Herrenschmidt
2017-08-17 10:56           ` David Laight
2017-08-17 19:25             ` Alex Williamson
2017-08-21  2:47   ` Alexey Kardashevskiy
2017-08-29  2:58     ` Alexey Kardashevskiy
2017-09-11  3:27       ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1502843749.4493.67.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=Kyle.Mahlkuch@ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=arvind.yadav.cs@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=cjia@nvidia.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dwmw2@infradead.org \
    --cc=elohimes@gmail.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jike.song@intel.com \
    --cc=joro@8bytes.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mauricfo@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=robin.murphy@arm.com \
    --cc=vlad@tsyrklevich.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).