All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: bpf@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
	David Woodhouse <dwmw2@infradead.org>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Kevin Tian <kevin.tian@intel.com>,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	llvm@lists.linux.dev, Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Tom Rix <trix@redhat.com>, Will Deacon <will@kernel.org>,
	Anthony Krowiak <akrowiak@linux.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Eric Farman <farman@linux.ibm.com>,
	Jason Wang <jasowang@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Jason Herne <jjherne@linux.ibm.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	kvm@vger.kernel.org, Lixiao Yang <lixiao.yang@intel.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	Yi Liu <yi.l.liu@intel.com>, Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH v5 13/19] iommufd: Add kAPI toward external drivers for physical devices
Date: Mon, 28 Nov 2022 11:55:41 +0100	[thread overview]
Message-ID: <94e6034a-c4c1-be0a-ea8c-f5934dbadd4c@redhat.com> (raw)
In-Reply-To: <Y4P9VzpCv/DyHeaD@nvidia.com>

Hi Jason,

On 11/28/22 01:14, Jason Gunthorpe wrote:
> On Sun, Nov 27, 2022 at 10:13:55PM +0100, Eric Auger wrote:
>>> +static int iommufd_device_setup_msi(struct iommufd_device *idev,
>>> +				    struct iommufd_hw_pagetable *hwpt,
>>> +				    phys_addr_t sw_msi_start,
>>> +				    unsigned int flags)
>>> +{
>>> +	int rc;
>>> +
>>> +	/*
>>> +	 * IOMMU_CAP_INTR_REMAP means that the platform is isolating MSI, and it
>> rather means that the *IOMMU* implements IRQ remapping.
> Not really. The name is a mess, but as it is implemented, it means the
> platform is implementing MSI security. How exactly that is done is not
> really defined, and it doesn't really belong as an iommu property.
> However the security is being created is done in a way that is
> transparent to the iommu_domain user.
Some 'ARM platforms' implement what you call MSI security but they do
not advertise IOMMU_CAP_INTR_REMAP

Besides refering to include/linux/iommu.h:
IOMMU_CAP_INTR_REMAP,           /* IOMMU supports interrupt isolation */

>
> MSI security means that the originating device (eg the RID for PCI) is
> validated when the MSI is processed. RIDs can only use MSIs they are
> authorized to use.
agreed this is what I described below.
>
> It doesn't matter how it is done, if it remapping HW, fancy
> iommu_domain tricks, or built into the MSI controller. Set this flag
> if the platform is secure and doesn't need the code triggered by
> irq_domain_check_msi_remap().
this is not what is implemented as of now. If the IOMMU does support
interrupt isolation, it advertises IOMMU_CAP_INTR_REMAP. On ARM this
feature is implemented by the ITS MSI controller instead and the only
way to retrieve the info whether the device MSIs are directed to that
kind of MSI controller is to use irq_domain_check_msi_remap().
>
>> irq_domain_check_msi_remap() instead means the MSI controller
>> implements that functionality (a given device id is able to trigger
> Not quite, it means that MSI isolation is available, however it is not
> transparent and the iommu_domain user must do the little dance that
> follows.
No I do not agree on that point. The 'little dance' is needed because
the SMMU does not bypass MSI writes as done on Intel. And someone must
take care of the MSI transaction mapping. This is the role of the MSI
cookie stuff. To me this is independent on the above discussion whether
MSI isolation is implemented.
>
> It does not mean the MSI controller implements the security
> functionality because it is not set on x86.
>
> Logically on x86 we should consider the remapping logic as part of the
> MSI controller even if x86 makes them separated for historical
> reasons.
>
>> MSI #n and this #n gets translated into actual MSI #m) So what you
>> want is to prevent an assigned device from being able to DMA into an
>> MSI doorbell that is not protected by either the IOMMU or the MSI
>> controller. If this happens this means the DMA can generate any kind
>> of MSI traffic that can jeopardize the host or other VMs
> I don't know of any real systems that work like this. ARM standard GIC
> uses a shared ITS page, the IOMMU does nothing to provide MSI
> security. MSI security is built into the GIC because it validates the
> device ID as part of processing the MSI. The IOMMU is only involved
> to grant access to the shared ITS page.

?? Yeah that's what I said. The SMMU does nothing about MSI security.
The ITS does.
>
> Intel is similar, it provides security through the MSI controller's
> remapping logic, the only difference is that on Intel the MSI window
> is always present in every iommu_domain (becuase it occures before the
> IOMMU), and in ARM you have to do the little dance.
On Intel the MSI window [FEE0_0000h - FEF0_000h] is bypassed by the
IOMMU. On ARM MSI transactions are translated except in case of HW MSI
RESV regions (I think).
>
> Even if the iommu is to be involved it is all hidden from this layer.
>
>> and afterwards resv_msi is checked to see if we need to create the
>> so-called msi cookie.  This msi cookie tells the MSI writes are
>> translated by the IOMMU and somebody must create a mapping for those
>> transactions.
> The cookie is always necessary if we are going to use the
> non-transparent mode. That is what makes it the non transparent
> mode. We have to supply the reserved IOVA range from one part of the
> iommu driver to another part.
>
>>> +	 * creates the MSI window by default in the iommu domain. Nothing
>>> +	 * further to do.
>>> +	 */
>>> +	if (device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP))
>>> +		return 0;
>>> +
>>> +	/*
>>> +	 * On ARM systems that set the global IRQ_DOMAIN_FLAG_MSI_REMAP every
>>> +	 * allocated iommu_domain will block interrupts by default and this
>> It sounds there is a confusion between IRQ remapping and the fact MSI
>> writes are not bypassed by the IOMMU.
> I don't think I'm confused :)
As soon as there is an SW MSI_RESV region and only in that case you need
to put in place the msi cookie (because it indicates the IOMMU
translates MSI transactions).

The fact the platform provides MSI security (through IOMMU or MSI
controller) looks orthogonal to me.
>
>>> +static int iommufd_device_do_attach(struct iommufd_device *idev,
>>> +				    struct iommufd_hw_pagetable *hwpt,
>>> +				    unsigned int flags)
>>> +{
>>> +	phys_addr_t sw_msi_start = 0;
>>> +	int rc;
>>> +
>>> +	mutex_lock(&hwpt->devices_lock);
>>> +
>>> +	/*
>>> +	 * Try to upgrade the domain we have, it is an iommu driver bug to
>>> +	 * report IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail
>>> +	 * enforce_cache_coherency when there are no devices attached to the
>>> +	 * domain.
>>> +	 */
>>> +	if (idev->enforce_cache_coherency && !hwpt->enforce_cache_coherency) {
>>> +		if (hwpt->domain->ops->enforce_cache_coherency)
>>> +			hwpt->enforce_cache_coherency =
>>> +				hwpt->domain->ops->enforce_cache_coherency(
>>> +					hwpt->domain);
>>> +		if (!hwpt->enforce_cache_coherency) {
>>> +			WARN_ON(list_empty(&hwpt->devices));
>>> +			rc = -EINVAL;
>>> +			goto out_unlock;
>>> +		}
>>> +	}
>>> +
>>> +	rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, idev->dev,
>>> +						   idev->group, &sw_msi_start);
>>> +	if (rc)
>>> +		goto out_unlock;
>>> +
>> so in the case of any IOMMU_RESV_MSI, iommufd_device_setup_msi() will be
>> called with *sw_msi_start = 0 which will return -EPERM?
>> I don't think this is what we want. In that case I think we want the
>> RESV_MSI region to be taken into account as a RESV region but we don't
>> need the MSI cookie.
> This was sort of sloppy copied from VFIO - we should just delete
> it. The is no driver that sets both, and once the platform asserts
> irq_domain_check_msi_remap() it is going down the non-transparent path
> anyhow and must set a cookie to work. [again the names doesn't make
> any sense for the functionality]
>
> Failing with EPERM is probably not so bad since the platform is using
> an invalid configuration. I'm kind of inclined to leave this for right
I don't understand why it is invalid? HW MSI RESV region is a valid
config and not sure you tested with that kind of setup, did you?
> now since it has all be tested and I don't want to make a
> regression. But I can try to send a patch to tidy it a bit more, maybe
> add an appropriate WARN_ON.
>
> The whole thing is actually backwards. The IOMMU_CAP_INTR_REMAP should
> have been some global irq_has_secure_msi() and everything with
> interrupt remapping, and ARM should set it.
You are revisiting this IOMMU_CAP_INTR_REMAP definition ... this is not
what is documented in the header file.

>
> Then the domain should have had a domain cap
> IOMUU_CAP_REQUIRE_MSI_WINDOW_SETUP to do the little dance ARM drivers
> need.
>
> And even better would have been to not have the little dance in the
> first place, as we don't really need the iommu_domain user to convey
> the fixed MSI window from one part of the iommu driver to another.
>
> We may try to fix this up when we get to doing nesting on ARM, or
> maybe just leave the confusing sort of working thing as-is. I don't
> know.
>
> Jason
>
Eric


  reply	other threads:[~2022-11-28 10:55 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-16 21:00 [PATCH v5 00/19] IOMMUFD Generic interface Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 01/19] iommu: Add IOMMU_CAP_ENFORCE_CACHE_COHERENCY Jason Gunthorpe
2022-11-23  8:30   ` Yi Liu
2022-11-23 16:56     ` Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 02/19] iommu: Add device-centric DMA ownership interfaces Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 03/19] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 04/19] scripts/kernel-doc: support EXPORT_SYMBOL_NS_GPL() with -export Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 05/19] iommufd: Document overview of iommufd Jason Gunthorpe
2022-11-18  9:06   ` Eric Auger
2022-11-30 15:06   ` Binbin Wu
2022-12-01  0:08     ` Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 06/19] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-11-18 16:27   ` Eric Auger
2022-11-18 20:23     ` Jason Gunthorpe
2022-11-25  8:43       ` Eric Auger
2022-11-16 21:00 ` [PATCH v5 07/19] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-11-18  9:08   ` Eric Auger
2022-11-18  9:09   ` Eric Auger
2022-11-18 16:28   ` Eric Auger
2022-11-18 20:25     ` Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 08/19] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-11-18  2:24   ` Tian, Kevin
2022-11-18  2:27     ` Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 09/19] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 10/19] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-11-18  2:55   ` Tian, Kevin
2022-11-16 21:00 ` [PATCH v5 11/19] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-11-27 17:49   ` Eric Auger
2022-11-28  9:05     ` Tian, Kevin
2022-11-28 18:11       ` Jason Gunthorpe
2022-11-28 18:27     ` Jason Gunthorpe
2022-11-28 20:09       ` Eric Auger
2022-11-16 21:00 ` [PATCH v5 12/19] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-11-27 15:12   ` Eric Auger
2022-11-16 21:00 ` [PATCH v5 13/19] iommufd: Add kAPI toward external drivers for physical devices Jason Gunthorpe
2022-11-27 21:13   ` Eric Auger
2022-11-28  0:14     ` Jason Gunthorpe
2022-11-28 10:55       ` Eric Auger [this message]
2022-11-28 13:20         ` Jason Gunthorpe
2022-11-28 14:17           ` Eric Auger
2022-11-29  1:09             ` Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 14/19] iommufd: Add kAPI toward external drivers for kernel access Jason Gunthorpe
2022-11-28 15:48   ` Eric Auger
2022-11-28 18:56     ` Jason Gunthorpe
2022-12-06 20:40       ` Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 15/19] iommufd: vfio container FD ioctl compatibility Jason Gunthorpe
2022-11-18  2:58   ` Tian, Kevin
2022-11-18 15:22     ` Jason Gunthorpe
2022-11-23  1:33       ` Tian, Kevin
2022-11-23  4:31         ` Jason Wang
2022-11-23 13:03         ` Jason Gunthorpe
2022-11-24  5:23           ` Tian, Kevin
2022-11-28 17:53   ` Eric Auger
2022-11-28 19:37     ` Jason Gunthorpe
2022-11-28 20:54       ` Eric Auger
2022-11-16 21:00 ` [PATCH v5 16/19] iommufd: Add kernel support for testing iommufd Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 17/19] iommufd: Add some fault injection points Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 18/19] iommufd: Add additional invariant assertions Jason Gunthorpe
2022-11-16 21:00 ` [PATCH v5 19/19] iommufd: Add a selftest Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94e6034a-c4c1-be0a-ea8c-f5934dbadd4c@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=bagasdotme@gmail.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=chaitanyak@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dwmw2@infradead.org \
    --cc=farman@linux.ibm.com \
    --cc=iommu@lists.linux.dev \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=jjherne@linux.ibm.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lixiao.yang@intel.com \
    --cc=llvm@lists.linux.dev \
    --cc=mjrosato@linux.ibm.com \
    --cc=mst@redhat.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=ojeda@kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shuah@kernel.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=trix@redhat.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.