iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Nicolin Chen <nicolinc@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>
Subject: Re: RMRR device on non-Intel platform
Date: Wed, 26 Apr 2023 14:53:53 +0100	[thread overview]
Message-ID: <5ff0d72b-a7b8-c8a9-60e5-396e7a1ef363@arm.com> (raw)
In-Reply-To: <ZEkRnIPjeLNxbkj8@nvidia.com>

On 2023-04-26 12:57, Jason Gunthorpe wrote:
> On Fri, Apr 21, 2023 at 02:58:01PM -0300, Jason Gunthorpe wrote:
> 
>>> which for practical purposes in this context means an ITS.
>>
>> I haven't delved into it super detail, but.. my impression was..
>>
>> The ITS page only becomes relavent to the IOMMU layer if the actual
>> IRQ driver calls iommu_dma_prepare_msi()
> 
> Nicolin and I sat down and traced this through, this explanation is
> almost right...
> 
> irq-gic-v4.c is some sub module of irq-gic-v3-its.c so it does end up
> calling iommu_dma_prepare_msi() however..

Ignore GICv4; that basically only makes a difference to what happens 
after the CPU receives an interrupt.

> qemu will setup the ACPI so that VM thinks the ITS page is at
> 0x08080000. I think it maps some dummy CPU memory to this address.
> 
> iommufd will map the real ITS page at MSI_IOVA_BASE = 0x8000000 (!!)
> and only into the IOMMU
> 
> qemu will setup some RMRR thing to make 0x8000000 1:1 at the VM's
> IOMMU
> 
> When DMA API is used iommu_dma_prepare_msi() is called which will
> select a MSI page address that avoids the reserved region, so it is
> some random value != 0x8000000 and maps the dummy CPU page to it.
> The VM will then do a MSI-X programming cycle with the S1 IOVA of the
> CPU page and the data. qemu traps this and throws away the address
> from the VM. The kernel sets up the interrupt and assumes 0x8000000
> is the right IOVA.
> 
> When VFIO is used iommufd in the VM will force the MSI window to
> 0x8000000 and instead of putting a 1:1 mapping we map the dummy CPU
> page and then everything is broken. Adding the reserved check is an
> improvement.
> 
> The only way to properly fix this is to have qemu stop throwing away
> the address during the MSI-X programming. This needs to be programmed
> into the device instead.
> 
> I have no idea how best to get there with the ARM GIC setup.. It feels
> really hard.

Give QEMU a way to tell IOMMUFD to associate that 0x08080000 address 
with a given device as an MSI target. IOMMUFD then ensures that the S2 
mapping exists from that IPA to the device's real ITS (I vaguely 
remember Eric had a patch to pre-populate an MSI cookie with specific 
pages, which may have been heading along those lines). In the worst case 
this might mean having to subdivide the per-SMMU copies of the S2 domain 
into per-ITS copies as well, so we'd probably want to detect and compare 
devices' ITS parents up-front.

QEMU will presumably also need a way to pass the VA down to IOMMUFD when 
it sees the guest programming the MSI (possibly it could pass the IPA at 
the same time so we don't need a distinct step to set up S2 beforehand?) 
- once the underlying physical MSI configuration comes back from the PCI 
layer, that VA just needs to be dropped in to replace the original 
msi_msg address.

TBH at that point it may be easier to just not have a cookie in the S2 
domain at all when nesting is enabled, and just let IOMMUFD make the ITS 
mappings directly for itself.

Thanks,
Robin.

  reply	other threads:[~2023-04-26 13:54 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-20  6:52 RMRR device on non-Intel platform Tian, Kevin
2023-04-20 14:15 ` Alex Williamson
2023-04-20 14:19   ` Robin Murphy
2023-04-20 14:49     ` Alex Williamson
2023-04-20 16:55       ` Robin Murphy
2023-04-20 21:49         ` Alex Williamson
2023-04-21  4:10           ` Tian, Kevin
2023-04-21 11:33             ` Jason Gunthorpe
2023-04-21 11:34             ` Robin Murphy
2023-04-23  8:23               ` Tian, Kevin
2023-04-21 12:04           ` Jason Gunthorpe
2023-04-21 12:29             ` Robin Murphy
2023-04-21 12:45               ` Jason Gunthorpe
2023-04-21 17:22                 ` Robin Murphy
2023-04-21 17:58                   ` Jason Gunthorpe
2023-04-25 14:48                     ` Robin Murphy
2023-04-25 15:58                       ` Jason Gunthorpe
2023-04-26  8:39                         ` Tian, Kevin
2023-04-26 12:24                         ` Robin Murphy
2023-04-26 12:58                           ` Jason Gunthorpe
2023-04-25 16:37                     ` Nicolin Chen
2023-04-26 11:57                     ` Jason Gunthorpe
2023-04-26 13:53                       ` Robin Murphy [this message]
2023-04-26 14:17                         ` Jason Gunthorpe
2023-04-21 13:21             ` Baolu Lu
2023-04-21 13:33               ` Jason Gunthorpe
2023-04-23  8:24             ` Tian, Kevin
2023-04-24  2:50               ` Baolu Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ff0d72b-a7b8-c8a9-60e5-396e7a1ef363@arm.com \
    --to=robin.murphy@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).