linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Ashok Raj <ashok_raj@linux.intel.com>
Cc: Baolu Lu <baolu.lu@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	iommu@lists.linux.dev, Ashok Raj <ashok.raj@intel.com>
Subject: Re: Question about reserved_regions w/ Intel IOMMU
Date: Thu, 8 Jun 2023 10:10:54 -0700	[thread overview]
Message-ID: <CAKgT0UfTzExYZGNCEXCJaS7huWDxwoC3Z_2JCzJHAgr9Qyxmsg@mail.gmail.com> (raw)
In-Reply-To: <ZIH1/e2OcCuD7DEi@araj-dh-work>

On Thu, Jun 8, 2023 at 8:40 AM Ashok Raj <ashok_raj@linux.intel.com> wrote:
>
> On Thu, Jun 08, 2023 at 07:33:31AM -0700, Alexander Duyck wrote:
> > On Wed, Jun 7, 2023 at 8:05 PM Baolu Lu <baolu.lu@linux.intel.com> wrote:
> > >
> > > On 6/8/23 7:03 AM, Alexander Duyck wrote:
> > > > On Wed, Jun 7, 2023 at 3:40 PM Alexander Duyck
> > > > <alexander.duyck@gmail.com> wrote:
> > > >>
> > > >> I am running into a DMA issue that appears to be a conflict between
> > > >> ACS and IOMMU. As per the documentation I can find, the IOMMU is
> > > >> supposed to create reserved regions for MSI and the memory window
> > > >> behind the root port. However looking at reserved_regions I am not
> > > >> seeing that. I only see the reservation for the MSI.
> > > >>
> > > >> So for example with an enabled NIC and iommu enabled w/o passthru I am seeing:
> > > >> # cat /sys/bus/pci/devices/0000\:83\:00.0/iommu_group/reserved_regions
> > > >> 0x00000000fee00000 0x00000000feefffff msi
> > > >>
> > > >> Shouldn't there also be a memory window for the region behind the root
> > > >> port to prevent any possible peer-to-peer access?
> > > >
> > > > Since the iommu portion of the email bounced I figured I would fix
> > > > that and provide some additional info.
> > > >
> > > > I added some instrumentation to the kernel to dump the resources found
> > > > in iova_reserve_pci_windows. From what I can tell it is finding the
> > > > correct resources for the Memory and Prefetchable regions behind the
> > > > root port. It seems to be calling reserve_iova which is successfully
> > > > allocating an iova to reserve the region.
> > > >
> > > > However still no luck on why it isn't showing up in reserved_regions.
> > >
> > > Perhaps I can ask the opposite question, why it should show up in
> > > reserve_regions? Why does the iommu subsystem block any possible peer-
> > > to-peer DMA access? Isn't that a decision of the device driver.
> > >
> > > The iova_reserve_pci_windows() you've seen is for kernel DMA interfaces
> > > which is not related to peer-to-peer accesses.
> >
> > The problem is if the IOVA overlaps with the physical addresses of
> > other devices that can be routed to via ACS redirect. As such if ACS
> > redirect is enabled a host IOVA could be directed to another device on
> > the switch instead. To prevent that we need to reserve those addresses
> > to avoid address space collisions.

Our test case is just to perform DMA to/from the host on one device on
a switch and what we are seeing is that when we hit an IOVA that
matches up with the physical address of the neighboring devices BAR0
then we are seeing an AER followed by a hot reset.

> Any untranslated address from a device must be forwarded to the IOMMU when
> ACS is enabled correct?I guess if you want true p2p, then you would need
> to map so that the hpa turns into the peer address.. but its always a round
> trip to IOMMU.

This assumes all parts are doing the Request Redirect "correctly". In
our case there is a PCIe switch we are trying to debug and we have a
few working theories. One concern I have is that the switch may be
throwing an ACS violation for us using an address that matches a
neighboring device instead of redirecting it to the upstream port. If
we pull the switch and just run on the root complex the issue seems to
be resolved so I started poking into the code which led me to the
documentation pointing out what is supposed to be reserved based on
the root complex and MSI regions.

As a part of going down that rabbit hole I realized that the
reserved_regions seems to only list the MSI reservation. However after
digging a bit deeper it seems like there is code to reserve the memory
behind the root complex in the IOVA but it doesn't look like that is
visible anywhere and is the piece I am currently trying to sort out.
What I am working on is trying to figure out if the system that is
failing is actually reserving that memory region in the IOVA, or if
that is somehow not happening in our test setup.

  reply	other threads:[~2023-06-08 17:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-07 22:40 Question about reserved_regions w/ Intel IOMMU Alexander Duyck
2023-06-07 23:03 ` Alexander Duyck
2023-06-08  3:03   ` Baolu Lu
2023-06-08 14:33     ` Alexander Duyck
2023-06-08 15:38       ` Ashok Raj
2023-06-08 17:10         ` Alexander Duyck [this message]
2023-06-08 17:52           ` Ashok Raj
2023-06-08 18:15             ` Alexander Duyck
2023-06-08 18:02           ` Robin Murphy
2023-06-08 18:17             ` Alexander Duyck
2023-06-08 15:28     ` Robin Murphy
2023-06-13 15:54       ` Jason Gunthorpe
2023-06-16  8:39         ` Tian, Kevin
2023-06-16 12:20           ` Jason Gunthorpe
2023-06-16 15:27             ` Alexander Duyck
2023-06-16 16:34               ` Robin Murphy
2023-06-16 18:59                 ` Jason Gunthorpe
2023-06-19 10:20                   ` Robin Murphy
2023-06-19 14:02                     ` Jason Gunthorpe
2023-06-20 14:57                       ` Alexander Duyck
2023-06-20 16:55                         ` Jason Gunthorpe
2023-06-20 17:47                           ` Alexander Duyck
2023-06-21 11:30                             ` Robin Murphy
2023-06-16 18:48               ` Jason Gunthorpe
2023-06-21  8:16             ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKgT0UfTzExYZGNCEXCJaS7huWDxwoC3Z_2JCzJHAgr9Qyxmsg@mail.gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=ashok.raj@intel.com \
    --cc=ashok_raj@linux.intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).