All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org, will@kernel.org,
	linux-kernel@vger.kernel.org, hch@lst.de
Subject: Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
Date: Fri, 24 Jun 2022 15:49:47 +0100	[thread overview]
Message-ID: <809b0d12-c5ce-2364-268f-f0c4564414c9@arm.com> (raw)
In-Reply-To: <YrW76PPKadbZuN/3@8bytes.org>

On 2022-06-24 14:28, Joerg Roedel wrote:
> On Thu, Jun 23, 2022 at 12:41:00PM +0100, Robin Murphy wrote:
>> On 2022-06-23 12:33, Joerg Roedel wrote:
>>> On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
>>>> Thanks for your bravery!
>>>
>>> It already starts, with that patch I am getting:
>>>
>>> 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
>>>
>>> In my kernel log. The device is an AMD XHCI controller and seems to
>>> funciton normally after boot. The message disappears with
>>> iommu.forcedac=0.
>>>
>>> Need to look more into that...
>>
>> Given how amd_iommu_domain_alloc() sets the domain aperture, presumably the
>> DMA address allocated was 0xffffffffffefe000? Odd that it gets bits punched
>> out in the middle rather than simply truncated off the top as I would have
>> expected :/
> 
> So even more weird, as a workaround I changed the AMD IOMMU driver to
> allocate a 4-level page-table and limit the DMA aperture to 48 bits. I
> still get the same message.

Hmm, in that case my best guess would be that somewhere between the 
device itself and the IOMMU input it's trying to sign-extend the address 
from bit 47 or lower, but for whatever reason bits 55:48 get lost.

Comparing the PCI xHCI I have to hand, mine (with nothing plugged in) 
only has 6 pages mapped for its command ring and other stuff. Thus 
unless it's sharing that domain with other devices, to be accessing 
something down in the second MB of IOVA space suggests that this 
probably isn't the very first access it's made, and therefore it would 
almost certainly have to be the endpoint emitting a corrupted address, 
but only for certain operations.

FWIW I'd be inclined to turn on DMA debug and call 
debug_dma_dump_mappings() from the IOMMU fault handler, and/or add a bit 
of tracing to all the DMA mapping/allocation sites in the xHCI driver, 
to see what the offending address most likely represents.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: will@kernel.org, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, hch@lst.de, john.garry@huawei.com
Subject: Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
Date: Fri, 24 Jun 2022 15:49:47 +0100	[thread overview]
Message-ID: <809b0d12-c5ce-2364-268f-f0c4564414c9@arm.com> (raw)
In-Reply-To: <YrW76PPKadbZuN/3@8bytes.org>

On 2022-06-24 14:28, Joerg Roedel wrote:
> On Thu, Jun 23, 2022 at 12:41:00PM +0100, Robin Murphy wrote:
>> On 2022-06-23 12:33, Joerg Roedel wrote:
>>> On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
>>>> Thanks for your bravery!
>>>
>>> It already starts, with that patch I am getting:
>>>
>>> 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
>>>
>>> In my kernel log. The device is an AMD XHCI controller and seems to
>>> funciton normally after boot. The message disappears with
>>> iommu.forcedac=0.
>>>
>>> Need to look more into that...
>>
>> Given how amd_iommu_domain_alloc() sets the domain aperture, presumably the
>> DMA address allocated was 0xffffffffffefe000? Odd that it gets bits punched
>> out in the middle rather than simply truncated off the top as I would have
>> expected :/
> 
> So even more weird, as a workaround I changed the AMD IOMMU driver to
> allocate a 4-level page-table and limit the DMA aperture to 48 bits. I
> still get the same message.

Hmm, in that case my best guess would be that somewhere between the 
device itself and the IOMMU input it's trying to sign-extend the address 
from bit 47 or lower, but for whatever reason bits 55:48 get lost.

Comparing the PCI xHCI I have to hand, mine (with nothing plugged in) 
only has 6 pages mapped for its command ring and other stuff. Thus 
unless it's sharing that domain with other devices, to be accessing 
something down in the second MB of IOVA space suggests that this 
probably isn't the very first access it's made, and therefore it would 
almost certainly have to be the endpoint emitting a corrupted address, 
but only for certain operations.

FWIW I'd be inclined to turn on DMA debug and call 
debug_dma_dump_mappings() from the IOMMU fault handler, and/or add a bit 
of tracing to all the DMA mapping/allocation sites in the xHCI driver, 
to see what the offending address most likely represents.

Robin.

  reply	other threads:[~2022-06-24 14:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 15:12 [PATCH v2] iommu/dma: Add config for PCI SAC address trick Robin Murphy
2022-06-09 15:12 ` Robin Murphy
2022-06-09 18:09 ` John Garry via iommu
2022-06-09 18:09   ` John Garry
2022-06-10  6:01 ` Christoph Hellwig
2022-06-10  6:01   ` Christoph Hellwig
2022-06-22 12:59 ` Joerg Roedel
2022-06-22 12:59   ` Joerg Roedel
2022-06-22 13:12   ` Robin Murphy
2022-06-22 13:12     ` Robin Murphy
2022-06-23 11:33     ` Joerg Roedel
2022-06-23 11:33       ` Joerg Roedel
2022-06-23 11:41       ` Robin Murphy
2022-06-23 11:41         ` Robin Murphy
2022-06-24 13:28         ` Joerg Roedel
2022-06-24 13:28           ` Joerg Roedel
2022-06-24 14:49           ` Robin Murphy [this message]
2022-06-24 14:49             ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=809b0d12-c5ce-2364-268f-f0c4564414c9@arm.com \
    --to=robin.murphy@arm.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.