All of lore.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: Ray Jui <ray.jui@broadcom.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel@lists.infradead.org,
	iommu@lists.linux-foundation.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Device address specific mapping of arm,mmu-500
Date: Wed, 31 May 2017 13:44:19 +0100	[thread overview]
Message-ID: <20170531124418.GE9723@arm.com> (raw)
In-Reply-To: <7bd03bf8-71d5-a974-bea2-a38b4349c547@broadcom.com>

On Tue, May 30, 2017 at 11:13:36PM -0700, Ray Jui wrote:
> I did a little more digging myself and I think I now understand what you
> meant by identity mapping, i.e., configuring the MMU-500 with 1:1 mapping
> between the DMA address and the IOVA address.
> 
> I think that should work. In the end, due to this MSI write parsing issue in
> our PCIe controller, the reason to use IOMMU is to allow the cache
> attributes (AxCACHE) of the MSI writes towards GICv3 ITS to be modified by
> the IOMMU to be device type, while leaving the rest of inbound reads/writes
> from/to DDR with more optimized cache attributes setting, to allow I/O
> coherency to be still enabled for the PCIe controller. In fact, the PCIe
> controller itself is fully capable of DMA to/from the full address space of
> our SoC including both DDR and any device memory.
> 
> The 1:1 mapping will still pose some translation overhead like you
> suggested; however, the overhead of allocating page tables and locking will
> be gone. This sounds like the best possible option I have currently.

It might end up being pretty invasive to work around a hardware bug, so
we'll have to see what it looks like. Ideally, we could just use the SMMU
for everything as-is and work on clawing back the lost performance (it
should be possible to get ~95% of the perf if we sort out the locking, which
we *are* working on).

> May I ask, how do I start to try to get this identity mapping to work as an
> experiment and proof of concept? Any pointer or advise is highly appreciated
> as you can see I'm not very experienced with this. I found Will recently
> added the IOMMU_DOMAIN_IDENTITY support to the arm-smmu driver. But I
> suppose that is to bypass the SMMU completely, instead of still going
> through the MMU with 1:1 translation. Is my understanding correct?

Yes, I don't think IOMMU_DOMAIN_IDENTITY is what you need because you
actally need per-page control of memory attributes.

Robin might have a better idea, but I think you'll have to hack dma-iommu.c
so that you can have a version of the DMA ops that:

  * Initialises the identity map (I guess as normal WB cacheable?)
  * Reserves and maps the MSI region appropriately
  * Just returns the physical address for the dma address for map requests
    (return error for the MSI region)
  * Does nothing for unmap requests

But my strong preference would be to fix the locking overhead from the
SMMU so that the perf hit is acceptable.

Will

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
To: Ray Jui <ray.jui-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Cc: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>,
	Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: Re: Device address specific mapping of arm,mmu-500
Date: Wed, 31 May 2017 13:44:19 +0100	[thread overview]
Message-ID: <20170531124418.GE9723@arm.com> (raw)
In-Reply-To: <7bd03bf8-71d5-a974-bea2-a38b4349c547-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

On Tue, May 30, 2017 at 11:13:36PM -0700, Ray Jui wrote:
> I did a little more digging myself and I think I now understand what you
> meant by identity mapping, i.e., configuring the MMU-500 with 1:1 mapping
> between the DMA address and the IOVA address.
> 
> I think that should work. In the end, due to this MSI write parsing issue in
> our PCIe controller, the reason to use IOMMU is to allow the cache
> attributes (AxCACHE) of the MSI writes towards GICv3 ITS to be modified by
> the IOMMU to be device type, while leaving the rest of inbound reads/writes
> from/to DDR with more optimized cache attributes setting, to allow I/O
> coherency to be still enabled for the PCIe controller. In fact, the PCIe
> controller itself is fully capable of DMA to/from the full address space of
> our SoC including both DDR and any device memory.
> 
> The 1:1 mapping will still pose some translation overhead like you
> suggested; however, the overhead of allocating page tables and locking will
> be gone. This sounds like the best possible option I have currently.

It might end up being pretty invasive to work around a hardware bug, so
we'll have to see what it looks like. Ideally, we could just use the SMMU
for everything as-is and work on clawing back the lost performance (it
should be possible to get ~95% of the perf if we sort out the locking, which
we *are* working on).

> May I ask, how do I start to try to get this identity mapping to work as an
> experiment and proof of concept? Any pointer or advise is highly appreciated
> as you can see I'm not very experienced with this. I found Will recently
> added the IOMMU_DOMAIN_IDENTITY support to the arm-smmu driver. But I
> suppose that is to bypass the SMMU completely, instead of still going
> through the MMU with 1:1 translation. Is my understanding correct?

Yes, I don't think IOMMU_DOMAIN_IDENTITY is what you need because you
actally need per-page control of memory attributes.

Robin might have a better idea, but I think you'll have to hack dma-iommu.c
so that you can have a version of the DMA ops that:

  * Initialises the identity map (I guess as normal WB cacheable?)
  * Reserves and maps the MSI region appropriately
  * Just returns the physical address for the dma address for map requests
    (return error for the MSI region)
  * Does nothing for unmap requests

But my strong preference would be to fix the locking overhead from the
SMMU so that the perf hit is acceptable.

Will

WARNING: multiple messages have this Message-ID (diff)
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: Device address specific mapping of arm,mmu-500
Date: Wed, 31 May 2017 13:44:19 +0100	[thread overview]
Message-ID: <20170531124418.GE9723@arm.com> (raw)
In-Reply-To: <7bd03bf8-71d5-a974-bea2-a38b4349c547@broadcom.com>

On Tue, May 30, 2017 at 11:13:36PM -0700, Ray Jui wrote:
> I did a little more digging myself and I think I now understand what you
> meant by identity mapping, i.e., configuring the MMU-500 with 1:1 mapping
> between the DMA address and the IOVA address.
> 
> I think that should work. In the end, due to this MSI write parsing issue in
> our PCIe controller, the reason to use IOMMU is to allow the cache
> attributes (AxCACHE) of the MSI writes towards GICv3 ITS to be modified by
> the IOMMU to be device type, while leaving the rest of inbound reads/writes
> from/to DDR with more optimized cache attributes setting, to allow I/O
> coherency to be still enabled for the PCIe controller. In fact, the PCIe
> controller itself is fully capable of DMA to/from the full address space of
> our SoC including both DDR and any device memory.
> 
> The 1:1 mapping will still pose some translation overhead like you
> suggested; however, the overhead of allocating page tables and locking will
> be gone. This sounds like the best possible option I have currently.

It might end up being pretty invasive to work around a hardware bug, so
we'll have to see what it looks like. Ideally, we could just use the SMMU
for everything as-is and work on clawing back the lost performance (it
should be possible to get ~95% of the perf if we sort out the locking, which
we *are* working on).

> May I ask, how do I start to try to get this identity mapping to work as an
> experiment and proof of concept? Any pointer or advise is highly appreciated
> as you can see I'm not very experienced with this. I found Will recently
> added the IOMMU_DOMAIN_IDENTITY support to the arm-smmu driver. But I
> suppose that is to bypass the SMMU completely, instead of still going
> through the MMU with 1:1 translation. Is my understanding correct?

Yes, I don't think IOMMU_DOMAIN_IDENTITY is what you need because you
actally need per-page control of memory attributes.

Robin might have a better idea, but I think you'll have to hack dma-iommu.c
so that you can have a version of the DMA ops that:

  * Initialises the identity map (I guess as normal WB cacheable?)
  * Reserves and maps the MSI region appropriately
  * Just returns the physical address for the dma address for map requests
    (return error for the MSI region)
  * Does nothing for unmap requests

But my strong preference would be to fix the locking overhead from the
SMMU so that the perf hit is acceptable.

Will

  reply	other threads:[~2017-05-31 12:44 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-30  1:18 Device address specific mapping of arm,mmu-500 Ray Jui
2017-05-30  1:18 ` Ray Jui
2017-05-30 15:14 ` Will Deacon
2017-05-30 15:14   ` Will Deacon
2017-05-30 15:14   ` Will Deacon
2017-05-30 16:49   ` Ray Jui
2017-05-30 16:49     ` Ray Jui
2017-05-30 16:49     ` Ray Jui via iommu
2017-05-30 16:59     ` Marc Zyngier
2017-05-30 16:59       ` Marc Zyngier
2017-05-30 16:59       ` Marc Zyngier
2017-05-30 17:16       ` Ray Jui
2017-05-30 17:16         ` Ray Jui
2017-05-30 17:16         ` Ray Jui via iommu
2017-05-30 17:27         ` Marc Zyngier
2017-05-30 17:27           ` Marc Zyngier
2017-05-30 17:27           ` Marc Zyngier
2017-05-30 22:06           ` Ray Jui
2017-05-30 22:06             ` Ray Jui
2017-05-30 22:06             ` Ray Jui via iommu
2017-05-31  6:13             ` Ray Jui
2017-05-31  6:13               ` Ray Jui
2017-05-31  6:13               ` Ray Jui via iommu
2017-05-31 12:44               ` Will Deacon [this message]
2017-05-31 12:44                 ` Will Deacon
2017-05-31 12:44                 ` Will Deacon
2017-05-31 17:32                 ` Ray Jui
2017-05-31 17:32                   ` Ray Jui
2017-05-31 17:32                   ` Ray Jui via iommu
2017-06-05 18:03                   ` Ray Jui
2017-06-05 18:03                     ` Ray Jui
2017-06-05 18:03                     ` Ray Jui via iommu
2017-06-06 10:02                     ` Robin Murphy
2017-06-06 10:02                       ` Robin Murphy
2017-06-07  6:20                       ` Ray Jui
2017-06-07  6:20                         ` Ray Jui
2017-05-30 17:27     ` Robin Murphy
2017-05-30 17:27       ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170531124418.GE9723@arm.com \
    --to=will.deacon@arm.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=ray.jui@broadcom.com \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.