From: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
To: Robin Murphy <robin.murphy@arm.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Joerg Roedel <joro@8bytes.org>,
<iommu@lists.linux-foundation.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
David Woodhouse <dwmw2@infradead.org>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
<Jonathan.Cameron@huawei.com>, <nwatters@codeaurora.org>,
<ray.jui@broadcom.com>
Subject: Re: [PATCH 0/4] Optimise 64-bit IOVA allocations
Date: Fri, 21 Jul 2017 17:48:25 +0800 [thread overview]
Message-ID: <5971CDE9.5040600@huawei.com> (raw)
In-Reply-To: <19661034-093e-a744-b6fb-3d23a285ebe3@arm.com>
On 2017/7/19 18:23, Robin Murphy wrote:
> On 19/07/17 09:37, Ard Biesheuvel wrote:
>> On 18 July 2017 at 17:57, Robin Murphy <robin.murphy@arm.com> wrote:
>>> Hi all,
>>>
>>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>>> workloads (e.g. storage I/O with large scatterlists) probably remain quite
>>> heavily influenced by IOVA allocation performance. Separately, Ard also
>>> reported massive performance drops for a graphical desktop on AMD Seattle
>>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>>> the overhead of the rbtree slow path. Whilst we could go around trying to
>>> close up all the little gaps that lead to hitting the slowest case, it
>>> seems a much better idea to simply make said slowest case a lot less slow.
>>>
>>> I had a go at rebasing Leizhen's last IOVA series[1], but ended up finding
>>> the changes rather too hard to follow, so I've taken the liberty here of
>>> picking the whole thing up and reimplementing the main part in a rather
>>> less invasive manner.
>>>
>>> Robin.
>>>
>>> [1] https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg17753.html
>>>
>>> Robin Murphy (1):
>>> iommu/iova: Extend rbtree node caching
>>>
>>> Zhen Lei (3):
>>> iommu/iova: Optimise rbtree searching
>>> iommu/iova: Optimise the padding calculation
>>> iommu/iova: Make dma_32bit_pfn implicit
>>>
>>> drivers/gpu/drm/tegra/drm.c | 3 +-
>>> drivers/gpu/host1x/dev.c | 3 +-
>>> drivers/iommu/amd_iommu.c | 7 +--
>>> drivers/iommu/dma-iommu.c | 18 +------
>>> drivers/iommu/intel-iommu.c | 11 ++--
>>> drivers/iommu/iova.c | 112 ++++++++++++++++-----------------------
>>> drivers/misc/mic/scif/scif_rma.c | 3 +-
>>> include/linux/iova.h | 8 +--
>>> 8 files changed, 60 insertions(+), 105 deletions(-)
>>>
>>
>> These patches look suspiciously like the ones I have been using over
>> the past couple of weeks (modulo the tegra and host1x changes) from
>> your git tree. They work fine on my AMD Overdrive B1, both in DT and
>> in ACPI/IORT modes, although it is difficult to quantify any
>> performance deltas on my setup.
>
> Indeed - this is a rebase (to account for those new callers) with a
> couple of trivial tweaks to error paths and corner cases that normal
> usage shouldn't have been hitting anyway. "No longer unusably awful" is
> a good enough performance delta for me :)
>
>> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
I got the same performance data compared with my patch version. It works well.
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
>
> Thanks!
>
> Robin.
>
> .
>
--
Thanks!
BestRegards
prev parent reply other threads:[~2017-07-21 9:49 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-18 16:57 [PATCH 0/4] Optimise 64-bit IOVA allocations Robin Murphy
2017-07-18 16:57 ` [PATCH 1/4] iommu/iova: Optimise rbtree searching Robin Murphy
2017-07-18 16:57 ` [PATCH 2/4] iommu/iova: Optimise the padding calculation Robin Murphy
2017-07-18 16:57 ` [PATCH 3/4] iommu/iova: Extend rbtree node caching Robin Murphy
2017-07-18 16:57 ` [PATCH 4/4] iommu/iova: Make dma_32bit_pfn implicit Robin Murphy
2017-07-19 15:07 ` kbuild test robot
2017-07-20 2:55 ` Leizhen (ThunderTown)
2017-07-19 8:37 ` [PATCH 0/4] Optimise 64-bit IOVA allocations Ard Biesheuvel
2017-07-19 10:23 ` Robin Murphy
2017-07-21 9:48 ` Leizhen (ThunderTown) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5971CDE9.5040600@huawei.com \
--to=thunder.leizhen@huawei.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=ard.biesheuvel@linaro.org \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=nwatters@codeaurora.org \
--cc=ray.jui@broadcom.com \
--cc=robin.murphy@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).