From: "Isaac J. Manjarres" <isaacm@codeaurora.org>
To: will@kernel.org, robin.murphy@arm.com, joro@8bytes.org
Cc: "Isaac J. Manjarres" <isaacm@codeaurora.org>,
pdaly@codeaurora.org, linux-kernel@vger.kernel.org,
iommu@lists.linux-foundation.org, pratikp@codeaurora.org,
linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 0/5] Optimize iommu_map_sg() performance
Date: Mon, 11 Jan 2021 06:54:17 -0800 [thread overview]
Message-ID: <1610376862-927-1-git-send-email-isaacm@codeaurora.org> (raw)
The iommu_map_sg() code currently iterates through the given
scatter-gather list, and in the worst case, invokes iommu_map()
for each element in the scatter-gather list, which calls into
the IOMMU driver through an indirect call. For an IOMMU driver
that uses a format supported by the io-pgtable code, the IOMMU
driver will then call into the io-pgtable code to map the chunk.
Jumping between the IOMMU core code, the IOMMU driver, and the
io-pgtable code and back for each element in a scatter-gather list
is not efficient.
Instead, add a map_sg() hook in both the IOMMU driver ops and the
io-pgtable ops. iommu_map_sg() can then call into the IOMMU driver's
map_sg() hook with the entire scatter-gather list, which can call
into the io-pgtable map_sg() hook, which can process the entire
scatter-gather list, signficantly reducing the number of indirect
calls, and jumps between these layers, boosting performance.
On a system that uses the ARM SMMU driver, and the ARM LPAE format,
the current implementation of iommu_map_sg() yields the following
latencies for mapping scatter-gather lists of various sizes. These
latencies are calculated by repeating the mapping operation 10 times:
size iommu_map_sg latency
4K 0.624 us
64K 9.468 us
1M 122.557 us
2M 239.807 us
12M 1435.979 us
24M 2884.968 us
32M 3832.979 us
On the same system, the proposed modifications yield the following
results:
size iommu_map_sg latency
4K 3.645 us
64K 4.198 us
1M 11.010 us
2M 17.125 us
12M 82.416 us
24M 158.677 us
32M 210.468 us
The procedure for collecting the iommu_map_sg latencies is
the same in both experiments. Clearly, reducing the jumps
between the different layers in the IOMMU code offers a
signficant performance boost in iommu_map_sg() latency.
Changes since v1:
-Fixed an off by one error in arm_[lpae/v7s]_map_by_pgsize
when checking if the IOVA and physical address ranges being
mapped are within the appropriate limits.
-Added Sai Prakash Ranjan's "Tested-by" tag.
Thanks,
Isaac
Isaac J. Manjarres (5):
iommu/io-pgtable: Introduce map_sg() as a page table op
iommu/io-pgtable-arm: Hook up map_sg()
iommu/io-pgtable-arm-v7s: Hook up map_sg()
iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
iommu/arm-smmu: Hook up map_sg()
drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 ++++++++
drivers/iommu/io-pgtable-arm-v7s.c | 90 +++++++++++++++++++++++++++++++++++
drivers/iommu/io-pgtable-arm.c | 86 +++++++++++++++++++++++++++++++++
drivers/iommu/iommu.c | 25 ++++++++--
include/linux/io-pgtable.h | 6 +++
include/linux/iommu.h | 13 +++++
6 files changed, 234 insertions(+), 5 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next reply other threads:[~2021-01-11 14:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-11 14:54 Isaac J. Manjarres [this message]
2021-01-11 14:54 ` [PATCH v2 1/5] iommu/io-pgtable: Introduce map_sg() as a page table op Isaac J. Manjarres
2021-01-11 14:54 ` [PATCH v2 2/5] iommu/io-pgtable-arm: Hook up map_sg() Isaac J. Manjarres
2021-01-11 14:54 ` [PATCH v2 3/5] iommu/io-pgtable-arm-v7s: " Isaac J. Manjarres
2021-01-11 14:54 ` [PATCH v2 4/5] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers Isaac J. Manjarres
2021-01-11 14:54 ` [PATCH v2 5/5] iommu/arm-smmu: Hook up map_sg() Isaac J. Manjarres
2021-01-12 16:00 ` [PATCH v2 0/5] Optimize iommu_map_sg() performance Robin Murphy
2021-01-12 16:33 ` Christoph Hellwig
2021-01-13 2:54 ` Robin Murphy
2021-01-21 21:30 ` isaacm
2021-01-22 13:44 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1610376862-927-1-git-send-email-isaacm@codeaurora.org \
--to=isaacm@codeaurora.org \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pdaly@codeaurora.org \
--cc=pratikp@codeaurora.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).