From: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
To: "hch@lst.de" <hch@lst.de>,
"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"will@kernel.org" <will@kernel.org>,
"ganapatrao.kulkarni@cavium.com" <ganapatrao.kulkarni@cavium.com>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>
Cc: "iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>,
huangdaode <huangdaode@huawei.com>,
Linuxarm <linuxarm@huawei.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v3 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA
Date: Mon, 13 Jul 2020 02:45:56 +0000 [thread overview]
Message-ID: <B926444035E5E2439431908E3842AFD257B188@DGGEMI525-MBS.china.huawei.com> (raw)
In-Reply-To: <20200628111251.19108-1-song.bao.hua@hisilicon.com>
> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Sunday, June 28, 2020 11:13 PM
> To: hch@lst.de; m.szyprowski@samsung.com; robin.murphy@arm.com;
> will@kernel.org; ganapatrao.kulkarni@cavium.com;
> catalin.marinas@arm.com
> Cc: iommu@lists.linux-foundation.org; Linuxarm <linuxarm@huawei.com>;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; Song Bao
> Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Subject: [PATCH v3 0/2] make dma_alloc_coherent NUMA-aware by
> per-NUMA CMA
>
> Ganapatrao Kulkarni has put some effort on making arm-smmu-v3 use local
> memory to save command queues[1]. I also did similar job in patch
> "iommu/arm-smmu-v3: allocate the memory of queues in local numa node"
> [2] while not realizing Ganapatrao has done that before.
>
> But it seems it is much better to make dma_alloc_coherent() to be inherently
> NUMA-aware on NUMA-capable systems.
>
> Right now, smmu is using dma_alloc_coherent() to get memory to save queues
> and tables. Typically, on ARM64 server, there is a default CMA located at
> node0, which could be far away from node2, node3 etc.
> Saving queues and tables remotely will increase the latency of ARM SMMU
> significantly. For example, when SMMU is at node2 and the default global
> CMA is at node0, after sending a CMD_SYNC in an empty command queue, we
> have to wait more than 550ns for the completion of the command
> CMD_SYNC.
> However, if we save them locally, we only need to wait for 240ns.
>
> with per-numa CMA, smmu will get memory from local numa node to save
> command queues and page tables. that means dma_unmap latency will be
> shrunk much.
>
> Meanwhile, when iommu.passthrough is on, device drivers which call dma_
> alloc_coherent() will also get local memory and avoid the travel between
> numa nodes.
>
> [1]
> https://lists.linuxfoundation.org/pipermail/iommu/2017-October/024455.htm
> l
> [2] https://www.spinics.net/lists/iommu/msg44767.html
>
> -v3:
> * move to use page_to_nid() while freeing cma with respect to Robin's
> comment, but this will only work after applying my below patch:
> "mm/cma.c: use exact_nid true to fix possible per-numa cma leak"
> https://marc.info/?l=linux-mm&m=159333034726647&w=2
>
> * handle the case count <= 1 more properly according to Robin's
> comment;
>
> * add pernuma_cma parameter to support dynamic setting of per-numa
> cma size;
> ideally we can leverage the CMA_SIZE_MBYTES, CMA_SIZE_PERCENTAGE and
> "cma=" kernel parameter and avoid a new paramter separately for per-
> numa cma. Practically, it is really too complicated considering the
> below problems:
> (1) if we leverage the size of default numa for per-numa, we have to
> avoid creating two cma with same size in node0 since default cma is
> probably on node0.
> (2) default cma can consider the address limitation for old devices
> while per-numa cma doesn't support GFP_DMA and GFP_DMA32. all
> allocations with limitation flags will fallback to default one.
> (3) hard to apply CMA_SIZE_PERCENTAGE to per-numa. it is hard to
> decide if the percentage should apply to the whole memory size
> or only apply to the memory size of a specific numa node.
> (4) default cma size has CMA_SIZE_SEL_MIN and CMA_SIZE_SEL_MAX, it
> makes things even more complicated to per-numa cma.
>
> I haven't figured out a good way to leverage the size of default cma
> for per-numa cma. it seems a separate parameter for per-numa could
> make life easier.
>
> * move dma_pernuma_cma_reserve() after hugetlb_cma_reserve() to
> reuse the comment before hugetlb_cma_reserve() with respect to
> Robin's comment
>
> -v2:
> * fix some issues reported by kernel test robot
> * fallback to default cma while allocation fails in per-numa cma
> free memory properly
>
> Barry Song (2):
> dma-direct: provide the ability to reserve per-numa CMA
> arm64: mm: reserve per-numa CMA to localize coherent dma buffers
>
> .../admin-guide/kernel-parameters.txt | 9 ++
> arch/arm64/mm/init.c | 2 +
> include/linux/dma-contiguous.h | 4 +
> kernel/dma/Kconfig | 10 ++
> kernel/dma/contiguous.c | 98
> +++++++++++++++++--
> 5 files changed, 114 insertions(+), 9 deletions(-)
Gentle ping :-)
Thanks
Barry
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
prev parent reply other threads:[~2020-07-13 2:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-28 11:12 [PATCH v3 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA Barry Song
2020-06-28 11:12 ` [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA Barry Song
2020-07-22 14:16 ` Christoph Hellwig
2020-07-22 21:26 ` Song Bao Hua (Barry Song)
2020-07-23 11:55 ` Christoph Hellwig
2020-07-22 14:29 ` Christoph Hellwig
2020-07-22 21:41 ` Song Bao Hua (Barry Song)
2020-07-23 12:00 ` Christoph Hellwig
2020-07-23 12:08 ` Song Bao Hua (Barry Song)
2020-06-28 11:12 ` [PATCH v3 2/2] arm64: mm: reserve per-numa CMA to localize coherent dma buffers Barry Song
2020-07-13 2:45 ` Song Bao Hua (Barry Song) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B926444035E5E2439431908E3842AFD257B188@DGGEMI525-MBS.china.huawei.com \
--to=song.bao.hua@hisilicon.com \
--cc=catalin.marinas@arm.com \
--cc=ganapatrao.kulkarni@cavium.com \
--cc=hch@lst.de \
--cc=huangdaode@huawei.com \
--cc=iommu@lists.linux-foundation.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=m.szyprowski@samsung.com \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).