All of lore.kernel.org
 help / color / mirror / Atom feed
From: Barry Song <song.bao.hua@hisilicon.com>
To: <hch@lst.de>, <m.szyprowski@samsung.com>, <robin.murphy@arm.com>,
	<catalin.marinas@arm.com>
Cc: <iommu@lists.linux-foundation.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, <linuxarm@huawei.com>,
	<Jonathan.Cameron@huawei.com>, <john.garry@huawei.com>,
	<prime.zeng@hisilicon.com>,
	Barry Song <song.bao.hua@hisilicon.com>
Subject: [PATCH 0/3] support per-numa CMA for ARM server
Date: Wed, 3 Jun 2020 14:42:28 +1200	[thread overview]
Message-ID: <20200603024231.61748-1-song.bao.hua@hisilicon.com> (raw)

Right now, smmu is using dma_alloc_coherent() to get memory to save queues
and tables. Typically, on ARM64 server, there is a default CMA located at
node0, which could be far away from node2, node3 etc.
Saving queues and tables remotely will increase the latency of ARM SMMU
significantly. For example, when SMMU is at node2 and the default global
CMA is at node0, after sending a CMD_SYNC in an empty command queue, we
have to wait more than 550ns for the completion of the command CMD_SYNC.
However, if we save them locally, we only need to wait for 240ns.

with per-numa CMA, smmu will get memory from local numa node to save command
queues and page tables. that means dma_unmap latency will be shrunk much.

Meanwhile, when iommu.passthrough is on, device drivers which call dma_
alloc_coherent() will also get local memory and avoid the travel between
numa nodes.

Barry Song (3):
  dma-direct: provide the ability to reserve per-numa CMA
  arm64: mm: reserve hugetlb CMA after numa_init
  arm64: mm: reserve per-numa CMA after numa_init

 arch/arm64/mm/init.c           | 12 ++++++----
 include/linux/dma-contiguous.h |  4 ++++
 kernel/dma/Kconfig             | 10 ++++++++
 kernel/dma/contiguous.c        | 43 +++++++++++++++++++++++++++++++++-
 4 files changed, 63 insertions(+), 6 deletions(-)

-- 
2.23.0



WARNING: multiple messages have this Message-ID (diff)
From: Barry Song <song.bao.hua@hisilicon.com>
To: <hch@lst.de>, <m.szyprowski@samsung.com>, <robin.murphy@arm.com>,
	<catalin.marinas@arm.com>
Cc: linux-kernel@vger.kernel.org, linuxarm@huawei.com,
	iommu@lists.linux-foundation.org, prime.zeng@hisilicon.com,
	linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/3] support per-numa CMA for ARM server
Date: Wed, 3 Jun 2020 14:42:28 +1200	[thread overview]
Message-ID: <20200603024231.61748-1-song.bao.hua@hisilicon.com> (raw)

Right now, smmu is using dma_alloc_coherent() to get memory to save queues
and tables. Typically, on ARM64 server, there is a default CMA located at
node0, which could be far away from node2, node3 etc.
Saving queues and tables remotely will increase the latency of ARM SMMU
significantly. For example, when SMMU is at node2 and the default global
CMA is at node0, after sending a CMD_SYNC in an empty command queue, we
have to wait more than 550ns for the completion of the command CMD_SYNC.
However, if we save them locally, we only need to wait for 240ns.

with per-numa CMA, smmu will get memory from local numa node to save command
queues and page tables. that means dma_unmap latency will be shrunk much.

Meanwhile, when iommu.passthrough is on, device drivers which call dma_
alloc_coherent() will also get local memory and avoid the travel between
numa nodes.

Barry Song (3):
  dma-direct: provide the ability to reserve per-numa CMA
  arm64: mm: reserve hugetlb CMA after numa_init
  arm64: mm: reserve per-numa CMA after numa_init

 arch/arm64/mm/init.c           | 12 ++++++----
 include/linux/dma-contiguous.h |  4 ++++
 kernel/dma/Kconfig             | 10 ++++++++
 kernel/dma/contiguous.c        | 43 +++++++++++++++++++++++++++++++++-
 4 files changed, 63 insertions(+), 6 deletions(-)

-- 
2.23.0


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Barry Song <song.bao.hua@hisilicon.com>
To: <hch@lst.de>, <m.szyprowski@samsung.com>, <robin.murphy@arm.com>,
	<catalin.marinas@arm.com>
Cc: Barry Song <song.bao.hua@hisilicon.com>,
	john.garry@huawei.com, linux-kernel@vger.kernel.org,
	linuxarm@huawei.com, iommu@lists.linux-foundation.org,
	prime.zeng@hisilicon.com, Jonathan.Cameron@huawei.com,
	linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/3] support per-numa CMA for ARM server
Date: Wed, 3 Jun 2020 14:42:28 +1200	[thread overview]
Message-ID: <20200603024231.61748-1-song.bao.hua@hisilicon.com> (raw)

Right now, smmu is using dma_alloc_coherent() to get memory to save queues
and tables. Typically, on ARM64 server, there is a default CMA located at
node0, which could be far away from node2, node3 etc.
Saving queues and tables remotely will increase the latency of ARM SMMU
significantly. For example, when SMMU is at node2 and the default global
CMA is at node0, after sending a CMD_SYNC in an empty command queue, we
have to wait more than 550ns for the completion of the command CMD_SYNC.
However, if we save them locally, we only need to wait for 240ns.

with per-numa CMA, smmu will get memory from local numa node to save command
queues and page tables. that means dma_unmap latency will be shrunk much.

Meanwhile, when iommu.passthrough is on, device drivers which call dma_
alloc_coherent() will also get local memory and avoid the travel between
numa nodes.

Barry Song (3):
  dma-direct: provide the ability to reserve per-numa CMA
  arm64: mm: reserve hugetlb CMA after numa_init
  arm64: mm: reserve per-numa CMA after numa_init

 arch/arm64/mm/init.c           | 12 ++++++----
 include/linux/dma-contiguous.h |  4 ++++
 kernel/dma/Kconfig             | 10 ++++++++
 kernel/dma/contiguous.c        | 43 +++++++++++++++++++++++++++++++++-
 4 files changed, 63 insertions(+), 6 deletions(-)

-- 
2.23.0



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2020-06-03  2:44 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-03  2:42 Barry Song [this message]
2020-06-03  2:42 ` [PATCH 0/3] support per-numa CMA for ARM server Barry Song
2020-06-03  2:42 ` Barry Song
2020-06-03  2:42 ` [PATCH 1/3] dma-direct: provide the ability to reserve per-numa CMA Barry Song
2020-06-03  2:42   ` Barry Song
2020-06-03  2:42   ` Barry Song
2020-06-03  6:55   ` kbuild test robot
2020-06-03  6:55     ` kbuild test robot
2020-06-03  6:55     ` kbuild test robot
2020-06-03  6:55     ` kbuild test robot
2020-06-03  7:18   ` kbuild test robot
2020-06-03  7:18     ` kbuild test robot
2020-06-03  7:18     ` kbuild test robot
2020-06-03  7:18     ` kbuild test robot
2020-06-03  7:18   ` [RFC PATCH] dma-direct: dma_contiguous_pernuma_area[] can be static kbuild test robot
2020-06-03  7:18     ` kbuild test robot
2020-06-03  7:18     ` kbuild test robot
2020-06-03  7:18     ` kbuild test robot
2020-06-04 11:36   ` [PATCH 1/3] dma-direct: provide the ability to reserve per-numa CMA Dan Carpenter
2020-06-04 11:36     ` Dan Carpenter
2020-06-04 11:36     ` Dan Carpenter
2020-06-04 11:36     ` Dan Carpenter
2020-06-04 11:36     ` Dan Carpenter
2020-06-05  6:04     ` Song Bao Hua (Barry Song)
2020-06-05  6:04       ` Song Bao Hua
2020-06-05  6:04       ` Song Bao Hua (Barry Song)
2020-06-05  6:04       ` Song Bao Hua (Barry Song)
2020-06-05  8:57       ` Dan Carpenter
2020-06-05  8:57         ` Dan Carpenter
2020-06-05  8:57         ` Dan Carpenter
2020-06-05  8:57         ` Dan Carpenter
2020-06-05  8:57         ` Dan Carpenter
2020-06-06  3:46         ` [kbuild-all] " Philip Li
2020-06-06  3:46           ` Philip Li
2020-06-06  3:46           ` [kbuild-all] " Philip Li
2020-06-06  3:46           ` Philip Li
2020-06-06  3:46           ` Philip Li
2020-06-06 10:15           ` Song Bao Hua (Barry Song)
2020-06-06 10:15             ` Song Bao Hua
2020-06-06 10:15             ` [kbuild-all] " Song Bao Hua (Barry Song)
2020-06-06 10:15             ` Song Bao Hua (Barry Song)
2020-06-05 13:57   ` kernel test robot
2020-06-05 13:57     ` kernel test robot
2020-06-05 13:57     ` kernel test robot
2020-06-05 13:57     ` kernel test robot
2020-06-03  2:42 ` [PATCH 2/3] arm64: mm: reserve hugetlb CMA after numa_init Barry Song
2020-06-03  2:42   ` Barry Song
2020-06-03  2:42   ` Barry Song
2020-06-03  3:22   ` Roman Gushchin
2020-06-03  3:22     ` Roman Gushchin
2020-06-03  3:22     ` Roman Gushchin via iommu
2020-06-07 20:14     ` Matthias Brugger
2020-06-07 20:14       ` Matthias Brugger
2020-06-07 20:14       ` Matthias Brugger
2020-06-08  0:50       ` Song Bao Hua (Barry Song)
2020-06-08  0:50         ` Song Bao Hua (Barry Song)
2020-06-08  0:50         ` Song Bao Hua (Barry Song)
2020-06-09 15:33         ` Matthias Brugger
2020-06-09 15:33           ` Matthias Brugger
2020-06-09 15:33           ` Matthias Brugger
2020-06-03  2:42 ` [PATCH 3/3] arm64: mm: reserve per-numa " Barry Song
2020-06-03  2:42   ` Barry Song
2020-06-03  2:42   ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200603024231.61748-1-song.bao.hua@hisilicon.com \
    --to=song.bao.hua@hisilicon.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=catalin.marinas@arm.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=john.garry@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=m.szyprowski@samsung.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.