All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhen Lei <thunder.leizhen@huawei.com>
To: Joerg Roedel <joro@8bytes.org>,
	iommu <iommu@lists.linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Sudeep Dutt <sudeep.dutt@intel.com>,
	Ashutosh Dixit <ashutosh.dixit@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Cc: Zefan Li <lizefan@huawei.com>, Xinwei Hu <huxinwei@huawei.com>,
	"Tianhong Ding" <dingtianhong@huawei.com>,
	Hanjun Guo <guohanjun@huawei.com>,
	Zhen Lei <thunder.leizhen@huawei.com>
Subject: [PATCH 0/7] iommu/iova: improve the allocation performance of dma64
Date: Wed, 22 Mar 2017 14:27:40 +0800	[thread overview]
Message-ID: <1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com> (raw)

64 bits devices is very common now. But currently we only defined a cached32_node
to optimize the allocation performance of dma32, and I saw some dma64 drivers chose
to allocate iova from dma32 space first, maybe becuase of current dma64 performance
problem or some other reasons.

For example:(in drivers/iommu/amd_iommu.c)
static unsigned long dma_ops_alloc_iova(......
{
	......
	if (dma_mask > DMA_BIT_MASK(32))
		pfn = alloc_iova_fast(&dma_dom->iovad, pages,
				      IOVA_PFN(DMA_BIT_MASK(32)));
	if (!pfn)
		pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask));
		
For the details of why dma64 iova allocation performance is very bad, please refer the
description of patch-5.

In this patch series, I added a cached64_node to manage the dma64 iova space(iova>=4G), it
takes the same effect as cached32_node(iova<4G).

Below it's the performance data before and after my patch series:
(before)$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
[  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
[  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
[  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec

(after)$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
[  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
[  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
[  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec


Zhen Lei (7):
  iommu/iova: fix incorrect variable types
  iommu/iova: cut down judgement times
  iommu/iova: insert start_pfn boundary of dma32
  iommu/iova: adjust __cached_rbnode_insert_update
  iommu/iova: to optimize the allocation performance of dma64
  iommu/iova: move the caculation of pad mask out of loop
  iommu/iova: fix iovad->dma_32bit_pfn as the last pfn of dma32

 drivers/iommu/amd_iommu.c        |   7 +-
 drivers/iommu/dma-iommu.c        |  22 ++----
 drivers/iommu/intel-iommu.c      |  11 +--
 drivers/iommu/iova.c             | 143 +++++++++++++++++++++------------------
 drivers/misc/mic/scif/scif_rma.c |   3 +-
 include/linux/iova.h             |   7 +-
 6 files changed, 94 insertions(+), 99 deletions(-)

-- 
2.5.0

WARNING: multiple messages have this Message-ID (diff)
From: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
To: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>,
	iommu
	<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>,
	David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Sudeep Dutt <sudeep.dutt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Ashutosh Dixit
	<ashutosh.dixit-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	linux-kernel
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Cc: Xinwei Hu <huxinwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Zhen Lei
	<thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Hanjun Guo <guohanjun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Tianhong Ding
	<dingtianhong-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Subject: [PATCH 0/7] iommu/iova: improve the allocation performance of dma64
Date: Wed, 22 Mar 2017 14:27:40 +0800	[thread overview]
Message-ID: <1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com> (raw)

64 bits devices is very common now. But currently we only defined a cached32_node
to optimize the allocation performance of dma32, and I saw some dma64 drivers chose
to allocate iova from dma32 space first, maybe becuase of current dma64 performance
problem or some other reasons.

For example:(in drivers/iommu/amd_iommu.c)
static unsigned long dma_ops_alloc_iova(......
{
	......
	if (dma_mask > DMA_BIT_MASK(32))
		pfn = alloc_iova_fast(&dma_dom->iovad, pages,
				      IOVA_PFN(DMA_BIT_MASK(32)));
	if (!pfn)
		pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask));
		
For the details of why dma64 iova allocation performance is very bad, please refer the
description of patch-5.

In this patch series, I added a cached64_node to manage the dma64 iova space(iova>=4G), it
takes the same effect as cached32_node(iova<4G).

Below it's the performance data before and after my patch series:
(before)$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
[  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
[  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
[  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec

(after)$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
[  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
[  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
[  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec


Zhen Lei (7):
  iommu/iova: fix incorrect variable types
  iommu/iova: cut down judgement times
  iommu/iova: insert start_pfn boundary of dma32
  iommu/iova: adjust __cached_rbnode_insert_update
  iommu/iova: to optimize the allocation performance of dma64
  iommu/iova: move the caculation of pad mask out of loop
  iommu/iova: fix iovad->dma_32bit_pfn as the last pfn of dma32

 drivers/iommu/amd_iommu.c        |   7 +-
 drivers/iommu/dma-iommu.c        |  22 ++----
 drivers/iommu/intel-iommu.c      |  11 +--
 drivers/iommu/iova.c             | 143 +++++++++++++++++++++------------------
 drivers/misc/mic/scif/scif_rma.c |   3 +-
 include/linux/iova.h             |   7 +-
 6 files changed, 94 insertions(+), 99 deletions(-)

-- 
2.5.0

             reply	other threads:[~2017-03-22  6:29 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-22  6:27 Zhen Lei [this message]
2017-03-22  6:27 ` [PATCH 0/7] iommu/iova: improve the allocation performance of dma64 Zhen Lei
2017-03-22  6:27 ` [PATCH 1/7] iommu/iova: fix incorrect variable types Zhen Lei
2017-03-22  6:27   ` Zhen Lei
2017-03-23 11:42   ` Robin Murphy
2017-03-24  2:27     ` Leizhen (ThunderTown)
2017-03-24  2:27       ` Leizhen (ThunderTown)
2017-03-31  3:30       ` Leizhen (ThunderTown)
2017-03-31  3:30         ` Leizhen (ThunderTown)
2017-03-22  6:27 ` [PATCH 2/7] iommu/iova: cut down judgement times Zhen Lei
2017-03-22  6:27   ` Zhen Lei
2017-03-23 12:11   ` Robin Murphy
2017-03-23 12:11     ` Robin Murphy
2017-03-31  3:55     ` Leizhen (ThunderTown)
2017-03-31  3:55       ` Leizhen (ThunderTown)
2017-03-22  6:27 ` [PATCH 3/7] iommu/iova: insert start_pfn boundary of dma32 Zhen Lei
2017-03-22  6:27   ` Zhen Lei
2017-03-23 13:01   ` Robin Murphy
2017-03-23 13:01     ` Robin Murphy
2017-03-24  3:43     ` Leizhen (ThunderTown)
2017-03-24  3:43       ` Leizhen (ThunderTown)
2017-03-31  3:32       ` Leizhen (ThunderTown)
2017-03-31  3:32         ` Leizhen (ThunderTown)
2017-03-22  6:27 ` [PATCH 4/7] iommu/iova: adjust __cached_rbnode_insert_update Zhen Lei
2017-03-22  6:27   ` Zhen Lei
2017-03-22  6:27 ` [PATCH 5/7] iommu/iova: to optimize the allocation performance of dma64 Zhen Lei
2017-03-22  6:27   ` Zhen Lei
2017-03-22  6:27 ` [PATCH 6/7] iommu/iova: move the caculation of pad mask out of loop Zhen Lei
2017-03-22  6:27   ` Zhen Lei
2017-03-22  6:27 ` [PATCH 7/7] iommu/iova: fix iovad->dma_32bit_pfn as the last pfn of dma32 Zhen Lei
2017-03-22  6:27   ` Zhen Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com \
    --to=thunder.leizhen@huawei.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=dingtianhong@huawei.com \
    --cc=dwmw2@infradead.org \
    --cc=guohanjun@huawei.com \
    --cc=huxinwei@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=robin.murphy@arm.com \
    --cc=sudeep.dutt@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.