All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: <will@kernel.org>, <robin.murphy@arm.com>
Cc: <joro@8bytes.org>, <trivial@kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<iommu@lists.linux-foundation.org>,
	<linux-kernel@vger.kernel.org>, <linuxarm@huawei.com>,
	<maz@kernel.org>
Subject: Re: [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency
Date: Wed, 8 Jul 2020 14:00:54 +0100	[thread overview]
Message-ID: <a5f98ff2-2d93-7306-9af9-a7bfc347757e@huawei.com> (raw)
In-Reply-To: <1592846920-45338-1-git-send-email-john.garry@huawei.com>

On 22/06/2020 18:28, John Garry wrote:

Hi, Can you guys let me know if this is on the radar at all?

I have been talking about this performance issue since Jan, and not 
getting anything really.

thanks

> As mentioned in [0], the CPU may consume many cycles processing
> arm_smmu_cmdq_issue_cmdlist(). One issue we find is the cmpxchg() loop to
> get space on the queue takes approx 25% of the cycles for this function.
> 
> This series removes that cmpxchg().
> 
> For my NVMe test with 3x NVMe SSDs, I'm getting a ~24% throughput
> increase:
> Before: 1310 IOPs
> After: 1630 IOPs
> 
> I also have a test harness to check the rate of DMA map+unmaps we can
> achieve:
> 
> CPU count	32	64	128
> Before:		63187	19418	10169
> After:		93287	44789	15862
> 
> (unit is map+unmaps per CPU per second)
> 
> [0] https://lore.kernel.org/linux-iommu/B926444035E5E2439431908E3842AFD24B86DB@DGGEMI525-MBS.china.huawei.com/T/#ma02e301c38c3e94b7725e685757c27e39c7cbde3
> 
> John Garry (4):
>    iommu/arm-smmu-v3: Fix trivial typo
>    iommu/arm-smmu-v3: Calculate bits for prod and owner
>    iommu/arm-smmu-v3: Always issue a CMD_SYNC per batch
>    iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()
> 
>   drivers/iommu/arm-smmu-v3.c | 233 +++++++++++++++++++++++-------------
>   1 file changed, 151 insertions(+), 82 deletions(-)
> 


WARNING: multiple messages have this Message-ID (diff)
From: John Garry <john.garry@huawei.com>
To: <will@kernel.org>, <robin.murphy@arm.com>
Cc: trivial@kernel.org, maz@kernel.org, linuxarm@huawei.com,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency
Date: Wed, 8 Jul 2020 14:00:54 +0100	[thread overview]
Message-ID: <a5f98ff2-2d93-7306-9af9-a7bfc347757e@huawei.com> (raw)
In-Reply-To: <1592846920-45338-1-git-send-email-john.garry@huawei.com>

On 22/06/2020 18:28, John Garry wrote:

Hi, Can you guys let me know if this is on the radar at all?

I have been talking about this performance issue since Jan, and not 
getting anything really.

thanks

> As mentioned in [0], the CPU may consume many cycles processing
> arm_smmu_cmdq_issue_cmdlist(). One issue we find is the cmpxchg() loop to
> get space on the queue takes approx 25% of the cycles for this function.
> 
> This series removes that cmpxchg().
> 
> For my NVMe test with 3x NVMe SSDs, I'm getting a ~24% throughput
> increase:
> Before: 1310 IOPs
> After: 1630 IOPs
> 
> I also have a test harness to check the rate of DMA map+unmaps we can
> achieve:
> 
> CPU count	32	64	128
> Before:		63187	19418	10169
> After:		93287	44789	15862
> 
> (unit is map+unmaps per CPU per second)
> 
> [0] https://lore.kernel.org/linux-iommu/B926444035E5E2439431908E3842AFD24B86DB@DGGEMI525-MBS.china.huawei.com/T/#ma02e301c38c3e94b7725e685757c27e39c7cbde3
> 
> John Garry (4):
>    iommu/arm-smmu-v3: Fix trivial typo
>    iommu/arm-smmu-v3: Calculate bits for prod and owner
>    iommu/arm-smmu-v3: Always issue a CMD_SYNC per batch
>    iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()
> 
>   drivers/iommu/arm-smmu-v3.c | 233 +++++++++++++++++++++++-------------
>   1 file changed, 151 insertions(+), 82 deletions(-)
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: John Garry <john.garry@huawei.com>
To: <will@kernel.org>, <robin.murphy@arm.com>
Cc: trivial@kernel.org, maz@kernel.org, joro@8bytes.org,
	linuxarm@huawei.com, linux-kernel@vger.kernel.org,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency
Date: Wed, 8 Jul 2020 14:00:54 +0100	[thread overview]
Message-ID: <a5f98ff2-2d93-7306-9af9-a7bfc347757e@huawei.com> (raw)
In-Reply-To: <1592846920-45338-1-git-send-email-john.garry@huawei.com>

On 22/06/2020 18:28, John Garry wrote:

Hi, Can you guys let me know if this is on the radar at all?

I have been talking about this performance issue since Jan, and not 
getting anything really.

thanks

> As mentioned in [0], the CPU may consume many cycles processing
> arm_smmu_cmdq_issue_cmdlist(). One issue we find is the cmpxchg() loop to
> get space on the queue takes approx 25% of the cycles for this function.
> 
> This series removes that cmpxchg().
> 
> For my NVMe test with 3x NVMe SSDs, I'm getting a ~24% throughput
> increase:
> Before: 1310 IOPs
> After: 1630 IOPs
> 
> I also have a test harness to check the rate of DMA map+unmaps we can
> achieve:
> 
> CPU count	32	64	128
> Before:		63187	19418	10169
> After:		93287	44789	15862
> 
> (unit is map+unmaps per CPU per second)
> 
> [0] https://lore.kernel.org/linux-iommu/B926444035E5E2439431908E3842AFD24B86DB@DGGEMI525-MBS.china.huawei.com/T/#ma02e301c38c3e94b7725e685757c27e39c7cbde3
> 
> John Garry (4):
>    iommu/arm-smmu-v3: Fix trivial typo
>    iommu/arm-smmu-v3: Calculate bits for prod and owner
>    iommu/arm-smmu-v3: Always issue a CMD_SYNC per batch
>    iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()
> 
>   drivers/iommu/arm-smmu-v3.c | 233 +++++++++++++++++++++++-------------
>   1 file changed, 151 insertions(+), 82 deletions(-)
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-07-08 13:02 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-22 17:28 [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency John Garry
2020-06-22 17:28 ` John Garry
2020-06-22 17:28 ` [PATCH 1/4] iommu/arm-smmu-v3: Fix trivial typo John Garry
2020-06-22 17:28   ` John Garry
2020-06-22 17:28 ` [PATCH 2/4] iommu/arm-smmu-v3: Calculate bits for prod and owner John Garry
2020-06-22 17:28   ` John Garry
2020-06-22 17:28 ` [PATCH 3/4] iommu/arm-smmu-v3: Always issue a CMD_SYNC per batch John Garry
2020-06-22 17:28   ` John Garry
2020-06-22 17:28 ` [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist() John Garry
2020-06-22 17:28   ` John Garry
2020-06-23  1:07   ` kernel test robot
2020-06-23  1:07     ` kernel test robot
2020-06-23  1:07     ` kernel test robot
2020-06-23  9:21     ` John Garry
2020-06-23  9:21       ` John Garry
2020-06-23  9:21       ` John Garry
2020-06-23  9:35       ` Rikard Falkeborn
2020-06-23 10:19         ` John Garry
2020-06-23 10:19           ` John Garry
2020-06-23 10:19           ` John Garry
2020-06-23 13:55           ` Rikard Falkeborn
2020-06-26 10:05             ` John Garry
2020-06-26 10:05               ` John Garry
2020-06-26 10:05               ` John Garry
2020-06-26 10:05               ` John Garry
2020-06-23 16:22       ` Robin Murphy
2020-06-23 16:22         ` Robin Murphy
2020-06-23 16:22         ` Robin Murphy
2020-06-23 16:22         ` Robin Murphy
2020-06-24  8:15         ` John Garry
2020-06-24  8:15           ` John Garry
2020-06-24  8:15           ` John Garry
2020-06-24  8:15           ` John Garry
2020-07-16 10:20   ` Will Deacon
2020-07-16 10:20     ` Will Deacon
2020-07-16 10:20     ` Will Deacon
2020-07-16 10:26     ` John Garry
2020-07-16 10:26       ` John Garry
2020-07-16 10:26       ` John Garry
2020-07-08 13:00 ` John Garry [this message]
2020-07-08 13:00   ` [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency John Garry
2020-07-08 13:00   ` John Garry
2020-07-16 10:19 ` Will Deacon
2020-07-16 10:19   ` Will Deacon
2020-07-16 10:19   ` Will Deacon
2020-07-16 10:22   ` Will Deacon
2020-07-16 10:22     ` Will Deacon
2020-07-16 10:22     ` Will Deacon
2020-07-16 10:28     ` Will Deacon
2020-07-16 10:28       ` Will Deacon
2020-07-16 10:28       ` Will Deacon
2020-07-16 10:56       ` John Garry
2020-07-16 10:56         ` John Garry
2020-07-16 10:56         ` John Garry
2020-07-16 11:22         ` Robin Murphy
2020-07-16 11:22           ` Robin Murphy
2020-07-16 11:22           ` Robin Murphy
2020-07-16 11:30           ` John Garry
2020-07-16 11:30             ` John Garry
2020-07-16 11:30             ` John Garry
2020-07-16 11:32           ` Will Deacon
2020-07-16 11:32             ` Will Deacon
2020-07-16 11:32             ` Will Deacon
2020-07-16 16:50             ` John Garry
2020-07-16 16:50               ` John Garry
2020-07-16 16:50               ` John Garry
2020-07-16 13:31       ` John Garry
2020-07-16 13:31         ` John Garry
2020-07-16 13:31         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5f98ff2-2d93-7306-9af9-a7bfc347757e@huawei.com \
    --to=john.garry@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=maz@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=trivial@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.