All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Will Deacon <will@kernel.org>
Cc: <robin.murphy@arm.com>, <joro@8bytes.org>, <trivial@kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<iommu@lists.linux-foundation.org>,
	<linux-kernel@vger.kernel.org>, <linuxarm@huawei.com>,
	<maz@kernel.org>
Subject: Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()
Date: Thu, 16 Jul 2020 11:26:29 +0100	[thread overview]
Message-ID: <36fe9596-745b-b01c-181c-b87a544a413b@huawei.com> (raw)
In-Reply-To: <20200716102037.GB7036@willie-the-truck>

On 16/07/2020 11:20, Will Deacon wrote:
> On Tue, Jun 23, 2020 at 01:28:40AM +0800, John Garry wrote:
>> It has been shown that the cmpxchg() for finding space in the cmdq can
>> be a bottleneck:
>> - for more CPUs contending the cmdq, the cmpxchg() will fail more often
>> - since the software-maintained cons pointer is updated on the same 64b
>>    memory region, the chance of cmpxchg() failure increases again
>>
>> The cmpxchg() is removed as part of 2 related changes:
>>
>> - Update prod and cmdq owner in a single atomic add operation. For this, we
>>    count the prod and owner in separate regions in prod memory.
>>
>>    As with simple binary counting, once the prod+wrap fields overflow, they
>>    will zero. They should never overflow into "owner" region, and we zero
>>    the non-owner, prod region for each owner. This maintains the prod
>>    pointer.
>>
>>    As for the "owner", we now count this value, instead of setting a flag.
>>    Similar to before, once the owner has finished gathering, it will clear
>>    a mask. As such, a CPU declares itself as the "owner" when it reads zero
>>    for this region. This zeroing will also clear possible overflow in
>>    wrap+prod region, above.
>>
>>    The owner is now responsible for all cmdq locking to avoid possible
>>    deadlock. The owner will lock the cmdq for all non-owers it has gathered
>>    when they have space in the queue and have written their entries.
>>
>> - Check for space in the cmdq after the prod pointer has been assigned.
>>
>>    We don't bother checking for space in the cmdq before assigning the prod
>>    pointer, as this would be racy.
>>
>>    So since the prod pointer is updated unconditionally, it would be common
>>    for no space to be available in the cmdq when prod is assigned - that
>>    is, according the software-maintained prod and cons pointer. So now
>>    it must be ensured that the entries are not yet written and not until
>>    there is space.
>>
>>    How the prod pointer is maintained also leads to a strange condition
>>    where the prod pointer can wrap past the cons pointer. We can detect this
>>    condition, and report no space here. However, a prod pointer progressed
>>    twice past the cons pointer cannot be detected. But it can be ensured that
>>    this that this scenario does not occur, as we limit the amount of
>>    commands any CPU can issue at any given time, such that we cannot
>>    progress prod pointer further.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 101 ++++++++++++++++++++++--------------
>>   1 file changed, 61 insertions(+), 40 deletions(-)
> 
> I must admit, you made me smile putting trivial@kernel.org on cc for this ;)
> 

Yes, quite ironic :)

WARNING: multiple messages have this Message-ID (diff)
From: John Garry <john.garry@huawei.com>
To: Will Deacon <will@kernel.org>
Cc: trivial@kernel.org, maz@kernel.org, linux-kernel@vger.kernel.org,
	linuxarm@huawei.com, iommu@lists.linux-foundation.org,
	robin.murphy@arm.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()
Date: Thu, 16 Jul 2020 11:26:29 +0100	[thread overview]
Message-ID: <36fe9596-745b-b01c-181c-b87a544a413b@huawei.com> (raw)
In-Reply-To: <20200716102037.GB7036@willie-the-truck>

On 16/07/2020 11:20, Will Deacon wrote:
> On Tue, Jun 23, 2020 at 01:28:40AM +0800, John Garry wrote:
>> It has been shown that the cmpxchg() for finding space in the cmdq can
>> be a bottleneck:
>> - for more CPUs contending the cmdq, the cmpxchg() will fail more often
>> - since the software-maintained cons pointer is updated on the same 64b
>>    memory region, the chance of cmpxchg() failure increases again
>>
>> The cmpxchg() is removed as part of 2 related changes:
>>
>> - Update prod and cmdq owner in a single atomic add operation. For this, we
>>    count the prod and owner in separate regions in prod memory.
>>
>>    As with simple binary counting, once the prod+wrap fields overflow, they
>>    will zero. They should never overflow into "owner" region, and we zero
>>    the non-owner, prod region for each owner. This maintains the prod
>>    pointer.
>>
>>    As for the "owner", we now count this value, instead of setting a flag.
>>    Similar to before, once the owner has finished gathering, it will clear
>>    a mask. As such, a CPU declares itself as the "owner" when it reads zero
>>    for this region. This zeroing will also clear possible overflow in
>>    wrap+prod region, above.
>>
>>    The owner is now responsible for all cmdq locking to avoid possible
>>    deadlock. The owner will lock the cmdq for all non-owers it has gathered
>>    when they have space in the queue and have written their entries.
>>
>> - Check for space in the cmdq after the prod pointer has been assigned.
>>
>>    We don't bother checking for space in the cmdq before assigning the prod
>>    pointer, as this would be racy.
>>
>>    So since the prod pointer is updated unconditionally, it would be common
>>    for no space to be available in the cmdq when prod is assigned - that
>>    is, according the software-maintained prod and cons pointer. So now
>>    it must be ensured that the entries are not yet written and not until
>>    there is space.
>>
>>    How the prod pointer is maintained also leads to a strange condition
>>    where the prod pointer can wrap past the cons pointer. We can detect this
>>    condition, and report no space here. However, a prod pointer progressed
>>    twice past the cons pointer cannot be detected. But it can be ensured that
>>    this that this scenario does not occur, as we limit the amount of
>>    commands any CPU can issue at any given time, such that we cannot
>>    progress prod pointer further.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 101 ++++++++++++++++++++++--------------
>>   1 file changed, 61 insertions(+), 40 deletions(-)
> 
> I must admit, you made me smile putting trivial@kernel.org on cc for this ;)
> 

Yes, quite ironic :)
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: John Garry <john.garry@huawei.com>
To: Will Deacon <will@kernel.org>
Cc: trivial@kernel.org, maz@kernel.org, joro@8bytes.org,
	linux-kernel@vger.kernel.org, linuxarm@huawei.com,
	iommu@lists.linux-foundation.org, robin.murphy@arm.com,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()
Date: Thu, 16 Jul 2020 11:26:29 +0100	[thread overview]
Message-ID: <36fe9596-745b-b01c-181c-b87a544a413b@huawei.com> (raw)
In-Reply-To: <20200716102037.GB7036@willie-the-truck>

On 16/07/2020 11:20, Will Deacon wrote:
> On Tue, Jun 23, 2020 at 01:28:40AM +0800, John Garry wrote:
>> It has been shown that the cmpxchg() for finding space in the cmdq can
>> be a bottleneck:
>> - for more CPUs contending the cmdq, the cmpxchg() will fail more often
>> - since the software-maintained cons pointer is updated on the same 64b
>>    memory region, the chance of cmpxchg() failure increases again
>>
>> The cmpxchg() is removed as part of 2 related changes:
>>
>> - Update prod and cmdq owner in a single atomic add operation. For this, we
>>    count the prod and owner in separate regions in prod memory.
>>
>>    As with simple binary counting, once the prod+wrap fields overflow, they
>>    will zero. They should never overflow into "owner" region, and we zero
>>    the non-owner, prod region for each owner. This maintains the prod
>>    pointer.
>>
>>    As for the "owner", we now count this value, instead of setting a flag.
>>    Similar to before, once the owner has finished gathering, it will clear
>>    a mask. As such, a CPU declares itself as the "owner" when it reads zero
>>    for this region. This zeroing will also clear possible overflow in
>>    wrap+prod region, above.
>>
>>    The owner is now responsible for all cmdq locking to avoid possible
>>    deadlock. The owner will lock the cmdq for all non-owers it has gathered
>>    when they have space in the queue and have written their entries.
>>
>> - Check for space in the cmdq after the prod pointer has been assigned.
>>
>>    We don't bother checking for space in the cmdq before assigning the prod
>>    pointer, as this would be racy.
>>
>>    So since the prod pointer is updated unconditionally, it would be common
>>    for no space to be available in the cmdq when prod is assigned - that
>>    is, according the software-maintained prod and cons pointer. So now
>>    it must be ensured that the entries are not yet written and not until
>>    there is space.
>>
>>    How the prod pointer is maintained also leads to a strange condition
>>    where the prod pointer can wrap past the cons pointer. We can detect this
>>    condition, and report no space here. However, a prod pointer progressed
>>    twice past the cons pointer cannot be detected. But it can be ensured that
>>    this that this scenario does not occur, as we limit the amount of
>>    commands any CPU can issue at any given time, such that we cannot
>>    progress prod pointer further.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 101 ++++++++++++++++++++++--------------
>>   1 file changed, 61 insertions(+), 40 deletions(-)
> 
> I must admit, you made me smile putting trivial@kernel.org on cc for this ;)
> 

Yes, quite ironic :)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-07-16 10:28 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-22 17:28 [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency John Garry
2020-06-22 17:28 ` John Garry
2020-06-22 17:28 ` [PATCH 1/4] iommu/arm-smmu-v3: Fix trivial typo John Garry
2020-06-22 17:28   ` John Garry
2020-06-22 17:28 ` [PATCH 2/4] iommu/arm-smmu-v3: Calculate bits for prod and owner John Garry
2020-06-22 17:28   ` John Garry
2020-06-22 17:28 ` [PATCH 3/4] iommu/arm-smmu-v3: Always issue a CMD_SYNC per batch John Garry
2020-06-22 17:28   ` John Garry
2020-06-22 17:28 ` [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist() John Garry
2020-06-22 17:28   ` John Garry
2020-06-23  1:07   ` kernel test robot
2020-06-23  1:07     ` kernel test robot
2020-06-23  1:07     ` kernel test robot
2020-06-23  9:21     ` John Garry
2020-06-23  9:21       ` John Garry
2020-06-23  9:21       ` John Garry
2020-06-23  9:35       ` Rikard Falkeborn
2020-06-23 10:19         ` John Garry
2020-06-23 10:19           ` John Garry
2020-06-23 10:19           ` John Garry
2020-06-23 13:55           ` Rikard Falkeborn
2020-06-26 10:05             ` John Garry
2020-06-26 10:05               ` John Garry
2020-06-26 10:05               ` John Garry
2020-06-26 10:05               ` John Garry
2020-06-23 16:22       ` Robin Murphy
2020-06-23 16:22         ` Robin Murphy
2020-06-23 16:22         ` Robin Murphy
2020-06-23 16:22         ` Robin Murphy
2020-06-24  8:15         ` John Garry
2020-06-24  8:15           ` John Garry
2020-06-24  8:15           ` John Garry
2020-06-24  8:15           ` John Garry
2020-07-16 10:20   ` Will Deacon
2020-07-16 10:20     ` Will Deacon
2020-07-16 10:20     ` Will Deacon
2020-07-16 10:26     ` John Garry [this message]
2020-07-16 10:26       ` John Garry
2020-07-16 10:26       ` John Garry
2020-07-08 13:00 ` [PATCH 0/4] iommu/arm-smmu-v3: Improve cmdq lock efficiency John Garry
2020-07-08 13:00   ` John Garry
2020-07-08 13:00   ` John Garry
2020-07-16 10:19 ` Will Deacon
2020-07-16 10:19   ` Will Deacon
2020-07-16 10:19   ` Will Deacon
2020-07-16 10:22   ` Will Deacon
2020-07-16 10:22     ` Will Deacon
2020-07-16 10:22     ` Will Deacon
2020-07-16 10:28     ` Will Deacon
2020-07-16 10:28       ` Will Deacon
2020-07-16 10:28       ` Will Deacon
2020-07-16 10:56       ` John Garry
2020-07-16 10:56         ` John Garry
2020-07-16 10:56         ` John Garry
2020-07-16 11:22         ` Robin Murphy
2020-07-16 11:22           ` Robin Murphy
2020-07-16 11:22           ` Robin Murphy
2020-07-16 11:30           ` John Garry
2020-07-16 11:30             ` John Garry
2020-07-16 11:30             ` John Garry
2020-07-16 11:32           ` Will Deacon
2020-07-16 11:32             ` Will Deacon
2020-07-16 11:32             ` Will Deacon
2020-07-16 16:50             ` John Garry
2020-07-16 16:50               ` John Garry
2020-07-16 16:50               ` John Garry
2020-07-16 13:31       ` John Garry
2020-07-16 13:31         ` John Garry
2020-07-16 13:31         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36fe9596-745b-b01c-181c-b87a544a413b@huawei.com \
    --to=john.garry@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=maz@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=trivial@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.