From: John Garry <john.garry@huawei.com>
To: Will Deacon <will.deacon@arm.com>, Robin Murphy <robin.murphy@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>,
Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
iommu <iommu@lists.linux-foundation.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
LinuxArm <linuxarm@huawei.com>, Hanjun Guo <guohanjun@huawei.com>,
Libin <huawei.libin@huawei.com>
Subject: Re: [PATCH v3 1/2] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
Date: Wed, 15 Aug 2018 19:08:45 +0100 [thread overview]
Message-ID: <5961191f-f913-9bbf-5d0d-81800bec36a1@huawei.com> (raw)
In-Reply-To: <20180815130046.GA19402@arm.com>
On 15/08/2018 14:00, Will Deacon wrote:
> On Wed, Aug 15, 2018 at 01:26:31PM +0100, Robin Murphy wrote:
>> On 15/08/18 11:23, Zhen Lei wrote:
>>> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
>>> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
>>> monotonously according to the sequence of the CMDs in the cmdq.
>>>
>>> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
>>> by spinlock, so the following scenarios may appear:
>>> cpu0 cpu1
>>> msidata=0
>>> msidata=1
>>> insert cmd1
>>> insert cmd0
>>> smmu execute cmd1
>>> smmu execute cmd0
>>> poll timeout, because msidata=1 is overridden by
>>> cmd0, that means VAL=0, sync_idx=1.
>>>
>>> This is not a functional problem, just make the caller wait for a long
>>> time until TIMEOUT. It's rare to happen, because any other CMD_SYNCs
>>> during the waiting period will break it.
>>>
>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>> ---
>>> drivers/iommu/arm-smmu-v3.c | 12 ++++++++----
>>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>>> index 1d64710..3f5c236 100644
>>> --- a/drivers/iommu/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm-smmu-v3.c
>>> @@ -566,7 +566,7 @@ struct arm_smmu_device {
>>>
>>> int gerr_irq;
>>> int combined_irq;
>>> - atomic_t sync_nr;
>>> + u32 sync_nr;
>>>
>>> unsigned long ias; /* IPA */
>>> unsigned long oas; /* PA */
>>> @@ -775,6 +775,11 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent)
>>> return 0;
>>> }
>>>
>>> +static inline void arm_smmu_cmdq_sync_set_msidata(u64 *cmd, u32 msidata)
>>
>> If we *are* going to go down this route then I think it would make sense to
>> move the msiaddr and CMDQ_SYNC_0_CS_MSI logic here as well; i.e.
>> arm_smmu_cmdq_build_cmd() always generates a "normal" SEV-based sync
>> command, then calling this guy would convert it to an MSI-based one. As-is,
>> having bits of mutually-dependent data handled across two separate places
>> just seems too messy and error-prone.
>
> Yeah, but I'd first like to see some number showing that doing all of this
> under the lock actually has an impact.
Update:
I tested this patch versus a modified version which builds the command
under the queue spinlock (* below). From my testing there is a small
difference:
Setup:
Testing Single NVME card
fio 15 processes
No process pinning
Average Results:
v3 patch read/r,w/write (IOPS): 301K/149K,149K/307K
Build under lock version read/r,w/write (IOPS): 304K/150K,150K/311K
I don't know why it's better to build under the lock. We can test more.
I suppose there is no justification to build the command outside the
spinlock based on these results alone...
Cheers,
John
* Modified version:
static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
{
u64 cmd[CMDQ_ENT_DWORDS];
unsigned long flags;
struct arm_smmu_cmdq_ent ent = {
.opcode = CMDQ_OP_CMD_SYNC,
.sync = {
.msiaddr = virt_to_phys(&smmu->sync_count),
},
};
spin_lock_irqsave(&smmu->cmdq.lock, flags);
ent.sync.msidata = ++smmu->sync_nr;
arm_smmu_cmdq_build_cmd(cmd, &ent);
arm_smmu_cmdq_insert_cmd(smmu, cmd);
spin_unlock_irqrestore(&smmu->cmdq.lock, flags);
return __arm_smmu_sync_poll_msi(smmu, ent.sync.msidata);
}
> Will
>
> .
>
next prev parent reply other threads:[~2018-08-15 18:09 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-15 10:23 [PATCH v3 0/2] bugfix and optimization about CMD_SYNC Zhen Lei
2018-08-15 10:23 ` [PATCH v3 1/2] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout Zhen Lei
2018-08-15 12:26 ` Robin Murphy
2018-08-15 13:00 ` Will Deacon
2018-08-15 18:08 ` John Garry [this message]
2018-08-16 4:11 ` Leizhen (ThunderTown)
2018-08-16 8:21 ` Leizhen (ThunderTown)
2018-08-16 9:18 ` Will Deacon
2018-08-16 9:27 ` Robin Murphy
2018-08-19 7:02 ` Leizhen (ThunderTown)
2018-09-05 1:46 ` Leizhen (ThunderTown)
2018-08-15 10:23 ` [PATCH v3 2/2] iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible Zhen Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5961191f-f913-9bbf-5d0d-81800bec36a1@huawei.com \
--to=john.garry@huawei.com \
--cc=guohanjun@huawei.com \
--cc=huawei.libin@huawei.com \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=robin.murphy@arm.com \
--cc=thunder.leizhen@huawei.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).