linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Will Deacon <will.deacon@arm.com>, Robin Murphy <robin.murphy@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>,
	Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	iommu <iommu@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	LinuxArm <linuxarm@huawei.com>, Hanjun Guo <guohanjun@huawei.com>,
	Libin <huawei.libin@huawei.com>
Subject: Re: [PATCH v3 1/2] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
Date: Wed, 15 Aug 2018 19:08:45 +0100	[thread overview]
Message-ID: <5961191f-f913-9bbf-5d0d-81800bec36a1@huawei.com> (raw)
In-Reply-To: <20180815130046.GA19402@arm.com>

On 15/08/2018 14:00, Will Deacon wrote:
> On Wed, Aug 15, 2018 at 01:26:31PM +0100, Robin Murphy wrote:
>> On 15/08/18 11:23, Zhen Lei wrote:
>>> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
>>> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
>>> monotonously according to the sequence of the CMDs in the cmdq.
>>>
>>> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
>>> by spinlock, so the following scenarios may appear:
>>> cpu0			cpu1
>>> msidata=0
>>> 			msidata=1
>>> 			insert cmd1
>>> insert cmd0
>>> 			smmu execute cmd1
>>> smmu execute cmd0
>>> 			poll timeout, because msidata=1 is overridden by
>>> 			cmd0, that means VAL=0, sync_idx=1.
>>>
>>> This is not a functional problem, just make the caller wait for a long
>>> time until TIMEOUT. It's rare to happen, because any other CMD_SYNCs
>>> during the waiting period will break it.
>>>
>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>> ---
>>>  drivers/iommu/arm-smmu-v3.c | 12 ++++++++----
>>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>>> index 1d64710..3f5c236 100644
>>> --- a/drivers/iommu/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm-smmu-v3.c
>>> @@ -566,7 +566,7 @@ struct arm_smmu_device {
>>>
>>>  	int				gerr_irq;
>>>  	int				combined_irq;
>>> -	atomic_t			sync_nr;
>>> +	u32				sync_nr;
>>>
>>>  	unsigned long			ias; /* IPA */
>>>  	unsigned long			oas; /* PA */
>>> @@ -775,6 +775,11 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent)
>>>  	return 0;
>>>  }
>>>
>>> +static inline void arm_smmu_cmdq_sync_set_msidata(u64 *cmd, u32 msidata)
>>
>> If we *are* going to go down this route then I think it would make sense to
>> move the msiaddr and CMDQ_SYNC_0_CS_MSI logic here as well; i.e.
>> arm_smmu_cmdq_build_cmd() always generates a "normal" SEV-based sync
>> command, then calling this guy would convert it to an MSI-based one. As-is,
>> having bits of mutually-dependent data handled across two separate places
>> just seems too messy and error-prone.
>
> Yeah, but I'd first like to see some number showing that doing all of this
> under the lock actually has an impact.

Update:

I tested this patch versus a modified version which builds the command 
under the queue spinlock (* below). From my testing there is a small 
difference:

Setup:
Testing Single NVME card
fio 15 processes
No process pinning

Average Results:
v3 patch read/r,w/write (IOPS): 301K/149K,149K/307K
Build under lock version read/r,w/write (IOPS): 304K/150K,150K/311K

I don't know why it's better to build under the lock. We can test more.

I suppose there is no justification to build the command outside the 
spinlock based on these results alone...

Cheers,
John

* Modified version:
static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
{
     u64 cmd[CMDQ_ENT_DWORDS];
     unsigned long flags;
     struct arm_smmu_cmdq_ent ent = {
         .opcode = CMDQ_OP_CMD_SYNC,
         .sync    = {
             .msiaddr = virt_to_phys(&smmu->sync_count),
         },
     };

     spin_lock_irqsave(&smmu->cmdq.lock, flags);
     ent.sync.msidata = ++smmu->sync_nr;
     arm_smmu_cmdq_build_cmd(cmd, &ent);
     arm_smmu_cmdq_insert_cmd(smmu, cmd);
     spin_unlock_irqrestore(&smmu->cmdq.lock, flags);

     return __arm_smmu_sync_poll_msi(smmu, ent.sync.msidata);
}


> Will
>
> .
>



  reply	other threads:[~2018-08-15 18:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-15 10:23 [PATCH v3 0/2] bugfix and optimization about CMD_SYNC Zhen Lei
2018-08-15 10:23 ` [PATCH v3 1/2] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout Zhen Lei
2018-08-15 12:26   ` Robin Murphy
2018-08-15 13:00     ` Will Deacon
2018-08-15 18:08       ` John Garry [this message]
2018-08-16  4:11         ` Leizhen (ThunderTown)
2018-08-16  8:21     ` Leizhen (ThunderTown)
2018-08-16  9:18       ` Will Deacon
2018-08-16  9:27         ` Robin Murphy
2018-08-19  7:02           ` Leizhen (ThunderTown)
2018-09-05  1:46             ` Leizhen (ThunderTown)
2018-08-15 10:23 ` [PATCH v3 2/2] iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible Zhen Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5961191f-f913-9bbf-5d0d-81800bec36a1@huawei.com \
    --to=john.garry@huawei.com \
    --cc=guohanjun@huawei.com \
    --cc=huawei.libin@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=robin.murphy@arm.com \
    --cc=thunder.leizhen@huawei.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).