* [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
@ 2018-08-06 12:31 Zhen Lei
2018-08-08 10:12 ` Will Deacon
0 siblings, 1 reply; 5+ messages in thread
From: Zhen Lei @ 2018-08-06 12:31 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: Zhen Lei, LinuxArm, Hanjun Guo, Libin
The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
__arm_smmu_sync_poll_msi requires that sync_idx must be increased
monotonously according to the sequence of the CMDs in the cmdq.
But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
by spinlock, so the following scenarios may appear:
cpu0 cpu1
msidata=0
msidata=1
insert cmd1
insert cmd0
smmu execute cmd1
smmu execute cmd0
poll timeout, because msidata=1 is overridden by
cmd0, that means VAL=0, sync_idx=1.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/arm-smmu-v3.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 1d64710..4810f61 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -566,7 +566,7 @@ struct arm_smmu_device {
int gerr_irq;
int combined_irq;
- atomic_t sync_nr;
+ u32 sync_nr;
unsigned long ias; /* IPA */
unsigned long oas; /* PA */
@@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV);
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB);
- cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
break;
default:
@@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
struct arm_smmu_cmdq_ent ent = {
.opcode = CMDQ_OP_CMD_SYNC,
.sync = {
- .msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
.msiaddr = virt_to_phys(&smmu->sync_count),
},
};
@@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
arm_smmu_cmdq_build_cmd(cmd, &ent);
spin_lock_irqsave(&smmu->cmdq.lock, flags);
+ ent.sync.msidata = ++smmu->sync_nr;
+ cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);
arm_smmu_cmdq_insert_cmd(smmu, cmd);
spin_unlock_irqrestore(&smmu->cmdq.lock, flags);
@@ -2179,7 +2179,6 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu)
{
int ret;
- atomic_set(&smmu->sync_nr, 0);
ret = arm_smmu_init_queues(smmu);
if (ret)
return ret;
--
1.8.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
2018-08-06 12:31 [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout Zhen Lei
@ 2018-08-08 10:12 ` Will Deacon
2018-08-09 1:30 ` Leizhen (ThunderTown)
0 siblings, 1 reply; 5+ messages in thread
From: Will Deacon @ 2018-08-08 10:12 UTC (permalink / raw)
To: Zhen Lei
Cc: Robin Murphy, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel, LinuxArm, Hanjun Guo, Libin
Hi Thunder,
On Mon, Aug 06, 2018 at 08:31:29PM +0800, Zhen Lei wrote:
> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
> monotonously according to the sequence of the CMDs in the cmdq.
>
> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
> by spinlock, so the following scenarios may appear:
> cpu0 cpu1
> msidata=0
> msidata=1
> insert cmd1
> insert cmd0
> smmu execute cmd1
> smmu execute cmd0
> poll timeout, because msidata=1 is overridden by
> cmd0, that means VAL=0, sync_idx=1.
Oh yuck, you're right! We probably want a CC stable on this. Did you see
this go wrong in practice?
One comment on your patch...
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
> drivers/iommu/arm-smmu-v3.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 1d64710..4810f61 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -566,7 +566,7 @@ struct arm_smmu_device {
>
> int gerr_irq;
> int combined_irq;
> - atomic_t sync_nr;
> + u32 sync_nr;
>
> unsigned long ias; /* IPA */
> unsigned long oas; /* PA */
> @@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV);
> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB);
> - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
> cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
> break;
> default:
> @@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
> struct arm_smmu_cmdq_ent ent = {
> .opcode = CMDQ_OP_CMD_SYNC,
> .sync = {
> - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
> .msiaddr = virt_to_phys(&smmu->sync_count),
> },
> };
> @@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
> arm_smmu_cmdq_build_cmd(cmd, &ent);
>
> spin_lock_irqsave(&smmu->cmdq.lock, flags);
> + ent.sync.msidata = ++smmu->sync_nr;
> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);
I really don't like splitting this out from building the rest of the
command. Can you just move the call to arm_smmu_cmdq_build_cmd into the
critical section, please?
Thanks,
Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
2018-08-08 10:12 ` Will Deacon
@ 2018-08-09 1:30 ` Leizhen (ThunderTown)
2018-08-09 8:49 ` Will Deacon
0 siblings, 1 reply; 5+ messages in thread
From: Leizhen (ThunderTown) @ 2018-08-09 1:30 UTC (permalink / raw)
To: Will Deacon
Cc: Robin Murphy, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel, LinuxArm, Hanjun Guo, Libin
On 2018/8/8 18:12, Will Deacon wrote:
> Hi Thunder,
>
> On Mon, Aug 06, 2018 at 08:31:29PM +0800, Zhen Lei wrote:
>> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
>> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
>> monotonously according to the sequence of the CMDs in the cmdq.
>>
>> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
>> by spinlock, so the following scenarios may appear:
>> cpu0 cpu1
>> msidata=0
>> msidata=1
>> insert cmd1
>> insert cmd0
>> smmu execute cmd1
>> smmu execute cmd0
>> poll timeout, because msidata=1 is overridden by
>> cmd0, that means VAL=0, sync_idx=1.
>
> Oh yuck, you're right! We probably want a CC stable on this. Did you see
> this go wrong in practice?
Just misreported and make the caller wait for a long time until TIMEOUT. It's
rare to happen, because any other CMD_SYNC during the waiting period will break
it.
>
> One comment on your patch...
>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>> drivers/iommu/arm-smmu-v3.c | 7 +++----
>> 1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 1d64710..4810f61 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -566,7 +566,7 @@ struct arm_smmu_device {
>>
>> int gerr_irq;
>> int combined_irq;
>> - atomic_t sync_nr;
>> + u32 sync_nr;
>>
>> unsigned long ias; /* IPA */
>> unsigned long oas; /* PA */
>> @@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV);
>> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
>> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB);
>> - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
>> cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
>> break;
>> default:
>> @@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
>> struct arm_smmu_cmdq_ent ent = {
>> .opcode = CMDQ_OP_CMD_SYNC,
>> .sync = {
>> - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
>> .msiaddr = virt_to_phys(&smmu->sync_count),
>> },
>> };
>> @@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
>> arm_smmu_cmdq_build_cmd(cmd, &ent);
>>
>> spin_lock_irqsave(&smmu->cmdq.lock, flags);
>> + ent.sync.msidata = ++smmu->sync_nr;
>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);
>
> I really don't like splitting this out from building the rest of the
> command. Can you just move the call to arm_smmu_cmdq_build_cmd into the
> critical section, please?
OK. I have considered that before, just worry it will increase the compition of spinlock.
In addition, I will append a optimization patch: the adjacent CMD_SYNCs, we only need one.
>
> Thanks,
>
> Will
>
> .
>
--
Thanks!
BestRegards
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
2018-08-09 1:30 ` Leizhen (ThunderTown)
@ 2018-08-09 8:49 ` Will Deacon
2018-08-09 10:05 ` Leizhen (ThunderTown)
0 siblings, 1 reply; 5+ messages in thread
From: Will Deacon @ 2018-08-09 8:49 UTC (permalink / raw)
To: Leizhen (ThunderTown)
Cc: Robin Murphy, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel, LinuxArm, Hanjun Guo, Libin
On Thu, Aug 09, 2018 at 09:30:51AM +0800, Leizhen (ThunderTown) wrote:
> On 2018/8/8 18:12, Will Deacon wrote:
> > On Mon, Aug 06, 2018 at 08:31:29PM +0800, Zhen Lei wrote:
> >> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
> >> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
> >> monotonously according to the sequence of the CMDs in the cmdq.
> >>
> >> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
> >> by spinlock, so the following scenarios may appear:
> >> cpu0 cpu1
> >> msidata=0
> >> msidata=1
> >> insert cmd1
> >> insert cmd0
> >> smmu execute cmd1
> >> smmu execute cmd0
> >> poll timeout, because msidata=1 is overridden by
> >> cmd0, that means VAL=0, sync_idx=1.
> >
> > Oh yuck, you're right! We probably want a CC stable on this. Did you see
> > this go wrong in practice?
> Just misreported and make the caller wait for a long time until TIMEOUT. It's
> rare to happen, because any other CMD_SYNC during the waiting period will break
> it.
Thanks. Please mention that in the commit message, because I think it's
useful to know.
> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >> ---
> >> drivers/iommu/arm-smmu-v3.c | 7 +++----
> >> 1 file changed, 3 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >> index 1d64710..4810f61 100644
> >> --- a/drivers/iommu/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm-smmu-v3.c
> >> @@ -566,7 +566,7 @@ struct arm_smmu_device {
> >>
> >> int gerr_irq;
> >> int combined_irq;
> >> - atomic_t sync_nr;
> >> + u32 sync_nr;
> >>
> >> unsigned long ias; /* IPA */
> >> unsigned long oas; /* PA */
> >> @@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> >> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV);
> >> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
> >> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB);
> >> - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
> >> cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
> >> break;
> >> default:
> >> @@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
> >> struct arm_smmu_cmdq_ent ent = {
> >> .opcode = CMDQ_OP_CMD_SYNC,
> >> .sync = {
> >> - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
> >> .msiaddr = virt_to_phys(&smmu->sync_count),
> >> },
> >> };
> >> @@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
> >> arm_smmu_cmdq_build_cmd(cmd, &ent);
> >>
> >> spin_lock_irqsave(&smmu->cmdq.lock, flags);
> >> + ent.sync.msidata = ++smmu->sync_nr;
> >> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);
> >
> > I really don't like splitting this out from building the rest of the
> > command. Can you just move the call to arm_smmu_cmdq_build_cmd into the
> > critical section, please?
> OK. I have considered that before, just worry it will increase the
> compition of spinlock.
If you can provide numbers showing that it's a problem, then we could add
a helper function e.g. arm_smmu_cmdq_sync_set_msidata(arm_smmu_cmdq_ent *cmd)
> In addition, I will append a optimization patch: the adjacent CMD_SYNCs,
> we only need one.
Ok, but please keep them separate, since I don't want to fix up fixes and
optimisations.
Thanks,
Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
2018-08-09 8:49 ` Will Deacon
@ 2018-08-09 10:05 ` Leizhen (ThunderTown)
0 siblings, 0 replies; 5+ messages in thread
From: Leizhen (ThunderTown) @ 2018-08-09 10:05 UTC (permalink / raw)
To: Will Deacon
Cc: Robin Murphy, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel, LinuxArm, Hanjun Guo, Libin
On 2018/8/9 16:49, Will Deacon wrote:
> On Thu, Aug 09, 2018 at 09:30:51AM +0800, Leizhen (ThunderTown) wrote:
>> On 2018/8/8 18:12, Will Deacon wrote:
>>> On Mon, Aug 06, 2018 at 08:31:29PM +0800, Zhen Lei wrote:
>>>> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
>>>> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
>>>> monotonously according to the sequence of the CMDs in the cmdq.
>>>>
>>>> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
>>>> by spinlock, so the following scenarios may appear:
>>>> cpu0 cpu1
>>>> msidata=0
>>>> msidata=1
>>>> insert cmd1
>>>> insert cmd0
>>>> smmu execute cmd1
>>>> smmu execute cmd0
>>>> poll timeout, because msidata=1 is overridden by
>>>> cmd0, that means VAL=0, sync_idx=1.
>>>
>>> Oh yuck, you're right! We probably want a CC stable on this. Did you see
>>> this go wrong in practice?
>> Just misreported and make the caller wait for a long time until TIMEOUT. It's
>> rare to happen, because any other CMD_SYNC during the waiting period will break
>> it.
>
> Thanks. Please mention that in the commit message, because I think it's
> useful to know.
OK.
>
>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>>> ---
>>>> drivers/iommu/arm-smmu-v3.c | 7 +++----
>>>> 1 file changed, 3 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>>>> index 1d64710..4810f61 100644
>>>> --- a/drivers/iommu/arm-smmu-v3.c
>>>> +++ b/drivers/iommu/arm-smmu-v3.c
>>>> @@ -566,7 +566,7 @@ struct arm_smmu_device {
>>>>
>>>> int gerr_irq;
>>>> int combined_irq;
>>>> - atomic_t sync_nr;
>>>> + u32 sync_nr;
>>>>
>>>> unsigned long ias; /* IPA */
>>>> unsigned long oas; /* PA */
>>>> @@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>>> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV);
>>>> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
>>>> cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB);
>>>> - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
>>>> cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
>>>> break;
>>>> default:
>>>> @@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
>>>> struct arm_smmu_cmdq_ent ent = {
>>>> .opcode = CMDQ_OP_CMD_SYNC,
>>>> .sync = {
>>>> - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
>>>> .msiaddr = virt_to_phys(&smmu->sync_count),
>>>> },
>>>> };
>>>> @@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu)
>>>> arm_smmu_cmdq_build_cmd(cmd, &ent);
>>>>
>>>> spin_lock_irqsave(&smmu->cmdq.lock, flags);
>>>> + ent.sync.msidata = ++smmu->sync_nr;
>>>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);
>>>
>>> I really don't like splitting this out from building the rest of the
>>> command. Can you just move the call to arm_smmu_cmdq_build_cmd into the
>>> critical section, please?
>> OK. I have considered that before, just worry it will increase the
>> compition of spinlock.
>
> If you can provide numbers showing that it's a problem, then we could add
> a helper function e.g. arm_smmu_cmdq_sync_set_msidata(arm_smmu_cmdq_ent *cmd)
The performance data from my current test envirnoment is not stable now, I'm
trying to find anothor one. So I want to leave this problem for the time being.
If it'a problem, I will send a new patch.
>
>> In addition, I will append a optimization patch: the adjacent CMD_SYNCs,
>> we only need one.
>
> Ok, but please keep them separate, since I don't want to fix up fixes and
> optimisations.
OK
>
> Thanks,
>
> Will
>
> .
>
--
Thanks!
BestRegards
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-09 10:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-06 12:31 [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout Zhen Lei
2018-08-08 10:12 ` Will Deacon
2018-08-09 1:30 ` Leizhen (ThunderTown)
2018-08-09 8:49 ` Will Deacon
2018-08-09 10:05 ` Leizhen (ThunderTown)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).