All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-08  9:20 ` Tomas Krcka
  0 siblings, 0 replies; 18+ messages in thread
From: Tomas Krcka @ 2023-03-08  9:20 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Tomas Krcka, Will Deacon, Robin Murphy, Joerg Roedel, Lu Baolu,
	Shameer Kolothum, iommu, linux-kernel

When an overflow occurs in the event queue, the SMMU toggles overflow
flag OVFLG in the PROD register.
The evtq thread is supposed to acknowledge the overflow flag by toggling
flag OVACKFLG in the CONS register, otherwise the overflow condition is
still active (OVFLG != OVACKFLG).

Currently the acknowledge register is toggled after clearing the event
queue but is never propagated to the hardware. It would be done next
time when executing evtq thread.

The SMMU still adds elements to the queue when the overflow condition is
active but any subsequent overflow information after clearing the event
queue will be lost.

This change keeps the SMMU in sync as it's expected by design.

Signed-off-by: Tomas Krcka <krckatom@amazon.de>
Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f2425b0f0cd6..acc1ff5ff69b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 	/* Sync our overflow flag, as we believe we're up to speed */
 	llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
 		    Q_IDX(llq, llq->cons);
+	queue_sync_cons_out(q);
 	return IRQ_HANDLED;
 }
 
-- 
2.39.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-08  9:20 ` Tomas Krcka
  0 siblings, 0 replies; 18+ messages in thread
From: Tomas Krcka @ 2023-03-08  9:20 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Tomas Krcka, Will Deacon, Robin Murphy, Joerg Roedel, Lu Baolu,
	Shameer Kolothum, iommu, linux-kernel

When an overflow occurs in the event queue, the SMMU toggles overflow
flag OVFLG in the PROD register.
The evtq thread is supposed to acknowledge the overflow flag by toggling
flag OVACKFLG in the CONS register, otherwise the overflow condition is
still active (OVFLG != OVACKFLG).

Currently the acknowledge register is toggled after clearing the event
queue but is never propagated to the hardware. It would be done next
time when executing evtq thread.

The SMMU still adds elements to the queue when the overflow condition is
active but any subsequent overflow information after clearing the event
queue will be lost.

This change keeps the SMMU in sync as it's expected by design.

Signed-off-by: Tomas Krcka <krckatom@amazon.de>
Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f2425b0f0cd6..acc1ff5ff69b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 	/* Sync our overflow flag, as we believe we're up to speed */
 	llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
 		    Q_IDX(llq, llq->cons);
+	queue_sync_cons_out(q);
 	return IRQ_HANDLED;
 }
 
-- 
2.39.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-08  9:20 ` Tomas Krcka
@ 2023-03-08 13:08   ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2023-03-08 13:08 UTC (permalink / raw)
  To: Tomas Krcka, linux-arm-kernel
  Cc: Will Deacon, Joerg Roedel, Lu Baolu, Shameer Kolothum, iommu,
	linux-kernel

On 2023-03-08 09:20, Tomas Krcka wrote:
> When an overflow occurs in the event queue, the SMMU toggles overflow
> flag OVFLG in the PROD register.
> The evtq thread is supposed to acknowledge the overflow flag by toggling
> flag OVACKFLG in the CONS register, otherwise the overflow condition is
> still active (OVFLG != OVACKFLG).
> 
> Currently the acknowledge register is toggled after clearing the event
> queue but is never propagated to the hardware. It would be done next
> time when executing evtq thread.
> 
> The SMMU still adds elements to the queue when the overflow condition is
> active but any subsequent overflow information after clearing the event
> queue will be lost.
> 
> This change keeps the SMMU in sync as it's expected by design.

If I've understood correctly, the upshot of this is that if the queue 
has overflowed once, become empty, then somehow goes from empty to full 
before we manage to consume a single event, we won't print the "events 
lost" message a second time.

Have you seen this happen in practice? TBH if the event queue ever 
overflows even once it's indicative that the system is hosed anyway, so 
it's not clear to me that there's any great loss of value in sometimes 
failing to repeat a warning for a chronic ongoing operational failure.

It could be argued that we have a subtle inconsistency between 
arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is 
that the Event queue and PRI queue *do* have different overflow 
behaviours, so it could equally be argued that inconsistency in the code 
helps reflect that. FWIW I can't say I have a strong preference either way.

Thanks,
Robin.

> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f2425b0f0cd6..acc1ff5ff69b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   	/* Sync our overflow flag, as we believe we're up to speed */
>   	llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>   		    Q_IDX(llq, llq->cons);
> +	queue_sync_cons_out(q);
>   	return IRQ_HANDLED;
>   }
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-08 13:08   ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2023-03-08 13:08 UTC (permalink / raw)
  To: Tomas Krcka, linux-arm-kernel
  Cc: Will Deacon, Joerg Roedel, Lu Baolu, Shameer Kolothum, iommu,
	linux-kernel

On 2023-03-08 09:20, Tomas Krcka wrote:
> When an overflow occurs in the event queue, the SMMU toggles overflow
> flag OVFLG in the PROD register.
> The evtq thread is supposed to acknowledge the overflow flag by toggling
> flag OVACKFLG in the CONS register, otherwise the overflow condition is
> still active (OVFLG != OVACKFLG).
> 
> Currently the acknowledge register is toggled after clearing the event
> queue but is never propagated to the hardware. It would be done next
> time when executing evtq thread.
> 
> The SMMU still adds elements to the queue when the overflow condition is
> active but any subsequent overflow information after clearing the event
> queue will be lost.
> 
> This change keeps the SMMU in sync as it's expected by design.

If I've understood correctly, the upshot of this is that if the queue 
has overflowed once, become empty, then somehow goes from empty to full 
before we manage to consume a single event, we won't print the "events 
lost" message a second time.

Have you seen this happen in practice? TBH if the event queue ever 
overflows even once it's indicative that the system is hosed anyway, so 
it's not clear to me that there's any great loss of value in sometimes 
failing to repeat a warning for a chronic ongoing operational failure.

It could be argued that we have a subtle inconsistency between 
arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is 
that the Event queue and PRI queue *do* have different overflow 
behaviours, so it could equally be argued that inconsistency in the code 
helps reflect that. FWIW I can't say I have a strong preference either way.

Thanks,
Robin.

> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f2425b0f0cd6..acc1ff5ff69b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   	/* Sync our overflow flag, as we believe we're up to speed */
>   	llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>   		    Q_IDX(llq, llq->cons);
> +	queue_sync_cons_out(q);
>   	return IRQ_HANDLED;
>   }
>   

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-08 13:08   ` Robin Murphy
@ 2023-03-08 14:02     ` Tomas Krcka
  -1 siblings, 0 replies; 18+ messages in thread
From: Tomas Krcka @ 2023-03-08 14:02 UTC (permalink / raw)
  To: robin.murphy
  Cc: baolu.lu, iommu, joro, krckatom, linux-arm-kernel, linux-kernel,
	shameerali.kolothum.thodi, will

>> When an overflow occurs in the event queue, the SMMU toggles overflow
>> flag OVFLG in the PROD register.
>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>> still active (OVFLG != OVACKFLG).
>>
>> Currently the acknowledge register is toggled after clearing the event
>> queue but is never propagated to the hardware. It would be done next
>> time when executing evtq thread.
>>
>> The SMMU still adds elements to the queue when the overflow condition is
>> active but any subsequent overflow information after clearing the event
>> queue will be lost.
>>
>> This change keeps the SMMU in sync as it's expected by design.
>
> If I've understood correctly, the upshot of this is that if the queue
> has overflowed once, become empty, then somehow goes from empty to full
> before we manage to consume a single event, we won't print the "events
> lost" message a second time.
>
> Have you seen this happen in practice? TBH if the event queue ever
> overflows even once it's indicative that the system is hosed anyway, so
> it's not clear to me that there's any great loss of value in sometimes
> failing to repeat a warning for a chronic ongoing operational failure.
>

Yes, I did see in practice. And it’s not just about loosing subsequence warning.
The way how it’s done now keeps inconsistent CONS register value between SMMU and the kernel
until any new event happens. The kernel doesn’t inform SMMU that we know about the overflow
and consuming events as fast as we can.

> It could be argued that we have a subtle inconsistency between
> arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is
> that the Event queue and PRI queue *do* have different overflow
> behaviours, so it could equally be argued that inconsistency in the code
> helps reflect that. FWIW I can't say I have a strong preference either way.

For the argument that the code can reflect the difference.
Then the comment 'Sync our overflow flag, as we believe we're up to speed’ is
already misleading.

Thanks.
BR,
Tomas



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-08 14:02     ` Tomas Krcka
  0 siblings, 0 replies; 18+ messages in thread
From: Tomas Krcka @ 2023-03-08 14:02 UTC (permalink / raw)
  To: robin.murphy
  Cc: baolu.lu, iommu, joro, krckatom, linux-arm-kernel, linux-kernel,
	shameerali.kolothum.thodi, will

>> When an overflow occurs in the event queue, the SMMU toggles overflow
>> flag OVFLG in the PROD register.
>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>> still active (OVFLG != OVACKFLG).
>>
>> Currently the acknowledge register is toggled after clearing the event
>> queue but is never propagated to the hardware. It would be done next
>> time when executing evtq thread.
>>
>> The SMMU still adds elements to the queue when the overflow condition is
>> active but any subsequent overflow information after clearing the event
>> queue will be lost.
>>
>> This change keeps the SMMU in sync as it's expected by design.
>
> If I've understood correctly, the upshot of this is that if the queue
> has overflowed once, become empty, then somehow goes from empty to full
> before we manage to consume a single event, we won't print the "events
> lost" message a second time.
>
> Have you seen this happen in practice? TBH if the event queue ever
> overflows even once it's indicative that the system is hosed anyway, so
> it's not clear to me that there's any great loss of value in sometimes
> failing to repeat a warning for a chronic ongoing operational failure.
>

Yes, I did see in practice. And it’s not just about loosing subsequence warning.
The way how it’s done now keeps inconsistent CONS register value between SMMU and the kernel
until any new event happens. The kernel doesn’t inform SMMU that we know about the overflow
and consuming events as fast as we can.

> It could be argued that we have a subtle inconsistency between
> arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is
> that the Event queue and PRI queue *do* have different overflow
> behaviours, so it could equally be argued that inconsistency in the code
> helps reflect that. FWIW I can't say I have a strong preference either way.

For the argument that the code can reflect the difference.
Then the comment 'Sync our overflow flag, as we believe we're up to speed’ is
already misleading.

Thanks.
BR,
Tomas



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-08 14:02     ` Tomas Krcka
@ 2023-03-08 16:46       ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2023-03-08 16:46 UTC (permalink / raw)
  To: Tomas Krcka
  Cc: baolu.lu, iommu, joro, linux-arm-kernel, linux-kernel,
	shameerali.kolothum.thodi, will

On 2023-03-08 14:02, Tomas Krcka wrote:
>>> When an overflow occurs in the event queue, the SMMU toggles overflow
>>> flag OVFLG in the PROD register.
>>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>>> still active (OVFLG != OVACKFLG).
>>>
>>> Currently the acknowledge register is toggled after clearing the event
>>> queue but is never propagated to the hardware. It would be done next
>>> time when executing evtq thread.
>>>
>>> The SMMU still adds elements to the queue when the overflow condition is
>>> active but any subsequent overflow information after clearing the event
>>> queue will be lost.
>>>
>>> This change keeps the SMMU in sync as it's expected by design.
>>
>> If I've understood correctly, the upshot of this is that if the queue
>> has overflowed once, become empty, then somehow goes from empty to full
>> before we manage to consume a single event, we won't print the "events
>> lost" message a second time.
>>
>> Have you seen this happen in practice? TBH if the event queue ever
>> overflows even once it's indicative that the system is hosed anyway, so
>> it's not clear to me that there's any great loss of value in sometimes
>> failing to repeat a warning for a chronic ongoing operational failure.
>>
> 
> Yes, I did see in practice. And it’s not just about loosing subsequence warning.
> The way how it’s done now keeps inconsistent CONS register value between SMMU and the kernel
> until any new event happens. The kernel doesn’t inform SMMU that we know about the overflow
> and consuming events as fast as we can.

Interesting - out of curiosity, is something blocking the IRQ thread 
from running in a timely manner, or are you just using a really tiny 
event queue?

Either way though, the point is that there is nothing to "inform" the 
SMMU about here. It will see that we're consuming events and making 
space in the queue, because we're still updating CONS.RD. All that an 
update of PROD.OVFLG serves to do is indicate to software that events 
have been discarded since the last time CONS.OVACKFLG was updated. It 
makes no difference to the SMMU if it continues to discard *more* events 
until software updates CONS.OVACKFLG again. It's entirely software's own 
decision how closely it wants to keep track of overflows.

Like I say it's not clear how much Linux really cares about that, given 
that all we do with the information is log a message to indicate that 
some more events have been lost since the last time we logged the same 
message. Furthermore, the only thing we'll do with the overwhelming 
majority of events themselves is also log messages. Thus realistically 
if we're suddenly faced with processing a full event queue out of 
nowhere, then many of the events which *were* delivered to the queue 
will also be "lost" thanks to rate-limiting.

FWIW I think it's still true that for our currently supported use-cases 
in Linux, *any* discardable event is a sign that something's gone wrong; 
a full queue of 32K events would already be a sign that something's gone 
*severely* wrong, so at that point knowing whether it was exactly 32K, 
or 32K + n for some indeterminate value of n, is unlikely to be 
significantly meaningful.

>> It could be argued that we have a subtle inconsistency between
>> arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is
>> that the Event queue and PRI queue *do* have different overflow
>> behaviours, so it could equally be argued that inconsistency in the code
>> helps reflect that. FWIW I can't say I have a strong preference either way.
> 
> For the argument that the code can reflect the difference.
> Then the comment 'Sync our overflow flag, as we believe we're up to speed’ is
> already misleading.

Yes, that is what I was alluding to. Sometimes if a comment doesn't 
clearly match the code it means the code is wrong. Sometimes it just 
means the comment is wrong.

I'm not saying this patch is the wrong answer, but as presented it 
hasn't managed to convince me that it's the right one either. Largely 
since I'm not 100% sure what the exact question is - even with this 
change we'd still have the same ABA problem whenever the queue overflows 
again *before* it's completely drained.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-08 16:46       ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2023-03-08 16:46 UTC (permalink / raw)
  To: Tomas Krcka
  Cc: baolu.lu, iommu, joro, linux-arm-kernel, linux-kernel,
	shameerali.kolothum.thodi, will

On 2023-03-08 14:02, Tomas Krcka wrote:
>>> When an overflow occurs in the event queue, the SMMU toggles overflow
>>> flag OVFLG in the PROD register.
>>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>>> still active (OVFLG != OVACKFLG).
>>>
>>> Currently the acknowledge register is toggled after clearing the event
>>> queue but is never propagated to the hardware. It would be done next
>>> time when executing evtq thread.
>>>
>>> The SMMU still adds elements to the queue when the overflow condition is
>>> active but any subsequent overflow information after clearing the event
>>> queue will be lost.
>>>
>>> This change keeps the SMMU in sync as it's expected by design.
>>
>> If I've understood correctly, the upshot of this is that if the queue
>> has overflowed once, become empty, then somehow goes from empty to full
>> before we manage to consume a single event, we won't print the "events
>> lost" message a second time.
>>
>> Have you seen this happen in practice? TBH if the event queue ever
>> overflows even once it's indicative that the system is hosed anyway, so
>> it's not clear to me that there's any great loss of value in sometimes
>> failing to repeat a warning for a chronic ongoing operational failure.
>>
> 
> Yes, I did see in practice. And it’s not just about loosing subsequence warning.
> The way how it’s done now keeps inconsistent CONS register value between SMMU and the kernel
> until any new event happens. The kernel doesn’t inform SMMU that we know about the overflow
> and consuming events as fast as we can.

Interesting - out of curiosity, is something blocking the IRQ thread 
from running in a timely manner, or are you just using a really tiny 
event queue?

Either way though, the point is that there is nothing to "inform" the 
SMMU about here. It will see that we're consuming events and making 
space in the queue, because we're still updating CONS.RD. All that an 
update of PROD.OVFLG serves to do is indicate to software that events 
have been discarded since the last time CONS.OVACKFLG was updated. It 
makes no difference to the SMMU if it continues to discard *more* events 
until software updates CONS.OVACKFLG again. It's entirely software's own 
decision how closely it wants to keep track of overflows.

Like I say it's not clear how much Linux really cares about that, given 
that all we do with the information is log a message to indicate that 
some more events have been lost since the last time we logged the same 
message. Furthermore, the only thing we'll do with the overwhelming 
majority of events themselves is also log messages. Thus realistically 
if we're suddenly faced with processing a full event queue out of 
nowhere, then many of the events which *were* delivered to the queue 
will also be "lost" thanks to rate-limiting.

FWIW I think it's still true that for our currently supported use-cases 
in Linux, *any* discardable event is a sign that something's gone wrong; 
a full queue of 32K events would already be a sign that something's gone 
*severely* wrong, so at that point knowing whether it was exactly 32K, 
or 32K + n for some indeterminate value of n, is unlikely to be 
significantly meaningful.

>> It could be argued that we have a subtle inconsistency between
>> arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is
>> that the Event queue and PRI queue *do* have different overflow
>> behaviours, so it could equally be argued that inconsistency in the code
>> helps reflect that. FWIW I can't say I have a strong preference either way.
> 
> For the argument that the code can reflect the difference.
> Then the comment 'Sync our overflow flag, as we believe we're up to speed’ is
> already misleading.

Yes, that is what I was alluding to. Sometimes if a comment doesn't 
clearly match the code it means the code is wrong. Sometimes it just 
means the comment is wrong.

I'm not saying this patch is the wrong answer, but as presented it 
hasn't managed to convince me that it's the right one either. Largely 
since I'm not 100% sure what the exact question is - even with this 
change we'd still have the same ABA problem whenever the queue overflows 
again *before* it's completely drained.

Thanks,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-08 16:46       ` Robin Murphy
@ 2023-03-09  8:56         ` Krcka, Tomas
  -1 siblings, 0 replies; 18+ messages in thread
From: Krcka, Tomas @ 2023-03-09  8:56 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Krcka, Tomas, baolu.lu, iommu, joro, linux-arm-kernel,
	linux-kernel, shameerali.kolothum.thodi, will


> 
> On 2023-03-08 14:02, Tomas Krcka wrote:
>>>> When an overflow occurs in the event queue, the SMMU toggles overflow
>>>> flag OVFLG in the PROD register.
>>>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>>>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>>>> still active (OVFLG != OVACKFLG).
>>>> 
>>>> Currently the acknowledge register is toggled after clearing the event
>>>> queue but is never propagated to the hardware. It would be done next
>>>> time when executing evtq thread.
>>>> 
>>>> The SMMU still adds elements to the queue when the overflow condition is
>>>> active but any subsequent overflow information after clearing the event
>>>> queue will be lost.
>>>> 
>>>> This change keeps the SMMU in sync as it's expected by design.
>>> 
>>> If I've understood correctly, the upshot of this is that if the queue
>>> has overflowed once, become empty, then somehow goes from empty to full
>>> before we manage to consume a single event, we won't print the "events
>>> lost" message a second time.
>>> 
>>> Have you seen this happen in practice? TBH if the event queue ever
>>> overflows even once it's indicative that the system is hosed anyway, so
>>> it's not clear to me that there's any great loss of value in sometimes
>>> failing to repeat a warning for a chronic ongoing operational failure.
>>> 
>> 
>> Yes, I did see in practice. And it’s not just about loosing subsequence warning.
>> The way how it’s done now keeps inconsistent CONS register value between SMMU and the kernel
>> until any new event happens. The kernel doesn’t inform SMMU that we know about the overflow
>> and consuming events as fast as we can.
> 
> Interesting - out of curiosity, is something blocking the IRQ thread
> from running in a timely manner, or are you just using a really tiny
> event queue?

Our case was the tiny event queue.
> 
> Either way though, the point is that there is nothing to "inform" the
> SMMU about here. It will see that we're consuming events and making
> space in the queue, because we're still updating CONS.RD. All that an
> update of PROD.OVFLG serves to do is indicate to software that events
> have been discarded since the last time CONS.OVACKFLG was updated. It
> makes no difference to the SMMU if it continues to discard *more* events
> until software updates CONS.OVACKFLG again. It's entirely software's own
> decision how closely it wants to keep track of overflows.
> 
> Like I say it's not clear how much Linux really cares about that, given
> that all we do with the information is log a message to indicate that
> some more events have been lost since the last time we logged the same
> message. Furthermore, the only thing we'll do with the overwhelming
> majority of events themselves is also log messages. Thus realistically
> if we're suddenly faced with processing a full event queue out of
> nowhere, then many of the events which *were* delivered to the queue
> will also be "lost" thanks to rate-limiting.
> 
> FWIW I think it's still true that for our currently supported use-cases
> in Linux, *any* discardable event is a sign that something's gone wrong;
> a full queue of 32K events would already be a sign that something's gone
> *severely* wrong, so at that point knowing whether it was exactly 32K,
> or 32K + n for some indeterminate value of n, is unlikely to be
> significantly meaningful.

The issue I see with the current state is that local (kernel) copy of the CONS register
will be different from the SMMU state for an indefinite period of time, until we get new 
event.
In the meantime, we cannot use the local copy as a value representing state in SMMU.
With or without this patch, it should be clearly visible. 
Right now the kernel just prints warning messages.
If any change is implemented in the future, this state should be taken into account.
Syncing the CONS register immediately after the change would prevent any misunderstanding.
That is why I have posted this patch, maybe I should have clarified it better in the
commit message.

Anyway, do you think that we should at least find a way how to make the eventq  
CONS.OVACKFLG  sync workflow clearer ?
> 
>>> It could be argued that we have a subtle inconsistency between
>>> arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is
>>> that the Event queue and PRI queue *do* have different overflow
>>> behaviours, so it could equally be argued that inconsistency in the code
>>> helps reflect that. FWIW I can't say I have a strong preference either way.
>> 
>> For the argument that the code can reflect the difference.
>> Then the comment 'Sync our overflow flag, as we believe we're up to speed’ is
>> already misleading.
> 
> Yes, that is what I was alluding to. Sometimes if a comment doesn't
> clearly match the code it means the code is wrong. Sometimes it just
> means the comment is wrong.
> 
> I'm not saying this patch is the wrong answer, but as presented it
> hasn't managed to convince me that it's the right one either. Largely
> since I'm not 100% sure what the exact question is - even with this
> change we'd still have the same ABA problem whenever the queue overflows
> again *before* it's completely drained.
> 
> Thanks,
> Robin.

Thank you.
Tomas





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-09  8:56         ` Krcka, Tomas
  0 siblings, 0 replies; 18+ messages in thread
From: Krcka, Tomas @ 2023-03-09  8:56 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Krcka, Tomas, baolu.lu, iommu, joro, linux-arm-kernel,
	linux-kernel, shameerali.kolothum.thodi, will


> 
> On 2023-03-08 14:02, Tomas Krcka wrote:
>>>> When an overflow occurs in the event queue, the SMMU toggles overflow
>>>> flag OVFLG in the PROD register.
>>>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>>>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>>>> still active (OVFLG != OVACKFLG).
>>>> 
>>>> Currently the acknowledge register is toggled after clearing the event
>>>> queue but is never propagated to the hardware. It would be done next
>>>> time when executing evtq thread.
>>>> 
>>>> The SMMU still adds elements to the queue when the overflow condition is
>>>> active but any subsequent overflow information after clearing the event
>>>> queue will be lost.
>>>> 
>>>> This change keeps the SMMU in sync as it's expected by design.
>>> 
>>> If I've understood correctly, the upshot of this is that if the queue
>>> has overflowed once, become empty, then somehow goes from empty to full
>>> before we manage to consume a single event, we won't print the "events
>>> lost" message a second time.
>>> 
>>> Have you seen this happen in practice? TBH if the event queue ever
>>> overflows even once it's indicative that the system is hosed anyway, so
>>> it's not clear to me that there's any great loss of value in sometimes
>>> failing to repeat a warning for a chronic ongoing operational failure.
>>> 
>> 
>> Yes, I did see in practice. And it’s not just about loosing subsequence warning.
>> The way how it’s done now keeps inconsistent CONS register value between SMMU and the kernel
>> until any new event happens. The kernel doesn’t inform SMMU that we know about the overflow
>> and consuming events as fast as we can.
> 
> Interesting - out of curiosity, is something blocking the IRQ thread
> from running in a timely manner, or are you just using a really tiny
> event queue?

Our case was the tiny event queue.
> 
> Either way though, the point is that there is nothing to "inform" the
> SMMU about here. It will see that we're consuming events and making
> space in the queue, because we're still updating CONS.RD. All that an
> update of PROD.OVFLG serves to do is indicate to software that events
> have been discarded since the last time CONS.OVACKFLG was updated. It
> makes no difference to the SMMU if it continues to discard *more* events
> until software updates CONS.OVACKFLG again. It's entirely software's own
> decision how closely it wants to keep track of overflows.
> 
> Like I say it's not clear how much Linux really cares about that, given
> that all we do with the information is log a message to indicate that
> some more events have been lost since the last time we logged the same
> message. Furthermore, the only thing we'll do with the overwhelming
> majority of events themselves is also log messages. Thus realistically
> if we're suddenly faced with processing a full event queue out of
> nowhere, then many of the events which *were* delivered to the queue
> will also be "lost" thanks to rate-limiting.
> 
> FWIW I think it's still true that for our currently supported use-cases
> in Linux, *any* discardable event is a sign that something's gone wrong;
> a full queue of 32K events would already be a sign that something's gone
> *severely* wrong, so at that point knowing whether it was exactly 32K,
> or 32K + n for some indeterminate value of n, is unlikely to be
> significantly meaningful.

The issue I see with the current state is that local (kernel) copy of the CONS register
will be different from the SMMU state for an indefinite period of time, until we get new 
event.
In the meantime, we cannot use the local copy as a value representing state in SMMU.
With or without this patch, it should be clearly visible. 
Right now the kernel just prints warning messages.
If any change is implemented in the future, this state should be taken into account.
Syncing the CONS register immediately after the change would prevent any misunderstanding.
That is why I have posted this patch, maybe I should have clarified it better in the
commit message.

Anyway, do you think that we should at least find a way how to make the eventq  
CONS.OVACKFLG  sync workflow clearer ?
> 
>>> It could be argued that we have a subtle inconsistency between
>>> arm_smmu_evtq_thread() and arm_smmu_priq_thread() here, but the fact is
>>> that the Event queue and PRI queue *do* have different overflow
>>> behaviours, so it could equally be argued that inconsistency in the code
>>> helps reflect that. FWIW I can't say I have a strong preference either way.
>> 
>> For the argument that the code can reflect the difference.
>> Then the comment 'Sync our overflow flag, as we believe we're up to speed’ is
>> already misleading.
> 
> Yes, that is what I was alluding to. Sometimes if a comment doesn't
> clearly match the code it means the code is wrong. Sometimes it just
> means the comment is wrong.
> 
> I'm not saying this patch is the wrong answer, but as presented it
> hasn't managed to convince me that it's the right one either. Largely
> since I'm not 100% sure what the exact question is - even with this
> change we'd still have the same ABA problem whenever the queue overflows
> again *before* it's completely drained.
> 
> Thanks,
> Robin.

Thank you.
Tomas





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-08  9:20 ` Tomas Krcka
@ 2023-03-27 12:12   ` Will Deacon
  -1 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2023-03-27 12:12 UTC (permalink / raw)
  To: Tomas Krcka
  Cc: linux-arm-kernel, Robin Murphy, Joerg Roedel, Lu Baolu,
	Shameer Kolothum, iommu, linux-kernel

On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
> When an overflow occurs in the event queue, the SMMU toggles overflow
> flag OVFLG in the PROD register.
> The evtq thread is supposed to acknowledge the overflow flag by toggling
> flag OVACKFLG in the CONS register, otherwise the overflow condition is
> still active (OVFLG != OVACKFLG).
> 
> Currently the acknowledge register is toggled after clearing the event
> queue but is never propagated to the hardware. It would be done next
> time when executing evtq thread.
> 
> The SMMU still adds elements to the queue when the overflow condition is
> active but any subsequent overflow information after clearing the event
> queue will be lost.
> 
> This change keeps the SMMU in sync as it's expected by design.
> 
> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f2425b0f0cd6..acc1ff5ff69b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>  	/* Sync our overflow flag, as we believe we're up to speed */
>  	llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>  		    Q_IDX(llq, llq->cons);
> +	queue_sync_cons_out(q);
>  	return IRQ_HANDLED;
>  }

I think I probably did mean to have something like this, but can we
only do the actual h/w update if overflow has occurred? Otherwise I think
we're pointlessly writing back the same value most of the time.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-27 12:12   ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2023-03-27 12:12 UTC (permalink / raw)
  To: Tomas Krcka
  Cc: linux-arm-kernel, Robin Murphy, Joerg Roedel, Lu Baolu,
	Shameer Kolothum, iommu, linux-kernel

On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
> When an overflow occurs in the event queue, the SMMU toggles overflow
> flag OVFLG in the PROD register.
> The evtq thread is supposed to acknowledge the overflow flag by toggling
> flag OVACKFLG in the CONS register, otherwise the overflow condition is
> still active (OVFLG != OVACKFLG).
> 
> Currently the acknowledge register is toggled after clearing the event
> queue but is never propagated to the hardware. It would be done next
> time when executing evtq thread.
> 
> The SMMU still adds elements to the queue when the overflow condition is
> active but any subsequent overflow information after clearing the event
> queue will be lost.
> 
> This change keeps the SMMU in sync as it's expected by design.
> 
> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f2425b0f0cd6..acc1ff5ff69b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>  	/* Sync our overflow flag, as we believe we're up to speed */
>  	llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>  		    Q_IDX(llq, llq->cons);
> +	queue_sync_cons_out(q);
>  	return IRQ_HANDLED;
>  }

I think I probably did mean to have something like this, but can we
only do the actual h/w update if overflow has occurred? Otherwise I think
we're pointlessly writing back the same value most of the time.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-27 12:12   ` Will Deacon
@ 2023-03-28  7:13     ` Krcka, Tomas
  -1 siblings, 0 replies; 18+ messages in thread
From: Krcka, Tomas @ 2023-03-28  7:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: Krcka, Tomas, linux-arm-kernel, Robin Murphy, Joerg Roedel,
	Lu Baolu, Shameer Kolothum, iommu, linux-kernel


> On 27. Mar 2023, at 14:12, Will Deacon <will@kernel.org> wrote:
> 
> 
> On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
>> When an overflow occurs in the event queue, the SMMU toggles overflow
>> flag OVFLG in the PROD register.
>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>> still active (OVFLG != OVACKFLG).
>> 
>> Currently the acknowledge register is toggled after clearing the event
>> queue but is never propagated to the hardware. It would be done next
>> time when executing evtq thread.
>> 
>> The SMMU still adds elements to the queue when the overflow condition is
>> active but any subsequent overflow information after clearing the event
>> queue will be lost.
>> 
>> This change keeps the SMMU in sync as it's expected by design.
>> 
>> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
>> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
>> ---
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index f2425b0f0cd6..acc1ff5ff69b 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>>      /* Sync our overflow flag, as we believe we're up to speed */
>>      llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>>                  Q_IDX(llq, llq->cons);
>> +     queue_sync_cons_out(q);
>>      return IRQ_HANDLED;
>> }
> 
> I think I probably did mean to have something like this, but can we
> only do the actual h/w update if overflow has occurred? Otherwise I think
> we're pointlessly writing back the same value most of the time.
> 
> Will

Yes, we can, but then same applies for the priq as well, there we also write back
every time.

Tomas



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-28  7:13     ` Krcka, Tomas
  0 siblings, 0 replies; 18+ messages in thread
From: Krcka, Tomas @ 2023-03-28  7:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: Krcka, Tomas, linux-arm-kernel, Robin Murphy, Joerg Roedel,
	Lu Baolu, Shameer Kolothum, iommu, linux-kernel


> On 27. Mar 2023, at 14:12, Will Deacon <will@kernel.org> wrote:
> 
> 
> On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
>> When an overflow occurs in the event queue, the SMMU toggles overflow
>> flag OVFLG in the PROD register.
>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>> still active (OVFLG != OVACKFLG).
>> 
>> Currently the acknowledge register is toggled after clearing the event
>> queue but is never propagated to the hardware. It would be done next
>> time when executing evtq thread.
>> 
>> The SMMU still adds elements to the queue when the overflow condition is
>> active but any subsequent overflow information after clearing the event
>> queue will be lost.
>> 
>> This change keeps the SMMU in sync as it's expected by design.
>> 
>> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
>> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
>> ---
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index f2425b0f0cd6..acc1ff5ff69b 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>>      /* Sync our overflow flag, as we believe we're up to speed */
>>      llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>>                  Q_IDX(llq, llq->cons);
>> +     queue_sync_cons_out(q);
>>      return IRQ_HANDLED;
>> }
> 
> I think I probably did mean to have something like this, but can we
> only do the actual h/w update if overflow has occurred? Otherwise I think
> we're pointlessly writing back the same value most of the time.
> 
> Will

Yes, we can, but then same applies for the priq as well, there we also write back
every time.

Tomas



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-28  7:13     ` Krcka, Tomas
@ 2023-03-28 11:56       ` Will Deacon
  -1 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2023-03-28 11:56 UTC (permalink / raw)
  To: Krcka, Tomas
  Cc: linux-arm-kernel, Robin Murphy, Joerg Roedel, Lu Baolu,
	Shameer Kolothum, iommu, linux-kernel

On Tue, Mar 28, 2023 at 07:13:52AM +0000, Krcka, Tomas wrote:
> 
> > On 27. Mar 2023, at 14:12, Will Deacon <will@kernel.org> wrote:
> > 
> > 
> > On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
> >> When an overflow occurs in the event queue, the SMMU toggles overflow
> >> flag OVFLG in the PROD register.
> >> The evtq thread is supposed to acknowledge the overflow flag by toggling
> >> flag OVACKFLG in the CONS register, otherwise the overflow condition is
> >> still active (OVFLG != OVACKFLG).
> >> 
> >> Currently the acknowledge register is toggled after clearing the event
> >> queue but is never propagated to the hardware. It would be done next
> >> time when executing evtq thread.
> >> 
> >> The SMMU still adds elements to the queue when the overflow condition is
> >> active but any subsequent overflow information after clearing the event
> >> queue will be lost.
> >> 
> >> This change keeps the SMMU in sync as it's expected by design.
> >> 
> >> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
> >> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
> >> ---
> >> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
> >> 1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> index f2425b0f0cd6..acc1ff5ff69b 100644
> >> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> >>      /* Sync our overflow flag, as we believe we're up to speed */
> >>      llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
> >>                  Q_IDX(llq, llq->cons);
> >> +     queue_sync_cons_out(q);
> >>      return IRQ_HANDLED;
> >> }
> > 
> > I think I probably did mean to have something like this, but can we
> > only do the actual h/w update if overflow has occurred? Otherwise I think
> > we're pointlessly writing back the same value most of the time.
> > 
> > Will
> 
> Yes, we can, but then same applies for the priq as well, there we also write back
> every time.

Sure, feel free to update both.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-28 11:56       ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2023-03-28 11:56 UTC (permalink / raw)
  To: Krcka, Tomas
  Cc: linux-arm-kernel, Robin Murphy, Joerg Roedel, Lu Baolu,
	Shameer Kolothum, iommu, linux-kernel

On Tue, Mar 28, 2023 at 07:13:52AM +0000, Krcka, Tomas wrote:
> 
> > On 27. Mar 2023, at 14:12, Will Deacon <will@kernel.org> wrote:
> > 
> > 
> > On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
> >> When an overflow occurs in the event queue, the SMMU toggles overflow
> >> flag OVFLG in the PROD register.
> >> The evtq thread is supposed to acknowledge the overflow flag by toggling
> >> flag OVACKFLG in the CONS register, otherwise the overflow condition is
> >> still active (OVFLG != OVACKFLG).
> >> 
> >> Currently the acknowledge register is toggled after clearing the event
> >> queue but is never propagated to the hardware. It would be done next
> >> time when executing evtq thread.
> >> 
> >> The SMMU still adds elements to the queue when the overflow condition is
> >> active but any subsequent overflow information after clearing the event
> >> queue will be lost.
> >> 
> >> This change keeps the SMMU in sync as it's expected by design.
> >> 
> >> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
> >> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
> >> ---
> >> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
> >> 1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> index f2425b0f0cd6..acc1ff5ff69b 100644
> >> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> >>      /* Sync our overflow flag, as we believe we're up to speed */
> >>      llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
> >>                  Q_IDX(llq, llq->cons);
> >> +     queue_sync_cons_out(q);
> >>      return IRQ_HANDLED;
> >> }
> > 
> > I think I probably did mean to have something like this, but can we
> > only do the actual h/w update if overflow has occurred? Otherwise I think
> > we're pointlessly writing back the same value most of the time.
> > 
> > Will
> 
> Yes, we can, but then same applies for the priq as well, there we also write back
> every time.

Sure, feel free to update both.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
  2023-03-28 11:56       ` Will Deacon
@ 2023-03-29 12:36         ` Krcka, Tomas
  -1 siblings, 0 replies; 18+ messages in thread
From: Krcka, Tomas @ 2023-03-29 12:36 UTC (permalink / raw)
  To: Will Deacon
  Cc: Krcka, Tomas, linux-arm-kernel, Robin Murphy, Joerg Roedel,
	Lu Baolu, Shameer Kolothum, iommu, linux-kernel



> On 28. Mar 2023, at 13:56, Will Deacon <will@kernel.org> wrote:
> 
> 
> On Tue, Mar 28, 2023 at 07:13:52AM +0000, Krcka, Tomas wrote:
>> 
>>> On 27. Mar 2023, at 14:12, Will Deacon <will@kernel.org> wrote:
>>> 
>>> 
>>> On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
>>>> When an overflow occurs in the event queue, the SMMU toggles overflow
>>>> flag OVFLG in the PROD register.
>>>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>>>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>>>> still active (OVFLG != OVACKFLG).
>>>> 
>>>> Currently the acknowledge register is toggled after clearing the event
>>>> queue but is never propagated to the hardware. It would be done next
>>>> time when executing evtq thread.
>>>> 
>>>> The SMMU still adds elements to the queue when the overflow condition is
>>>> active but any subsequent overflow information after clearing the event
>>>> queue will be lost.
>>>> 
>>>> This change keeps the SMMU in sync as it's expected by design.
>>>> 
>>>> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
>>>> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
>>>> ---
>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>> 
>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> index f2425b0f0cd6..acc1ff5ff69b 100644
>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>>>>     /* Sync our overflow flag, as we believe we're up to speed */
>>>>     llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>>>>                 Q_IDX(llq, llq->cons);
>>>> +     queue_sync_cons_out(q);
>>>>     return IRQ_HANDLED;
>>>> }
>>> 
>>> I think I probably did mean to have something like this, but can we
>>> only do the actual h/w update if overflow has occurred? Otherwise I think
>>> we're pointlessly writing back the same value most of the time.
>>> 
>>> Will
>> 
>> Yes, we can, but then same applies for the priq as well, there we also write back
>> every time.
> 
> Sure, feel free to update both.
> 

OK, I sent it as new patch: https://lore.kernel.org/lkml/20230329123420.34641-1-tomas.krcka@gmail.com





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment
@ 2023-03-29 12:36         ` Krcka, Tomas
  0 siblings, 0 replies; 18+ messages in thread
From: Krcka, Tomas @ 2023-03-29 12:36 UTC (permalink / raw)
  To: Will Deacon
  Cc: Krcka, Tomas, linux-arm-kernel, Robin Murphy, Joerg Roedel,
	Lu Baolu, Shameer Kolothum, iommu, linux-kernel



> On 28. Mar 2023, at 13:56, Will Deacon <will@kernel.org> wrote:
> 
> 
> On Tue, Mar 28, 2023 at 07:13:52AM +0000, Krcka, Tomas wrote:
>> 
>>> On 27. Mar 2023, at 14:12, Will Deacon <will@kernel.org> wrote:
>>> 
>>> 
>>> On Wed, Mar 08, 2023 at 09:20:47AM +0000, Tomas Krcka wrote:
>>>> When an overflow occurs in the event queue, the SMMU toggles overflow
>>>> flag OVFLG in the PROD register.
>>>> The evtq thread is supposed to acknowledge the overflow flag by toggling
>>>> flag OVACKFLG in the CONS register, otherwise the overflow condition is
>>>> still active (OVFLG != OVACKFLG).
>>>> 
>>>> Currently the acknowledge register is toggled after clearing the event
>>>> queue but is never propagated to the hardware. It would be done next
>>>> time when executing evtq thread.
>>>> 
>>>> The SMMU still adds elements to the queue when the overflow condition is
>>>> active but any subsequent overflow information after clearing the event
>>>> queue will be lost.
>>>> 
>>>> This change keeps the SMMU in sync as it's expected by design.
>>>> 
>>>> Signed-off-by: Tomas Krcka <krckatom@amazon.de>
>>>> Suggested-by: KarimAllah Ahmed <karahmed@amazon.de>
>>>> ---
>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>> 
>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> index f2425b0f0cd6..acc1ff5ff69b 100644
>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> @@ -1579,6 +1579,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>>>>     /* Sync our overflow flag, as we believe we're up to speed */
>>>>     llq->cons = Q_OVF(llq->prod) | Q_WRP(llq, llq->cons) |
>>>>                 Q_IDX(llq, llq->cons);
>>>> +     queue_sync_cons_out(q);
>>>>     return IRQ_HANDLED;
>>>> }
>>> 
>>> I think I probably did mean to have something like this, but can we
>>> only do the actual h/w update if overflow has occurred? Otherwise I think
>>> we're pointlessly writing back the same value most of the time.
>>> 
>>> Will
>> 
>> Yes, we can, but then same applies for the priq as well, there we also write back
>> every time.
> 
> Sure, feel free to update both.
> 

OK, I sent it as new patch: https://lore.kernel.org/lkml/20230329123420.34641-1-tomas.krcka@gmail.com





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-03-29 12:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-08  9:20 [PATCH] iommu/arm-smmu-v3: Fix event queue overflow acknowledgment Tomas Krcka
2023-03-08  9:20 ` Tomas Krcka
2023-03-08 13:08 ` Robin Murphy
2023-03-08 13:08   ` Robin Murphy
2023-03-08 14:02   ` Tomas Krcka
2023-03-08 14:02     ` Tomas Krcka
2023-03-08 16:46     ` Robin Murphy
2023-03-08 16:46       ` Robin Murphy
2023-03-09  8:56       ` Krcka, Tomas
2023-03-09  8:56         ` Krcka, Tomas
2023-03-27 12:12 ` Will Deacon
2023-03-27 12:12   ` Will Deacon
2023-03-28  7:13   ` Krcka, Tomas
2023-03-28  7:13     ` Krcka, Tomas
2023-03-28 11:56     ` Will Deacon
2023-03-28 11:56       ` Will Deacon
2023-03-29 12:36       ` Krcka, Tomas
2023-03-29 12:36         ` Krcka, Tomas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.