* [dovetail][PATCH] x86: pipeline: Fix vector stall after vector error handling
@ 2021-11-08 6:00 Florian Bezdeka
2021-11-08 6:54 ` Jan Kiszka
2021-11-08 7:56 ` Philippe Gerum
0 siblings, 2 replies; 3+ messages in thread
From: Florian Bezdeka @ 2021-11-08 6:00 UTC (permalink / raw)
To: xenomai, rpm
Whenever an IRQ was handled for a vector being NULL or in one of the
error states the interrupt was not acknowledged at the APIC. That can
happen if a vector is cleaned up by one of the device drivers while
there is still one IRQ in flight.
This has two effects:
- If the affected vector is re-assigned later, it does not work, the
IRQ never makes its way to the CPU
- Interrupts with lower priority are no longer delivered to the CPU
The problem was observed on a quite big Intel XEON machine where some
vectors / irqs were temporary used and cleaned up and re-assigned
later.
Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
arch/x86/kernel/irq_pipeline.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/irq_pipeline.c b/arch/x86/kernel/irq_pipeline.c
index 48d9959bc11a..63de68141b21 100644
--- a/arch/x86/kernel/irq_pipeline.c
+++ b/arch/x86/kernel/irq_pipeline.c
@@ -239,6 +239,8 @@ void arch_handle_irq(struct pt_regs *regs, u8 vector, bool irq_movable)
} else {
desc = __this_cpu_read(vector_irq[vector]);
if (unlikely(IS_ERR_OR_NULL(desc))) {
+ __ack_APIC_irq();
+
if (desc == VECTOR_UNUSED) {
pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n",
__func__, smp_processor_id(),
--
2.30.2
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [dovetail][PATCH] x86: pipeline: Fix vector stall after vector error handling
2021-11-08 6:00 [dovetail][PATCH] x86: pipeline: Fix vector stall after vector error handling Florian Bezdeka
@ 2021-11-08 6:54 ` Jan Kiszka
2021-11-08 7:56 ` Philippe Gerum
1 sibling, 0 replies; 3+ messages in thread
From: Jan Kiszka @ 2021-11-08 6:54 UTC (permalink / raw)
To: Florian Bezdeka, xenomai, rpm
On 08.11.21 07:00, Florian Bezdeka wrote:
> Whenever an IRQ was handled for a vector being NULL or in one of the
> error states the interrupt was not acknowledged at the APIC. That can
> happen if a vector is cleaned up by one of the device drivers while
> there is still one IRQ in flight.
>
> This has two effects:
> - If the affected vector is re-assigned later, it does not work, the
> IRQ never makes its way to the CPU
> - Interrupts with lower priority are no longer delivered to the CPU
>
> The problem was observed on a quite big Intel XEON machine where some
> vectors / irqs were temporary used and cleaned up and re-assigned
> later.
>
> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> ---
> arch/x86/kernel/irq_pipeline.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/kernel/irq_pipeline.c b/arch/x86/kernel/irq_pipeline.c
> index 48d9959bc11a..63de68141b21 100644
> --- a/arch/x86/kernel/irq_pipeline.c
> +++ b/arch/x86/kernel/irq_pipeline.c
> @@ -239,6 +239,8 @@ void arch_handle_irq(struct pt_regs *regs, u8 vector, bool irq_movable)
> } else {
> desc = __this_cpu_read(vector_irq[vector]);
> if (unlikely(IS_ERR_OR_NULL(desc))) {
> + __ack_APIC_irq();
> +
> if (desc == VECTOR_UNUSED) {
> pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n",
> __func__, smp_processor_id(),
>
Nice catch! And very hard work to get there - well done!
Jan
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [dovetail][PATCH] x86: pipeline: Fix vector stall after vector error handling
2021-11-08 6:00 [dovetail][PATCH] x86: pipeline: Fix vector stall after vector error handling Florian Bezdeka
2021-11-08 6:54 ` Jan Kiszka
@ 2021-11-08 7:56 ` Philippe Gerum
1 sibling, 0 replies; 3+ messages in thread
From: Philippe Gerum @ 2021-11-08 7:56 UTC (permalink / raw)
To: Florian Bezdeka; +Cc: xenomai, jan.kiszka, henning.schild
Florian Bezdeka <florian.bezdeka@siemens.com> writes:
> Whenever an IRQ was handled for a vector being NULL or in one of the
> error states the interrupt was not acknowledged at the APIC. That can
> happen if a vector is cleaned up by one of the device drivers while
> there is still one IRQ in flight.
>
> This has two effects:
> - If the affected vector is re-assigned later, it does not work, the
> IRQ never makes its way to the CPU
> - Interrupts with lower priority are no longer delivered to the CPU
>
> The problem was observed on a quite big Intel XEON machine where some
> vectors / irqs were temporary used and cleaned up and re-assigned
> later.
>
> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> ---
> arch/x86/kernel/irq_pipeline.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/kernel/irq_pipeline.c b/arch/x86/kernel/irq_pipeline.c
> index 48d9959bc11a..63de68141b21 100644
> --- a/arch/x86/kernel/irq_pipeline.c
> +++ b/arch/x86/kernel/irq_pipeline.c
> @@ -239,6 +239,8 @@ void arch_handle_irq(struct pt_regs *regs, u8 vector, bool irq_movable)
> } else {
> desc = __this_cpu_read(vector_irq[vector]);
> if (unlikely(IS_ERR_OR_NULL(desc))) {
> + __ack_APIC_irq();
> +
> if (desc == VECTOR_UNUSED) {
> pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n",
> __func__, smp_processor_id(),
Ouch. Thanks for digging this. Merged upstream.
--
Philippe.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-11-08 7:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 6:00 [dovetail][PATCH] x86: pipeline: Fix vector stall after vector error handling Florian Bezdeka
2021-11-08 6:54 ` Jan Kiszka
2021-11-08 7:56 ` Philippe Gerum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.