x86/irq: Do not touch IRQ chip_data if it does not belong to x86_vector_domain
diff mbox series

Message ID 20161003101708.34795-1-mika.westerberg@linux.intel.com
State New, archived
Headers show
Series
  • x86/irq: Do not touch IRQ chip_data if it does not belong to x86_vector_domain
Related show

Commit Message

Mika Westerberg Oct. 3, 2016, 10:17 a.m. UTC
When a CPU is about to be offlined we call fixup_irqs() that resets IRQ
affinities related to the CPU in question. The same thing is also done when
the system is suspended to S-states like S3 (mem).

For each IRQ we try to complete any on-going move regardless whether the
IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch
its chip_data, assume it is of type struct apic_chip_data and manipulate it
by clearing old_domain mask etc. For irq_chips that are not part of the
x86_vector_domain, like those created by various GPIO drivers, will find
their chip_data being changed unexpectly.

Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets
corrupted after resume:

  # cat /sys/kernel/debug/gpio
  gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
   gpio-511 (                    |sysfs               ) in  hi

  # rtcwake -s10 -mmem
  <10 seconds passes>

  # cat /sys/kernel/debug/gpio
  gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
   gpio-511 (                    |sysfs               ) in  ?

Note '?' in the output. It means the struct gpio_chip ->get function is
NULL whereas before suspend it was there.

Fix this by first checking that the IRQ belongs to x86_vector_domain before
we try to use the chip_data as struct apic_chip_data.

Reported-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 arch/x86/kernel/apic/vector.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

Comments

Sakari Ailus Oct. 3, 2016, 1:37 p.m. UTC | #1
Hi Mika,

On 10/03/16 13:17, Mika Westerberg wrote:
> When a CPU is about to be offlined we call fixup_irqs() that resets IRQ
> affinities related to the CPU in question. The same thing is also done when
> the system is suspended to S-states like S3 (mem).
> 
> For each IRQ we try to complete any on-going move regardless whether the
> IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch
> its chip_data, assume it is of type struct apic_chip_data and manipulate it
> by clearing old_domain mask etc. For irq_chips that are not part of the
> x86_vector_domain, like those created by various GPIO drivers, will find
> their chip_data being changed unexpectly.
> 
> Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets
> corrupted after resume:
> 
>   # cat /sys/kernel/debug/gpio
>   gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
>    gpio-511 (                    |sysfs               ) in  hi
> 
>   # rtcwake -s10 -mmem
>   <10 seconds passes>
> 
>   # cat /sys/kernel/debug/gpio
>   gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
>    gpio-511 (                    |sysfs               ) in  ?
> 
> Note '?' in the output. It means the struct gpio_chip ->get function is
> NULL whereas before suspend it was there.
> 
> Fix this by first checking that the IRQ belongs to x86_vector_domain before
> we try to use the chip_data as struct apic_chip_data.
> 
> Reported-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

Thanks for debugging this! I've tested it on the laptop where the SD
card is no longer detected after suspend; with this patch it works fine.

Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>

Patch
diff mbox series

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 6066d945c40e..5d30c5e42bb1 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -661,11 +661,28 @@  void irq_complete_move(struct irq_cfg *cfg)
  */
 void irq_force_complete_move(struct irq_desc *desc)
 {
-	struct irq_data *irqdata = irq_desc_get_irq_data(desc);
-	struct apic_chip_data *data = apic_chip_data(irqdata);
-	struct irq_cfg *cfg = data ? &data->cfg : NULL;
+	struct irq_data *irqdata;
+	struct apic_chip_data *data;
+	struct irq_cfg *cfg;
 	unsigned int cpu;
 
+	/*
+	 * The function is called for all descriptors regardless of which
+	 * irqdomain they belong to. For example if an IRQ is provided by
+	 * an irq_chip as part of a GPIO driver, the chip data for that
+	 * descriptor is specific to the irq_chip in question.
+	 *
+	 * Check first that the chip_data is what we expect
+	 * (apic_chip_data) before touching it any further.
+	 */
+	irqdata = irq_domain_get_irq_data(x86_vector_domain,
+					  irq_desc_get_irq(desc));
+	if (!irqdata)
+		return;
+
+	data = apic_chip_data(irqdata);
+	cfg = data ? &data->cfg : NULL;
+
 	if (!cfg)
 		return;