linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing
@ 2020-10-05 11:14 Marc Zyngier
  2020-10-05 11:14 ` [PATCH 1/3] gpio: tegra186: Allow optional irq parent callbacks Marc Zyngier
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 11:14 UTC (permalink / raw)
  To: linux-tegra, linux-kernel, linux-arm-kernel
  Cc: Thierry Reding, Jonathan Hunter, Dmitry Osipenko,
	Sowjanya Komatineni, Venkat Reddy Talla, Thomas Gleixner,
	kernel-team

Jon recently reported that one of the Tegra systems (Jetson TX2, aka
tegra186) stopped booting with the introduction of the "IPI as IRQs"
series. After a few weeks of head scratching and complete puzzlement,
I obtained a board and started looking at what was happening.

The interrupt hierarchy looks like this:

	[DEVICE] -A-> [PMC] -B-> [GIC]

which seems simple enough. However, not all the devices attached to
the PMC follow this hierarchy, and in some cases, the 'B' link isn't
present in the HW. In other cases, neither 'A' nor 'B' are present.
And yet the PMC driver creates such linkages using random hwirq values
for the non-existent links, potentially overriding existing mappings
in the process. "What could possibly go wrong?"

It turns out that for the 'B' link, the PMC driver uses hwirq 0, which
is SGI0 for the GIC, and used as the rescheduling IPI. Obviously, this
doesn't go very well, nor very far, as the IPI gets routed to random
drivers. Also, as the handling flow has been overridden, this
interrupt never gets deactivated and can't fire anymore. Yes, this is
bad.

The 'A' link is less problematic, but the hwirq value is still out of
the irqdomain range, and gets remapped every time a new 'A'-less
driver comes up.

Instead, let's trim the unused hierarchy levels as needed. This
requires some checks in the upper levels of the hierarchy as we now
have optional levels, but this looks a lot saner than what we
currently have. With this, tegra186 is back booting on -next.

I haven't tested any wake-up stuff, nor any other nvidia system (this
is the only one I have). If people agree to these changes, I can take
them via the irqchip tree so that they make it into the next merge
window.

	M.

Marc Zyngier (3):
  gpio: tegra186: Allow optional irq parent callbacks
  soc/tegra: pmc: Allow optional irq parent callbacks
  soc/tegra: pmc: Don't create fake interrupt hierarchy levels

 drivers/gpio/gpio-tegra186.c |  15 +++-
 drivers/soc/tegra/pmc.c      | 142 ++++++++++++++++++++---------------
 2 files changed, 95 insertions(+), 62 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] gpio: tegra186: Allow optional irq parent callbacks
  2020-10-05 11:14 [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Marc Zyngier
@ 2020-10-05 11:14 ` Marc Zyngier
  2020-10-05 11:14 ` [PATCH 2/3] soc/tegra: pmc: " Marc Zyngier
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 11:14 UTC (permalink / raw)
  To: linux-tegra, linux-kernel, linux-arm-kernel
  Cc: Thierry Reding, Jonathan Hunter, Dmitry Osipenko,
	Sowjanya Komatineni, Venkat Reddy Talla, Thomas Gleixner,
	kernel-team

Make the tegra186 GPIO driver resistent to variable depth
interrupt hierarchy, which we are about to introduce.

No functionnal change yet.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/gpio/gpio-tegra186.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-tegra186.c b/drivers/gpio/gpio-tegra186.c
index 178e9128ded0..9500074b1f1b 100644
--- a/drivers/gpio/gpio-tegra186.c
+++ b/drivers/gpio/gpio-tegra186.c
@@ -430,7 +430,18 @@ static int tegra186_irq_set_type(struct irq_data *data, unsigned int type)
 	else
 		irq_set_handler_locked(data, handle_edge_irq);
 
-	return irq_chip_set_type_parent(data, type);
+	if (data->parent_data)
+		return irq_chip_set_type_parent(data, type);
+
+	return 0;
+}
+
+static int tegra186_irq_set_wake(struct irq_data *data, unsigned int on)
+{
+	if (data->parent_data)
+		return irq_chip_set_wake_parent(data, on);
+
+	return 0;
 }
 
 static void tegra186_gpio_irq(struct irq_desc *desc)
@@ -678,7 +689,7 @@ static int tegra186_gpio_probe(struct platform_device *pdev)
 	gpio->intc.irq_mask = tegra186_irq_mask;
 	gpio->intc.irq_unmask = tegra186_irq_unmask;
 	gpio->intc.irq_set_type = tegra186_irq_set_type;
-	gpio->intc.irq_set_wake = irq_chip_set_wake_parent;
+	gpio->intc.irq_set_wake = tegra186_irq_set_wake;
 
 	irq = &gpio->gpio.irq;
 	irq->chip = &gpio->intc;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] soc/tegra: pmc: Allow optional irq parent callbacks
  2020-10-05 11:14 [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Marc Zyngier
  2020-10-05 11:14 ` [PATCH 1/3] gpio: tegra186: Allow optional irq parent callbacks Marc Zyngier
@ 2020-10-05 11:14 ` Marc Zyngier
  2020-10-05 11:27   ` Thierry Reding
  2020-10-05 11:14 ` [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels Marc Zyngier
  2020-10-05 11:22 ` [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Thierry Reding
  3 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 11:14 UTC (permalink / raw)
  To: linux-tegra, linux-kernel, linux-arm-kernel
  Cc: Thierry Reding, Jonathan Hunter, Dmitry Osipenko,
	Sowjanya Komatineni, Venkat Reddy Talla, Thomas Gleixner,
	kernel-team

Make the PMC driver resistent to variable depth interrupt hierarchy,
which we are about to introduce. The irq_chip structure is now
allocated statically, providing the indirection for the couple of
callbacks that are SoC-specific.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/soc/tegra/pmc.c | 65 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 54 insertions(+), 11 deletions(-)

diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
index d332e5d9abac..9960f7c18431 100644
--- a/drivers/soc/tegra/pmc.c
+++ b/drivers/soc/tegra/pmc.c
@@ -439,7 +439,6 @@ struct tegra_pmc {
 	struct pinctrl_dev *pctl_dev;
 
 	struct irq_domain *domain;
-	struct irq_chip irq;
 
 	struct notifier_block clk_nb;
 };
@@ -1928,6 +1927,58 @@ static void tegra_pmc_reset_sysfs_init(struct tegra_pmc *pmc)
 	}
 }
 
+static void tegra_irq_mask_parent(struct irq_data *data)
+{
+	if (data->parent_data)
+		irq_chip_mask_parent(data);
+}
+
+static void tegra_irq_unmask_parent(struct irq_data *data)
+{
+	if (data->parent_data)
+		irq_chip_unmask_parent(data);
+}
+
+static void tegra_irq_eoi_parent(struct irq_data *data)
+{
+	if (data->parent_data)
+		irq_chip_eoi_parent(data);
+}
+
+static int tegra_irq_set_affinity_parent(struct irq_data *data,
+					 const struct cpumask *dest,
+					 bool force)
+{
+	if (data->parent_data)
+		return irq_chip_set_affinity_parent(data, dest, force);
+
+	return -EINVAL;
+}
+
+static int tegra_irq_set_type(struct irq_data *data, unsigned int type)
+{
+	struct tegra_pmc *pmc = irq_data_get_irq_chip_data(data);
+
+	return pmc->soc->irq_set_type(data, type);
+}
+
+static int tegra_irq_set_wake(struct irq_data *data, unsigned int on)
+{
+	struct tegra_pmc *pmc = irq_data_get_irq_chip_data(data);
+
+	return pmc->soc->irq_set_wake(data, on);
+}
+
+static struct irq_chip pmc_irqchip = {
+	.name			= "tegra-pmc",
+	.irq_mask		= tegra_irq_mask_parent,
+	.irq_unmask		= tegra_irq_unmask_parent,
+	.irq_eoi		= tegra_irq_eoi_parent,
+	.irq_set_affinity	= tegra_irq_set_affinity_parent,
+	.irq_set_type		= tegra_irq_set_type,
+	.irq_set_wake		= tegra_irq_set_wake,
+};
+
 static int tegra_pmc_irq_translate(struct irq_domain *domain,
 				   struct irq_fwspec *fwspec,
 				   unsigned long *hwirq,
@@ -1965,7 +2016,7 @@ static int tegra_pmc_irq_alloc(struct irq_domain *domain, unsigned int virq,
 
 			err = irq_domain_set_hwirq_and_chip(domain, virq,
 							    event->id,
-							    &pmc->irq, pmc);
+							    &pmc_irqchip, pmc);
 			if (err < 0)
 				break;
 
@@ -2015,7 +2066,7 @@ static int tegra_pmc_irq_alloc(struct irq_domain *domain, unsigned int virq,
 	 */
 	if (i == soc->num_wake_events) {
 		err = irq_domain_set_hwirq_and_chip(domain, virq, ULONG_MAX,
-						    &pmc->irq, pmc);
+						    &pmc_irqchip, pmc);
 
 		/*
 		 * Interrupts without a wake event don't have a corresponding
@@ -2198,14 +2249,6 @@ static int tegra_pmc_irq_init(struct tegra_pmc *pmc)
 	if (!parent)
 		return 0;
 
-	pmc->irq.name = dev_name(pmc->dev);
-	pmc->irq.irq_mask = irq_chip_mask_parent;
-	pmc->irq.irq_unmask = irq_chip_unmask_parent;
-	pmc->irq.irq_eoi = irq_chip_eoi_parent;
-	pmc->irq.irq_set_affinity = irq_chip_set_affinity_parent;
-	pmc->irq.irq_set_type = pmc->soc->irq_set_type;
-	pmc->irq.irq_set_wake = pmc->soc->irq_set_wake;
-
 	pmc->domain = irq_domain_add_hierarchy(parent, 0, 96, pmc->dev->of_node,
 					       &tegra_pmc_irq_domain_ops, pmc);
 	if (!pmc->domain) {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels
  2020-10-05 11:14 [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Marc Zyngier
  2020-10-05 11:14 ` [PATCH 1/3] gpio: tegra186: Allow optional irq parent callbacks Marc Zyngier
  2020-10-05 11:14 ` [PATCH 2/3] soc/tegra: pmc: " Marc Zyngier
@ 2020-10-05 11:14 ` Marc Zyngier
  2020-10-05 11:33   ` Thierry Reding
  2020-10-05 11:22 ` [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Thierry Reding
  3 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 11:14 UTC (permalink / raw)
  To: linux-tegra, linux-kernel, linux-arm-kernel
  Cc: Thierry Reding, Jonathan Hunter, Dmitry Osipenko,
	Sowjanya Komatineni, Venkat Reddy Talla, Thomas Gleixner,
	kernel-team

The Tegra PMC driver does ungodly things with the interrupt hierarchy,
repeatedly corrupting it by pulling hwirq numbers out of thin air,
overriding existing IRQ mappings and changing the handling flow
of unsuspecting users.

All of this is done in the name of preserving the interrupt hierarchy
even when these levels do not exist in the HW. Together with the use
of proper IRQs for IPIs, this leads to an unbootable system as the
rescheduling IPI gets repeatedly repurposed for random drivers...

Instead, let's allow the hierarchy to be trimmed to the level that
actually makes sense for the HW, and not any deeper. This avoids
having unnecessary callbacks, overriding mappings, and otherwise
keeps the hierarchy sane.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/soc/tegra/pmc.c | 79 +++++++++++++++--------------------------
 1 file changed, 29 insertions(+), 50 deletions(-)

diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
index 9960f7c18431..4eea3134fb3e 100644
--- a/drivers/soc/tegra/pmc.c
+++ b/drivers/soc/tegra/pmc.c
@@ -1993,6 +1993,30 @@ static int tegra_pmc_irq_translate(struct irq_domain *domain,
 	return 0;
 }
 
+/* Trim the irq hierarchy from a particular irq domain */
+static void trim_hierarchy(unsigned int virq, struct irq_domain *domain)
+{
+	struct irq_data *tail, *irq_data = irq_get_irq_data(virq);
+
+	/* The PMC doesn't generate any interrupt by itself */
+	if (WARN_ON(!irq_data->parent_data))
+		return;
+
+	/* Skip until we find the right domain */
+	while (irq_data->parent_data && irq_data->parent_data->domain != domain)
+		irq_data = irq_data->parent_data;
+
+	/* Sever the inner part of the hierarchy...  */
+	tail = irq_data->parent_data;
+	irq_data->parent_data = NULL;
+
+	/* ... and free it */
+	for (irq_data = tail; irq_data; irq_data = tail) {
+		tail = irq_data->parent_data;
+		kfree(irq_data);
+	};
+}
+
 static int tegra_pmc_irq_alloc(struct irq_domain *domain, unsigned int virq,
 			       unsigned int num_irqs, void *data)
 {
@@ -2039,46 +2063,15 @@ static int tegra_pmc_irq_alloc(struct irq_domain *domain, unsigned int virq,
 
 			err = irq_domain_set_hwirq_and_chip(domain, virq,
 							    event->id,
-							    &pmc->irq, pmc);
-
-			/*
-			 * GPIOs don't have an equivalent interrupt in the
-			 * parent controller (GIC). However some code, such
-			 * as the one in irq_get_irqchip_state(), require a
-			 * valid IRQ chip to be set. Make sure that's the
-			 * case by passing NULL here, which will install a
-			 * dummy IRQ chip for the interrupt in the parent
-			 * domain.
-			 */
-			if (domain->parent)
-				irq_domain_set_hwirq_and_chip(domain->parent,
-							      virq, 0, NULL,
-							      NULL);
-
+							    &pmc_irqchip, pmc);
+			if (!err)
+				trim_hierarchy(virq, domain->parent);
 			break;
 		}
 	}
 
-	/*
-	 * For interrupts that don't have associated wake events, assign a
-	 * dummy hardware IRQ number. This is used in the ->irq_set_type()
-	 * and ->irq_set_wake() callbacks to return early for these IRQs.
-	 */
-	if (i == soc->num_wake_events) {
-		err = irq_domain_set_hwirq_and_chip(domain, virq, ULONG_MAX,
-						    &pmc_irqchip, pmc);
-
-		/*
-		 * Interrupts without a wake event don't have a corresponding
-		 * interrupt in the parent controller (GIC). Pass NULL for the
-		 * chip here, which causes a dummy IRQ chip to be installed
-		 * for the interrupt in the parent domain, to make this
-		 * explicit.
-		 */
-		if (domain->parent)
-			irq_domain_set_hwirq_and_chip(domain->parent, virq, 0,
-						      NULL, NULL);
-	}
+	if (i == soc->num_wake_events)
+		trim_hierarchy(virq, domain);
 
 	return err;
 }
@@ -2094,9 +2087,6 @@ static int tegra210_pmc_irq_set_wake(struct irq_data *data, unsigned int on)
 	unsigned int offset, bit;
 	u32 value;
 
-	if (data->hwirq == ULONG_MAX)
-		return 0;
-
 	offset = data->hwirq / 32;
 	bit = data->hwirq % 32;
 
@@ -2131,9 +2121,6 @@ static int tegra210_pmc_irq_set_type(struct irq_data *data, unsigned int type)
 	unsigned int offset, bit;
 	u32 value;
 
-	if (data->hwirq == ULONG_MAX)
-		return 0;
-
 	offset = data->hwirq / 32;
 	bit = data->hwirq % 32;
 
@@ -2174,10 +2161,6 @@ static int tegra186_pmc_irq_set_wake(struct irq_data *data, unsigned int on)
 	unsigned int offset, bit;
 	u32 value;
 
-	/* nothing to do if there's no associated wake event */
-	if (WARN_ON(data->hwirq == ULONG_MAX))
-		return 0;
-
 	offset = data->hwirq / 32;
 	bit = data->hwirq % 32;
 
@@ -2205,10 +2188,6 @@ static int tegra186_pmc_irq_set_type(struct irq_data *data, unsigned int type)
 	struct tegra_pmc *pmc = irq_data_get_irq_chip_data(data);
 	u32 value;
 
-	/* nothing to do if there's no associated wake event */
-	if (data->hwirq == ULONG_MAX)
-		return 0;
-
 	value = readl(pmc->wake + WAKE_AOWAKE_CNTRL(data->hwirq));
 
 	switch (type) {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing
  2020-10-05 11:14 [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Marc Zyngier
                   ` (2 preceding siblings ...)
  2020-10-05 11:14 ` [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels Marc Zyngier
@ 2020-10-05 11:22 ` Thierry Reding
  2020-10-05 13:06   ` Marc Zyngier
  3 siblings, 1 reply; 12+ messages in thread
From: Thierry Reding @ 2020-10-05 11:22 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

[-- Attachment #1: Type: text/plain, Size: 2448 bytes --]

On Mon, Oct 05, 2020 at 12:14:40PM +0100, Marc Zyngier wrote:
> Jon recently reported that one of the Tegra systems (Jetson TX2, aka
> tegra186) stopped booting with the introduction of the "IPI as IRQs"
> series. After a few weeks of head scratching and complete puzzlement,
> I obtained a board and started looking at what was happening.
> 
> The interrupt hierarchy looks like this:
> 
> 	[DEVICE] -A-> [PMC] -B-> [GIC]
> 
> which seems simple enough. However, not all the devices attached to
> the PMC follow this hierarchy, and in some cases, the 'B' link isn't
> present in the HW. In other cases, neither 'A' nor 'B' are present.
> And yet the PMC driver creates such linkages using random hwirq values
> for the non-existent links, potentially overriding existing mappings
> in the process. "What could possibly go wrong?"

Yes, that would've been my fault. It seemed like the right thing to do
at the time, but the way you describe it makes it obvious that it was
not. I can't say I understand why this would've worked prior to the
rework that made this surface, though.

> It turns out that for the 'B' link, the PMC driver uses hwirq 0, which
> is SGI0 for the GIC, and used as the rescheduling IPI. Obviously, this
> doesn't go very well, nor very far, as the IPI gets routed to random
> drivers. Also, as the handling flow has been overridden, this
> interrupt never gets deactivated and can't fire anymore. Yes, this is
> bad.
> 
> The 'A' link is less problematic, but the hwirq value is still out of
> the irqdomain range, and gets remapped every time a new 'A'-less
> driver comes up.
> 
> Instead, let's trim the unused hierarchy levels as needed. This
> requires some checks in the upper levels of the hierarchy as we now
> have optional levels, but this looks a lot saner than what we
> currently have. With this, tegra186 is back booting on -next.
> 
> I haven't tested any wake-up stuff, nor any other nvidia system (this
> is the only one I have). If people agree to these changes, I can take
> them via the irqchip tree so that they make it into the next merge
> window.

Yeah, it sounds like this needs to go in ideally before the rework that
caused this to surface in order to preserve bisectibility. But if it
goes in afterwards that's probably fine as well.

Let Jon and myself do a bit of testing with this to verify that the wake
up paths are still working.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] soc/tegra: pmc: Allow optional irq parent callbacks
  2020-10-05 11:14 ` [PATCH 2/3] soc/tegra: pmc: " Marc Zyngier
@ 2020-10-05 11:27   ` Thierry Reding
  2020-10-05 12:59     ` Marc Zyngier
  0 siblings, 1 reply; 12+ messages in thread
From: Thierry Reding @ 2020-10-05 11:27 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

[-- Attachment #1: Type: text/plain, Size: 941 bytes --]

On Mon, Oct 05, 2020 at 12:14:42PM +0100, Marc Zyngier wrote:
> Make the PMC driver resistent to variable depth interrupt hierarchy,
> which we are about to introduce. The irq_chip structure is now
> allocated statically, providing the indirection for the couple of
> callbacks that are SoC-specific.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  drivers/soc/tegra/pmc.c | 65 ++++++++++++++++++++++++++++++++++-------
>  1 file changed, 54 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
> index d332e5d9abac..9960f7c18431 100644
> --- a/drivers/soc/tegra/pmc.c
> +++ b/drivers/soc/tegra/pmc.c
> @@ -439,7 +439,6 @@ struct tegra_pmc {
>  	struct pinctrl_dev *pctl_dev;
>  
>  	struct irq_domain *domain;
> -	struct irq_chip irq;

Did you have any particular reason for pulling this out of the struct
tegra_pmc and making it a global variable?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels
  2020-10-05 11:14 ` [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels Marc Zyngier
@ 2020-10-05 11:33   ` Thierry Reding
  2020-10-05 13:10     ` Marc Zyngier
  0 siblings, 1 reply; 12+ messages in thread
From: Thierry Reding @ 2020-10-05 11:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

[-- Attachment #1: Type: text/plain, Size: 2617 bytes --]

On Mon, Oct 05, 2020 at 12:14:43PM +0100, Marc Zyngier wrote:
> The Tegra PMC driver does ungodly things with the interrupt hierarchy,
> repeatedly corrupting it by pulling hwirq numbers out of thin air,
> overriding existing IRQ mappings and changing the handling flow
> of unsuspecting users.
> 
> All of this is done in the name of preserving the interrupt hierarchy
> even when these levels do not exist in the HW. Together with the use
> of proper IRQs for IPIs, this leads to an unbootable system as the
> rescheduling IPI gets repeatedly repurposed for random drivers...
> 
> Instead, let's allow the hierarchy to be trimmed to the level that
> actually makes sense for the HW, and not any deeper. This avoids
> having unnecessary callbacks, overriding mappings, and otherwise
> keeps the hierarchy sane.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  drivers/soc/tegra/pmc.c | 79 +++++++++++++++--------------------------
>  1 file changed, 29 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
> index 9960f7c18431..4eea3134fb3e 100644
> --- a/drivers/soc/tegra/pmc.c
> +++ b/drivers/soc/tegra/pmc.c
> @@ -1993,6 +1993,30 @@ static int tegra_pmc_irq_translate(struct irq_domain *domain,
>  	return 0;
>  }
>  
> +/* Trim the irq hierarchy from a particular irq domain */
> +static void trim_hierarchy(unsigned int virq, struct irq_domain *domain)
> +{
> +	struct irq_data *tail, *irq_data = irq_get_irq_data(virq);
> +
> +	/* The PMC doesn't generate any interrupt by itself */
> +	if (WARN_ON(!irq_data->parent_data))
> +		return;
> +
> +	/* Skip until we find the right domain */
> +	while (irq_data->parent_data && irq_data->parent_data->domain != domain)
> +		irq_data = irq_data->parent_data;
> +
> +	/* Sever the inner part of the hierarchy...  */
> +	tail = irq_data->parent_data;
> +	irq_data->parent_data = NULL;
> +
> +	/* ... and free it */
> +	for (irq_data = tail; irq_data; irq_data = tail) {
> +		tail = irq_data->parent_data;
> +		kfree(irq_data);
> +	};
> +}

That kind of looks like what I originally wanted to do and (naively)
thought that passing the (0, NULL, NULL) triplet would achieve.

Given that this is fairly low-level stuff that deals with the inner
workings of the IRQ infrastructure, should we eventually pull this out
of the driver and make it into a core helper? I don't seriously expect
this to be widely useful, but putting it into the core might help keep
it more maintainable.

I volunteer to do that work if you think it's a good idea.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] soc/tegra: pmc: Allow optional irq parent callbacks
  2020-10-05 11:27   ` Thierry Reding
@ 2020-10-05 12:59     ` Marc Zyngier
  0 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 12:59 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

On 2020-10-05 12:27, Thierry Reding wrote:
> On Mon, Oct 05, 2020 at 12:14:42PM +0100, Marc Zyngier wrote:
>> Make the PMC driver resistent to variable depth interrupt hierarchy,
>> which we are about to introduce. The irq_chip structure is now
>> allocated statically, providing the indirection for the couple of
>> callbacks that are SoC-specific.
>> 
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>  drivers/soc/tegra/pmc.c | 65 
>> ++++++++++++++++++++++++++++++++++-------
>>  1 file changed, 54 insertions(+), 11 deletions(-)
>> 
>> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
>> index d332e5d9abac..9960f7c18431 100644
>> --- a/drivers/soc/tegra/pmc.c
>> +++ b/drivers/soc/tegra/pmc.c
>> @@ -439,7 +439,6 @@ struct tegra_pmc {
>>  	struct pinctrl_dev *pctl_dev;
>> 
>>  	struct irq_domain *domain;
>> -	struct irq_chip irq;
> 
> Did you have any particular reason for pulling this out of the struct
> tegra_pmc and making it a global variable?

The main reason is that it really isn't per-PMC. You can do everything
with a single one that is used even if you have multiple PMCs of 
different
types (not sure that's possible with the current HW, but still).

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing
  2020-10-05 11:22 ` [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Thierry Reding
@ 2020-10-05 13:06   ` Marc Zyngier
  2020-10-05 15:45     ` Thierry Reding
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 13:06 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

On 2020-10-05 12:22, Thierry Reding wrote:
> On Mon, Oct 05, 2020 at 12:14:40PM +0100, Marc Zyngier wrote:
>> Jon recently reported that one of the Tegra systems (Jetson TX2, aka
>> tegra186) stopped booting with the introduction of the "IPI as IRQs"
>> series. After a few weeks of head scratching and complete puzzlement,
>> I obtained a board and started looking at what was happening.
>> 
>> The interrupt hierarchy looks like this:
>> 
>> 	[DEVICE] -A-> [PMC] -B-> [GIC]
>> 
>> which seems simple enough. However, not all the devices attached to
>> the PMC follow this hierarchy, and in some cases, the 'B' link isn't
>> present in the HW. In other cases, neither 'A' nor 'B' are present.
>> And yet the PMC driver creates such linkages using random hwirq values
>> for the non-existent links, potentially overriding existing mappings
>> in the process. "What could possibly go wrong?"
> 
> Yes, that would've been my fault. It seemed like the right thing to do
> at the time, but the way you describe it makes it obvious that it was
> not. I can't say I understand why this would've worked prior to the
> rework that made this surface, though.

Because until these IPI patches, the range 0-7 never ever appeared
as actual hwirqs in the GIC domain. SGIs were handled in the GIC
code, behind the core kernel's back. As soon as we start using
an actual domain mapping for hwirq 0, the PMC driver starts messing
with it.

>> It turns out that for the 'B' link, the PMC driver uses hwirq 0, which
>> is SGI0 for the GIC, and used as the rescheduling IPI. Obviously, this
>> doesn't go very well, nor very far, as the IPI gets routed to random
>> drivers. Also, as the handling flow has been overridden, this
>> interrupt never gets deactivated and can't fire anymore. Yes, this is
>> bad.
>> 
>> The 'A' link is less problematic, but the hwirq value is still out of
>> the irqdomain range, and gets remapped every time a new 'A'-less
>> driver comes up.
>> 
>> Instead, let's trim the unused hierarchy levels as needed. This
>> requires some checks in the upper levels of the hierarchy as we now
>> have optional levels, but this looks a lot saner than what we
>> currently have. With this, tegra186 is back booting on -next.
>> 
>> I haven't tested any wake-up stuff, nor any other nvidia system (this
>> is the only one I have). If people agree to these changes, I can take
>> them via the irqchip tree so that they make it into the next merge
>> window.
> 
> Yeah, it sounds like this needs to go in ideally before the rework that
> caused this to surface in order to preserve bisectibility. But if it
> goes in afterwards that's probably fine as well.

It's easy to take it as part of the same pull request as the IPI
stuff. Not fully bisectable for this platform, but close enough.
I may even be able to merge this in *before* the IPI patches
(I'd need to rebuild irq/irqchip-next, but that won't alter the commit
ids of the individual patches as they are on separate branches).

> Let Jon and myself do a bit of testing with this to verify that the 
> wake
> up paths are still working.

Sure. Let me know what you find.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels
  2020-10-05 11:33   ` Thierry Reding
@ 2020-10-05 13:10     ` Marc Zyngier
  0 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2020-10-05 13:10 UTC (permalink / raw)
  To: Thierry Reding
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

On 2020-10-05 12:33, Thierry Reding wrote:
> On Mon, Oct 05, 2020 at 12:14:43PM +0100, Marc Zyngier wrote:
>> The Tegra PMC driver does ungodly things with the interrupt hierarchy,
>> repeatedly corrupting it by pulling hwirq numbers out of thin air,
>> overriding existing IRQ mappings and changing the handling flow
>> of unsuspecting users.
>> 
>> All of this is done in the name of preserving the interrupt hierarchy
>> even when these levels do not exist in the HW. Together with the use
>> of proper IRQs for IPIs, this leads to an unbootable system as the
>> rescheduling IPI gets repeatedly repurposed for random drivers...
>> 
>> Instead, let's allow the hierarchy to be trimmed to the level that
>> actually makes sense for the HW, and not any deeper. This avoids
>> having unnecessary callbacks, overriding mappings, and otherwise
>> keeps the hierarchy sane.
>> 
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>  drivers/soc/tegra/pmc.c | 79 
>> +++++++++++++++--------------------------
>>  1 file changed, 29 insertions(+), 50 deletions(-)
>> 
>> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
>> index 9960f7c18431..4eea3134fb3e 100644
>> --- a/drivers/soc/tegra/pmc.c
>> +++ b/drivers/soc/tegra/pmc.c
>> @@ -1993,6 +1993,30 @@ static int tegra_pmc_irq_translate(struct 
>> irq_domain *domain,
>>  	return 0;
>>  }
>> 
>> +/* Trim the irq hierarchy from a particular irq domain */
>> +static void trim_hierarchy(unsigned int virq, struct irq_domain 
>> *domain)
>> +{
>> +	struct irq_data *tail, *irq_data = irq_get_irq_data(virq);
>> +
>> +	/* The PMC doesn't generate any interrupt by itself */
>> +	if (WARN_ON(!irq_data->parent_data))
>> +		return;
>> +
>> +	/* Skip until we find the right domain */
>> +	while (irq_data->parent_data && irq_data->parent_data->domain != 
>> domain)
>> +		irq_data = irq_data->parent_data;
>> +
>> +	/* Sever the inner part of the hierarchy...  */
>> +	tail = irq_data->parent_data;
>> +	irq_data->parent_data = NULL;
>> +
>> +	/* ... and free it */
>> +	for (irq_data = tail; irq_data; irq_data = tail) {
>> +		tail = irq_data->parent_data;
>> +		kfree(irq_data);
>> +	};
>> +}
> 
> That kind of looks like what I originally wanted to do and (naively)
> thought that passing the (0, NULL, NULL) triplet would achieve.
> 
> Given that this is fairly low-level stuff that deals with the inner
> workings of the IRQ infrastructure, should we eventually pull this out
> of the driver and make it into a core helper? I don't seriously expect
> this to be widely useful, but putting it into the core might help keep
> it more maintainable.

That's the ultimate plan, but I wanted to give it some soaking time
on Tegra before exposing it to the outside world 
(irq_domain_free_irq_data()
could be rewritten in terms of this primitive, for example).

> I volunteer to do that work if you think it's a good idea.

Sure, once we know we're good to go with this.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing
  2020-10-05 13:06   ` Marc Zyngier
@ 2020-10-05 15:45     ` Thierry Reding
  2020-10-05 18:23       ` Jon Hunter
  0 siblings, 1 reply; 12+ messages in thread
From: Thierry Reding @ 2020-10-05 15:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Jonathan Hunter,
	Dmitry Osipenko, Sowjanya Komatineni, Venkat Reddy Talla,
	Thomas Gleixner, kernel-team

[-- Attachment #1: Type: text/plain, Size: 6398 bytes --]

On Mon, Oct 05, 2020 at 02:06:37PM +0100, Marc Zyngier wrote:
> On 2020-10-05 12:22, Thierry Reding wrote:
> > On Mon, Oct 05, 2020 at 12:14:40PM +0100, Marc Zyngier wrote:
> > > Jon recently reported that one of the Tegra systems (Jetson TX2, aka
> > > tegra186) stopped booting with the introduction of the "IPI as IRQs"
> > > series. After a few weeks of head scratching and complete puzzlement,
> > > I obtained a board and started looking at what was happening.
> > > 
> > > The interrupt hierarchy looks like this:
> > > 
> > > 	[DEVICE] -A-> [PMC] -B-> [GIC]
> > > 
> > > which seems simple enough. However, not all the devices attached to
> > > the PMC follow this hierarchy, and in some cases, the 'B' link isn't
> > > present in the HW. In other cases, neither 'A' nor 'B' are present.
> > > And yet the PMC driver creates such linkages using random hwirq values
> > > for the non-existent links, potentially overriding existing mappings
> > > in the process. "What could possibly go wrong?"
> > 
> > Yes, that would've been my fault. It seemed like the right thing to do
> > at the time, but the way you describe it makes it obvious that it was
> > not. I can't say I understand why this would've worked prior to the
> > rework that made this surface, though.
> 
> Because until these IPI patches, the range 0-7 never ever appeared
> as actual hwirqs in the GIC domain. SGIs were handled in the GIC
> code, behind the core kernel's back. As soon as we start using
> an actual domain mapping for hwirq 0, the PMC driver starts messing
> with it.

Okay, that makes sense.

> > > It turns out that for the 'B' link, the PMC driver uses hwirq 0, which
> > > is SGI0 for the GIC, and used as the rescheduling IPI. Obviously, this
> > > doesn't go very well, nor very far, as the IPI gets routed to random
> > > drivers. Also, as the handling flow has been overridden, this
> > > interrupt never gets deactivated and can't fire anymore. Yes, this is
> > > bad.
> > > 
> > > The 'A' link is less problematic, but the hwirq value is still out of
> > > the irqdomain range, and gets remapped every time a new 'A'-less
> > > driver comes up.
> > > 
> > > Instead, let's trim the unused hierarchy levels as needed. This
> > > requires some checks in the upper levels of the hierarchy as we now
> > > have optional levels, but this looks a lot saner than what we
> > > currently have. With this, tegra186 is back booting on -next.
> > > 
> > > I haven't tested any wake-up stuff, nor any other nvidia system (this
> > > is the only one I have). If people agree to these changes, I can take
> > > them via the irqchip tree so that they make it into the next merge
> > > window.
> > 
> > Yeah, it sounds like this needs to go in ideally before the rework that
> > caused this to surface in order to preserve bisectibility. But if it
> > goes in afterwards that's probably fine as well.
> 
> It's easy to take it as part of the same pull request as the IPI
> stuff. Not fully bisectable for this platform, but close enough.
> I may even be able to merge this in *before* the IPI patches
> (I'd need to rebuild irq/irqchip-next, but that won't alter the commit
> ids of the individual patches as they are on separate branches).

I noticed there's a small conflict with another patch that I've queued
up and that I was thinking of sending in for v5.10. This adds a callback
for .irq_retrigger. I don't exactly recall why I needed this, but I've
been carrying that patch locally for a while now. I don't think it's
critical, but there must've been an issue that it fixed. Unfortunately I
seem to have been in a hurry because the commit message is a bit terse.

I can probably just leave that out for now to not further complicated
things, in which case it should be fine to take your patches here into
v5.10 along with the IPI changes. As I already mentioned I'm a bit
unhappy about the needless move of the irq_chip to a global variable,
but I can send out an updated version with those changed back, which is
basically what I have on my local test branch because I had to manually
apply that patch anyway due to the conflict I mentioned above, so
keeping the embedded irq_chip was actually simpler and reduces the diff
a little bit.

> > Let Jon and myself do a bit of testing with this to verify that the wake
> > up paths are still working.
> 
> Sure. Let me know what you find.

The results are in and it's a bit of a mixed bag. I was able to confirm
that Tegra194 also boots again after this series and I'm also able to
resume from sleep using either rtcwake or the power-key as wakeup
source, so the wake-events mechanism is still functional after the
series. I do see a bit of breakage on resume, but none of that seems
related to your patches and is likely something that crept in while we
were looking into the current issue.

Jon had started a job in our test farm in parallel and that came back
with a failing suspend/resume test on Tegra186 (Jetson TX2), but that
seems to have been a pre-existing issue. This was already in linux-next
around next-20200910 and Jon had been investigating it when the boot
failures due to the IPI changes started happening. So I then hooked up
my Jetson TX2 and verified locally that I can properly suspend/resume
using either rtcwake or the power-key as wakeup source, just like I
previously did on Tegra194 (Jetson AGX Xavier). Tegra186 seems to be a
little more unstable because it didn't boot every time for me, but that
is probably not related to this.

I also did a bit of testing on Tegra210 and that seems to also still
work. We cannot test suspend/resume and therefore wake-events there
out-of-the-box because of some long-standing issue with USB refusing
to suspend, but I was able to at least trigger wake-up using the
RTC by first unloading the XUSB driver. It doesn't resume to a fully
functional system, but if I do see it resume, then the wake-event
code must have worked.

So, I'm tempted to say:

Tested-by: Thierry Reding <treding@nvidia.com>

on the series and then we can pick up the other pieces once everything
is back to at least booting.

Thanks again for tracking this down, and apologies for breaking it in
the first place. Let me know if you want me to send the bike-shedded
version of your patches.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing
  2020-10-05 15:45     ` Thierry Reding
@ 2020-10-05 18:23       ` Jon Hunter
  0 siblings, 0 replies; 12+ messages in thread
From: Jon Hunter @ 2020-10-05 18:23 UTC (permalink / raw)
  To: Thierry Reding, Marc Zyngier
  Cc: linux-tegra, linux-kernel, linux-arm-kernel, Dmitry Osipenko,
	Sowjanya Komatineni, Venkat Reddy Talla, Thomas Gleixner,
	kernel-team


On 05/10/2020 16:45, Thierry Reding wrote:

...

>>> Let Jon and myself do a bit of testing with this to verify that the wake
>>> up paths are still working.
>>
>> Sure. Let me know what you find.
> 
> The results are in and it's a bit of a mixed bag. I was able to confirm
> that Tegra194 also boots again after this series and I'm also able to
> resume from sleep using either rtcwake or the power-key as wakeup
> source, so the wake-events mechanism is still functional after the
> series. I do see a bit of breakage on resume, but none of that seems
> related to your patches and is likely something that crept in while we
> were looking into the current issue.
> 
> Jon had started a job in our test farm in parallel and that came back
> with a failing suspend/resume test on Tegra186 (Jetson TX2), but that
> seems to have been a pre-existing issue. This was already in linux-next
> around next-20200910 and Jon had been investigating it when the boot
> failures due to the IPI changes started happening. So I then hooked up
> my Jetson TX2 and verified locally that I can properly suspend/resume
> using either rtcwake or the power-key as wakeup source, just like I
> previously did on Tegra194 (Jetson AGX Xavier). Tegra186 seems to be a
> little more unstable because it didn't boot every time for me, but that
> is probably not related to this.

Yes my feeling is that those are other issues too that we need to look
at next.

> So, I'm tempted to say:
> 
> Tested-by: Thierry Reding <treding@nvidia.com>

Yes and you can have my ...

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Thanks again Marc for tracking this down!

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-05 18:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05 11:14 [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Marc Zyngier
2020-10-05 11:14 ` [PATCH 1/3] gpio: tegra186: Allow optional irq parent callbacks Marc Zyngier
2020-10-05 11:14 ` [PATCH 2/3] soc/tegra: pmc: " Marc Zyngier
2020-10-05 11:27   ` Thierry Reding
2020-10-05 12:59     ` Marc Zyngier
2020-10-05 11:14 ` [PATCH 3/3] soc/tegra: pmc: Don't create fake interrupt hierarchy levels Marc Zyngier
2020-10-05 11:33   ` Thierry Reding
2020-10-05 13:10     ` Marc Zyngier
2020-10-05 11:22 ` [PATCH 0/3] soc/tegra: Prevent the PMC driver from corrupting interrupt routing Thierry Reding
2020-10-05 13:06   ` Marc Zyngier
2020-10-05 15:45     ` Thierry Reding
2020-10-05 18:23       ` Jon Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).