Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: dann frazier <dann.frazier@canonical.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Sumit Garg <sumit.garg@linaro.org>,
	kernel-team@android.com, Russell King <linux@arm.linux.org.uk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH 08/11] irqchip/gic: Configure SGIs as standard interrupts
Date: Wed, 21 Apr 2021 11:58:40 +0100
Message-ID: <8735vjrjj3.wl-maz@kernel.org> (raw)
In-Reply-To: <YH9G3+aDUWpcLCpD@xps13.dannf>

Hi Dan,n

On Tue, 20 Apr 2021 22:25:51 +0100,
dann frazier <dann.frazier@canonical.com> wrote:
> 
> On Tue, Apr 20, 2021 at 02:37:10PM -0600, dann frazier wrote:
> > On Tue, May 19, 2020 at 05:17:52PM +0100, Marc Zyngier wrote:
> > > Change the way we deal with GIC SGIs by turning them into proper
> > > IRQs, and calling into the arch code to register the interrupt range
> > > instead of a callback.
> > > 
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > 
> > hey Marc,
> > 
> >   I bisected a boot failure on our Gigabyte R120-T33 systems (ThunderX
> > CN88XX) down to this commit, but only when running in ACPI mode. See below:
> > 
> > 
> > EFI stub: Booting Linux Kernel...
> > EFI stub: EFI_RNG_PROTOCOL unavailable, KASLR will be disabled
> > EFI stub: Using DTB from configuration table
> > EFI stub: Exiting boot services and installing virtual address map...
> > [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x431f0a11]
> > [    0.000000] Linux version 5.11.0-13-generic (buildd@bos02-arm64-067) (gcc (Ubuntu 10.2.1-23ubuntu1) 10.2.1 20210312, GNU ld (GNU Binutils for Ubuntu) 2.36.1) #14-Ubuntu SMP Fri Mar 19 16:57:35 UTC 2021 (Ubuntu 5.11.0-13.14-generic 5.11.7)
> 
> Sorry, realized I posted a log from an Ubuntu kernel. Here's an
> upstream one:

[...]

> 
> [    7.842174] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 243)
> [    7.849699] io scheduler mq-deadline registered
> [    7.857591] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> [    7.865127] efifb: probing for efifb
> [    7.868738] efifb: No BGRT, not showing boot graphics
> [    7.873783] efifb: framebuffer at 0x881010000000, using 3072k, total 3072k
> [    7.880649] efifb: mode is 1024x768x32, linelength=4096, pages=1
> [    7.886647] efifb: scrolling: redraw
> [    7.890212] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
> [    7.895905] fbcon: Deferring console take-over
> [    7.900350] fb0: EFI VGA frame buffer device
> [    7.905289] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
> [    7.913714] ACPI: button: Power Button [PWRB]
> [    7.919549] ACPI GTDT: [Firmware Bug]: failed to get the Watchdog base address.
> [    7.927289] Unable to handle kernel read from unreadable memory at virtual address 0000000000000028
> [    7.936326] Mem abort info:
> [    7.939108]   ESR = 0x96000004
> [    7.942151]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    7.947451]   SET = 0, FnV = 0
> [    7.950494]   EA = 0, S1PTW = 0
> [    7.953624] Data abort info:
> [    7.956492]   ISV = 0, ISS = 0x00000004
> [    7.960316]   CM = 0, WnR = 0
> [    7.963273] [0000000000000028] user address but active_mm is swapper
> [    7.969616] Internal error: Oops: 96000004 [#1] SMP
> [    7.974483] Modules linked in:
> [    7.977531] CPU: 9 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc8 #19
> [    7.983874] Hardware name: GIGABYTE R120-T33/MT30-GS1, BIOS F02 08/06/2019
> [    7.990737] pstate: 40400085 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
> [    7.996732] pc : __ipi_send_mask+0x60/0x114
> [    8.000910] lr : smp_cross_call+0x40/0xcc
> [    8.004913] sp : ffff800012753c10
> [    8.008216] x29: ffff800012753c10 x28: ffff000100de5d00 
> [    8.013521] x27: 000000000000000a x26: ffff80001225da20 
> [    8.018825] x25: 0000000000000000 x24: ffff000ff62719b0 
> [    8.024129] x23: ffff80001225d000 x22: ffff800012368108 
> [    8.029433] x21: ffff800010f69a20 x20: 0000000000000000 
> [    8.034737] x19: ffff000100143c60 x18: 0000000000000020 
> [    8.040041] x17: 000000008e74252f x16: 00000000bf0ab2ad 
> [    8.045345] x15: ffffffffffffffff x14: 0000000000000000 
> [    8.050649] x13: 003d090000000000 x12: 00003d0900000000 
> [    8.055953] x11: 0000000000000000 x10: 00003d0900000000 
> [    8.061257] x9 : ffff800010027f14 x8 : 0000000000000000 
> [    8.066561] x7 : 00000000ffffffff x6 : ffff000ff6148698 
> [    8.071865] x5 : ffff80001159d040 x4 : ffff80001159d110 
> [    8.077169] x3 : ffff800010f69a00 x2 : 0000000000000000 
> [    8.082473] x1 : ffff800010f69a20 x0 : 0000000000000000 
> [    8.087777] Call trace:
> [    8.090213]  __ipi_send_mask+0x60/0x114
> [    8.094038]  smp_cross_call+0x40/0xcc
> [    8.097691]  smp_send_reschedule+0x3c/0x50
> [    8.101778]  resched_curr+0x5c/0xb0
> [    8.105258]  check_preempt_curr+0x58/0x90
> [    8.109258]  ttwu_do_wakeup+0x2c/0x190
> [    8.112996]  ttwu_do_activate+0x7c/0x114
> [    8.116909]  try_to_wake_up+0x388/0x670
> [    8.120735]  wake_up_process+0x24/0x30
> [    8.124474]  swake_up_one+0x48/0x9c
> [    8.127953]  rcu_gp_kthread_wake+0x68/0x8c
> [    8.132041]  rcu_accelerate_cbs_unlocked+0xb4/0xf0
> [    8.136822]  rcu_core+0x520/0x694
> [    8.140128]  rcu_core_si+0x1c/0x2c
> [    8.143520]  __do_softirq+0x128/0x388
> [    8.147172]  irq_exit+0xc4/0xec
> [    8.150304]  __handle_domain_irq+0x8c/0xec
> [    8.154394]  gic_handle_irq+0xd8/0x2f0
> [    8.158132]  el1_irq+0xc0/0x180
> [    8.161262]  __pi_strcmp+0x20/0x158
> [    8.164742]  driver_register+0x68/0x140
> [    8.168571]  __platform_driver_register+0x34/0x40
> [    8.173265]  imx8mp_clk_driver_init+0x28/0x34
> [    8.177614]  do_one_initcall+0x50/0x260
> [    8.181440]  kernel_init_freeable+0x24c/0x2d4
> [    8.185790]  kernel_init+0x20/0x134
> [    8.189271]  ret_from_fork+0x10/0x18
> [    8.192840] Code: a90363f7 aa0103f5 d0010957 f9401260 (b9402800) 
> [    8.198955] ---[ end trace c24172add816c1f0 ]---
> [    8.203562] Kernel panic - not syncing: Oops: Fatal exception in interrupt
> [    8.210442] SMP: stopping secondary CPUs
> [    9.258360] SMP: failed to stop secondary CPUs 0,9
> [    9.263141] Kernel Offset: disabled
> [    9.266617] CPU features: 0x00040002,69101108
> [    9.270963] Memory Limit: none
> [    9.274024] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

Please feed this stacktrace to scripts/decode_stacktrace.sh so that I
can get an idea about what is going wrong. I bet something is playing
ungodly games with the one of the IPIs, and things go horribly wrong.

Now, here's a hunch: in the fine TX1 tradition, the firmware is broken
and the GTDT table looks unusable. Amusingly, the crash happens right
after the SBSA watchdog fails to probe.

And looking at the code that implements that driver, it looks dodgy as
hell, as it unmaps an interrupt it doesn't even know is valid. And it
does that right when the driver fails the way you experienced it. If,
by any chance, the interrupt field is 0 in the firmware table, this
results in SGI0 being unmapped. Given that this is the rescheduling
interrupt, fireworks happen.

Can you have a go with the patchlet below, and let me know if that
helps?

Thanks,

	M.

diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c
index f2d0e5915dab..0a0a982f9c28 100644
--- a/drivers/acpi/arm64/gtdt.c
+++ b/drivers/acpi/arm64/gtdt.c
@@ -329,7 +329,7 @@ static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
 					int index)
 {
 	struct platform_device *pdev;
-	int irq = map_gt_gsi(wd->timer_interrupt, wd->timer_flags);
+	int irq;
 
 	/*
 	 * According to SBSA specification the size of refresh and control
@@ -338,7 +338,7 @@ static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
 	struct resource res[] = {
 		DEFINE_RES_MEM(wd->control_frame_address, SZ_4K),
 		DEFINE_RES_MEM(wd->refresh_frame_address, SZ_4K),
-		DEFINE_RES_IRQ(irq),
+		{},
 	};
 	int nr_res = ARRAY_SIZE(res);
 
@@ -348,10 +348,11 @@ static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
 
 	if (!(wd->refresh_frame_address && wd->control_frame_address)) {
 		pr_err(FW_BUG "failed to get the Watchdog base address.\n");
-		acpi_unregister_gsi(wd->timer_interrupt);
 		return -EINVAL;
 	}
 
+	irq = map_gt_gsi(wd->timer_interrupt, wd->timer_flags);
+	res[2] = (struct resource)DEFINE_RES_IRQ(irq);
 	if (irq <= 0) {
 		pr_warn("failed to map the Watchdog interrupt.\n");
 		nr_res--;
@@ -364,7 +365,8 @@ static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
 	 */
 	pdev = platform_device_register_simple("sbsa-gwdt", index, res, nr_res);
 	if (IS_ERR(pdev)) {
-		acpi_unregister_gsi(wd->timer_interrupt);
+		if (irq > 0)
+			acpi_unregister_gsi(wd->timer_interrupt);
 		return PTR_ERR(pdev);
 	}
 

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply index

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19 16:17 [PATCH 00/11] arm/arm64: Turning IPIs into normal interrupts Marc Zyngier
2020-05-19 16:17 ` [PATCH 01/11] genirq: Add fasteoi IPI flow Marc Zyngier
2020-05-19 19:47   ` Florian Fainelli
2020-06-12  9:54     ` Marc Zyngier
2020-05-19 22:25   ` Valentin Schneider
2020-05-19 22:29     ` Valentin Schneider
2020-06-12  9:58     ` Marc Zyngier
2020-05-19 16:17 ` [PATCH 02/11] genirq: Allow interrupts to be excluded from /proc/interrupts Marc Zyngier
2020-05-19 16:17 ` [PATCH 03/11] arm64: Allow IPIs to be handled as normal interrupts Marc Zyngier
2020-05-21 14:03   ` Valentin Schneider
2020-05-19 16:17 ` [PATCH 04/11] ARM: " Marc Zyngier
2020-05-19 22:24   ` Russell King - ARM Linux admin
2020-05-21 14:03     ` Valentin Schneider
2020-05-21 15:12       ` Russell King - ARM Linux admin
2020-05-21 16:11         ` Valentin Schneider
2020-05-19 16:17 ` [PATCH 05/11] irqchip/gic-v3: Describe the SGI range Marc Zyngier
2020-05-19 16:17 ` [PATCH 06/11] irqchip/gic-v3: Configure SGIs as standard interrupts Marc Zyngier
2020-05-20  9:52   ` Sumit Garg
2020-05-20 10:24     ` Marc Zyngier
2020-05-21 14:04   ` Valentin Schneider
2020-06-12 10:39     ` Marc Zyngier
2020-05-19 16:17 ` [PATCH 07/11] irqchip/gic: Refactor SMP configuration Marc Zyngier
2020-05-19 16:17 ` [PATCH 08/11] irqchip/gic: Configure SGIs as standard interrupts Marc Zyngier
2021-04-20 20:37   ` dann frazier
2021-04-20 21:25     ` dann frazier
2021-04-21 10:58       ` Marc Zyngier [this message]
2021-04-21 14:52         ` dann frazier
2021-04-21 15:49           ` Marc Zyngier
2020-05-19 16:17 ` [PATCH 09/11] irqchip/gic-common: Don't enable SGIs by default Marc Zyngier
2020-05-19 16:17 ` [PATCH 10/11] irqchip/bcm2836: Configure mailbox interrupts as standard interrupts Marc Zyngier
2020-05-19 16:17 ` [PATCH 11/11] arm64: Kill __smp_cross_call and co Marc Zyngier
2020-05-19 17:50 ` [PATCH 00/11] arm/arm64: Turning IPIs into normal interrupts Florian Fainelli
2020-05-19 19:47   ` Florian Fainelli
2020-06-12  9:49   ` Marc Zyngier
2020-06-12 16:57     ` Florian Fainelli
2020-05-19 22:25 ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735vjrjj3.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dann.frazier@canonical.com \
    --cc=kernel-team@android.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=sumit.garg@linaro.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org
	public-inbox-index linux-arm-kernel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git