All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Will Deacon <will@kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-next@vger.kernel.org,
	Claire Chang <tientzu@chromium.org>,
	Christoph Hellwig <hch@lst.de>,
	Robin Murphy <robin.murphy@arm.com>,
	iommu@lists.linux-foundation.org
Subject: Re: [powerpc][next-20210727] Boot failure - kernel BUG at arch/powerpc/kernel/interrupt.c:98!
Date: Wed, 28 Jul 2021 10:35:34 -0700	[thread overview]
Message-ID: <YQGVZnMe9hFieF8D@Ryzen-9-3900X.localdomain> (raw)
In-Reply-To: <1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com>

On Wed, Jul 28, 2021 at 01:31:06PM +0530, Sachin Sant wrote:
> linux-next fails to boot on Power server (POWER8/POWER9). Following traces
> are seen during boot
> 
> [    0.010799] software IO TLB: tearing down default memory pool
> [    0.010805] ------------[ cut here ]------------
> [    0.010808] kernel BUG at arch/powerpc/kernel/interrupt.c:98!
> [    0.010812] Oops: Exception in kernel mode, sig: 5 [#1]
> [    0.010816] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [    0.010820] Modules linked in:
> [    0.010824] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc3-next-20210727 #1
> [    0.010830] NIP:  c000000000032cfc LR: c00000000000c764 CTR: c00000000000c670
> [    0.010834] REGS: c000000003603b10 TRAP: 0700   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010838] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000002
> [    0.010848] CFAR: c00000000000c760 IRQMASK: 3 
> [    0.010848] GPR00: c00000000000c764 c000000003603db0 c0000000029bd000 0000000000000001 
> [    0.010848] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010848] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000003 
> [    0.010848] GPR12: ffffffffffffffff c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010848] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR24: 000000000000f134 0000000000000000 ffffffffffffffff c000000003603868 
> [    0.010848] GPR28: 0000000000000400 0000000000000a68 c00000000202e9c0 c000000003603e80 
> [    0.010896] NIP [c000000000032cfc] system_call_exception+0x8c/0x2e0
> [    0.010901] LR [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010907] Call Trace:
> [    0.010909] [c000000003603db0] [c00000000016a6dc] calculate_sigpending+0x4c/0xe0 (unreliable)
> [    0.010915] [c000000003603e10] [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010921] --- interrupt: c00 at kvm_template_end+0x4/0x8
> [    0.010926] NIP:  c000000000092dec LR: c000000000114fc8 CTR: 0000000000000000
> [    0.010930] REGS: c000000003603e80 TRAP: 0c00   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010934] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000000
> [    0.010943] IRQMASK: 0 
> [    0.010943] GPR00: c00000000202e9c0 c000000003603b00 c0000000029bd000 000000000000f134 
> [    0.010943] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010943] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR12: 0000000000000000 c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010943] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR24: c0000000020033c4 c00000000110afc0 c000000002081950 c000000003277d40 
> [    0.010943] GPR28: 0000000000000000 c00000000a680000 0000000004000000 00000000000d0000 
> [    0.010989] NIP [c000000000092dec] kvm_template_end+0x4/0x8
> [    0.010993] LR [c000000000114fc8] set_memory_encrypted+0x38/0x60
> [    0.010999] --- interrupt: c00
> [    0.011001] [c000000003603b00] [c00000000000c764] system_call_common+0xf4/0x258 (unreliable)
> [    0.011008] Instruction dump:
> [    0.011011] 694a0003 312affff 7d495110 0b0a0000 60000000 60000000 e87f0108 68690002 
> [    0.011019] 7929ffe2 0b090000 68634000 786397e2 <0b030000> e93f0138 792907e0 0b090000 
> [    0.011029] ---[ end trace a20ad55589efcb10 ]---
> [    0.012297] 
> [    1.012304] Kernel panic - not syncing: Fatal exception
> 
> next-20210723 was good. The boot failure seems to have been introduced with next-20210726.
> 
> I have attached the boot log.

I noticed this with OpenSUSE's ppc64le config [1] and my bisect landed on
commit ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()"). That
series just keeps on giving... Adding some people from that thread to
this one. Original thread:
https://lore.kernel.org/r/1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com/

[1]: https://github.com/openSUSE/kernel-source/raw/master/config/ppc64le/default

Cheers,
Nathan

WARNING: multiple messages have this Message-ID (diff)
From: Nathan Chancellor <nathan@kernel.org>
To: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Will Deacon <will@kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
	iommu@lists.linux-foundation.org, linux-next@vger.kernel.org,
	Claire Chang <tientzu@chromium.org>,
	linuxppc-dev@lists.ozlabs.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [powerpc][next-20210727] Boot failure - kernel BUG at arch/powerpc/kernel/interrupt.c:98!
Date: Wed, 28 Jul 2021 10:35:34 -0700	[thread overview]
Message-ID: <YQGVZnMe9hFieF8D@Ryzen-9-3900X.localdomain> (raw)
In-Reply-To: <1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com>

On Wed, Jul 28, 2021 at 01:31:06PM +0530, Sachin Sant wrote:
> linux-next fails to boot on Power server (POWER8/POWER9). Following traces
> are seen during boot
> 
> [    0.010799] software IO TLB: tearing down default memory pool
> [    0.010805] ------------[ cut here ]------------
> [    0.010808] kernel BUG at arch/powerpc/kernel/interrupt.c:98!
> [    0.010812] Oops: Exception in kernel mode, sig: 5 [#1]
> [    0.010816] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [    0.010820] Modules linked in:
> [    0.010824] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc3-next-20210727 #1
> [    0.010830] NIP:  c000000000032cfc LR: c00000000000c764 CTR: c00000000000c670
> [    0.010834] REGS: c000000003603b10 TRAP: 0700   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010838] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000002
> [    0.010848] CFAR: c00000000000c760 IRQMASK: 3 
> [    0.010848] GPR00: c00000000000c764 c000000003603db0 c0000000029bd000 0000000000000001 
> [    0.010848] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010848] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000003 
> [    0.010848] GPR12: ffffffffffffffff c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010848] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR24: 000000000000f134 0000000000000000 ffffffffffffffff c000000003603868 
> [    0.010848] GPR28: 0000000000000400 0000000000000a68 c00000000202e9c0 c000000003603e80 
> [    0.010896] NIP [c000000000032cfc] system_call_exception+0x8c/0x2e0
> [    0.010901] LR [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010907] Call Trace:
> [    0.010909] [c000000003603db0] [c00000000016a6dc] calculate_sigpending+0x4c/0xe0 (unreliable)
> [    0.010915] [c000000003603e10] [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010921] --- interrupt: c00 at kvm_template_end+0x4/0x8
> [    0.010926] NIP:  c000000000092dec LR: c000000000114fc8 CTR: 0000000000000000
> [    0.010930] REGS: c000000003603e80 TRAP: 0c00   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010934] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000000
> [    0.010943] IRQMASK: 0 
> [    0.010943] GPR00: c00000000202e9c0 c000000003603b00 c0000000029bd000 000000000000f134 
> [    0.010943] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010943] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR12: 0000000000000000 c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010943] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR24: c0000000020033c4 c00000000110afc0 c000000002081950 c000000003277d40 
> [    0.010943] GPR28: 0000000000000000 c00000000a680000 0000000004000000 00000000000d0000 
> [    0.010989] NIP [c000000000092dec] kvm_template_end+0x4/0x8
> [    0.010993] LR [c000000000114fc8] set_memory_encrypted+0x38/0x60
> [    0.010999] --- interrupt: c00
> [    0.011001] [c000000003603b00] [c00000000000c764] system_call_common+0xf4/0x258 (unreliable)
> [    0.011008] Instruction dump:
> [    0.011011] 694a0003 312affff 7d495110 0b0a0000 60000000 60000000 e87f0108 68690002 
> [    0.011019] 7929ffe2 0b090000 68634000 786397e2 <0b030000> e93f0138 792907e0 0b090000 
> [    0.011029] ---[ end trace a20ad55589efcb10 ]---
> [    0.012297] 
> [    1.012304] Kernel panic - not syncing: Fatal exception
> 
> next-20210723 was good. The boot failure seems to have been introduced with next-20210726.
> 
> I have attached the boot log.

I noticed this with OpenSUSE's ppc64le config [1] and my bisect landed on
commit ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()"). That
series just keeps on giving... Adding some people from that thread to
this one. Original thread:
https://lore.kernel.org/r/1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com/

[1]: https://github.com/openSUSE/kernel-source/raw/master/config/ppc64le/default

Cheers,
Nathan

WARNING: multiple messages have this Message-ID (diff)
From: Nathan Chancellor <nathan@kernel.org>
To: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Will Deacon <will@kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
	iommu@lists.linux-foundation.org, linux-next@vger.kernel.org,
	Claire Chang <tientzu@chromium.org>,
	linuxppc-dev@lists.ozlabs.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [powerpc][next-20210727] Boot failure - kernel BUG at arch/powerpc/kernel/interrupt.c:98!
Date: Wed, 28 Jul 2021 10:35:34 -0700	[thread overview]
Message-ID: <YQGVZnMe9hFieF8D@Ryzen-9-3900X.localdomain> (raw)
In-Reply-To: <1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com>

On Wed, Jul 28, 2021 at 01:31:06PM +0530, Sachin Sant wrote:
> linux-next fails to boot on Power server (POWER8/POWER9). Following traces
> are seen during boot
> 
> [    0.010799] software IO TLB: tearing down default memory pool
> [    0.010805] ------------[ cut here ]------------
> [    0.010808] kernel BUG at arch/powerpc/kernel/interrupt.c:98!
> [    0.010812] Oops: Exception in kernel mode, sig: 5 [#1]
> [    0.010816] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [    0.010820] Modules linked in:
> [    0.010824] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc3-next-20210727 #1
> [    0.010830] NIP:  c000000000032cfc LR: c00000000000c764 CTR: c00000000000c670
> [    0.010834] REGS: c000000003603b10 TRAP: 0700   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010838] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000002
> [    0.010848] CFAR: c00000000000c760 IRQMASK: 3 
> [    0.010848] GPR00: c00000000000c764 c000000003603db0 c0000000029bd000 0000000000000001 
> [    0.010848] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010848] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000003 
> [    0.010848] GPR12: ffffffffffffffff c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010848] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR24: 000000000000f134 0000000000000000 ffffffffffffffff c000000003603868 
> [    0.010848] GPR28: 0000000000000400 0000000000000a68 c00000000202e9c0 c000000003603e80 
> [    0.010896] NIP [c000000000032cfc] system_call_exception+0x8c/0x2e0
> [    0.010901] LR [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010907] Call Trace:
> [    0.010909] [c000000003603db0] [c00000000016a6dc] calculate_sigpending+0x4c/0xe0 (unreliable)
> [    0.010915] [c000000003603e10] [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010921] --- interrupt: c00 at kvm_template_end+0x4/0x8
> [    0.010926] NIP:  c000000000092dec LR: c000000000114fc8 CTR: 0000000000000000
> [    0.010930] REGS: c000000003603e80 TRAP: 0c00   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010934] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000000
> [    0.010943] IRQMASK: 0 
> [    0.010943] GPR00: c00000000202e9c0 c000000003603b00 c0000000029bd000 000000000000f134 
> [    0.010943] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010943] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR12: 0000000000000000 c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010943] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR24: c0000000020033c4 c00000000110afc0 c000000002081950 c000000003277d40 
> [    0.010943] GPR28: 0000000000000000 c00000000a680000 0000000004000000 00000000000d0000 
> [    0.010989] NIP [c000000000092dec] kvm_template_end+0x4/0x8
> [    0.010993] LR [c000000000114fc8] set_memory_encrypted+0x38/0x60
> [    0.010999] --- interrupt: c00
> [    0.011001] [c000000003603b00] [c00000000000c764] system_call_common+0xf4/0x258 (unreliable)
> [    0.011008] Instruction dump:
> [    0.011011] 694a0003 312affff 7d495110 0b0a0000 60000000 60000000 e87f0108 68690002 
> [    0.011019] 7929ffe2 0b090000 68634000 786397e2 <0b030000> e93f0138 792907e0 0b090000 
> [    0.011029] ---[ end trace a20ad55589efcb10 ]---
> [    0.012297] 
> [    1.012304] Kernel panic - not syncing: Fatal exception
> 
> next-20210723 was good. The boot failure seems to have been introduced with next-20210726.
> 
> I have attached the boot log.

I noticed this with OpenSUSE's ppc64le config [1] and my bisect landed on
commit ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()"). That
series just keeps on giving... Adding some people from that thread to
this one. Original thread:
https://lore.kernel.org/r/1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com/

[1]: https://github.com/openSUSE/kernel-source/raw/master/config/ppc64le/default

Cheers,
Nathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2021-07-28 17:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-28  8:01 [powerpc][next-20210727] Boot failure - kernel BUG at arch/powerpc/kernel/interrupt.c:98! Sachin Sant
2021-07-28 17:35 ` Nathan Chancellor [this message]
2021-07-28 17:35   ` Nathan Chancellor
2021-07-28 17:35   ` Nathan Chancellor
2021-07-29  4:08   ` Nicholas Piggin
2021-07-29  4:08     ` Nicholas Piggin
2021-07-29  4:08     ` Nicholas Piggin
2021-07-29  4:21   ` Sachin Sant
2021-07-29  4:21     ` Sachin Sant
2021-07-29  4:21     ` Sachin Sant
2021-07-29 16:13   ` Will Deacon
2021-07-29 16:13     ` Will Deacon
2021-07-29 16:13     ` Will Deacon
2021-07-29 16:35     ` Konrad Rzeszutek Wilk
2021-07-29 16:35       ` Konrad Rzeszutek Wilk
2021-07-29 16:35       ` Konrad Rzeszutek Wilk
2021-07-29 19:05       ` Nathan Chancellor
2021-07-29 19:05         ` Nathan Chancellor
2021-07-29 19:05         ` Nathan Chancellor
2021-07-30  5:17     ` Sachin Sant
2021-07-30  5:17       ` Sachin Sant
2021-07-30  5:17       ` Sachin Sant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQGVZnMe9hFieF8D@Ryzen-9-3900X.localdomain \
    --to=nathan@kernel.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=robin.murphy@arm.com \
    --cc=sachinp@linux.vnet.ibm.com \
    --cc=tientzu@chromium.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.