linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gaurav Batra <gbatra@linux.ibm.com>
To: "Michal Suchánek" <msuchanek@suse.de>,
	"Michael Ellerman" <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc/pseries/iommu: LPAR panics when rebooted with a frozen PE
Date: Fri, 19 Apr 2024 09:41:11 -0500	[thread overview]
Message-ID: <3064baac-4727-4b9b-ab86-fc9476c937e0@linux.ibm.com> (raw)
In-Reply-To: <20240419111127.GZ20665@kitsune.suse.cz>

You are right. I think, the "reboot" should be replaced with just "boot 
up". If there are no other comments, or code changes, I can re-word the 
commit message and submit for review.

Thanks,

Gaurav

On 4/19/24 6:11 AM, Michal Suchánek wrote:
> Hello,
>
> On Fri, Apr 19, 2024 at 04:12:46PM +1000, Michael Ellerman wrote:
>> Gaurav Batra <gbatra@linux.ibm.com> writes:
>>> At the time of LPAR reboot, partition firmware provides Open Firmware
>>> property ibm,dma-window for the PE. This property is provided on the PCI
>>> bus the PE is attached to.
>> AFAICS you're actually describing a bug that happens during boot *up*?
>>
>> Describing it as "reboot" makes me think you're talking about the
>> shutdown path. I think that will confuse people, me at least :)
> there is probably an assumption that it must have been running
> previously for the errors to happen in the first place but given the
> error state persists for a day it may be a very long 'reboot'.
>
> Thanks
>
> Michal
>> cheers
>>
>>> There are execptions where the partition firmware might not provide this
>>> property for the PE at the time of LPAR reboot. One of the scenario is
>>> where the firmware has frozen the PE due to some error conditions. This
>>> PE is frozen for 24 hours or unless the whole system is reinitialized.
>>>
>>> Within this time frame, if the LPAR is rebooted, the frozen PE will be
>>> presented to the LPAR but ibm,dma-window property could be missing.
>>>
>>> Today, under these circumstances, the LPAR oopses with NULL pointer
>>> dereference, when configuring the PCI bus the PE is attached to.
>>>
>>> BUG: Kernel NULL pointer dereference on read at 0x000000c8
>>> Faulting instruction address: 0xc0000000001024c0
>>> Oops: Kernel access of bad area, sig: 7 [#1]
>>> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>>> Modules linked in:
>>> Supported: Yes
>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.4.0-150600.9-default #1
>>> Hardware name: IBM,9043-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_023) hv:phyp pSeries
>>> NIP:  c0000000001024c0 LR: c0000000001024b0 CTR: c000000000102450
>>> REGS: c0000000037db5c0 TRAP: 0300   Not tainted  (6.4.0-150600.9-default)
>>> MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 28000822  XER: 00000000
>>> CFAR: c00000000010254c DAR: 00000000000000c8 DSISR: 00080000 IRQMASK: 0
>>> ...
>>> NIP [c0000000001024c0] pci_dma_bus_setup_pSeriesLP+0x70/0x2a0
>>> LR [c0000000001024b0] pci_dma_bus_setup_pSeriesLP+0x60/0x2a0
>>> Call Trace:
>>> 	pci_dma_bus_setup_pSeriesLP+0x60/0x2a0 (unreliable)
>>> 	pcibios_setup_bus_self+0x1c0/0x370
>>> 	__of_scan_bus+0x2f8/0x330
>>> 	pcibios_scan_phb+0x280/0x3d0
>>> 	pcibios_init+0x88/0x12c
>>> 	do_one_initcall+0x60/0x320
>>> 	kernel_init_freeable+0x344/0x3e4
>>> 	kernel_init+0x34/0x1d0
>>> 	ret_from_kernel_user_thread+0x14/0x1c
>>>
>>> Fixes: b1fc44eaa9ba ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window")
>>> Signed-off-by: Gaurav Batra <gbatra@linux.ibm.com>
>>> ---
>>>   arch/powerpc/platforms/pseries/iommu.c | 8 ++++++++
>>>   1 file changed, 8 insertions(+)
>>>
>>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>>> index e8c4129697b1..e808d5b1fa49 100644
>>> --- a/arch/powerpc/platforms/pseries/iommu.c
>>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>>> @@ -786,8 +786,16 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
>>>   	 * parent bus. During reboot, there will be ibm,dma-window property to
>>>   	 * define DMA window. For kdump, there will at least be default window or DDW
>>>   	 * or both.
>>> +	 * There is an exception to the above. In case the PE goes into frozen
>>> +	 * state, firmware may not provide ibm,dma-window property at the time
>>> +	 * of LPAR reboot.
>>>   	 */
>>>   
>>> +	if (!pdn) {
>>> +		pr_debug("  no ibm,dma-window property !\n");
>>> +		return;
>>> +	}
>>> +
>>>   	ppci = PCI_DN(pdn);
>>>   
>>>   	pr_debug("  parent is %pOF, iommu_table: 0x%p\n",
>>>
>>> base-commit: 2c71fdf02a95b3dd425b42f28fd47fb2b1d22702
>>> -- 
>>> 2.39.3 (Apple Git-146)

  reply	other threads:[~2024-04-19 14:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-16 20:58 [PATCH] powerpc/pseries/iommu: LPAR panics when rebooted with a frozen PE Gaurav Batra
2024-04-19  6:12 ` Michael Ellerman
2024-04-19 11:11   ` Michal Suchánek
2024-04-19 14:41     ` Gaurav Batra [this message]
2024-04-22  5:42       ` Michael Ellerman
2024-04-22  5:40     ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3064baac-4727-4b9b-ab86-fc9476c937e0@linux.ibm.com \
    --to=gbatra@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=msuchanek@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).