All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
	Nathan Chancellor <nathan@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP
Date: Tue, 31 Aug 2021 23:48:17 +1000	[thread overview]
Message-ID: <87mtoxendq.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <55783e78-3159-9ab2-7955-fb5aa8aa0ddd@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 30/08/2021 à 13:55, Michael Ellerman a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 30/08/2021 à 09:52, Michael Ellerman a écrit :
>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>> Le 29/08/2021 à 20:55, Nathan Chancellor a écrit :
>>>>>> On Thu, Jul 08, 2021 at 04:49:43PM +0000, Christophe Leroy wrote:
>>>>>>> This patch converts powerpc to the generic PTDUMP implementation.
>>>>>>>
>>>>>>
>>>>>> This patch as commit e084728393a5 ("powerpc/ptdump: Convert powerpc to
>>>>>> GENERIC_PTDUMP") in powerpc/next causes a panic with Fedora's ppc64le
>>>>>> config [1] when booting up in QEMU with [2]:
>>>>>>
>>>>>> [    1.621864] BUG: Unable to handle kernel data access on read at 0xc0eeff7f00000000
>>>>>> [    1.623058] Faulting instruction address: 0xc00000000045e5fc
>>>>>> [    1.623832] Oops: Kernel access of bad area, sig: 11 [#1]
>>>>>> [    1.624318] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>>>>>> [    1.625015] Modules linked in:
>>>>>> [    1.625463] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7-next-20210827 #16
>>>>>> [    1.626237] NIP:  c00000000045e5fc LR: c00000000045e580 CTR: c000000000518220
>>>>>> [    1.626839] REGS: c00000000752b820 TRAP: 0380   Not tainted  (5.14.0-rc7-next-20210827)
>>>>>> [    1.627528] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 84002482  XER: 20000000
>>>>>> [    1.628449] CFAR: c000000000518300 IRQMASK: 0
>>>>>> [    1.628449] GPR00: c00000000045e580 c00000000752bac0 c0000000028a9300 0000000000000000
>>>>>> [    1.628449] GPR04: c200800000000000 ffffffffffffffff 000000000000000a 0000000000000001
>>>>>> [    1.628449] GPR08: c0eeff7f00000000 0000000000000012 0000000000000000 0000000000000000
>>>>>> [    1.628449] GPR12: 0000000000000000 c000000002b20000 fffffffffffffffe c000000002971a70
>>>>>> [    1.628449] GPR16: c000000002960040 c0000000011a8f98 c00000000752bbf0 ffffffffffffffff
>>>>>> [    1.628449] GPR20: c2008fffffffffff c0eeff7f00000000 c000000002971a68 c00a0003ff000000
>>>>>> [    1.628449] GPR24: c000000002971a78 0000000000000002 0000000000000001 c0000000011a8f98
>>>>>> [    1.628449] GPR28: c0000000011a8f98 c0000000028daef8 c200800000000000 c200900000000000
>>>>>> [    1.634090] NIP [c00000000045e5fc] __walk_page_range+0x2bc/0xce0
>>>>>> [    1.635117] LR [c00000000045e580] __walk_page_range+0x240/0xce0
>>>>>> [    1.635755] Call Trace:
>>>>>> [    1.636018] [c00000000752bac0] [c00000000045e580] __walk_page_range+0x240/0xce0 (unreliable)
>>>>>> [    1.636811] [c00000000752bbd0] [c00000000045f234] walk_page_range_novma+0x74/0xb0
>>>>>> [    1.637459] [c00000000752bc20] [c000000000518448] ptdump_walk_pgd+0x98/0x170
>>>>>> [    1.638138] [c00000000752bc70] [c0000000000aa988] ptdump_check_wx+0x88/0xd0
>>>>>> [    1.638738] [c00000000752bd50] [c00000000008d6d8] mark_rodata_ro+0x48/0x80
>>>>>> [    1.639299] [c00000000752bdb0] [c000000000012a34] kernel_init+0x74/0x1a0
>>>>>> [    1.639842] [c00000000752be10] [c00000000000cfd4] ret_from_kernel_thread+0x5c/0x64
>>>>>> [    1.640597] Instruction dump:
>>>>>> [    1.641021] 38e7ffff 39490010 7ce707b4 7fca5436 79081564 7d4a3838 7908f082 794a1f24
>>>>>> [    1.641740] 78a8f00e 30e6ffff 7ea85214 7ce73110 <7d48502a> 78f90fa4 2c2a0000 39290010
>>>>>> [    1.642771] ---[ end trace 6cf72b085097ad52 ]---
>>>>>> [    1.643220]
>>>>>> [    2.644228] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>>>>>> [    2.645523] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
>>>>>>
>>>>>> This is not compiler specific, I can reproduce it with GCC 11.2.0 and
>>>>>> binutils 2.37. If there is any additional information I can provide,
>>>>>> please let me know.
>>>>>
>>>>> Can you provide a dissassembly of __walk_page_range() ? Or provide your vmlinux binary.
>>>>
>>>> It seems to be walking of the end of the pgd.
>>>>
>>>> [    3.373800] walk_p4d_range: addr c00fff0000000000 end c00fff8000000000
>>>> [    3.373852] walk_p4d_range: addr c00fff8000000000 end c010000000000000	<- end of pgd at PAGE_OFFSET + 4PB
>>>> [    3.373905] walk_p4d_range: addr c010000000000000 end c010008000000000
>>>
>>> Yes, I want it to walk from TASK_SIZE_MAX up to 0xffffffffffffffff :)
>> 
>> But the page table doesn't span that far? 0_o
>> 
>>> static struct ptdump_range ptdump_range[] __ro_after_init = {
>>> 	{TASK_SIZE_MAX, ~0UL},
>>> 	{0, 0}
>>> };
>>>
>>> Ok, well, ppc32 go up to 0xffffffff
>>>
>>> What's the top address to be used for ppc64 ?
>> 
>> It's different for (hash | radix) x page size.
>> 
>> The below works, and matches what we used to do.
>> 
>> Possibly we can come up with something cleaner, not sure.
>> 
>> cheers
>> 
>> 
>> diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
>> index 2d80d775d15e..3d3778a74969 100644
>> --- a/arch/powerpc/mm/ptdump/ptdump.c
>> +++ b/arch/powerpc/mm/ptdump/ptdump.c
>> @@ -359,6 +359,8 @@ static int __init ptdump_init(void)
>>   		ptdump_range[0].start = KERN_VIRT_START;
>>   	else
>>   		ptdump_range[0].start = PAGE_OFFSET;
>> +
>> +	ptdump_range[0].end = ptdump_range[0].start + (PGDIR_SIZE * PTRS_PER_PGD);
>
> Hum ...
>
> It was:
>
> 	for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
>
> And there is
>
> #define pgd_index(a)  (((a) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))

Yes you're right.

> Do we have the following ?
>
> 	pgd_index(KERN_VIRT_START) == 0

No.

Since 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")

It's:

 	pgd_index(PAGE_OFFSET) == 0


> Shouldn't it be something like
>
> 	ptdump_range[0].end = PAGE_OFFSET + (PGDIR_SIZE * PTRS_PER_PGD);

Yep.

And we should also change the start address for hash to be PAGE_OFFSET.
Even though we don't expect anything in the page tables between
PAGE_OFFSET and KERN_VIRT_START, it's still good to check that range.

cheers

WARNING: multiple messages have this Message-ID
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
	Nathan Chancellor <nathan@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras <paulus@samba.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP
Date: Tue, 31 Aug 2021 23:48:17 +1000	[thread overview]
Message-ID: <87mtoxendq.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <55783e78-3159-9ab2-7955-fb5aa8aa0ddd@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 30/08/2021 à 13:55, Michael Ellerman a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 30/08/2021 à 09:52, Michael Ellerman a écrit :
>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>> Le 29/08/2021 à 20:55, Nathan Chancellor a écrit :
>>>>>> On Thu, Jul 08, 2021 at 04:49:43PM +0000, Christophe Leroy wrote:
>>>>>>> This patch converts powerpc to the generic PTDUMP implementation.
>>>>>>>
>>>>>>
>>>>>> This patch as commit e084728393a5 ("powerpc/ptdump: Convert powerpc to
>>>>>> GENERIC_PTDUMP") in powerpc/next causes a panic with Fedora's ppc64le
>>>>>> config [1] when booting up in QEMU with [2]:
>>>>>>
>>>>>> [    1.621864] BUG: Unable to handle kernel data access on read at 0xc0eeff7f00000000
>>>>>> [    1.623058] Faulting instruction address: 0xc00000000045e5fc
>>>>>> [    1.623832] Oops: Kernel access of bad area, sig: 11 [#1]
>>>>>> [    1.624318] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>>>>>> [    1.625015] Modules linked in:
>>>>>> [    1.625463] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7-next-20210827 #16
>>>>>> [    1.626237] NIP:  c00000000045e5fc LR: c00000000045e580 CTR: c000000000518220
>>>>>> [    1.626839] REGS: c00000000752b820 TRAP: 0380   Not tainted  (5.14.0-rc7-next-20210827)
>>>>>> [    1.627528] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 84002482  XER: 20000000
>>>>>> [    1.628449] CFAR: c000000000518300 IRQMASK: 0
>>>>>> [    1.628449] GPR00: c00000000045e580 c00000000752bac0 c0000000028a9300 0000000000000000
>>>>>> [    1.628449] GPR04: c200800000000000 ffffffffffffffff 000000000000000a 0000000000000001
>>>>>> [    1.628449] GPR08: c0eeff7f00000000 0000000000000012 0000000000000000 0000000000000000
>>>>>> [    1.628449] GPR12: 0000000000000000 c000000002b20000 fffffffffffffffe c000000002971a70
>>>>>> [    1.628449] GPR16: c000000002960040 c0000000011a8f98 c00000000752bbf0 ffffffffffffffff
>>>>>> [    1.628449] GPR20: c2008fffffffffff c0eeff7f00000000 c000000002971a68 c00a0003ff000000
>>>>>> [    1.628449] GPR24: c000000002971a78 0000000000000002 0000000000000001 c0000000011a8f98
>>>>>> [    1.628449] GPR28: c0000000011a8f98 c0000000028daef8 c200800000000000 c200900000000000
>>>>>> [    1.634090] NIP [c00000000045e5fc] __walk_page_range+0x2bc/0xce0
>>>>>> [    1.635117] LR [c00000000045e580] __walk_page_range+0x240/0xce0
>>>>>> [    1.635755] Call Trace:
>>>>>> [    1.636018] [c00000000752bac0] [c00000000045e580] __walk_page_range+0x240/0xce0 (unreliable)
>>>>>> [    1.636811] [c00000000752bbd0] [c00000000045f234] walk_page_range_novma+0x74/0xb0
>>>>>> [    1.637459] [c00000000752bc20] [c000000000518448] ptdump_walk_pgd+0x98/0x170
>>>>>> [    1.638138] [c00000000752bc70] [c0000000000aa988] ptdump_check_wx+0x88/0xd0
>>>>>> [    1.638738] [c00000000752bd50] [c00000000008d6d8] mark_rodata_ro+0x48/0x80
>>>>>> [    1.639299] [c00000000752bdb0] [c000000000012a34] kernel_init+0x74/0x1a0
>>>>>> [    1.639842] [c00000000752be10] [c00000000000cfd4] ret_from_kernel_thread+0x5c/0x64
>>>>>> [    1.640597] Instruction dump:
>>>>>> [    1.641021] 38e7ffff 39490010 7ce707b4 7fca5436 79081564 7d4a3838 7908f082 794a1f24
>>>>>> [    1.641740] 78a8f00e 30e6ffff 7ea85214 7ce73110 <7d48502a> 78f90fa4 2c2a0000 39290010
>>>>>> [    1.642771] ---[ end trace 6cf72b085097ad52 ]---
>>>>>> [    1.643220]
>>>>>> [    2.644228] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>>>>>> [    2.645523] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
>>>>>>
>>>>>> This is not compiler specific, I can reproduce it with GCC 11.2.0 and
>>>>>> binutils 2.37. If there is any additional information I can provide,
>>>>>> please let me know.
>>>>>
>>>>> Can you provide a dissassembly of __walk_page_range() ? Or provide your vmlinux binary.
>>>>
>>>> It seems to be walking of the end of the pgd.
>>>>
>>>> [    3.373800] walk_p4d_range: addr c00fff0000000000 end c00fff8000000000
>>>> [    3.373852] walk_p4d_range: addr c00fff8000000000 end c010000000000000	<- end of pgd at PAGE_OFFSET + 4PB
>>>> [    3.373905] walk_p4d_range: addr c010000000000000 end c010008000000000
>>>
>>> Yes, I want it to walk from TASK_SIZE_MAX up to 0xffffffffffffffff :)
>> 
>> But the page table doesn't span that far? 0_o
>> 
>>> static struct ptdump_range ptdump_range[] __ro_after_init = {
>>> 	{TASK_SIZE_MAX, ~0UL},
>>> 	{0, 0}
>>> };
>>>
>>> Ok, well, ppc32 go up to 0xffffffff
>>>
>>> What's the top address to be used for ppc64 ?
>> 
>> It's different for (hash | radix) x page size.
>> 
>> The below works, and matches what we used to do.
>> 
>> Possibly we can come up with something cleaner, not sure.
>> 
>> cheers
>> 
>> 
>> diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
>> index 2d80d775d15e..3d3778a74969 100644
>> --- a/arch/powerpc/mm/ptdump/ptdump.c
>> +++ b/arch/powerpc/mm/ptdump/ptdump.c
>> @@ -359,6 +359,8 @@ static int __init ptdump_init(void)
>>   		ptdump_range[0].start = KERN_VIRT_START;
>>   	else
>>   		ptdump_range[0].start = PAGE_OFFSET;
>> +
>> +	ptdump_range[0].end = ptdump_range[0].start + (PGDIR_SIZE * PTRS_PER_PGD);
>
> Hum ...
>
> It was:
>
> 	for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
>
> And there is
>
> #define pgd_index(a)  (((a) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))

Yes you're right.

> Do we have the following ?
>
> 	pgd_index(KERN_VIRT_START) == 0

No.

Since 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")

It's:

 	pgd_index(PAGE_OFFSET) == 0


> Shouldn't it be something like
>
> 	ptdump_range[0].end = PAGE_OFFSET + (PGDIR_SIZE * PTRS_PER_PGD);

Yep.

And we should also change the start address for hash to be PAGE_OFFSET.
Even though we don't expect anything in the page tables between
PAGE_OFFSET and KERN_VIRT_START, it's still good to check that range.

cheers

  reply	other threads:[~2021-08-31 13:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-08 16:49 [PATCH v4 1/4] powerpc/ptdump: Use DEFINE_SHOW_ATTRIBUTE() Christophe Leroy
2021-07-08 16:49 ` Christophe Leroy
2021-07-08 16:49 ` [PATCH v4 2/4] powerpc/ptdump: Remove unused 'page_size' parameter Christophe Leroy
2021-07-08 16:49   ` Christophe Leroy
2021-07-08 16:49 ` [PATCH v4 3/4] powerpc/ptdump: Reduce level numbers by 1 in note_page() and add p4d level Christophe Leroy
2021-07-08 16:49   ` Christophe Leroy
2021-07-08 16:49 ` [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP Christophe Leroy
2021-07-08 16:49   ` Christophe Leroy
2021-08-29 18:55   ` Nathan Chancellor
2021-08-29 18:55     ` Nathan Chancellor
2021-08-29 19:11     ` Christophe Leroy
2021-08-29 19:11       ` Christophe Leroy
2021-08-29 21:39       ` Nathan Chancellor
2021-08-29 21:39         ` Nathan Chancellor
2021-08-30  7:52       ` Michael Ellerman
2021-08-30  7:52         ` Michael Ellerman
2021-08-30  8:16         ` Christophe Leroy
2021-08-30  8:16           ` Christophe Leroy
2021-08-30 11:55           ` Michael Ellerman
2021-08-30 11:55             ` Michael Ellerman
2021-08-30 13:16             ` Christophe Leroy
2021-08-30 13:16               ` Christophe Leroy
2021-08-31 13:48               ` Michael Ellerman [this message]
2021-08-31 13:48                 ` Michael Ellerman
2021-08-27 13:16 ` [PATCH v4 1/4] powerpc/ptdump: Use DEFINE_SHOW_ATTRIBUTE() Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtoxendq.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nathan@kernel.org \
    --cc=paulus@samba.org \
    --subject='Re: [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.