All of lore.kernel.org
 help / color / mirror / Atom feed
* Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
@ 2017-09-19 18:24 Guenter Roeck
  2017-09-20  3:05 ` Michael Ellerman
  0 siblings, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2017-09-19 18:24 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Michael Ellerman, linux-kernel, Benjamin Herrenschmidt,
	linuxppc-dev, Paul Mackerras

Hi,

I see a the following traceback when running an SMP image based on
85xx/mpc85xx_cds_defconfig in qemu.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
task: cf830000 task.stack: cf82e000
NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000

GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 
GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 
GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 
GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 
NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
LR [c00a9634] smp_call_function+0x3c/0x50
Call Trace:
[cf82fe90] [00000010] 0x10 (unreliable)
[cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
[cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
[cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
[cf82ff20] [c001484c] free_initmem+0x20/0x4c
[cf82ff30] [c000316c] kernel_init+0x1c/0x108
[cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac 
3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 
---[ end trace 7da7bdcf8b15ddb3 ]---

A complete log is available at:
http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio

Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection
after freeing unused memory on PPC32"). Bisect log is attached. A quick look
suggests that mark_initmem_nx() is called with interrupts disabled, which
triggers the traceback.

Guenter

---
# bad: [ebb2c2437d8008d46796902ff390653822af6cc4] Merge tag 'mmc-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
# good: [e7d0c41ecc2e372a81741a30894f556afec24315] Merge tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect start 'HEAD' 'e7d0c41ecc2e'
# bad: [c0da4fa0d1a54495d6055c009ac46b76d1da2c86] Merge tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect bad c0da4fa0d1a54495d6055c009ac46b76d1da2c86
# good: [aae3dbb4776e7916b6cd442d00159bea27a695c1] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect good aae3dbb4776e7916b6cd442d00159bea27a695c1
# bad: [3645e6d0dc80be4376f87acc9ee527768387c909] Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
git bisect bad 3645e6d0dc80be4376f87acc9ee527768387c909
# bad: [bac65d9d87b383471d8d29128319508d71b74180] Merge tag 'powerpc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect bad bac65d9d87b383471d8d29128319508d71b74180
# good: [57e88b43b81301d9b28f124a5576ac43a1cf9e8d] Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 57e88b43b81301d9b28f124a5576ac43a1cf9e8d
# bad: [f9065c83ccf4a6c1ff5419d216ad8276e99bee6c] powerpc/configs: Explicitly drop CONFIG_INPUT_MOUSEDEV
git bisect bad f9065c83ccf4a6c1ff5419d216ad8276e99bee6c
# good: [ea16e83aec40f9110be9cb0c3398ef41ae890ca6] powerpc/cpm1: link to CONFIG_CPM1 instead of CONFIG_8xx
git bisect good ea16e83aec40f9110be9cb0c3398ef41ae890ca6
# bad: [8bfa42ab84910841336218265fcee94fd1e6285a] powerpc: Add const to bin_attribute structures
git bisect bad 8bfa42ab84910841336218265fcee94fd1e6285a
# good: [36992606eee8016c36ad2576687e97422f2f35ed] powerpc/chrp: Store the intended structure
git bisect good 36992606eee8016c36ad2576687e97422f2f35ed
# bad: [86b19520e7ef5539eb081c76fe2f5c955180205f] powerpc/mm: declare some local functions static
git bisect bad 86b19520e7ef5539eb081c76fe2f5c955180205f
# good: [87be3e2d31c01d3858bff43ab663769db03aab17] powerpc/8xx: Do not allow Pinned TLBs with STRICT_KERNEL_RWX or DEBUG_PAGEALLOC
git bisect good 87be3e2d31c01d3858bff43ab663769db03aab17
# good: [e611939fc8ec13387018df88083de7102a438730] powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBs
git bisect good e611939fc8ec13387018df88083de7102a438730
# bad: [95902e6c8864d39b09134dcaa3c99d8161d1deea] powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32
git bisect bad 95902e6c8864d39b09134dcaa3c99d8161d1deea
# bad: [3184cc4b6f6a1dc0c1745aafe2b14b1206ef3187] powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32
git bisect bad 3184cc4b6f6a1dc0c1745aafe2b14b1206ef3187
# first bad commit: [3184cc4b6f6a1dc0c1745aafe2b14b1206ef3187] powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
  2017-09-19 18:24 Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu Guenter Roeck
@ 2017-09-20  3:05 ` Michael Ellerman
  2017-09-20  3:45   ` Guenter Roeck
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Ellerman @ 2017-09-20  3:05 UTC (permalink / raw)
  To: Guenter Roeck, Christophe Leroy
  Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras

Guenter Roeck <linux@roeck-us.net> writes:

> Hi,
>
> I see a the following traceback when running an SMP image based on
> 85xx/mpc85xx_cds_defconfig in qemu.
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
> task: cf830000 task.stack: cf82e000
> NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
> REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
> MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000
>
> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 
> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 
> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 
> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 
> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
> LR [c00a9634] smp_call_function+0x3c/0x50
> Call Trace:
> [cf82fe90] [00000010] 0x10 (unreliable)
> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
> [cf82ff20] [c001484c] free_initmem+0x20/0x4c
> [cf82ff30] [c000316c] kernel_init+0x1c/0x108
> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
> Instruction dump:
> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac 
> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 
> ---[ end trace 7da7bdcf8b15ddb3 ]---

Thanks.

I guess the system still runs OK otherwise, you're just seeing the warning?

> A complete log is available at:
> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio
>
> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection
> after freeing unused memory on PPC32"). Bisect log is attached. A quick look
> suggests that mark_initmem_nx() is called with interrupts disabled, which
> triggers the traceback.

Hmm. Yes the MSR says you have interrupts disabled (EE missing).

But I don't see why. start_kernel() did local_irq_enable(), so I don't
understand why we got to mark_initmem_nx() with them disabled. I'll hope
that Christophe has some idea.

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
  2017-09-20  3:05 ` Michael Ellerman
@ 2017-09-20  3:45   ` Guenter Roeck
  2017-09-21 18:44     ` Christophe LEROY
  0 siblings, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2017-09-20  3:45 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy
  Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras

On 09/19/2017 08:05 PM, Michael Ellerman wrote:
> Guenter Roeck <linux@roeck-us.net> writes:
> 
>> Hi,
>>
>> I see a the following traceback when running an SMP image based on
>> 85xx/mpc85xx_cds_defconfig in qemu.
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
>> task: cf830000 task.stack: cf82e000
>> NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
>> REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
>> MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000
>>
>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001
>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000
>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000
>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000
>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
>> LR [c00a9634] smp_call_function+0x3c/0x50
>> Call Trace:
>> [cf82fe90] [00000010] 0x10 (unreliable)
>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c
>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108
>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
>> Instruction dump:
>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac
>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78
>> ---[ end trace 7da7bdcf8b15ddb3 ]---
> 
> Thanks.
> 
> I guess the system still runs OK otherwise, you're just seeing the warning?
> 
Yes, though I am not sure if that is because there is only one active CPU (there is
still only one if I say "-smp 4" on the qemu command line).

>> A complete log is available at:
>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio
>>
>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection
>> after freeing unused memory on PPC32"). Bisect log is attached. A quick look
>> suggests that mark_initmem_nx() is called with interrupts disabled, which
>> triggers the traceback.
> 
> Hmm. Yes the MSR says you have interrupts disabled (EE missing).
> 
> But I don't see why. start_kernel() did local_irq_enable(), so I don't
> understand why we got to mark_initmem_nx() with them disabled. I'll hope
> that Christophe has some idea.
> 
Good question. I only see this with one of 9 ppc emulations, with 85xx/mpc85xx_cds_defconfig
+CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific init function
which leaves interrupts disabled. Question is which one that might be.

Guenter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
  2017-09-20  3:45   ` Guenter Roeck
@ 2017-09-21 18:44     ` Christophe LEROY
  2017-09-24 16:05       ` Guenter Roeck
  0 siblings, 1 reply; 6+ messages in thread
From: Christophe LEROY @ 2017-09-21 18:44 UTC (permalink / raw)
  To: Guenter Roeck, Michael Ellerman
  Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras



Le 20/09/2017 à 05:45, Guenter Roeck a écrit :
> On 09/19/2017 08:05 PM, Michael Ellerman wrote:
>> Guenter Roeck <linux@roeck-us.net> writes:
>>
>>> Hi,
>>>
>>> I see a the following traceback when running an SMP image based on
>>> 85xx/mpc85xx_cds_defconfig in qemu.
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 
>>> smp_call_function_many+0xcc/0x2fc
>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
>>> task: cf830000 task.stack: cf82e000
>>> NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
>>> REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
>>> MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000
>>>
>>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 
>>> 00000001
>>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 
>>> 00000000
>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 
>>> c0510000
>>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 
>>> 00000000
>>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
>>> LR [c00a9634] smp_call_function+0x3c/0x50
>>> Call Trace:
>>> [cf82fe90] [00000010] 0x10 (unreliable)
>>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
>>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
>>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
>>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c
>>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108
>>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
>>> Instruction dump:
>>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac
>>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 
>>> 7f64db78
>>> ---[ end trace 7da7bdcf8b15ddb3 ]---
>>
>> Thanks.
>>
>> I guess the system still runs OK otherwise, you're just seeing the 
>> warning?
>>
> Yes, though I am not sure if that is because there is only one active 
> CPU (there is
> still only one if I say "-smp 4" on the qemu command line).
> 
>>> A complete log is available at:
>>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio 
>>>
>>>
>>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM 
>>> protection
>>> after freeing unused memory on PPC32"). Bisect log is attached. A 
>>> quick look
>>> suggests that mark_initmem_nx() is called with interrupts disabled, 
>>> which
>>> triggers the traceback.
>>
>> Hmm. Yes the MSR says you have interrupts disabled (EE missing).
>>
>> But I don't see why. start_kernel() did local_irq_enable(), so I don't
>> understand why we got to mark_initmem_nx() with them disabled. I'll hope
>> that Christophe has some idea.
>>
> Good question. I only see this with one of 9 ppc emulations, with 
> 85xx/mpc85xx_cds_defconfig
> +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific 
> init function
> which leaves interrupts disabled. Question is which one that might be.
> 

Unfortunatly no, I have no idea. My three platforms (860, 885 and 8321) 
are not SMPs so that warning would not appear, but I added a WARN_ON(1) 
just become calling mark_initmem_nx(), and I can confirm that MSR has EE 
set on all three at that time.

So as you suggest, there must be a platform specific stuff leaving the 
interrupts disabled.

Christophe


> Guenter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
  2017-09-21 18:44     ` Christophe LEROY
@ 2017-09-24 16:05       ` Guenter Roeck
  2017-09-25  6:36         ` Christophe LEROY
  0 siblings, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2017-09-24 16:05 UTC (permalink / raw)
  To: Christophe LEROY, Michael Ellerman
  Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras

On 09/21/2017 11:44 AM, Christophe LEROY wrote:
> 
> 
> Le 20/09/2017 à 05:45, Guenter Roeck a écrit :
>> On 09/19/2017 08:05 PM, Michael Ellerman wrote:
>>> Guenter Roeck <linux@roeck-us.net> writes:
>>>
>>>> Hi,
>>>>
>>>> I see a the following traceback when running an SMP image based on
>>>> 85xx/mpc85xx_cds_defconfig in qemu.
>>>>
>>>> ------------[ cut here ]------------
>>>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc
>>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
>>>> task: cf830000 task.stack: cf82e000
>>>> NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
>>>> REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
>>>> MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000
>>>>
>>>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001
>>>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000
>>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000
>>>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000
>>>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
>>>> LR [c00a9634] smp_call_function+0x3c/0x50
>>>> Call Trace:
>>>> [cf82fe90] [00000010] 0x10 (unreliable)
>>>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
>>>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
>>>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
>>>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c
>>>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108
>>>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
>>>> Instruction dump:
>>>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac
>>>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78
>>>> ---[ end trace 7da7bdcf8b15ddb3 ]---
>>>
>>> Thanks.
>>>
>>> I guess the system still runs OK otherwise, you're just seeing the warning?
>>>
>> Yes, though I am not sure if that is because there is only one active CPU (there is
>> still only one if I say "-smp 4" on the qemu command line).
>>
>>>> A complete log is available at:
>>>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio
>>>>
>>>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection
>>>> after freeing unused memory on PPC32"). Bisect log is attached. A quick look
>>>> suggests that mark_initmem_nx() is called with interrupts disabled, which
>>>> triggers the traceback.
>>>
>>> Hmm. Yes the MSR says you have interrupts disabled (EE missing).
>>>
>>> But I don't see why. start_kernel() did local_irq_enable(), so I don't
>>> understand why we got to mark_initmem_nx() with them disabled. I'll hope
>>> that Christophe has some idea.
>>>
>> Good question. I only see this with one of 9 ppc emulations, with 85xx/mpc85xx_cds_defconfig
>> +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific init function
>> which leaves interrupts disabled. Question is which one that might be.
>>
> 
> Unfortunatly no, I have no idea. My three platforms (860, 885 and 8321) are not SMPs so that warning would not appear, but I added a WARN_ON(1) just become calling mark_initmem_nx(), and I can confirm that MSR has EE set on all three at that time.
> 

You should still be able to compile and run a SMP kernel. mpc85xx_cds_defconfig
without CONFIG_SMP=y does not show the warning either.

Turns out interrupts are disabled in change_page_attr(), called by mark_initmem_nx().
change_page_attr() calls flush_tlb_kernel_range() with interrupts disabled.
This only happens if CONFIG_PPC_MMU_NOHASH=y.
Given that, I would assume that this will be seen with every 32 bit ppc build which has
CONFIG_SMP=y and CONFIG_PPC_MMU_NOHASH=y.

Maybe the problem was really introduced with commit e611939fc8ec1 ("powerpc/mm: Ensure
change_page_attr() doesn't invalidate pinned TLBs"). From the context it appears that
flush_tlb_kernel_range() should not be called with interrupts disabled.
Indeed, moving flush_tlb_kernel_range() outside the irq disabled code fixes
the problem for me.

Thanks,
Guenter

> So as you suggest, there must be a platform specific stuff leaving the interrupts disabled.
> 
> Christophe
> 
> 
>> Guenter
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
  2017-09-24 16:05       ` Guenter Roeck
@ 2017-09-25  6:36         ` Christophe LEROY
  0 siblings, 0 replies; 6+ messages in thread
From: Christophe LEROY @ 2017-09-25  6:36 UTC (permalink / raw)
  To: Guenter Roeck, Michael Ellerman
  Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras



Le 24/09/2017 à 18:05, Guenter Roeck a écrit :
> On 09/21/2017 11:44 AM, Christophe LEROY wrote:
>>
>>
>> Le 20/09/2017 à 05:45, Guenter Roeck a écrit :
>>> On 09/19/2017 08:05 PM, Michael Ellerman wrote:
>>>> Guenter Roeck <linux@roeck-us.net> writes:
>>>>
>>>>> Hi,
>>>>>
>>>>> I see a the following traceback when running an SMP image based on
>>>>> 85xx/mpc85xx_cds_defconfig in qemu.
>>>>>
>>>>> ------------[ cut here ]------------
>>>>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 
>>>>> smp_call_function_many+0xcc/0x2fc
>>>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
>>>>> task: cf830000 task.stack: cf82e000
>>>>> NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
>>>>> REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
>>>>> MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000
>>>>>
>>>>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 
>>>>> 00000001 00000001
>>>>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 
>>>>> c0003150 00000000
>>>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 
>>>>> 00000000 c0510000
>>>>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 
>>>>> 00000025 00000000
>>>>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
>>>>> LR [c00a9634] smp_call_function+0x3c/0x50
>>>>> Call Trace:
>>>>> [cf82fe90] [00000010] 0x10 (unreliable)
>>>>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
>>>>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
>>>>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
>>>>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c
>>>>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108
>>>>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
>>>>> Instruction dump:
>>>>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 
>>>>> 40beffac
>>>>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 
>>>>> 7f64db78
>>>>> ---[ end trace 7da7bdcf8b15ddb3 ]---
>>>>
>>>> Thanks.
>>>>
>>>> I guess the system still runs OK otherwise, you're just seeing the 
>>>> warning?
>>>>
>>> Yes, though I am not sure if that is because there is only one active 
>>> CPU (there is
>>> still only one if I say "-smp 4" on the qemu command line).
>>>
>>>>> A complete log is available at:
>>>>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio 
>>>>>
>>>>>
>>>>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel 
>>>>> RAM protection
>>>>> after freeing unused memory on PPC32"). Bisect log is attached. A 
>>>>> quick look
>>>>> suggests that mark_initmem_nx() is called with interrupts disabled, 
>>>>> which
>>>>> triggers the traceback.
>>>>
>>>> Hmm. Yes the MSR says you have interrupts disabled (EE missing).
>>>>
>>>> But I don't see why. start_kernel() did local_irq_enable(), so I don't
>>>> understand why we got to mark_initmem_nx() with them disabled. I'll 
>>>> hope
>>>> that Christophe has some idea.
>>>>
>>> Good question. I only see this with one of 9 ppc emulations, with 
>>> 85xx/mpc85xx_cds_defconfig
>>> +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific 
>>> init function
>>> which leaves interrupts disabled. Question is which one that might be.
>>>
>>
>> Unfortunatly no, I have no idea. My three platforms (860, 885 and 
>> 8321) are not SMPs so that warning would not appear, but I added a 
>> WARN_ON(1) just become calling mark_initmem_nx(), and I can confirm 
>> that MSR has EE set on all three at that time.
>>
> 
> You should still be able to compile and run a SMP kernel. 
> mpc85xx_cds_defconfig

SMP doesn't support the 8xx, and the 83xx has hash MMU.

> without CONFIG_SMP=y does not show the warning either.

Yes that's normal, as the smp_call_function() is not called in that 
case, hence my test with a WARN_ON(1) just before calling mark_initram_nx()

> 
> Turns out interrupts are disabled in change_page_attr(), called by 
> mark_initmem_nx().

Oops, you're right, I missed it.

> change_page_attr() calls flush_tlb_kernel_range() with interrupts disabled.
> This only happens if CONFIG_PPC_MMU_NOHASH=y.
> Given that, I would assume that this will be seen with every 32 bit ppc 
> build which has
> CONFIG_SMP=y and CONFIG_PPC_MMU_NOHASH=y.
> 
> Maybe the problem was really introduced with commit e611939fc8ec1 
> ("powerpc/mm: Ensure
> change_page_attr() doesn't invalidate pinned TLBs"). From the context it 
> appears that
> flush_tlb_kernel_range() should not be called with interrupts disabled.

Right, it looks like that warning was introduced by this commit.
However, by looking at flush_tlb_page() which was the function that was 
called instead before that commit, there was most likely also an issue 
with SMP because flush_tlb_page() called with a NULL vma results in a 
warning in the SMP NOHASH version of flush_tlb_page().

> Indeed, moving flush_tlb_kernel_range() outside the irq disabled code fixes
> the problem for me.

Yes that's likely the solution it seems.

Thanks
Christophe

> 
> Thanks,
> Guenter
> 
>> So as you suggest, there must be a platform specific stuff leaving the 
>> interrupts disabled.
>>
>> Christophe
>>
>>
>>> Guenter
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-09-25  6:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19 18:24 Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu Guenter Roeck
2017-09-20  3:05 ` Michael Ellerman
2017-09-20  3:45   ` Guenter Roeck
2017-09-21 18:44     ` Christophe LEROY
2017-09-24 16:05       ` Guenter Roeck
2017-09-25  6:36         ` Christophe LEROY

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.