linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
@ 2017-06-14 19:26 Guenter Roeck
  2017-06-14 21:31 ` Frank Rowand
  2017-06-15  6:48 ` Michael Ellerman
  0 siblings, 2 replies; 9+ messages in thread
From: Guenter Roeck @ 2017-06-14 19:26 UTC (permalink / raw)
  To: Frank Rowand; +Cc: linux-kernel, Rob Herring

Hi Frank,

your commit 'of: remove *phandle properties from expanded device tree' in
-next causes several of my ppc qemu tests to crash. Looking into qemu, it
sets "linux,phandle" properties for the mpic and for other devices.

The crashes are along the line of

------------[ cut here ]------------
kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=32 
NUMA 
CoreNet Generic
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc5-next-20170614 #1
task: c000000000ad8cc0 task.stack: c000000000bec000
NIP: c000000000a8ca7c LR: c000000000a8ca6c CTR: c000000000a8ca20
REGS: c000000000befb90 TRAP: 0700   Not tainted  (4.12.0-rc5-next-20170614)
MSR: 0000000080021000 <CE,ME>
  CR: 22000042  XER: 00000000
  SOFTE: 0 
  GPR00: c000000000a8ca6c c000000000befe10 c000000000befa00 0000000000000000 
  GPR04: 0000000000000000 c000000000ac8458 c000000000ac8438 c000000000830658 
  GPR08: 0000000000000001 0000000000000001 0000000000000000 0000000000009531 
  GPR12: 0000000022000022 c00000003fff1000 0000000000000000 0000000000000000 
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
  GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
  GPR28: c000000000000300 c00000003fff2cc0 c000000000ac06e0 c000000000ac06e0 
  NIP [c000000000a8ca7c] .corenet_gen_pic_init+0x5c/0x90
  LR [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90
  Call Trace:
  [c000000000befe10] [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90
  (unreliable)
  [c000000000befe80] [c000000000a832f8] .init_IRQ+0x34/0x4c
  [c000000000befef0] [c000000000a7fc88] .start_kernel+0x2fc/0x500
  [c000000000beff90] [c000000000000554] start_here_common+0x1c/0x48
  Instruction dump:
  e8aa0068 39088268 39407002 38600000 7fa54800 39205002 7caa4f9e 4bffe9e9 
  60000000 2c230000 7d200026 55291ffe <0b090000> 4bfff335 60000000 3ca2ffdd 
  random: 0x600000003d220004 get_random_bytes called with crng_init=0
  ---[ end trace 0000000000000000 ]---

and are caused by the kernel not finding the mpic node anymore.

Any idea how to solve the problem ?

Bisect log is attached.

Thanks,
Guenter

---
# bad: [b14746170b0684005bab3e07893e6b91baf7dbf6] Add linux-next specific files for 20170614
# good: [32c1431eea4881a6b17bd7c639315010aeefa452] Linux 4.12-rc5
git bisect start 'HEAD' 'v4.12-rc5'
# good: [0500b956eedb4686b0420308ae01a74b00f9ab64] Merge remote-tracking branch 'crypto/master'
git bisect good 0500b956eedb4686b0420308ae01a74b00f9ab64
# bad: [4717c17660509cee9d3596eb19b99f3e26d57c36] Merge remote-tracking branch 'tip/auto-latest'
git bisect bad 4717c17660509cee9d3596eb19b99f3e26d57c36
# good: [f32807fd889514af115c32f597f59763d44ffae4] next-20170613/sound-asoc
git bisect good f32807fd889514af115c32f597f59763d44ffae4
# good: [8bf3df94bf566c7294b6f972cb5afa2d9a3a83f5] Merge remote-tracking branch 'iommu/next'
git bisect good 8bf3df94bf566c7294b6f972cb5afa2d9a3a83f5
# good: [e5c91c3569136b20783bd0799f026b89e4a2752a] Merge branch 'sched/core'
git bisect good e5c91c3569136b20783bd0799f026b89e4a2752a
# good: [3ff2be7e0e543ed1fbdd1a9f5ca49417be7b2a66] Merge branch 'x86/boot'
git bisect good 3ff2be7e0e543ed1fbdd1a9f5ca49417be7b2a66
# good: [2b37bbbc6291132aa8b08088ec31652eaf66ce6a] Merge remote-tracking branches 'spi/topic/rockchip', 'spi/topic/sh-msiof', 'spi/topic/spidev' and 'spi/topic/st-ssc4' into spi-next
git bisect good 2b37bbbc6291132aa8b08088ec31652eaf66ce6a
# good: [82a28f6c16030d04f5719889999f4fa9a35bcfc7] Merge branch 'x86/timers'
git bisect good 82a28f6c16030d04f5719889999f4fa9a35bcfc7
# bad: [d19a4961ac001b1284013ecff3deb6456a09abda] of: make __of_attach_node() static
git bisect bad d19a4961ac001b1284013ecff3deb6456a09abda
# good: [e5e9b5fae7e7d1fad87e4abb52f5f3d55c9f4e25] iio: proximity: as3935: add missing required spi-max-frequency
git bisect good e5e9b5fae7e7d1fad87e4abb52f5f3d55c9f4e25
# good: [d20dc1493db438fbbfb7733adc82f472dd8a0789] of: Support const and non-const use for to_of_node()
git bisect good d20dc1493db438fbbfb7733adc82f472dd8a0789
# good: [4811a1a7800bc59074e640a4fe9befdb668ae56f] Merge branch 'dt/property-move' into dt/next
git bisect good 4811a1a7800bc59074e640a4fe9befdb668ae56f
# bad: [f847192ce4061dc7e9087eb9136a38e3bf582efb] of: remove *phandle properties from expanded device tree
git bisect bad f847192ce4061dc7e9087eb9136a38e3bf582efb
# good: [6fedb069def034a4738584920fe94535ab29637a] of: Provide dummy of_device_compatible_match() for compile-testing
git bisect good 6fedb069def034a4738584920fe94535ab29637a
# first bad commit: [f847192ce4061dc7e9087eb9136a38e3bf582efb] of: remove *phandle properties from expanded device tree

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-14 19:26 Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' Guenter Roeck
@ 2017-06-14 21:31 ` Frank Rowand
  2017-06-14 22:35   ` Guenter Roeck
  2017-06-15  6:48 ` Michael Ellerman
  1 sibling, 1 reply; 9+ messages in thread
From: Frank Rowand @ 2017-06-14 21:31 UTC (permalink / raw)
  To: Guenter Roeck, Frank Rowand; +Cc: linux-kernel, Rob Herring

Hi Guenter,

Thanks for reporting this.


On 06/14/17 12:26, Guenter Roeck wrote:
> Hi Frank,
> 
> your commit 'of: remove *phandle properties from expanded device tree' in
> -next causes several of my ppc qemu tests to crash. Looking into qemu, it
> sets "linux,phandle" properties for the mpic and for other devices.
> 
> The crashes are along the line of
> 
> ------------[ cut here ]------------
> kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=32 
> NUMA 
> CoreNet Generic
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc5-next-20170614 #1
> task: c000000000ad8cc0 task.stack: c000000000bec000
> NIP: c000000000a8ca7c LR: c000000000a8ca6c CTR: c000000000a8ca20
> REGS: c000000000befb90 TRAP: 0700   Not tainted  (4.12.0-rc5-next-20170614)
> MSR: 0000000080021000 <CE,ME>
>   CR: 22000042  XER: 00000000
>   SOFTE: 0 
>   GPR00: c000000000a8ca6c c000000000befe10 c000000000befa00 0000000000000000 
>   GPR04: 0000000000000000 c000000000ac8458 c000000000ac8438 c000000000830658 
>   GPR08: 0000000000000001 0000000000000001 0000000000000000 0000000000009531 
>   GPR12: 0000000022000022 c00000003fff1000 0000000000000000 0000000000000000 
>   GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>   GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>   GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>   GPR28: c000000000000300 c00000003fff2cc0 c000000000ac06e0 c000000000ac06e0 
>   NIP [c000000000a8ca7c] .corenet_gen_pic_init+0x5c/0x90
>   LR [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90
>   Call Trace:
>   [c000000000befe10] [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90
>   (unreliable)
>   [c000000000befe80] [c000000000a832f8] .init_IRQ+0x34/0x4c
>   [c000000000befef0] [c000000000a7fc88] .start_kernel+0x2fc/0x500
>   [c000000000beff90] [c000000000000554] start_here_common+0x1c/0x48
>   Instruction dump:
>   e8aa0068 39088268 39407002 38600000 7fa54800 39205002 7caa4f9e 4bffe9e9 
>   60000000 2c230000 7d200026 55291ffe <0b090000> 4bfff335 60000000 3ca2ffdd 
>   random: 0x600000003d220004 get_random_bytes called with crng_init=0
>   ---[ end trace 0000000000000000 ]---
> 
> and are caused by the kernel not finding the mpic node anymore.
> 
> Any idea how to solve the problem ?

The BUG() is triggered if mpic_alloc() returns NULL.

I looked through mpic_alloc(), and the functions that it calls, and nothing
is jumping out as being related to phandles.

Can you add some printks to mpic_alloc() to determine what problem is
causing it to return NULL?

Can you also include the console messages before the "[ cut here ]" line?

-Frank

> 
> Bisect log is attached.
> 
> Thanks,
> Guenter

< snip >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-14 21:31 ` Frank Rowand
@ 2017-06-14 22:35   ` Guenter Roeck
  2017-06-15  0:45     ` Frank Rowand
  0 siblings, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2017-06-14 22:35 UTC (permalink / raw)
  To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring

On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote:
> Hi Guenter,
> 
> Thanks for reporting this.
> 
> 
> On 06/14/17 12:26, Guenter Roeck wrote:
> > Hi Frank,
> > 
> > your commit 'of: remove *phandle properties from expanded device tree' in
> > -next causes several of my ppc qemu tests to crash. Looking into qemu, it
> > sets "linux,phandle" properties for the mpic and for other devices.
> > 
> > The crashes are along the line of
> > 
> > ------------[ cut here ]------------
> > kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50!
> > Oops: Exception in kernel mode, sig: 5 [#1]
> > SMP NR_CPUS=32 
> > NUMA 
> > CoreNet Generic
> > Modules linked in:
> > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc5-next-20170614 #1
> > task: c000000000ad8cc0 task.stack: c000000000bec000
> > NIP: c000000000a8ca7c LR: c000000000a8ca6c CTR: c000000000a8ca20
> > REGS: c000000000befb90 TRAP: 0700   Not tainted  (4.12.0-rc5-next-20170614)
> > MSR: 0000000080021000 <CE,ME>
> >   CR: 22000042  XER: 00000000
> >   SOFTE: 0 
> >   GPR00: c000000000a8ca6c c000000000befe10 c000000000befa00 0000000000000000 
> >   GPR04: 0000000000000000 c000000000ac8458 c000000000ac8438 c000000000830658 
> >   GPR08: 0000000000000001 0000000000000001 0000000000000000 0000000000009531 
> >   GPR12: 0000000022000022 c00000003fff1000 0000000000000000 0000000000000000 
> >   GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> >   GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> >   GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> >   GPR28: c000000000000300 c00000003fff2cc0 c000000000ac06e0 c000000000ac06e0 
> >   NIP [c000000000a8ca7c] .corenet_gen_pic_init+0x5c/0x90
> >   LR [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90
> >   Call Trace:
> >   [c000000000befe10] [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90
> >   (unreliable)
> >   [c000000000befe80] [c000000000a832f8] .init_IRQ+0x34/0x4c
> >   [c000000000befef0] [c000000000a7fc88] .start_kernel+0x2fc/0x500
> >   [c000000000beff90] [c000000000000554] start_here_common+0x1c/0x48
> >   Instruction dump:
> >   e8aa0068 39088268 39407002 38600000 7fa54800 39205002 7caa4f9e 4bffe9e9 
> >   60000000 2c230000 7d200026 55291ffe <0b090000> 4bfff335 60000000 3ca2ffdd 
> >   random: 0x600000003d220004 get_random_bytes called with crng_init=0
> >   ---[ end trace 0000000000000000 ]---
> > 
> > and are caused by the kernel not finding the mpic node anymore.
> > 
> > Any idea how to solve the problem ?
> 
> The BUG() is triggered if mpic_alloc() returns NULL.
> 
Yes, I got that far as well ...

> I looked through mpic_alloc(), and the functions that it calls, and nothing
> is jumping out as being related to phandles.
> 
> Can you add some printks to mpic_alloc() to determine what problem is
> causing it to return NULL?
> 
I'll try later tonight.

> Can you also include the console messages before the "[ cut here ]" line?
> 
http://kerneltests.org/builders

Check qemu test results in the 'next' column. ppc and ppc64 show related console
messages.

Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-14 22:35   ` Guenter Roeck
@ 2017-06-15  0:45     ` Frank Rowand
  2017-06-15  2:10       ` Guenter Roeck
  2017-06-15  4:12       ` Guenter Roeck
  0 siblings, 2 replies; 9+ messages in thread
From: Frank Rowand @ 2017-06-15  0:45 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Frank Rowand, linux-kernel, Rob Herring

On 06/14/17 15:35, Guenter Roeck wrote:
> On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote:
>> Hi Guenter,

< snip >

>> Can you also include the console messages before the "[ cut here ]" line?
>>
> http://kerneltests.org/builders
> 
> Check qemu test results in the 'next' column. ppc and ppc64 show related console
> messages.

Thanks for the pointer.  Unfortunately I did not see any additional clues (yet)
in the full log.

I tried to compare the failed boot to a good boot, but did not find a console
log for a good boot.  I started at the qemu-ppc-next builder page:

  http://kerneltests.org/builders/qemu-ppc64-next

and looked at recent tests that were successful (like #645).  But the log file
link from that test does not show the contents of the console for tests that
pass.  Is there some way to see what the console for a successful test looks
like?

-Frank

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-15  0:45     ` Frank Rowand
@ 2017-06-15  2:10       ` Guenter Roeck
  2017-06-15  4:12       ` Guenter Roeck
  1 sibling, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2017-06-15  2:10 UTC (permalink / raw)
  To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring

[-- Attachment #1: Type: text/plain, Size: 1082 bytes --]

On Wed, Jun 14, 2017 at 05:45:52PM -0700, Frank Rowand wrote:
> On 06/14/17 15:35, Guenter Roeck wrote:
> > On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote:
> >> Hi Guenter,
> 
> < snip >
> 
> >> Can you also include the console messages before the "[ cut here ]" line?
> >>
> > http://kerneltests.org/builders
> > 
> > Check qemu test results in the 'next' column. ppc and ppc64 show related console
> > messages.
> 
> Thanks for the pointer.  Unfortunately I did not see any additional clues (yet)
> in the full log.
> 
> I tried to compare the failed boot to a good boot, but did not find a console
> log for a good boot.  I started at the qemu-ppc-next builder page:
> 
>   http://kerneltests.org/builders/qemu-ppc64-next
> 
> and looked at recent tests that were successful (like #645).  But the log file
> link from that test does not show the contents of the console for tests that
> pass.  Is there some way to see what the console for a successful test looks
> like?
> 

See attached. I am on the road; I'll try to do some debugging later from home.

Guenter


[-- Attachment #2: ppc64.log --]
[-- Type: text/plain, Size: 6407 bytes --]

MMU: Supported page sizes
         4 KB as direct
      4096 KB as direct
     16384 KB as direct
     65536 KB as direct
    262144 KB as direct
   1048576 KB as direct
MMU: Book3E HW tablewalk not supported
Linux version 4.12.0-rc4 (groeck@mars) (gcc version 4.8.1 (GCC) ) #1 SMP Wed Jun 14 19:06:31 PDT 2017
Found initrd at 0xc000000004000000:0xc000000004200c00
Using CoreNet Generic machine description
bootconsole [udbg0] enabled
CPU maps initialized for 1 thread per core
-----------------------------------------------------
phys_mem_size     = 0x40000000
dcache_bsize      = 0x40
icache_bsize      = 0x40
cpu_features      = 0x00180400181802c0
  possible        = 0x00180480581802c0
  always          = 0x00180400581802c0
cpu_user_features = 0xcc008000 0x08000000
mmu_features      = 0x000a0010
firmware_features = 0x0000000000000000
-----------------------------------------------------
numa:   NODE_DATA [mem 0x3ffd6740-0x3ffdffff]
CoreNet Generic board
Zone ranges:
  DMA      [mem 0x0000000000000000-0x000000003fffffff]
  DMA32    empty
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000000000-0x000000003fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
MMU: Allocated 2112 bytes of context maps for 255 contexts
percpu: Embedded 18 pages/cpu @c00000003fe00000 s35544 r0 d38184 u1048576
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 258560
Policy zone: DMA
Kernel command line: rdinit=/sbin/init console=tty console=ttyS0 doreboot
PID hash table entries: 4096 (order: 3, 32768 bytes)
Memory: 952996K/1048576K available (8224K kernel code, 1280K rwdata, 2448K rodata, 368K init, 441K bss, 95580K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Hierarchical RCU implementation.
	RCU debugfs-based tracing is enabled.
	RCU restricting CPUs from NR_CPUS=32 to nr_cpu_ids=1.
RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
NR_IRQS:512 nr_irqs:512 16
mpic: Setting up MPIC " OpenPIC  " version 1.2 at e0040000, max 1 CPUs
mpic: ISU size: 512, shift: 9, mask: 1ff
mpic: Initializing for 512 sources
clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x5c4093a7d1, max_idle_ns: 440795210635 ns
clocksource: timebase mult[2800000] shift[24] registered
Console: colour dummy device 80x25
console [tty0] enabled
pid_max: default: 32768 minimum: 301
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes)
smp: Bringing up secondary CPUs ...
smp: Brought up 1 node, 1 CPU
devtmpfs: initialized
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 256 (order: 2, 16384 bytes)
NET: Registered protocol family 16
Machine: MPC8544DS
SoC family: QorIQ
SoC ID: svr:0x00000000, Revision: 0.0
PCI: Probing PCI hardware
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
PTP clock support registered
Advanced Linux Sound Architecture Driver Initialized.
clocksource: Switched to clocksource timebase
NET: Registered protocol family 2
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 5, 131072 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
UDP hash table entries: 512 (order: 2, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Trying to unpack rootfs image as initramfs...
Freeing initrd memory: 2048K
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(0.224:1): state=initialized audit_enabled=0 res=1
workingset: timestamp_bits=54 max_order=18 bucket_order=0
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
ntfs: driver 2.1.32 [Flags: R/O].
jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
io scheduler mq-deadline registered
io scheduler kyber registered
Serial: 8250/16550 driver, 2 ports, IRQ sharing enabled
console [ttyS0] disabled
serial8250.0: ttyS0 at MMIO 0xe0004500 (irq = 42, base_baud = 115200) is a 16550A
console [ttyS0] enabled
console [ttyS0] enabled
bootconsole [udbg0] disabled
bootconsole [udbg0] disabled
brd: module loaded
loop: module loaded
st: Version 20160209, fixed bufsize 32768, s/g segs 256
libphy: Fixed MDIO Bus: probed
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
usbcore: registered new interface driver usb-storage
i2c /dev entries driver
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
sdhci-pltfm: SDHCI platform and OF driver helper
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
ipip: IPv4 and MPLS over IPv4 tunneling driver
Initializing XFRM netlink socket
NET: Registered protocol family 10
Segment Routing with IPv6
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
NET: Registered protocol family 15
Key type dns_resolver registered
hctosys: unable to open rtc device (rtc0)
ALSA device list:
  No soundcards found.
Freeing unused kernel memory: 368K
This architecture does not have kernel memory protection.

Boot successful.
Rebooting.
swapoff: can't open '/etc/fstab': No such file or directory
umount: can't umount /: Invalid argument

The system is going down NOW!

Sent SIGTERM to all processes

Sent SIGKILL to all processes

Requesting system reboot
reboot: Restarting system

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-15  0:45     ` Frank Rowand
  2017-06-15  2:10       ` Guenter Roeck
@ 2017-06-15  4:12       ` Guenter Roeck
  2017-06-15  7:58         ` Frank Rowand
  1 sibling, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2017-06-15  4:12 UTC (permalink / raw)
  To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring

On 06/14/2017 05:45 PM, Frank Rowand wrote:
> On 06/14/17 15:35, Guenter Roeck wrote:
>> On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote:
>>> Hi Guenter,
> 
> < snip >
> 
>>> Can you also include the console messages before the "[ cut here ]" line?
>>>
>> http://kerneltests.org/builders
>>
>> Check qemu test results in the 'next' column. ppc and ppc64 show related console
>> messages.
> 
> Thanks for the pointer.  Unfortunately I did not see any additional clues (yet)
> in the full log.
> 
> I tried to compare the failed boot to a good boot, but did not find a console
> log for a good boot.  I started at the qemu-ppc-next builder page:
> 
>    http://kerneltests.org/builders/qemu-ppc64-next
> 
> and looked at recent tests that were successful (like #645).  But the log file
> link from that test does not show the contents of the console for tests that
> pass.  Is there some way to see what the console for a successful test looks
> like?
> 
> -Frank
> 
Good (v4.12-rc4):

...
NR_IRQS:512 nr_irqs:512 16
OF: Checking node /
OF:   node '/' compatible '' type 'open-pic' name '' score 0
OF:   node '/' compatible 'open-pic' type '' name '' score 0
OF: Checking node /pci@e0008000
OF:   node '/pci@e0008000' compatible '' type 'open-pic' name '' score 0
OF:   node '/pci@e0008000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000
OF:   node '/soc@e0000000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/msi@41600
OF:   node '/soc@e0000000/msi@41600' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/msi@41600' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/global-utilities@e0000
OF:   node '/soc@e0000000/global-utilities@e0000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/global-utilities@e0000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/serial@4500
OF:   node '/soc@e0000000/serial@4500' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/serial@4500' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/pic@40000
OF:     type match
OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 2
OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0
mpic: Setting up MPIC " OpenPIC  " version 1.2 at e0040000, max 1 CPUs
mpic: ISU size: 512, shift: 9, mask: 1ff
mpic: Initializing for 512 sources

bad:

NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
OF: Checking node /
OF:   node '/' compatible '' type 'open-pic' name '' score 0
OF:   node '/' compatible 'open-pic' type '' name '' score 0
OF: Checking node /pci@e0008000
OF:   node '/pci@e0008000' compatible '' type 'open-pic' name '' score 0
OF:   node '/pci@e0008000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000
OF:   node '/soc@e0000000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/msi@41600
OF:   node '/soc@e0000000/msi@41600' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/msi@41600' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/global-utilities@e0000
OF:   node '/soc@e0000000/global-utilities@e0000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/global-utilities@e0000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/serial@4500
OF:   node '/soc@e0000000/serial@4500' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/serial@4500' compatible 'open-pic' type '' name '' score 0
OF: Checking node /soc@e0000000/pic@40000
OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0
OF: Checking node /aliases
OF:   node '/aliases' compatible '' type 'open-pic' name '' score 0
OF:   node '/aliases' compatible 'open-pic' type '' name '' score 0
OF: Checking node /cpus
OF:   node '/cpus' compatible '' type 'open-pic' name '' score 0
OF:   node '/cpus' compatible 'open-pic' type '' name '' score 0
OF: Checking node /cpus/PowerPC,8544@0
OF:   node '/cpus/PowerPC,8544@0' compatible '' type 'open-pic' name '' score 0
OF:   node '/cpus/PowerPC,8544@0' compatible 'open-pic' type '' name '' score 0
OF: Checking node /chosen
OF:   node '/chosen' compatible '' type 'open-pic' name '' score 0
OF:   node '/chosen' compatible 'open-pic' type '' name '' score 0
OF: Checking node /memory
OF:   node '/memory' compatible '' type 'open-pic' name '' score 0
OF:   node '/memory' compatible 'open-pic' type '' name '' score 0
No matching open-pic node
------------[ cut here ]------------
kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50!

So, in __of_device_is_compatible(), the difference is in
__of_device_is_compatible() after

         /* Matching type is better than matching name */

Further debugging shows that device->type is NULL in the bad case.

OF: Checking node /soc@e0000000/pic@40000
OF:     trying type match open-pic - <NULL>
OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0

Do you need more information ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-14 19:26 Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' Guenter Roeck
  2017-06-14 21:31 ` Frank Rowand
@ 2017-06-15  6:48 ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2017-06-15  6:48 UTC (permalink / raw)
  To: Guenter Roeck, Frank Rowand; +Cc: linux-kernel, Rob Herring

Guenter Roeck <linux@roeck-us.net> writes:

> Hi Frank,
>
> your commit 'of: remove *phandle properties from expanded device tree' in
> -next causes several of my ppc qemu tests to crash. Looking into qemu, it
> sets "linux,phandle" properties for the mpic and for other devices.

Yeah this broke ~50% of my machines.

Various back traces, or in some cases nothing at all.

cheers

eg:

   XICS: Cannot find a Source Controller !
   ------------[ cut here ]------------
   kernel BUG at arch/powerpc/sysdev/xics/xics-common.c:58!
   Oops: Exception in kernel mode, sig: 5 [#1]
   SMP NR_CPUS=2048 
   NUMA 
   pSeries
   Modules linked in:
   CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.12.0-rc5-gcc5-next-20170614-gb147461 #1
   task: c000000000eb1180 task.stack: c000000001084000
   NIP: c00000000008d780 LR: c00000000008d770 CTR: 0000000000000000
   REGS: c000000001087a40 TRAP: 0700   Tainted: G        W        (4.12.0-rc5-gcc5-next-20170614-gb147461)
   MSR: 8000000000021032 <SF,ME,IR,DR,RI>
     CR: 24000422  XER: 00000001
   CFAR: c0000000008dd280 SOFTE: 0 
   GPR00: c00000000008d770 c000000001087cc0 c000000001086400 0000000000000000 
   GPR04: 0000000000000000 0000000000000000 c000000000ad14c8 0000000000000002 
   GPR08: 0000000000000002 0000000000000001 0000000000000002 0000000000000000 
   GPR12: 0000000022000424 c000000006af0000 00000000054dd288 00000000054b5618 
   GPR16: 00000000054b5320 00000000054b59e8 000000000554dd20 0000000000000060 
   GPR20: 000000000462eea0 0000000001b56c80 0000000000000040 0000000000000000 
   GPR24: 0000000004814000 0000000005aa0028 0000000004814000 0000000005ab158e 
   GPR28: ffffffffd00dfeed c000000000e115e0 0000000000000000 c000000000eb54f4 
   NIP [c00000000008d780] .xics_update_irq_servers+0x40/0x140
   LR [c00000000008d770] .xics_update_irq_servers+0x30/0x140
   Call Trace:
   [c000000001087cc0] [c00000000008d770] .xics_update_irq_servers+0x30/0x140 (unreliable)
   [c000000001087d50] [c000000000db85f0] .xics_init+0x134/0x188
   [c000000001087dd0] [c000000000dbdc64] .pseries_init_irq+0x48/0x230
   [c000000001087e80] [c000000000da8dcc] .init_IRQ+0x3c/0x50
   [c000000001087ef0] [c000000000da44e4] .start_kernel+0x31c/0x528
   [c000000001087f90] [c00000000000b070] start_here_common+0x1c/0x4ac
   Instruction dump:
   f821ff71 60000000 60000000 3d02ffe3 38800000 3be8f0f4 e87f0002 4884fa85 
   60000000 7c690074 7c7e1b78 7929d182 <0b090000> e93f0002 3d02000b 3c82ffc2 
   ---[ end trace 523b05d3a02887f6 ]---

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-15  4:12       ` Guenter Roeck
@ 2017-06-15  7:58         ` Frank Rowand
  2017-06-15  9:53           ` Guenter Roeck
  0 siblings, 1 reply; 9+ messages in thread
From: Frank Rowand @ 2017-06-15  7:58 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Frank Rowand, linux-kernel, Rob Herring

On 06/14/17 21:12, Guenter Roeck wrote:

< snip >

> Good (v4.12-rc4):
> 

< snip >

> OF: Checking node /soc@e0000000/pic@40000
> OF:     type match
> OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 2
> OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0

< snip >

> 
> bad:

< snip >

> OF: Checking node /soc@e0000000/pic@40000
> OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
> OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0

< snip >

> No matching open-pic node
> ------------[ cut here ]------------
> kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50!
> 
> So, in __of_device_is_compatible(), the difference is in
> __of_device_is_compatible() after
> 
>         /* Matching type is better than matching name */
> 
> Further debugging shows that device->type is NULL in the bad case.
> 
> OF: Checking node /soc@e0000000/pic@40000
> OF:     trying type match open-pic - <NULL>
> OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
> OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0
> 
> Do you need more information ?

I think I know what part of my patch is causing the problem.

Can you try the following patch to see if if fixes the failure in
__of_device_is_compatible()?

If this fixes the failure, then I know what is going on.  If it works
then I will have to rework my original patch in a different way than
this quick hack.

-Frank



---
 drivers/of/dynamic.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

Index: b/drivers/of/dynamic.c
===================================================================
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -218,6 +218,20 @@ int of_property_notify(int action, struc
 
 static void __of_attach_node(struct device_node *np)
 {
+	const __be32 *phandle;
+	int sz;
+
+	/* use "<NULL>" to be consistent with populate_node() */
+	np->name = __of_get_property(np, "name", NULL) ? : "<NULL>";
+	np->type = __of_get_property(np, "device_type", NULL) ? : "<NULL>";
+
+	phandle = __of_get_property(np, "phandle", &sz);
+	if (!phandle)
+		phandle = __of_get_property(np, "linux,phandle", &sz);
+	if (IS_ENABLED(CONFIG_PPC_PSERIES) && !phandle)
+		phandle = __of_get_property(np, "ibm,phandle", &sz);
+	np->phandle = (phandle && (sz >= 4)) ? be32_to_cpup(phandle) : 0;
+
 	np->child = NULL;
 	np->sibling = np->parent->child;
 	np->parent->child = np;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree'
  2017-06-15  7:58         ` Frank Rowand
@ 2017-06-15  9:53           ` Guenter Roeck
  0 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2017-06-15  9:53 UTC (permalink / raw)
  To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring

On 06/15/2017 12:58 AM, Frank Rowand wrote:
> On 06/14/17 21:12, Guenter Roeck wrote:
> 
> < snip >
> 
>> Good (v4.12-rc4):
>>
> 
> < snip >
> 
>> OF: Checking node /soc@e0000000/pic@40000
>> OF:     type match
>> OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 2
>> OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0
> 
> < snip >
> 
>>
>> bad:
> 
> < snip >
> 
>> OF: Checking node /soc@e0000000/pic@40000
>> OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
>> OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0
> 
> < snip >
> 
>> No matching open-pic node
>> ------------[ cut here ]------------
>> kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50!
>>
>> So, in __of_device_is_compatible(), the difference is in
>> __of_device_is_compatible() after
>>
>>          /* Matching type is better than matching name */
>>
>> Further debugging shows that device->type is NULL in the bad case.
>>
>> OF: Checking node /soc@e0000000/pic@40000
>> OF:     trying type match open-pic - <NULL>
>> OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
>> OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0
>>
>> Do you need more information ?
> 
> I think I know what part of my patch is causing the problem.
> 
> Can you try the following patch to see if if fixes the failure in
> __of_device_is_compatible()?
> 
> If this fixes the failure, then I know what is going on.  If it works
> then I will have to rework my original patch in a different way than
> this quick hack.
> 

Sorry, doesn't make a difference.

OF: Checking node /soc@e0000000/pic@40000
OF:     trying type match open-pic - <NULL>
OF:   node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0
OF:   node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0

I added a log message into __of_attach_node(); it is not called.

Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-06-15  9:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-14 19:26 Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' Guenter Roeck
2017-06-14 21:31 ` Frank Rowand
2017-06-14 22:35   ` Guenter Roeck
2017-06-15  0:45     ` Frank Rowand
2017-06-15  2:10       ` Guenter Roeck
2017-06-15  4:12       ` Guenter Roeck
2017-06-15  7:58         ` Frank Rowand
2017-06-15  9:53           ` Guenter Roeck
2017-06-15  6:48 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).