All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-next@vger.kernel.org,
	ppc-dev <linuxppc-dev@lists.ozlabs.org>,
	gregkh@linuxfoundation.org,
	Arjan van de Ven <arjan@linux.intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Milton Miller <miltonm@bga.com>,
	mikey@neuling.org,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	benh@kernel.crashing.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Subject: Re: Boot failure with next-20120208
Date: Tue, 14 Feb 2012 03:12:34 +0530	[thread overview]
Message-ID: <4F3983CA.2070403@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120212113805.c7e5d902c95a9d0f4037e12c@canb.auug.org.au>

On 02/12/2012 06:08 AM, Stephen Rothwell wrote:

> Hi all,
> 
> Just a quick note to say I got a boot OOPs with next-20120208 and 9 on a
> Power7 blade (my other PowerPC boot tests are ok.  I'll investigate this
> further on Monday.
> 
> The line referenced below is:
> 
> BUG_ON(!kobj || !kobj->sd || !attr);
> 
> in sysfs_create_file().
> 
> calling  .topology_init+0x0/0x1ac @ 1
> initcall 7_.async_cpu_up+0x0/0x40 returned 0 after 9765 usecs
> async_continuing @ 20 after 9765 usec
> ------------[ cut here ]------------
> kernel BUG at fs/sysfs/file.c:573!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=32 NUMA pSeries
> Modules linked in:
> NIP: c00000000024a35c LR: c0000000004ee050 CTR: c00000000083ca24
> REGS: c0000003fd9e7560 TRAP: 0700   Not tainted  (3.3.0-rc2-autokern1)
> MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 88002082  XER: 0000000f
> CFAR: c00000000024a370
> TASK = c0000003fd9e8000[20] 'kworker/u:6' THREAD: c0000003fd9e4000 CPU: 0
> GPR00: 0000000000000001 c0000003fd9e77e0 c000000000d19bb8 0000000000000000 
> GPR04: c000000000bf37a8 0000000000000008 8000000002096400 0000000000000000 
> GPR08: 0000000000000000 c000000000f80028 c000000000d52bd8 0000000000000000 
> GPR12: 0000000048002088 c00000000f33b000 0000000001affa78 00000000009aa000 
> GPR16: 0000000000e1f3c8 0000000002d517f0 0000000001aff984 0000000000000060 
> GPR20: 0000000000000000 ffffffffffffffff 0000000000000000 c000000000c45128 
> GPR24: 0000000000000000 0000000000000008 0000000000000000 c000000000c44200 
> GPR28: c000000000f80028 0000000000000008 c000000000c85038 0000000000000002 
> NIP [c00000000024a35c] .sysfs_create_file+0x1c/0x40
> LR [c0000000004ee050] .device_create_file+0x20/0x40
> Call Trace:
> [c0000003fd9e77e0] [c0000003fd9e78a0] 0xc0000003fd9e78a0 (unreliable)
> [c0000003fd9e7850] [c00000000083c9a4] .register_cpu_online+0x1d0/0x250
> [c0000003fd9e7900] [c00000000083ca8c] .sysfs_cpu_notify+0x68/0x28c
> [c0000003fd9e79b0] [c00000000083769c] .notifier_call_chain+0x9c/0x100
> [c0000003fd9e7a50] [c0000000000a5878] .__cpu_notify+0x38/0x80
> [c0000003fd9e7ad0] [c00000000083e124] ._cpu_up+0x10c/0x178
> [c0000003fd9e7b90] [c00000000083e2c8] .cpu_up+0x138/0x164
> [c0000003fd9e7c20] [c000000000ba46d0] .async_cpu_up+0x28/0x40
> [c0000003fd9e7ca0] [c0000000000d81ec] .async_run_entry_fn+0xbc/0x1f0
> [c0000003fd9e7d50] [c0000000000c7cbc] .process_one_work+0x19c/0x590
> [c0000003fd9e7e10] [c0000000000c8618] .worker_thread+0x188/0x4b0
> [c0000003fd9e7ed0] [c0000000000ce57c] .kthread+0xbc/0xd0
> [c0000003fd9e7f90] [c000000000021448] .kernel_thread+0x54/0x70
> Instruction dump:
> 7fa3eb78 ebe1fff8 eba1ffe8 7c0803a6 4e800020 2c230000 41820024 e8630030 
> 7c800074 7800d182 2fa30000 419e0014 <0b000000> 38a00002 4bfffebc e8630030 
> ---[ end trace 31fd0ba7d8756001 ]---
> initcall .topology_init+0x0/0x1ac returned 0 after 0 usecs
> calling  .pcibios_init+0x0/0xe8 @ 1
> PCI: Probing PCI hardware
> PCI: Probing PCI hardware done
> initcall .pcibios_init+0x0/0xe8 returned 0 after 0 usecs
> calling  .add_system_ram_resources+0x0/0x140 @ 1
> initcall .add_system_ram_resources+0x0/0x140 returned 0 after 0 usecs
> calling  .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8 @ 1
> initcall .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8 returned 0 after 0 usecs
> calling  .opal_init+0x0/0x1cc @ 1
> opal: Node not found
> initcall .opal_init+0x0/0x1cc returned -19 after 0 usecs
> calling  .__machine_initcall_pseries_ioei_init+0x0/0xa0 @ 1
> 


I took a brief look.. This looks like a race in register_cpu_online().

init/main.c: calls smp_init() followed by do_basic_setup().
do_basic_setup() executes all the post-smp initcalls.

arch/powerpc/kernel/sysfs.c:topology_init() is a subsys_initcall.
Hence it gets run from do_basic_setup().

And topology_init() does 2 things:
It calls register_cpu_notifier() and also calls register_cpu_online()
inside a 'for' loop.

And the sysfs_cpu_notify() function also calls register_cpu_online().

This was safe as long as topology_init() and CPU hotplug were run serially,
(ie., smp_init() finishing all cpu onlining and only then calling
do_basic_setup(), as was done previous to Arjan's patch).

But Arjan's patch makes these 2 things to run in parallel and since
register_cpu_online() doesn't have any protection, we hit the race
condition.

I'll try to take a deeper look into this tomorrow...

Regards,
Srivatsa S. Bhat
IBM Linux Technology Center


WARNING: multiple messages have this Message-ID (diff)
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: mikey@neuling.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	gregkh@linuxfoundation.org,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	ppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Milton Miller <miltonm@bga.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-next@vger.kernel.org,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: Boot failure with next-20120208
Date: Tue, 14 Feb 2012 03:12:34 +0530	[thread overview]
Message-ID: <4F3983CA.2070403@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120212113805.c7e5d902c95a9d0f4037e12c@canb.auug.org.au>

On 02/12/2012 06:08 AM, Stephen Rothwell wrote:

> Hi all,
> 
> Just a quick note to say I got a boot OOPs with next-20120208 and 9 on a
> Power7 blade (my other PowerPC boot tests are ok.  I'll investigate this
> further on Monday.
> 
> The line referenced below is:
> 
> BUG_ON(!kobj || !kobj->sd || !attr);
> 
> in sysfs_create_file().
> 
> calling  .topology_init+0x0/0x1ac @ 1
> initcall 7_.async_cpu_up+0x0/0x40 returned 0 after 9765 usecs
> async_continuing @ 20 after 9765 usec
> ------------[ cut here ]------------
> kernel BUG at fs/sysfs/file.c:573!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=32 NUMA pSeries
> Modules linked in:
> NIP: c00000000024a35c LR: c0000000004ee050 CTR: c00000000083ca24
> REGS: c0000003fd9e7560 TRAP: 0700   Not tainted  (3.3.0-rc2-autokern1)
> MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 88002082  XER: 0000000f
> CFAR: c00000000024a370
> TASK = c0000003fd9e8000[20] 'kworker/u:6' THREAD: c0000003fd9e4000 CPU: 0
> GPR00: 0000000000000001 c0000003fd9e77e0 c000000000d19bb8 0000000000000000 
> GPR04: c000000000bf37a8 0000000000000008 8000000002096400 0000000000000000 
> GPR08: 0000000000000000 c000000000f80028 c000000000d52bd8 0000000000000000 
> GPR12: 0000000048002088 c00000000f33b000 0000000001affa78 00000000009aa000 
> GPR16: 0000000000e1f3c8 0000000002d517f0 0000000001aff984 0000000000000060 
> GPR20: 0000000000000000 ffffffffffffffff 0000000000000000 c000000000c45128 
> GPR24: 0000000000000000 0000000000000008 0000000000000000 c000000000c44200 
> GPR28: c000000000f80028 0000000000000008 c000000000c85038 0000000000000002 
> NIP [c00000000024a35c] .sysfs_create_file+0x1c/0x40
> LR [c0000000004ee050] .device_create_file+0x20/0x40
> Call Trace:
> [c0000003fd9e77e0] [c0000003fd9e78a0] 0xc0000003fd9e78a0 (unreliable)
> [c0000003fd9e7850] [c00000000083c9a4] .register_cpu_online+0x1d0/0x250
> [c0000003fd9e7900] [c00000000083ca8c] .sysfs_cpu_notify+0x68/0x28c
> [c0000003fd9e79b0] [c00000000083769c] .notifier_call_chain+0x9c/0x100
> [c0000003fd9e7a50] [c0000000000a5878] .__cpu_notify+0x38/0x80
> [c0000003fd9e7ad0] [c00000000083e124] ._cpu_up+0x10c/0x178
> [c0000003fd9e7b90] [c00000000083e2c8] .cpu_up+0x138/0x164
> [c0000003fd9e7c20] [c000000000ba46d0] .async_cpu_up+0x28/0x40
> [c0000003fd9e7ca0] [c0000000000d81ec] .async_run_entry_fn+0xbc/0x1f0
> [c0000003fd9e7d50] [c0000000000c7cbc] .process_one_work+0x19c/0x590
> [c0000003fd9e7e10] [c0000000000c8618] .worker_thread+0x188/0x4b0
> [c0000003fd9e7ed0] [c0000000000ce57c] .kthread+0xbc/0xd0
> [c0000003fd9e7f90] [c000000000021448] .kernel_thread+0x54/0x70
> Instruction dump:
> 7fa3eb78 ebe1fff8 eba1ffe8 7c0803a6 4e800020 2c230000 41820024 e8630030 
> 7c800074 7800d182 2fa30000 419e0014 <0b000000> 38a00002 4bfffebc e8630030 
> ---[ end trace 31fd0ba7d8756001 ]---
> initcall .topology_init+0x0/0x1ac returned 0 after 0 usecs
> calling  .pcibios_init+0x0/0xe8 @ 1
> PCI: Probing PCI hardware
> PCI: Probing PCI hardware done
> initcall .pcibios_init+0x0/0xe8 returned 0 after 0 usecs
> calling  .add_system_ram_resources+0x0/0x140 @ 1
> initcall .add_system_ram_resources+0x0/0x140 returned 0 after 0 usecs
> calling  .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8 @ 1
> initcall .__machine_initcall_powermac_pmac_i2c_create_platform_devices+0x0/0xc8 returned 0 after 0 usecs
> calling  .opal_init+0x0/0x1cc @ 1
> opal: Node not found
> initcall .opal_init+0x0/0x1cc returned -19 after 0 usecs
> calling  .__machine_initcall_pseries_ioei_init+0x0/0xa0 @ 1
> 


I took a brief look.. This looks like a race in register_cpu_online().

init/main.c: calls smp_init() followed by do_basic_setup().
do_basic_setup() executes all the post-smp initcalls.

arch/powerpc/kernel/sysfs.c:topology_init() is a subsys_initcall.
Hence it gets run from do_basic_setup().

And topology_init() does 2 things:
It calls register_cpu_notifier() and also calls register_cpu_online()
inside a 'for' loop.

And the sysfs_cpu_notify() function also calls register_cpu_online().

This was safe as long as topology_init() and CPU hotplug were run serially,
(ie., smp_init() finishing all cpu onlining and only then calling
do_basic_setup(), as was done previous to Arjan's patch).

But Arjan's patch makes these 2 things to run in parallel and since
register_cpu_online() doesn't have any protection, we hit the race
condition.

I'll try to take a deeper look into this tomorrow...

Regards,
Srivatsa S. Bhat
IBM Linux Technology Center

  parent reply	other threads:[~2012-02-13 21:42 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-12  0:38 Boot failure with next-20120208 Stephen Rothwell
2012-02-12  0:38 ` Stephen Rothwell
2012-02-13  3:04 ` Michael Neuling
2012-02-13  3:04   ` Michael Neuling
2012-02-13  5:47   ` Stephen Rothwell
2012-02-13  5:47     ` Stephen Rothwell
2012-02-13 14:18   ` Arjan van de Ven
2012-02-13 14:18     ` Arjan van de Ven
2012-02-13 20:05     ` Andrew Morton
2012-02-13 20:05       ` Andrew Morton
2012-02-13 20:16       ` Arjan van de Ven
2012-02-13 20:16         ` Arjan van de Ven
2012-03-23 19:22         ` Andrew Morton
2012-03-23 19:22           ` Andrew Morton
2012-03-23 19:24           ` Arjan van de Ven
2012-03-23 19:24             ` Arjan van de Ven
2012-03-23 22:18             ` Benjamin Herrenschmidt
2012-03-23 22:18               ` Benjamin Herrenschmidt
2012-02-13 21:42 ` Srivatsa S. Bhat [this message]
2012-02-13 21:42   ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F3983CA.2070403@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mikey@neuling.org \
    --cc=miltonm@bga.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.