All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Charles (Chas) Williams" <ciwillia@brocade.com>
To: <linux-kernel@vger.kernel.org>
Subject: Oops in rapl_cpu_prepare()
Date: Thu, 20 Oct 2016 16:27:55 -0400	[thread overview]
Message-ID: <d40f8e3c-b332-c331-38b9-11eb4f4aaaa7@brocade.com> (raw)

Recent 4.8 kernels have been oopsing when running under VMWare:

[    2.270203] BUG: unable to handle kernel NULL pointer dereference at 0000000000000408
[    2.270325] IP: [<ffffffff81012bb9>] rapl_cpu_online+0x59/0x70
[    2.270448] PGD 0
[    2.270570] Oops: 0002 [#1] SMP
[    2.270693] Modules linked in:
[    2.270815] CPU: 2 PID: 21 Comm: cpuhp/2 Not tainted 4.8.2-1-amd64-vyatta #1
[    2.270938] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/14/2014
[    2.271060] task: ffff8802361fc2c0 task.stack: ffff880236208000
[    2.271183] RIP: 0010:[<ffffffff81012bb9>]  [<ffffffff81012bb9>] rapl_cpu_online+0x59/0x70
[    2.271306] RSP: 0000:ffff88023620be68  EFLAGS: 00010246
[    2.271428] RAX: 0000000000000004 RBX: ffff88023fd0d940 RCX: 0000000000000000
[    2.271551] RDX: 0000000000000040 RSI: 0000000000000004 RDI: 0000000000000004
[    2.271673] RBP: 0000000000000002 R08: fffffffffffffffc R09: 0000000000000000
[    2.271796] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000400
[    2.271918] R13: ffff8802361fc2c0 R14: ffff8802361fc2c0 R15: ffff8802361fc2c0
[    2.272041] FS:  0000000000000000(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000
[    2.272163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.272286] CR2: 0000000000000408 CR3: 0000000001a06000 CR4: 00000000000406e0
[    2.272408] Stack:
[    2.272531]  ffff88023fd0d940 0000000000000002 ffffffff81a38240 ffffffff81061231
[    2.272654]  ffff8802361fc2c0 ffff880237002180 ffffffff8107ddcf 0000000000000000
[    2.272776]  ffff8802361a5a80 ffff880237002180 ffffffff8107dcb0 ffffffff81a6a380
[    2.272899] Call Trace:
[    2.273021]  [<ffffffff81061231>] ? cpuhp_thread_fun+0x31/0x100
[    2.273144]  [<ffffffff8107ddcf>] ? smpboot_thread_fn+0x11f/0x180
[    2.273266]  [<ffffffff8107dcb0>] ? sort_range+0x20/0x20
[    2.273389]  [<ffffffff8107b05a>] ? kthread+0xca/0xe0
[    2.273511]  [<ffffffff8157677f>] ? ret_from_fork+0x1f/0x40
[    2.273634]  [<ffffffff8107af90>] ? kthread_park+0x50/0x50
[    2.273757] Code: 00 00 48 83 c0 22 4c 8b 24 c1 48 c7 c0 30 a1 00 00 48 8b 14 10 e8 a8 61 26 00 3b 05 b6 56 ae 00 7c 0e f0 48 0f a
[    2.279445] RIP  [<ffffffff81012bb9>] rapl_cpu_online+0x59/0x70
[    2.279568]  RSP <ffff88023620be68>
[    2.279690] CR2: 0000000000000408
[    2.279813] ---[ end trace c95da920748eb432 ]---


gdb tells me:

(gdb) info line *(rapl_cpu_online+0x59)
Line 595 of "arch/x86/events/intel/rapl.c" starts at address 0xffffffff81012bb9 <rapl_cpu_online+89>
    and ends at 0xffffffff81012bbe <rapl_cpu_online+94>.

Which is:


         target = cpumask_any_and(&rapl_cpu_mask, topology_core_cpumask(cpu));
         if (target < nr_cpu_ids)
                 return 0;

         cpumask_set_cpu(cpu, &rapl_cpu_mask);
         pmu->cpu = cpu;		<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
         return 0;

This code was recently changed by commit 8b5b773d6245138c
"perf/x86/intel/rapl: Convert to hotplug state machine" and it
appears that the setup is done as a callback:

         /*
          * Install callbacks. Core will call them for each online cpu.
          */

         ret = cpuhp_setup_state(CPUHP_PERF_X86_RAPL_PREP, "PERF_X86_RAPL_PREP",
                                 rapl_cpu_prepare, NULL);
         if (ret)
                 goto out;

         ret = cpuhp_setup_state(CPUHP_AP_PERF_X86_RAPL_ONLINE,
                                 "AP_PERF_X86_RAPL_ONLINE",
                                 rapl_cpu_online, rapl_cpu_offline);

Is there a particular order guaranteed by the callbacks?  Will
rapl_cpu_prepare() always happen before online/offline?  Additionally,
rapl_cpu_prepare() can fail to allocate pmu,

	static int rapl_cpu_prepare(unsigned int cpu)
	{
		struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);

		if (pmu)
			return 0;

		pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
		if (!pmu)
			return -ENOMEM;

But rapl_cpu_online() would have no idea about this.  What should be
done in this case?

             reply	other threads:[~2016-10-20 20:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-20 20:27 Charles (Chas) Williams [this message]
2016-10-21 10:56 ` [PREEMPT-RT] Oops in rapl_cpu_prepare() Sebastian Andrzej Siewior
2016-10-21 21:03   ` Charles (Chas) Williams
2016-10-25 12:22     ` Sebastian Andrzej Siewior
2016-10-25 12:42       ` Sebastian Andrzej Siewior
2016-10-27 19:00       ` Charles (Chas) Williams
2016-10-28  8:03         ` Sebastian Andrzej Siewior
2016-11-01 10:15           ` M. Vefa Bicakci
2016-11-02 17:23             ` Sebastian Andrzej Siewior
2016-11-03 18:21               ` M. Vefa Bicakci
2016-11-02  9:16           ` Charles (Chas) Williams
2016-11-02  9:58             ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d40f8e3c-b332-c331-38b9-11eb4f4aaaa7@brocade.com \
    --to=ciwillia@brocade.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.