All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
@ 2015-04-23  5:47 Stephane Eranian
  2015-04-24 13:38 ` Vince Weaver
  0 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2015-04-23  5:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, ak, kan.liang, mingo, sonnyrao


This patch fixes a bug introduced by:

commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
Author: Kan Liang <kan.liang@intel.com>
Date:   Tue Jan 20 04:54:25 2015 +0000

    perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization

It moves uncore_box_init() out of place but this meant that for desktop memory
controller (IMC) PCI-based PMU, the intialization was now missing causing a
crash at first access. This patch fixes the issue by adding the call to the
uncore_enable_box() in snb_uncore_imc_read_counter() to ensure the box
is initialized and the PCI BAR is io-remapped.

Signed-off-by: Stephane Eranian <eranian@google.com>
--

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
index 4562e9e..755ec05 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
@@ -329,6 +329,9 @@ static void snb_uncore_imc_event_start(struct perf_event *event, int flags)
 
 	list_add_tail(&event->active_entry, &box->active_list);
 
+	if (box->n_active == 1)
+		uncore_enable_box(box);
+
 	count = snb_uncore_imc_read_counter(box, event);
 	local64_set(&event->hw.prev_count, count);
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-23  5:47 [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization Stephane Eranian
@ 2015-04-24 13:38 ` Vince Weaver
  2015-04-24 14:03   ` Vince Weaver
  0 siblings, 1 reply; 14+ messages in thread
From: Vince Weaver @ 2015-04-24 13:38 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, peterz, ak, kan.liang, mingo, sonnyrao

On Thu, 23 Apr 2015, Stephane Eranian wrote:

> This patch fixes a bug introduced by:
> 
> commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> Author: Kan Liang <kan.liang@intel.com>
> Date:   Tue Jan 20 04:54:25 2015 +0000

I was hoping this would fix the uncore/imc bug that the perf_fuzzer 
triggers on my Haswell box, but even with this patch I can still quickly 
hard lock the system

[   79.468201] ------------[ cut here ]------------
[   79.473121] kernel BUG at mm/vmalloc.c:1335!
[   79.477705] invalid opcode: 0000 [#1] SMP 
[   79.482141] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul snd_hda_intel snd_hda_controller snd_hda_codec crc32_pclmul snd_hda_core snd_hwdep ghash_clmulni_intel i915 aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ppdev snd_pcm iTCO_vendor_support evdev ablk_helper drm_kms_helper psmouse snd_timer cryptd drm tpm_tis serio_raw pcspkr xhci_pci snd parport_pc soundcore tpm xhci_hcd lpc_ich parport mei_me wmi mfd_core processor video battery i2c_i801 i2c_algo_bit button mei sg sr_mod cdrom sd_mod ehci_pci ehci_hcd ahci e1000e libahci ptp crc32c_intel usbcore libata scsi_mod usb_common pps_core thermal fan thermal_sys
[   79.554768] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0+ #135
[   79.561315] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[   79.569250] task: ffffffff81c14580 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[   79.577270] RIP: 0010:[<ffffffff811b0cd0>]  [<ffffffff811b0cd0>] __get_vm_area_node+0x170/0x180
[   79.586597] RSP: 0000:ffff88011ea03cb8  EFLAGS: 00010006
[   79.592301] RAX: 0000000080010000 RBX: 0000000000006000 RCX: ffffc90000000000
[   79.599950] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000006000
[   79.607581] RBP: ffff88011ea03cf8 R08: ffffe8ffffffffff R09: 00000000ffffffff
[   79.615220] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000002
[   79.622850] R13: 00000000fed16000 R14: 00000000fed10000 R15: 00000000fed10000
[   79.630480] FS:  0000000000000000(0000) GS:ffff88011ea00000(0000) knlGS:0000000000000000
[   79.639167] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   79.645302] CR2: 000000000061c218 CR3: 0000000001c0d000 CR4: 00000000001407f0
[   79.652953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   79.660571] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   79.668188] Stack:
[   79.670344]  00000000fed10000 ffffc90000000000 ffff88011ea03d18 0000000000006000
[   79.678352]  0000000000000002 00000000fed16000 00000000fed10000 00000000fed10000
[   79.686346]  ffff88011ea03d18 ffffffff811b2130 00000000000000d0 ffffffff81039f4d
[   79.694391] Call Trace:
[   79.696997]  <IRQ> 
[   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
[   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
[   79.712414]  [<ffffffff810635e5>] __ioremap_caller+0x195/0x350
[   79.718610]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
[   79.725462]  [<ffffffff81427f6b>] ? debug_object_activate+0x14b/0x1e0
[   79.732346]  [<ffffffff810637b7>] ioremap_nocache+0x17/0x20
[   79.738283]  [<ffffffff81039f4d>] snb_uncore_imc_init_box+0x6d/0x90
[   79.744945]  [<ffffffff81039cf7>] snb_uncore_imc_event_start+0xb7/0x110
[   79.752020]  [<ffffffff81039d97>] snb_uncore_imc_event_add+0x47/0x60
[   79.758832]  [<ffffffff81162cbb>] event_sched_in.isra.85+0xfb/0x330
[   79.765519]  [<ffffffff81162f5f>] group_sched_in+0x6f/0x1e0
[   79.771481]  [<ffffffff8101df1a>] ? native_sched_clock+0x2a/0x90
[   79.777858]  [<ffffffff811637bc>] __perf_event_enable+0x25c/0x2a0
[   79.784418]  [<ffffffff810f3e69>] ? tick_nohz_irq_exit+0x29/0x30
[   79.790820]  [<ffffffff8115ef30>] ? cpu_clock_event_start+0x40/0x40
[   79.797546]  [<ffffffff8115ef80>] remote_function+0x50/0x60
[   79.803535]  [<ffffffff810f8cd1>] flush_smp_call_function_queue+0x81/0x180
[   79.810840]  [<ffffffff810f9763>] generic_smp_call_function_single_interrupt+0x13/0x60
[   79.819328]  [<ffffffff8104b5e8>] smp_trace_call_function_single_interrupt+0x38/0xc0
[   79.827614]  [<ffffffff816de9be>] trace_call_function_single_interrupt+0x6e/0x80
[   79.835465]  <EOI> 
[   79.837543]  [<ffffffff8156e8b5>] ? cpuidle_enter_state+0x65/0x160
[   79.844377]  [<ffffffff8156e8a1>] ? cpuidle_enter_state+0x51/0x160
[   79.851015]  [<ffffffff8156e9e7>] cpuidle_enter+0x17/0x20
[   79.856791]  [<ffffffff810b6e39>] cpu_startup_entry+0x399/0x440
[   79.863165]  [<ffffffff816c9ddb>] rest_init+0xbb/0xd0
[   79.868555]  [<ffffffff81d46f74>] start_kernel+0x44e/0x45b
[   79.874433]  [<ffffffff81d46120>] ? early_idt_handlers+0x120/0x120
[   79.881083]  [<ffffffff81d464d7>] x86_64_start_reservations+0x2a/0x2c
[   79.888625]  [<ffffffff81d46614>] x86_64_start_kernel+0x13b/0x14a
[   79.895775] Code: fe ff ff 0f 1f 84 00 00 00 00 00 4c 89 ef e8 d8 b9 01 00 48 83 c4 18 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 80 00 00 00 00 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 
[   79.918001] RIP  [<ffffffff811b0cd0>] __get_vm_area_node+0x170/0x180
[   79.925511]  RSP <ffff88011ea03cb8>
[   79.929921] ---[ end trace 838bf78817e86b4e ]---
[   79.935575] Kernel panic - not syncing: Fatal exception in interrupt
[   79.943131] Kernel Offset: disabled
[   79.947644] drm_kms_helper: panic occurred, switching back to text console
[   79.955802] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
[   79.964192] ------------[ cut here ]------------



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-24 13:38 ` Vince Weaver
@ 2015-04-24 14:03   ` Vince Weaver
  2015-04-24 14:38     ` Stephane Eranian
  0 siblings, 1 reply; 14+ messages in thread
From: Vince Weaver @ 2015-04-24 14:03 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Stephane Eranian, linux-kernel, peterz, ak, kan.liang, mingo, sonnyrao

On Fri, 24 Apr 2015, Vince Weaver wrote:

> [   79.473121] kernel BUG at mm/vmalloc.c:1335!
> [   79.477705] invalid opcode: 0000 [#1] SMP 
...
> [   79.694391] Call Trace:
> [   79.696997]  <IRQ> 
> [   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
> [   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90

This maps to:

	static void snb_uncore_imc_init_box(struct intel_uncore_box *box) {
		...
	        box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);

The machine I am seeing this on is a Haswell desktop, Intel 6/60/3.

Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-24 14:03   ` Vince Weaver
@ 2015-04-24 14:38     ` Stephane Eranian
  2015-04-24 19:22       ` Bjorn Helgaas
  0 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2015-04-24 14:38 UTC (permalink / raw)
  To: Vince Weaver
  Cc: LKML, Peter Zijlstra, ak, Liang, Kan, mingo, Sonny Rao, Bjorn Helgaas

On Fri, Apr 24, 2015 at 7:03 AM, Vince Weaver <vincent.weaver@maine.edu> wrote:
> On Fri, 24 Apr 2015, Vince Weaver wrote:
>
>> [   79.473121] kernel BUG at mm/vmalloc.c:1335!
>> [   79.477705] invalid opcode: 0000 [#1] SMP
> ...
>> [   79.694391] Call Trace:
>> [   79.696997]  <IRQ>
>> [   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
>> [   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
>
> This maps to:
>
>         static void snb_uncore_imc_init_box(struct intel_uncore_box *box) {
>                 ...
>                 box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
>
> The machine I am seeing this on is a Haswell desktop, Intel 6/60/3.
>
I also got that on one Lenovo IvyBridge laptop but I had just updated the BIOS.
It was working before with older kernels. I am not sure what is going on here.
It will be easier to debug once I get my desktop Haswell back and setup with
serial console. Need to verify that the address of the BAR is the same compared
with older kernels.

Maybe Bjorn can shed some lights. Why would ioremap() die like this in the
latest 4.0 tree.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-24 14:38     ` Stephane Eranian
@ 2015-04-24 19:22       ` Bjorn Helgaas
  2015-04-24 19:29         ` Stephane Eranian
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2015-04-24 19:22 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Vince Weaver, LKML, Peter Zijlstra, ak, Liang, Kan, mingo, Sonny Rao

On Fri, Apr 24, 2015 at 9:38 AM, Stephane Eranian <eranian@google.com> wrote:
> On Fri, Apr 24, 2015 at 7:03 AM, Vince Weaver <vincent.weaver@maine.edu> wrote:
>> On Fri, 24 Apr 2015, Vince Weaver wrote:
>>
>>> [   79.473121] kernel BUG at mm/vmalloc.c:1335!
>>> [   79.477705] invalid opcode: 0000 [#1] SMP
>> ...
>>> [   79.694391] Call Trace:
>>> [   79.696997]  <IRQ>
>>> [   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
>>> [   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
>>
>> This maps to:
>>
>>         static void snb_uncore_imc_init_box(struct intel_uncore_box *box) {
>>                 ...
>>                 box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
>>
>> The machine I am seeing this on is a Haswell desktop, Intel 6/60/3.
>>
> I also got that on one Lenovo IvyBridge laptop but I had just updated the BIOS.
> It was working before with older kernels. I am not sure what is going on here.
> It will be easier to debug once I get my desktop Haswell back and setup with
> serial console. Need to verify that the address of the BAR is the same compared
> with older kernels.
>
> Maybe Bjorn can shed some lights. Why would ioremap() die like this in the
> latest 4.0 tree.

  1328  static struct vm_struct *__get_vm_area_node(unsigned long size,
  1329                  unsigned long align, unsigned long flags,
unsigned long start,
  1330                  unsigned long end, int node, gfp_t gfp_mask,
const void *caller)
  1331  {
  1332          struct vmap_area *va;
  1333          struct vm_struct *area;
  1334
  1335          BUG_ON(in_interrupt());           <-----------------

Is there some perfmon initialization happening in interrupt context?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-24 19:22       ` Bjorn Helgaas
@ 2015-04-24 19:29         ` Stephane Eranian
  2015-04-25  4:38           ` Andi Kleen
  0 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2015-04-24 19:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Vince Weaver, LKML, Peter Zijlstra, ak, Liang, Kan, mingo, Sonny Rao

On Fri, Apr 24, 2015 at 12:22 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Fri, Apr 24, 2015 at 9:38 AM, Stephane Eranian <eranian@google.com> wrote:
>> On Fri, Apr 24, 2015 at 7:03 AM, Vince Weaver <vincent.weaver@maine.edu> wrote:
>>> On Fri, 24 Apr 2015, Vince Weaver wrote:
>>>
>>>> [   79.473121] kernel BUG at mm/vmalloc.c:1335!
>>>> [   79.477705] invalid opcode: 0000 [#1] SMP
>>> ...
>>>> [   79.694391] Call Trace:
>>>> [   79.696997]  <IRQ>
>>>> [   79.699090]  [<ffffffff811b2130>] get_vm_area_caller+0x40/0x50
>>>> [   79.705505]  [<ffffffff81039f4d>] ? snb_uncore_imc_init_box+0x6d/0x90
>>>
>>> This maps to:
>>>
>>>         static void snb_uncore_imc_init_box(struct intel_uncore_box *box) {
>>>                 ...
>>>                 box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
>>>
>>> The machine I am seeing this on is a Haswell desktop, Intel 6/60/3.
>>>
>> I also got that on one Lenovo IvyBridge laptop but I had just updated the BIOS.
>> It was working before with older kernels. I am not sure what is going on here.
>> It will be easier to debug once I get my desktop Haswell back and setup with
>> serial console. Need to verify that the address of the BAR is the same compared
>> with older kernels.
>>
>> Maybe Bjorn can shed some lights. Why would ioremap() die like this in the
>> latest 4.0 tree.
>
>   1328  static struct vm_struct *__get_vm_area_node(unsigned long size,
>   1329                  unsigned long align, unsigned long flags,
> unsigned long start,
>   1330                  unsigned long end, int node, gfp_t gfp_mask,
> const void *caller)
>   1331  {
>   1332          struct vmap_area *va;
>   1333          struct vm_struct *area;
>   1334
>   1335          BUG_ON(in_interrupt());           <-----------------
>
> Is there some perfmon initialization happening in interrupt context?

May be in IPI context if I read Vince's stack trace.

This leads me to believe that this patch:

commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
Author: Kan Liang <kan.liang@intel.com>
Date:   Tue Jan 20 04:54:25 2015 +0000

    perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization

If I revert it, I bet things will work again.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-24 19:29         ` Stephane Eranian
@ 2015-04-25  4:38           ` Andi Kleen
  2015-04-27  3:43             ` Liang, Kan
  0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2015-04-25  4:38 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Bjorn Helgaas, Vince Weaver, LKML, Peter Zijlstra, Liang, Kan,
	mingo, Sonny Rao

> This leads me to believe that this patch:
> 
> commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> Author: Kan Liang <kan.liang@intel.com>
> Date:   Tue Jan 20 04:54:25 2015 +0000
> 
>     perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization
> 
> If I revert it, I bet things will work again.

Yes the initialization needs to be moved out of the IPI context.

-Andi


-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-25  4:38           ` Andi Kleen
@ 2015-04-27  3:43             ` Liang, Kan
  2015-04-28  6:23               ` Stephane Eranian
  2015-04-28 10:52               ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Liang, Kan @ 2015-04-27  3:43 UTC (permalink / raw)
  To: Andi Kleen, Stephane Eranian
  Cc: Bjorn Helgaas, Vince Weaver, LKML, Peter Zijlstra, mingo, Sonny Rao


>
> > This leads me to believe that this patch:
> >
> > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> > Author: Kan Liang <kan.liang@intel.com>
> > Date:   Tue Jan 20 04:54:25 2015 +0000
> >
> >     perf/x86/intel/uncore: Move uncore_box_init() out of driver
> initialization
> >
> > If I revert it, I bet things will work again.
>
> Yes the initialization needs to be moved out of the IPI context.
>

Maybe we can move them to event init, which is not in IPI context.

What do you think of this patch?

---

>From 8a61c48144921e9d1c841656829c3bae9bfb4408 Mon Sep 17 00:00:00 2001
From: Kan Liang <kan.liang@intel.com>
Date: Sun, 26 Apr 2015 16:24:59 -0400
Subject: [PATCH 1/1] perf/x86/intel/uncore: move uncore_box_init to uncore
 event init

commit c05199e5a57a("perf/x86/intel/uncore: Move uncore_box_init() out
of driver initialization") moves uncore_box_init into uncore_enable_box
to prevent potential boot failures. However, uncore_enable_box is not
called on some client platforms (SNB/IVB/HSW) for counting IMC event.
When it is not called, the box is not initialized, which hard locks the
system.

Additionally, uncore_enable_box along with the initialization code in it
is always called in uncore event start functions, which are in IPI
context. But the initizlization code should not be in IPI context. This
is because, for example, the IMC box initialization codes for client
platforms include ioremap, which is not allowed to be called in IPI
context.

This patch moves uncore_box_init out of IPI context, to uncore event
init. The box is initialized only when it has not yet been initialized.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c     | 4 ++++
 arch/x86/kernel/cpu/perf_event_intel_uncore.h     | 2 --
 arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c | 3 +++
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index c635b8b..cbc1a93 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -623,6 +623,10 @@ static int uncore_pmu_event_init(struct perf_event *event)
        box = uncore_pmu_to_box(pmu, event->cpu);
        if (!box || box->cpu < 0)
                return -EINVAL;
+
+       /* Init box if it's not initialized yet */
+       uncore_box_init(box);
+
        event->cpu = box->cpu;

        event->hw.idx = -1;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index 6c8c1e7..1fb2905 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -273,8 +273,6 @@ static inline void uncore_disable_box(struct intel_uncore_box *box)

 static inline void uncore_enable_box(struct intel_uncore_box *box)
 {
-       uncore_box_init(box);
-
        if (box->pmu->type->ops->enable_box)
                box->pmu->type->ops->enable_box(box);
 }
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
index 4562e9e..ead70a6 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
@@ -279,6 +279,9 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
        if (!box || box->cpu < 0)
                return -EINVAL;

+       /* Init box if it's not initialized yet */
+       uncore_box_init(box);
+
        event->cpu = box->cpu;

        event->hw.idx = -1;

Thanks,
Kan

> -Andi
>
>
> --
> ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-27  3:43             ` Liang, Kan
@ 2015-04-28  6:23               ` Stephane Eranian
  2015-04-28 10:52               ` Peter Zijlstra
  1 sibling, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-04-28  6:23 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Andi Kleen, Bjorn Helgaas, Vince Weaver, LKML, Peter Zijlstra,
	mingo, Sonny Rao

On Sun, Apr 26, 2015 at 8:43 PM, Liang, Kan <kan.liang@intel.com> wrote:
>
>>
>> > This leads me to believe that this patch:
>> >
>> > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
>> > Author: Kan Liang <kan.liang@intel.com>
>> > Date:   Tue Jan 20 04:54:25 2015 +0000
>> >
>> >     perf/x86/intel/uncore: Move uncore_box_init() out of driver
>> initialization
>> >
>> > If I revert it, I bet things will work again.
>>
>> Yes the initialization needs to be moved out of the IPI context.
>>
>
> Maybe we can move them to event init, which is not in IPI context.
>
> What do you think of this patch?
>
> ---
>
> From 8a61c48144921e9d1c841656829c3bae9bfb4408 Mon Sep 17 00:00:00 2001
> From: Kan Liang <kan.liang@intel.com>
> Date: Sun, 26 Apr 2015 16:24:59 -0400
> Subject: [PATCH 1/1] perf/x86/intel/uncore: move uncore_box_init to uncore
>  event init
>
> commit c05199e5a57a("perf/x86/intel/uncore: Move uncore_box_init() out
> of driver initialization") moves uncore_box_init into uncore_enable_box
> to prevent potential boot failures. However, uncore_enable_box is not
> called on some client platforms (SNB/IVB/HSW) for counting IMC event.
> When it is not called, the box is not initialized, which hard locks the
> system.
>
> Additionally, uncore_enable_box along with the initialization code in it
> is always called in uncore event start functions, which are in IPI
> context. But the initizlization code should not be in IPI context. This
> is because, for example, the IMC box initialization codes for client
> platforms include ioremap, which is not allowed to be called in IPI
> context.
>
> This patch moves uncore_box_init out of IPI context, to uncore event
> init. The box is initialized only when it has not yet been initialized.
>
> Signed-off-by: Kan Liang <kan.liang@intel.com>

Ok this works for me now. Thanks.

Tested-by: Stephane Eranian <eranian@google.com>

> ---
>  arch/x86/kernel/cpu/perf_event_intel_uncore.c     | 4 ++++
>  arch/x86/kernel/cpu/perf_event_intel_uncore.h     | 2 --
>  arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c | 3 +++
>  3 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> index c635b8b..cbc1a93 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -623,6 +623,10 @@ static int uncore_pmu_event_init(struct perf_event *event)
>         box = uncore_pmu_to_box(pmu, event->cpu);
>         if (!box || box->cpu < 0)
>                 return -EINVAL;
> +
> +       /* Init box if it's not initialized yet */
> +       uncore_box_init(box);
> +
>         event->cpu = box->cpu;
>
>         event->hw.idx = -1;
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
> index 6c8c1e7..1fb2905 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
> @@ -273,8 +273,6 @@ static inline void uncore_disable_box(struct intel_uncore_box *box)
>
>  static inline void uncore_enable_box(struct intel_uncore_box *box)
>  {
> -       uncore_box_init(box);
> -
>         if (box->pmu->type->ops->enable_box)
>                 box->pmu->type->ops->enable_box(box);
>  }
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
> index 4562e9e..ead70a6 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
> @@ -279,6 +279,9 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
>         if (!box || box->cpu < 0)
>                 return -EINVAL;
>
> +       /* Init box if it's not initialized yet */
> +       uncore_box_init(box);
> +
>         event->cpu = box->cpu;
>
>         event->hw.idx = -1;
>
> Thanks,
> Kan
>
>> -Andi
>>
>>
>> --
>> ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-27  3:43             ` Liang, Kan
  2015-04-28  6:23               ` Stephane Eranian
@ 2015-04-28 10:52               ` Peter Zijlstra
  2015-05-11 10:00                 ` Ingo Molnar
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-04-28 10:52 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Andi Kleen, Stephane Eranian, Bjorn Helgaas, Vince Weaver, LKML,
	mingo, Sonny Rao

On Mon, Apr 27, 2015 at 03:43:32AM +0000, Liang, Kan wrote:
> 
> >
> > > This leads me to believe that this patch:
> > >
> > > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> > > Author: Kan Liang <kan.liang@intel.com>
> > > Date:   Tue Jan 20 04:54:25 2015 +0000
> > >
> > >     perf/x86/intel/uncore: Move uncore_box_init() out of driver
> > initialization
> > >
> > > If I revert it, I bet things will work again.
> >
> > Yes the initialization needs to be moved out of the IPI context.

I'm for the clean revert I think. Crashing is bad, but hiding/delaying
it seems counter productive too, it'll just mean we'll only learn about
it later.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-04-28 10:52               ` Peter Zijlstra
@ 2015-05-11 10:00                 ` Ingo Molnar
  2015-06-08 17:45                   ` Vince Weaver
  2015-06-09  8:33                   ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Ingo Molnar @ 2015-05-11 10:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Liang, Kan, Andi Kleen, Stephane Eranian, Bjorn Helgaas,
	Vince Weaver, LKML, mingo, Sonny Rao


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Mon, Apr 27, 2015 at 03:43:32AM +0000, Liang, Kan wrote:
> > 
> > >
> > > > This leads me to believe that this patch:
> > > >
> > > > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> > > > Author: Kan Liang <kan.liang@intel.com>
> > > > Date:   Tue Jan 20 04:54:25 2015 +0000
> > > >
> > > >     perf/x86/intel/uncore: Move uncore_box_init() out of driver
> > > initialization
> > > >
> > > > If I revert it, I bet things will work again.
> > >
> > > Yes the initialization needs to be moved out of the IPI context.
> 
> I'm for the clean revert I think. Crashing is bad, but hiding/delaying
> it seems counter productive too, it'll just mean we'll only learn about
> it later.

So should I revert c05199e5a57a, with a Cc: stable?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-05-11 10:00                 ` Ingo Molnar
@ 2015-06-08 17:45                   ` Vince Weaver
  2015-06-09  7:20                     ` Ingo Molnar
  2015-06-09  8:33                   ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: Vince Weaver @ 2015-06-08 17:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Liang, Kan, Andi Kleen, Stephane Eranian,
	Bjorn Helgaas, Vince Weaver, LKML, mingo, Sonny Rao

On Mon, 11 May 2015, Ingo Molnar wrote:

> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Mon, Apr 27, 2015 at 03:43:32AM +0000, Liang, Kan wrote:
> > > 
> > > >
> > > > > This leads me to believe that this patch:
> > > > >
> > > > > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> > > > > Author: Kan Liang <kan.liang@intel.com>
> > > > > Date:   Tue Jan 20 04:54:25 2015 +0000
> > > > >
> > > > >     perf/x86/intel/uncore: Move uncore_box_init() out of driver
> > > > initialization
> > > > >
> > > > > If I revert it, I bet things will work again.
> > > >
> > > > Yes the initialization needs to be moved out of the IPI context.
> > 
> > I'm for the clean revert I think. Crashing is bad, but hiding/delaying
> > it seems counter productive too, it'll just mean we'll only learn about
> > it later.
> 
> So should I revert c05199e5a57a, with a Cc: stable?

any progress on this issue?  With 4.1-rc7 I can still quickly crash a 
Haswell machine due to this bug.

Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-06-08 17:45                   ` Vince Weaver
@ 2015-06-09  7:20                     ` Ingo Molnar
  0 siblings, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2015-06-09  7:20 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, Liang, Kan, Andi Kleen, Stephane Eranian,
	Bjorn Helgaas, LKML, mingo, Sonny Rao


* Vince Weaver <vincent.weaver@maine.edu> wrote:

> On Mon, 11 May 2015, Ingo Molnar wrote:
> 
> > 
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > On Mon, Apr 27, 2015 at 03:43:32AM +0000, Liang, Kan wrote:
> > > > 
> > > > >
> > > > > > This leads me to believe that this patch:
> > > > > >
> > > > > > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
> > > > > > Author: Kan Liang <kan.liang@intel.com>
> > > > > > Date:   Tue Jan 20 04:54:25 2015 +0000
> > > > > >
> > > > > >     perf/x86/intel/uncore: Move uncore_box_init() out of driver
> > > > > initialization
> > > > > >
> > > > > > If I revert it, I bet things will work again.
> > > > >
> > > > > Yes the initialization needs to be moved out of the IPI context.
> > > 
> > > I'm for the clean revert I think. Crashing is bad, but hiding/delaying
> > > it seems counter productive too, it'll just mean we'll only learn about
> > > it later.
> > 
> > So should I revert c05199e5a57a, with a Cc: stable?
> 
> any progress on this issue?  With 4.1-rc7 I can still quickly crash a 
> Haswell machine due to this bug.

This dropped through the cracks. Peter?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization
  2015-05-11 10:00                 ` Ingo Molnar
  2015-06-08 17:45                   ` Vince Weaver
@ 2015-06-09  8:33                   ` Peter Zijlstra
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2015-06-09  8:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Liang, Kan, Andi Kleen, Stephane Eranian, Bjorn Helgaas,
	Vince Weaver, LKML, mingo, Sonny Rao

On Mon, May 11, 2015 at 12:00:30PM +0200, Ingo Molnar wrote:
> > I'm for the clean revert I think. Crashing is bad, but hiding/delaying
> > it seems counter productive too, it'll just mean we'll only learn about
> > it later.
> 
> So should I revert c05199e5a57a, with a Cc: stable?

Yep.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-06-09  8:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-23  5:47 [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization Stephane Eranian
2015-04-24 13:38 ` Vince Weaver
2015-04-24 14:03   ` Vince Weaver
2015-04-24 14:38     ` Stephane Eranian
2015-04-24 19:22       ` Bjorn Helgaas
2015-04-24 19:29         ` Stephane Eranian
2015-04-25  4:38           ` Andi Kleen
2015-04-27  3:43             ` Liang, Kan
2015-04-28  6:23               ` Stephane Eranian
2015-04-28 10:52               ` Peter Zijlstra
2015-05-11 10:00                 ` Ingo Molnar
2015-06-08 17:45                   ` Vince Weaver
2015-06-09  7:20                     ` Ingo Molnar
2015-06-09  8:33                   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.