All of lore.kernel.org
 help / color / mirror / Atom feed
* x86 PMU broken in current Linus' tree
@ 2016-08-02 12:04 Jiri Kosina
  2016-08-02 12:55 ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Kosina @ 2016-08-02 12:04 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin
  Cc: x86, linux-kernel

With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR 
write warning during bootup, and kernel panic when shutting PMUs down 
during poweroff.

The MSR warning is below, the camera capture of the poweroff panic can be 
found at

	http://www.jikos.cz/jikos/junk/pmu-panic.jpg

The last previous kernel version that I've booted on this particular 
machine was 4.7.0-rc4, and it had neither of those symptoms, so I can 
eventually bisect if needed.

=== [ snip ] ==
[    0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6)
[    0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
[    0.136000] ... version:                2
[    0.136000] ... bit width:              40
[    0.136000] ... generic registers:      2
[    0.136000] ... value mask:             000000ffffffffff
[    0.136000] ... max period:             000000007fffffff
[    0.136000] ... fixed-purpose events:   3
[    0.136000] ... event mask:             0000000700000003
[    0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[    0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190)
[    0.136000]  ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660
[    0.136000]  ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760
[    0.136000]  ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0
[    0.136000] Call Trace:
[    0.136000]  [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110
[    0.136000]  [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0
[    0.136000]  [<ffffffff90175642>] perf_pmu_enable+0x22/0x30
[    0.136000]  [<ffffffff901766b1>] ctx_resched+0x51/0x60
[    0.136000]  [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240
[    0.136000]  [<ffffffff9016e5f9>] event_function+0xa9/0x180
[    0.136000]  [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
[    0.136000]  [<ffffffff9016fcef>] remote_function+0x3f/0x50
[    0.136000]  [<ffffffff901012b6>] generic_exec_single+0xb6/0x140
[    0.136000]  [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
[    0.136000]  [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
[    0.136000]  [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110
[    0.136000]  [<ffffffff9016e984>] cpu_function_call+0x34/0x40
[    0.136000]  [<ffffffff9016e550>] ? list_del_event+0x150/0x150
[    0.136000]  [<ffffffff9016ecda>] event_function_call+0x11a/0x120
[    0.136000]  [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
[    0.136000]  [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70
[    0.136000]  [<ffffffff901736ac>] perf_event_enable+0x1c/0x40
[    0.136000]  [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0
[    0.136000]  [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0
[    0.136000]  [<ffffffff90092360>] ? sort_range+0x30/0x30
[    0.136000]  [<ffffffff9008e7e2>] kthread+0xf2/0x110
[    0.136000]  [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110
[    0.136000]  [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40
[    0.136000]  [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220
=== [ snip ] ===

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: x86 PMU broken in current Linus' tree
  2016-08-02 12:04 x86 PMU broken in current Linus' tree Jiri Kosina
@ 2016-08-02 12:55 ` Peter Zijlstra
  2016-08-02 13:31   ` Jiri Kosina
  2016-08-08 10:26   ` Peter Zijlstra
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2016-08-02 12:55 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86,
	linux-kernel

On Tue, Aug 02, 2016 at 02:04:58PM +0200, Jiri Kosina wrote:
> With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR 
> write warning during bootup, and kernel panic when shutting PMUs down 
> during poweroff.
> 
> The MSR warning is below, the camera capture of the poweroff panic can be 
> found at
> 
> 	http://www.jikos.cz/jikos/junk/pmu-panic.jpg
> 
> The last previous kernel version that I've booted on this particular 
> machine was 4.7.0-rc4, and it had neither of those symptoms, so I can 
> eventually bisect if needed.
> 
> === [ snip ] ==
> [    0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6)
> [    0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
> [    0.136000] ... version:                2
> [    0.136000] ... bit width:              40
> [    0.136000] ... generic registers:      2
> [    0.136000] ... value mask:             000000ffffffffff
> [    0.136000] ... max period:             000000007fffffff
> [    0.136000] ... fixed-purpose events:   3
> [    0.136000] ... event mask:             0000000700000003
> [    0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
> [    0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190)

'Curious'.. :/

x86_perf_event_set_period() only does:

  wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);

and hwc->event ends up being:

  MSR_ARCH_PERFMON_PERFCTR0 + index

>From which we can deduce that index = 0xdf - 0xc1 = 30, which is
somewhat larger than the max reported number of counters (2).

Lemme go see how that can happen.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: x86 PMU broken in current Linus' tree
  2016-08-02 12:55 ` Peter Zijlstra
@ 2016-08-02 13:31   ` Jiri Kosina
  2016-08-08 10:26   ` Peter Zijlstra
  1 sibling, 0 replies; 13+ messages in thread
From: Jiri Kosina @ 2016-08-02 13:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86,
	linux-kernel

On Tue, 2 Aug 2016, Peter Zijlstra wrote:

> > With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR 
> > write warning during bootup, and kernel panic when shutting PMUs down 
> > during poweroff.
> > 
> > The MSR warning is below, the camera capture of the poweroff panic can be 
> > found at
> > 
> > 	http://www.jikos.cz/jikos/junk/pmu-panic.jpg
> > 
> > The last previous kernel version that I've booted on this particular 
> > machine was 4.7.0-rc4, and it had neither of those symptoms, so I can 
> > eventually bisect if needed.
> > 
> > === [ snip ] ==
> > [    0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6)
> > [    0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
> > [    0.136000] ... version:                2
> > [    0.136000] ... bit width:              40
> > [    0.136000] ... generic registers:      2
> > [    0.136000] ... value mask:             000000ffffffffff
> > [    0.136000] ... max period:             000000007fffffff
> > [    0.136000] ... fixed-purpose events:   3
> > [    0.136000] ... event mask:             0000000700000003
> > [    0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
> > [    0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190)
> 
> 'Curious'.. :/
> 
> x86_perf_event_set_period() only does:
> 
>   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
> 
> and hwc->event ends up being:
> 
>   MSR_ARCH_PERFMON_PERFCTR0 + index
> 
> From which we can deduce that index = 0xdf - 0xc1 = 30, which is
> somewhat larger than the max reported number of counters (2).
> 
> Lemme go see how that can happen.

FTR, I tried the very same kernel on Xeon E5, and the issue didn't pop up. 
So it might be somehow specific to the older Core2, or somehow otherwise 
not really completely generic problem.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: x86 PMU broken in current Linus' tree
  2016-08-02 12:55 ` Peter Zijlstra
  2016-08-02 13:31   ` Jiri Kosina
@ 2016-08-08 10:26   ` Peter Zijlstra
  2016-08-08 14:41     ` Jiri Kosina
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2016-08-08 10:26 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86,
	linux-kernel, Mike Galbraith

On Tue, Aug 02, 2016 at 02:55:24PM +0200, Peter Zijlstra wrote:
> On Tue, Aug 02, 2016 at 02:04:58PM +0200, Jiri Kosina wrote:
> > With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR 
> > write warning during bootup, and kernel panic when shutting PMUs down 
> > during poweroff.
> > 
> > The MSR warning is below, the camera capture of the poweroff panic can be 
> > found at
> > 
> > 	http://www.jikos.cz/jikos/junk/pmu-panic.jpg
> > 
> > The last previous kernel version that I've booted on this particular 
> > machine was 4.7.0-rc4, and it had neither of those symptoms, so I can 
> > eventually bisect if needed.
> > 
> > === [ snip ] ==
> > [    0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6)
> > [    0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
> > [    0.136000] ... version:                2
> > [    0.136000] ... bit width:              40
> > [    0.136000] ... generic registers:      2
> > [    0.136000] ... value mask:             000000ffffffffff
> > [    0.136000] ... max period:             000000007fffffff
> > [    0.136000] ... fixed-purpose events:   3
> > [    0.136000] ... event mask:             0000000700000003
> > [    0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
> > [    0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190)
> 
> 'Curious'.. :/
> 
> x86_perf_event_set_period() only does:
> 
>   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
> 
> and hwc->event ends up being:
> 
>   MSR_ARCH_PERFMON_PERFCTR0 + index
> 
> From which we can deduce that index = 0xdf - 0xc1 = 30, which is
> somewhat larger than the max reported number of counters (2).
> 
> Lemme go see how that can happen.

So I can reproduce on my Lenovo T500 which has a Core2 as well. By long
and tedious printk() it looks like the event constraint:

  FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */

which is updated in:

  intel_pmu_init()

to include the generic counter masks, gets corrupted for some reason.
But the moment I put printk()s in there to print the idxmsk64 values,
everything works as expected again.

I'll go prod mode.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: x86 PMU broken in current Linus' tree
  2016-08-08 10:26   ` Peter Zijlstra
@ 2016-08-08 14:41     ` Jiri Kosina
  2016-08-08 16:12       ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Kosina @ 2016-08-08 14:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86,
	linux-kernel, Mike Galbraith, Borislav Petkov

On Mon, 8 Aug 2016, Peter Zijlstra wrote:

> So I can reproduce on my Lenovo T500 which has a Core2 as well. By long
> and tedious printk() it looks like the event constraint:
> 
>   FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */

[ adding Boris to CC ]

This made me wonder, and it turned out that HWEIGHT() doesn't seem to be 
doing the right thing on machines that don't have popcnt instruction 
(which is probably the distinguishing factor here).

How about the patch below? It fixes the symptoms on my system.




From: Jiri Kosina <jkosina@suse.cz>
Subject: [PATCH] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations

The open-coded version of __sw_hweight32() and __sw_hweight64() are broken;
the variable <-> register mapping in the comments doesn't match the registers
being used, resulting in wrong calculations.

This has a potential to demonstrate itself by various syptoms on machines
where this is actually being used to compute the hweight (due to lack of
popcnt insn). On my core2 system, this demonstrates itself as

 NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
 unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc
t_set_period+0xdc/0x190)
  ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660
  ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760
  ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0
 Call Trace:
  [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110
  [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0
  [<ffffffff90175642>] perf_pmu_enable+0x22/0x30
  [<ffffffff901766b1>] ctx_resched+0x51/0x60
  [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240
  [<ffffffff9016e5f9>] event_function+0xa9/0x180
  [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
  [<ffffffff9016fcef>] remote_function+0x3f/0x50
  [<ffffffff901012b6>] generic_exec_single+0xb6/0x140
  [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
  [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
  [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110
  [<ffffffff9016e984>] cpu_function_call+0x34/0x40
  [<ffffffff9016e550>] ? list_del_event+0x150/0x150
  [<ffffffff9016ecda>] event_function_call+0x11a/0x120
  [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
  [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70
  [<ffffffff901736ac>] perf_event_enable+0x1c/0x40
  [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0
  [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0
  [<ffffffff90092360>] ? sort_range+0x30/0x30
  [<ffffffff9008e7e2>] kthread+0xf2/0x110
  [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110
  [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40
  [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220

as FIXED_EVENT_CONSTRAINT() doesn't procude correct results, and followup panic
in x86_pmu_stop(). YMMV.

Fix this by rewriting the code so that it actually matches the math in
comments.

Fixes: f5967101e9d ("x86/hweight: Get rid of the special calling convention")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 arch/x86/lib/hweight.S | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/arch/x86/lib/hweight.S b/arch/x86/lib/hweight.S
index 02de3d7..9d0540f 100644
--- a/arch/x86/lib/hweight.S
+++ b/arch/x86/lib/hweight.S
@@ -8,24 +8,23 @@
  */
 ENTRY(__sw_hweight32)
 
+	__ASM_SIZE(push,) %__ASM_REG(dx)
 #ifdef CONFIG_X86_64
-	movl %edi, %eax				# w
+	movl %edi, %edx				# w -> t
 #endif
-	__ASM_SIZE(push,) %__ASM_REG(dx)
-	movl %eax, %edx				# w -> t
-	shrl %edx				# t >>= 1
-	andl $0x55555555, %edx			# t &= 0x55555555
-	subl %edx, %eax				# w -= t
+	shrl %edx				# w >>= 1
+	andl $0x55555555, %edx			# w &= 0x55555555
+	subl %edx, %edi				# w -= t
+	movl %edi, %eax				# w -> t
 
-	movl %eax, %edx				# w -> t
-	shrl $2, %eax				# w_tmp >>= 2
-	andl $0x33333333, %edx			# t	&= 0x33333333
-	andl $0x33333333, %eax			# w_tmp &= 0x33333333
-	addl %edx, %eax				# w = w_tmp + t
+	shrl $2, %edi				# w_tmp >>= 2
+	andl $0x33333333, %eax			# t	&= 0x33333333
+	andl $0x33333333, %edi			# w_tmp &= 0x33333333
+	addl %eax, %edi				# w = w_tmp + t
 
-	movl %eax, %edx				# w -> t
-	shrl $4, %edx				# t >>= 4
-	addl %edx, %eax				# w_tmp += t
+	movl %edi, %eax				# w -> t
+	shrl $4, %eax				# t >>= 4
+	addl %edi, %eax				# w_tmp += t
 	andl  $0x0f0f0f0f, %eax			# w_tmp &= 0x0f0f0f0f
 	imull $0x01010101, %eax, %eax		# w_tmp *= 0x01010101
 	shrl $24, %eax				# w = w_tmp >> 24
@@ -40,20 +39,20 @@ ENTRY(__sw_hweight64)
 	movq    %rdi, %rdx                      # w -> t
 	movabsq $0x5555555555555555, %rax
 	shrq    %rdx                            # t >>= 1
-	andq    %rdx, %rax                      # t &= 0x5555555555555555
+	andq    %rax, %rdx                      # t &= 0x5555555555555555
+	subq    %rdx, %rdi                      # w -= t
 	movabsq $0x3333333333333333, %rdx
-	subq    %rax, %rdi                      # w -= t
 
 	movq    %rdi, %rax                      # w -> t
 	shrq    $2, %rdi                        # w_tmp >>= 2
 	andq    %rdx, %rax                      # t     &= 0x3333333333333333
-	andq    %rdi, %rdx                      # w_tmp &= 0x3333333333333333
-	addq    %rdx, %rax                      # w = w_tmp + t
-
-	movq    %rax, %rdx                      # w -> t
-	shrq    $4, %rdx                        # t >>= 4
-	addq    %rdx, %rax                      # w_tmp += t
+	andq    %rdx, %rdi                      # w_tmp &= 0x3333333333333333
 	movabsq $0x0f0f0f0f0f0f0f0f, %rdx
+	addq    %rax, %rdi                      # w = w_tmp + t
+
+	movq    %rdi, %rax                      # w -> t
+	shrq    $4, %rax                        # t >>= 4
+	addq    %rdi, %rax                      # w_tmp += t
 	andq    %rdx, %rax                      # w_tmp &= 0x0f0f0f0f0f0f0f0f
 	movabsq $0x0101010101010101, %rdx
 	imulq   %rdx, %rax                      # w_tmp *= 0x0101010101010101

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-08 14:41     ` Jiri Kosina
@ 2016-08-08 16:12       ` Jiri Kosina
  2016-08-08 16:34         ` kbuild test robot
                           ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jiri Kosina @ 2016-08-08 16:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86,
	linux-kernel, Mike Galbraith, Borislav Petkov

From: Jiri Kosina <jkosina@suse.cz>

The open-coded version of __sw_hweight32() and __sw_hweight64() are 
broken; the variable <-> register mapping in the comments doesn't match 
the registers being used, resulting in wrong calculations.

This has a potential to demonstrate itself by various syptoms on machines 
where this is actually being used to compute the hweight (due to lack of 
popcnt insn). On my core2 system, this demonstrates itself as

 NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
 unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acct_set_period+0xdc/0x190)
  ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660
  ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760
  ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0
 Call Trace:
  [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110
  [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0
  [<ffffffff90175642>] perf_pmu_enable+0x22/0x30
  [<ffffffff901766b1>] ctx_resched+0x51/0x60
  [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240
  [<ffffffff9016e5f9>] event_function+0xa9/0x180
  [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
  [<ffffffff9016fcef>] remote_function+0x3f/0x50
  [<ffffffff901012b6>] generic_exec_single+0xb6/0x140
  [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
  [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
  [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110
  [<ffffffff9016e984>] cpu_function_call+0x34/0x40
  [<ffffffff9016e550>] ? list_del_event+0x150/0x150
  [<ffffffff9016ecda>] event_function_call+0x11a/0x120
  [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
  [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70
  [<ffffffff901736ac>] perf_event_enable+0x1c/0x40
  [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0
  [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0
  [<ffffffff90092360>] ? sort_range+0x30/0x30
  [<ffffffff9008e7e2>] kthread+0xf2/0x110
  [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110
  [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40
  [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220

as FIXED_EVENT_CONSTRAINT() doesn't produce correct results, and followup 
panic in x86_pmu_stop(). YMMV.

Fix this by rewriting the code so that it actually matches the math 
lib/hweight.c. While at it, nuke the original comments altogether; the 
"variable" names don't really match what's being used in the C-version of 
the function, and I find them to be more misleading than helping.

This version of gcc:

	gcc (SUSE Linux) 4.8.5

produces identical assembly code from lib/hweight.c (modulo frame pointer 
maths).

Fixes: f5967101e9d ("x86/hweight: Get rid of the special calling convention")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---

v1 -> v2: remove the comments altogether; put "approved by gcc" note into 
          changelog

 arch/x86/lib/hweight.S | 69 +++++++++++++++++++++++++-------------------------
 1 file changed, 34 insertions(+), 35 deletions(-)

diff --git a/arch/x86/lib/hweight.S b/arch/x86/lib/hweight.S
index 02de3d7..e1cea44f 100644
--- a/arch/x86/lib/hweight.S
+++ b/arch/x86/lib/hweight.S
@@ -8,56 +8,55 @@
  */
 ENTRY(__sw_hweight32)
 
-#ifdef CONFIG_X86_64
-	movl %edi, %eax				# w
-#endif
 	__ASM_SIZE(push,) %__ASM_REG(dx)
-	movl %eax, %edx				# w -> t
-	shrl %edx				# t >>= 1
-	andl $0x55555555, %edx			# t &= 0x55555555
-	subl %edx, %eax				# w -= t
 
-	movl %eax, %edx				# w -> t
-	shrl $2, %eax				# w_tmp >>= 2
-	andl $0x33333333, %edx			# t	&= 0x33333333
-	andl $0x33333333, %eax			# w_tmp &= 0x33333333
-	addl %edx, %eax				# w = w_tmp + t
+	movl %edi, %edx
+
+	shrl %edx
+	andl $0x55555555, %edx
+	subl %edx, %edi
+	movl %edi, %eax
+
+	shrl $2, %edi
+	andl $0x33333333, %eax
+	andl $0x33333333, %edi
+	addl %eax, %edi
 
-	movl %eax, %edx				# w -> t
-	shrl $4, %edx				# t >>= 4
-	addl %edx, %eax				# w_tmp += t
-	andl  $0x0f0f0f0f, %eax			# w_tmp &= 0x0f0f0f0f
-	imull $0x01010101, %eax, %eax		# w_tmp *= 0x01010101
-	shrl $24, %eax				# w = w_tmp >> 24
+	movl %edi, %eax
+	shrl $4, %eax
+	addl %edi, %eax
+	andl  $0x0f0f0f0f, %eax
+	imull $0x01010101, %eax, %eax
+	shrl $24, %eax
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	ret
 ENDPROC(__sw_hweight32)
 
 ENTRY(__sw_hweight64)
-#ifdef CONFIG_X86_64
+
 	pushq   %rdx
 
-	movq    %rdi, %rdx                      # w -> t
+	movq    %rdi, %rdx
 	movabsq $0x5555555555555555, %rax
-	shrq    %rdx                            # t >>= 1
-	andq    %rdx, %rax                      # t &= 0x5555555555555555
+	shrq    %rdx
+	andq    %rax, %rdx
+	subq    %rdx, %rdi
 	movabsq $0x3333333333333333, %rdx
-	subq    %rax, %rdi                      # w -= t
-
-	movq    %rdi, %rax                      # w -> t
-	shrq    $2, %rdi                        # w_tmp >>= 2
-	andq    %rdx, %rax                      # t     &= 0x3333333333333333
-	andq    %rdi, %rdx                      # w_tmp &= 0x3333333333333333
-	addq    %rdx, %rax                      # w = w_tmp + t
 
-	movq    %rax, %rdx                      # w -> t
-	shrq    $4, %rdx                        # t >>= 4
-	addq    %rdx, %rax                      # w_tmp += t
+	movq    %rdi, %rax
+	shrq    $2, %rdi
+	andq    %rdx, %rax
+	andq    %rdx, %rdi
 	movabsq $0x0f0f0f0f0f0f0f0f, %rdx
-	andq    %rdx, %rax                      # w_tmp &= 0x0f0f0f0f0f0f0f0f
+	addq    %rax, %rdi
+
+	movq    %rdi, %rax
+	shrq    $4, %rax
+	addq    %rdi, %rax
+	andq    %rdx, %rax
 	movabsq $0x0101010101010101, %rdx
-	imulq   %rdx, %rax                      # w_tmp *= 0x0101010101010101
-	shrq    $56, %rax                       # w = w_tmp >> 56
+	imulq   %rdx, %rax
+	shrq    $56, %rax
 
 	popq    %rdx
 	ret
-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-08 16:12       ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina
@ 2016-08-08 16:34         ` kbuild test robot
  2016-08-08 18:48         ` Jiri Kosina
  2016-08-10 13:59         ` Ingo Molnar
  2 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2016-08-08 16:34 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: kbuild-all, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel,
	Mike Galbraith, Borislav Petkov

[-- Attachment #1: Type: text/plain, Size: 2538 bytes --]

Hi Jiri,

[auto build test ERROR on tip/auto-latest]
[also build test ERROR on v4.8-rc1 next-20160808]
[cannot apply to tip/x86/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jiri-Kosina/x86-hweight-fix-open-coded-versions-of-32bit-and-64bit-hweight-calculations/20160809-001505
config: i386-tinyconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> arch/x86/lib/hweight.S:63:2: error: #else without #if
    #else /* CONFIG_X86_32 */
     ^~~~
>> arch/x86/lib/hweight.S:75:2: error: #endif without #if
    #endif
     ^~~~~

vim +63 arch/x86/lib/hweight.S

f5967101 Borislav Petkov 2016-05-30  57  	movabsq $0x0101010101010101, %rdx
4792d6cc Jiri Kosina     2016-08-08  58  	imulq   %rdx, %rax
4792d6cc Jiri Kosina     2016-08-08  59  	shrq    $56, %rax
f5967101 Borislav Petkov 2016-05-30  60  
f5967101 Borislav Petkov 2016-05-30  61  	popq    %rdx
f5967101 Borislav Petkov 2016-05-30  62  	ret
f5967101 Borislav Petkov 2016-05-30 @63  #else /* CONFIG_X86_32 */
f5967101 Borislav Petkov 2016-05-30  64  	/* We're getting an u64 arg in (%eax,%edx): unsigned long hweight64(__u64 w) */
f5967101 Borislav Petkov 2016-05-30  65  	pushl   %ecx
f5967101 Borislav Petkov 2016-05-30  66  
f5967101 Borislav Petkov 2016-05-30  67  	call    __sw_hweight32
f5967101 Borislav Petkov 2016-05-30  68  	movl    %eax, %ecx                      # stash away result
f5967101 Borislav Petkov 2016-05-30  69  	movl    %edx, %eax                      # second part of input
f5967101 Borislav Petkov 2016-05-30  70  	call    __sw_hweight32
f5967101 Borislav Petkov 2016-05-30  71  	addl    %ecx, %eax                      # result
f5967101 Borislav Petkov 2016-05-30  72  
f5967101 Borislav Petkov 2016-05-30  73  	popl    %ecx
f5967101 Borislav Petkov 2016-05-30  74  	ret
f5967101 Borislav Petkov 2016-05-30 @75  #endif
f5967101 Borislav Petkov 2016-05-30  76  ENDPROC(__sw_hweight64)

:::::: The code at line 63 was first introduced by commit
:::::: f5967101e9de12addcda4510dfbac66d7c5779c3 x86/hweight: Get rid of the special calling convention

:::::: TO: Borislav Petkov <bp@suse.de>
:::::: CC: Ingo Molnar <mingo@kernel.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 6290 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-08 16:12       ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina
  2016-08-08 16:34         ` kbuild test robot
@ 2016-08-08 18:48         ` Jiri Kosina
  2016-08-08 18:53           ` Borislav Petkov
  2016-08-10 13:59         ` Ingo Molnar
  2 siblings, 1 reply; 13+ messages in thread
From: Jiri Kosina @ 2016-08-08 18:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86,
	linux-kernel, Mike Galbraith, Borislav Petkov

Okay, scratch this thread, it's been just covered (and Linus already 
applied it) in

	https://lkml.kernel.org/r/1470677729-10561-1-git-send-email-ville.syrjala@linux.intel.com

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-08 18:48         ` Jiri Kosina
@ 2016-08-08 18:53           ` Borislav Petkov
  2016-08-08 18:55             ` Jiri Kosina
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2016-08-08 18:53 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, x86, linux-kernel, Mike Galbraith,
	Borislav Petkov

On Mon, Aug 08, 2016 at 08:48:44PM +0200, Jiri Kosina wrote:
> Okay, scratch this thread, it's been just covered (and Linus already 
> applied it) in
> 
> 	https://lkml.kernel.org/r/1470677729-10561-1-git-send-email-ville.syrjala@linux.intel.com

Can you confirm it fixes the issue on your machine too?

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-08 18:53           ` Borislav Petkov
@ 2016-08-08 18:55             ` Jiri Kosina
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Kosina @ 2016-08-08 18:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, x86, linux-kernel, Mike Galbraith,
	Borislav Petkov

On Mon, 8 Aug 2016, Borislav Petkov wrote:

> > Okay, scratch this thread, it's been just covered (and Linus already 
> > applied it) in
> > 
> > 	https://lkml.kernel.org/r/1470677729-10561-1-git-send-email-ville.syrjala@linux.intel.com
> 
> Can you confirm it fixes the issue on your machine too?

Confirmed, it does. Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-08 16:12       ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina
  2016-08-08 16:34         ` kbuild test robot
  2016-08-08 18:48         ` Jiri Kosina
@ 2016-08-10 13:59         ` Ingo Molnar
  2016-08-10 14:05           ` Jiri Kosina
  2 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2016-08-10 13:59 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, x86, linux-kernel, Mike Galbraith,
	Borislav Petkov


* Jiri Kosina <jikos@kernel.org> wrote:

>  ENTRY(__sw_hweight64)
> -#ifdef CONFIG_X86_64
> +

So this removes an #ifdef without removing the #else branch.

Help?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-10 13:59         ` Ingo Molnar
@ 2016-08-10 14:05           ` Jiri Kosina
  2016-08-11  0:09             ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Kosina @ 2016-08-10 14:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, x86, linux-kernel, Mike Galbraith,
	Borislav Petkov

On Wed, 10 Aug 2016, Ingo Molnar wrote:

> >  ENTRY(__sw_hweight64)
> > -#ifdef CONFIG_X86_64
> > +
> 
> So this removes an #ifdef without removing the #else branch.

Hi Ingo,

this patch is obsolete; the real issue has already been fixed by 65ea11ec6 
("x86/hweight: Don't clobber %rdi") in Linus' tree already.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations
  2016-08-10 14:05           ` Jiri Kosina
@ 2016-08-11  0:09             ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2016-08-11  0:09 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, x86, linux-kernel, Mike Galbraith,
	Borislav Petkov


* Jiri Kosina <jikos@kernel.org> wrote:

> On Wed, 10 Aug 2016, Ingo Molnar wrote:
> 
> > >  ENTRY(__sw_hweight64)
> > > -#ifdef CONFIG_X86_64
> > > +
> > 
> > So this removes an #ifdef without removing the #else branch.
> 
> Hi Ingo,
> 
> this patch is obsolete; the real issue has already been fixed by 65ea11ec6 
> ("x86/hweight: Don't clobber %rdi") in Linus' tree already.

Ok, great!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-08-11  0:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-02 12:04 x86 PMU broken in current Linus' tree Jiri Kosina
2016-08-02 12:55 ` Peter Zijlstra
2016-08-02 13:31   ` Jiri Kosina
2016-08-08 10:26   ` Peter Zijlstra
2016-08-08 14:41     ` Jiri Kosina
2016-08-08 16:12       ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina
2016-08-08 16:34         ` kbuild test robot
2016-08-08 18:48         ` Jiri Kosina
2016-08-08 18:53           ` Borislav Petkov
2016-08-08 18:55             ` Jiri Kosina
2016-08-10 13:59         ` Ingo Molnar
2016-08-10 14:05           ` Jiri Kosina
2016-08-11  0:09             ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.