All of lore.kernel.org
 help / color / mirror / Atom feed
* "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
@ 2018-01-04  1:59 Thomas Zeitlhofer
  2018-01-04 10:20 ` Thomas Zeitlhofer
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-04  1:59 UTC (permalink / raw)
  To: linux-kernel

Hello,

on an Ivybridge CPU, I get with 4.14.11:

   BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
   caller is native_flush_tlb_single+0x57/0xc0
   CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
   Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
   Call Trace:
    dump_stack+0x5c/0x86
    check_preemption_disabled+0xdd/0xe0
    native_flush_tlb_single+0x57/0xc0
    ? __set_pte_vaddr+0x2d/0x40
    __set_pte_vaddr+0x2d/0x40
    set_pte_vaddr+0x2f/0x40
    cea_set_pte+0x30/0x40
    ds_update_cea.constprop.4+0x4d/0x70
    reserve_ds_buffers+0x159/0x410
    ? wp_page_copy+0x36d/0x6a0
    x86_reserve_hardware+0x150/0x160
    x86_pmu_event_init+0x3e/0x1f0
    perf_try_init_event+0x69/0x80
    perf_event_alloc+0x652/0x740
    SyS_perf_event_open+0x3f6/0xd60
    do_syscall_64+0x5c/0x190
    entry_SYSCALL64_slow_path+0x25/0x25
   RIP: 0033:0x74a1d94580b9
   RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
   RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
   RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
   R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
   R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600

This does not show up when booting with pti=off.

Maybe it is related to the issue that is fixed for the upcoming 4.4.110
release by https://lkml.org/lkml/2018/1/3/692

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04  1:59 "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11 Thomas Zeitlhofer
@ 2018-01-04 10:20 ` Thomas Zeitlhofer
  2018-01-04 10:51   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-04 10:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Hugh Dickins; +Cc: linux-kernel

On Thu, Jan 04, 2018 at 02:59:06AM +0100, Thomas Zeitlhofer wrote:
> Hello,
> 
> on an Ivybridge CPU, I get with 4.14.11:
> 
>    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
>    caller is native_flush_tlb_single+0x57/0xc0
>    CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
>    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
>    Call Trace:
>     dump_stack+0x5c/0x86
>     check_preemption_disabled+0xdd/0xe0
>     native_flush_tlb_single+0x57/0xc0
>     ? __set_pte_vaddr+0x2d/0x40
>     __set_pte_vaddr+0x2d/0x40
>     set_pte_vaddr+0x2f/0x40
>     cea_set_pte+0x30/0x40
>     ds_update_cea.constprop.4+0x4d/0x70
>     reserve_ds_buffers+0x159/0x410
>     ? wp_page_copy+0x36d/0x6a0
>     x86_reserve_hardware+0x150/0x160
>     x86_pmu_event_init+0x3e/0x1f0
>     perf_try_init_event+0x69/0x80
>     perf_event_alloc+0x652/0x740
>     SyS_perf_event_open+0x3f6/0xd60
>     do_syscall_64+0x5c/0x190
>     entry_SYSCALL64_slow_path+0x25/0x25
>    RIP: 0033:0x74a1d94580b9
>    RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
>    RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
>    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
>    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
>    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
>    R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600
> 
> This does not show up when booting with pti=off.
> 
> Maybe it is related to the issue that is fixed for the upcoming 4.4.110
> release by https://lkml.org/lkml/2018/1/3/692

JFYI, the very same kernel does not show this issue on a Haswell CPU.

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 10:20 ` Thomas Zeitlhofer
@ 2018-01-04 10:51   ` Greg Kroah-Hartman
  2018-01-04 12:43     ` Thomas Zeitlhofer
  0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-04 10:51 UTC (permalink / raw)
  To: Thomas Zeitlhofer; +Cc: Hugh Dickins, linux-kernel

On Thu, Jan 04, 2018 at 11:20:29AM +0100, Thomas Zeitlhofer wrote:
> On Thu, Jan 04, 2018 at 02:59:06AM +0100, Thomas Zeitlhofer wrote:
> > Hello,
> > 
> > on an Ivybridge CPU, I get with 4.14.11:
> > 
> >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
> >    caller is native_flush_tlb_single+0x57/0xc0
> >    CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
> >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> >    Call Trace:
> >     dump_stack+0x5c/0x86
> >     check_preemption_disabled+0xdd/0xe0
> >     native_flush_tlb_single+0x57/0xc0
> >     ? __set_pte_vaddr+0x2d/0x40
> >     __set_pte_vaddr+0x2d/0x40
> >     set_pte_vaddr+0x2f/0x40
> >     cea_set_pte+0x30/0x40
> >     ds_update_cea.constprop.4+0x4d/0x70
> >     reserve_ds_buffers+0x159/0x410
> >     ? wp_page_copy+0x36d/0x6a0
> >     x86_reserve_hardware+0x150/0x160
> >     x86_pmu_event_init+0x3e/0x1f0
> >     perf_try_init_event+0x69/0x80
> >     perf_event_alloc+0x652/0x740
> >     SyS_perf_event_open+0x3f6/0xd60
> >     do_syscall_64+0x5c/0x190
> >     entry_SYSCALL64_slow_path+0x25/0x25
> >    RIP: 0033:0x74a1d94580b9
> >    RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> >    RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
> >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
> >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> >    R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600
> > 
> > This does not show up when booting with pti=off.
> > 
> > Maybe it is related to the issue that is fixed for the upcoming 4.4.110
> > release by https://lkml.org/lkml/2018/1/3/692

I don't understand this link.  The 4.4 and 4.9 backports are much
different than the 4.14 tree.

> JFYI, the very same kernel does not show this issue on a Haswell CPU.

I have now queued up a bunch of patches that are in Linus's tree, can
you test these out as well:
	https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.14

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 10:51   ` Greg Kroah-Hartman
@ 2018-01-04 12:43     ` Thomas Zeitlhofer
  2018-01-04 12:55       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-04 12:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Hugh Dickins, linux-kernel

On Thu, Jan 04, 2018 at 11:51:11AM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 11:20:29AM +0100, Thomas Zeitlhofer wrote:
> > On Thu, Jan 04, 2018 at 02:59:06AM +0100, Thomas Zeitlhofer wrote:
> > > Hello,
> > > 
> > > on an Ivybridge CPU, I get with 4.14.11:
> > > 
> > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
> > >    caller is native_flush_tlb_single+0x57/0xc0
> > >    CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
> > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > >    Call Trace:
> > >     dump_stack+0x5c/0x86
> > >     check_preemption_disabled+0xdd/0xe0
> > >     native_flush_tlb_single+0x57/0xc0
> > >     ? __set_pte_vaddr+0x2d/0x40
> > >     __set_pte_vaddr+0x2d/0x40
> > >     set_pte_vaddr+0x2f/0x40
> > >     cea_set_pte+0x30/0x40
> > >     ds_update_cea.constprop.4+0x4d/0x70
> > >     reserve_ds_buffers+0x159/0x410
> > >     ? wp_page_copy+0x36d/0x6a0
> > >     x86_reserve_hardware+0x150/0x160
> > >     x86_pmu_event_init+0x3e/0x1f0
> > >     perf_try_init_event+0x69/0x80
> > >     perf_event_alloc+0x652/0x740
> > >     SyS_perf_event_open+0x3f6/0xd60
> > >     do_syscall_64+0x5c/0x190
> > >     entry_SYSCALL64_slow_path+0x25/0x25
> > >    RIP: 0033:0x74a1d94580b9
> > >    RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > >    RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
> > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
> > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > >    R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600
> > > 
> > > This does not show up when booting with pti=off.
> > > 
> > > Maybe it is related to the issue that is fixed for the upcoming 4.4.110
> > > release by https://lkml.org/lkml/2018/1/3/692
> 
> I don't understand this link.  

I found that link when trying to search for the error message. That
patch touches __native_flush_tlb_single() and mentions hardware
differences in Ivybridge and below:

	"We have many machines (Westmere, Sandybridge, Ivybridge)
	supporting PCID but not INVPCID..."

As I see the error message only on Ivybridge and not on Haswell, I came
up with the vague guess that this could be related.

> The 4.4 and 4.9 backports are much different than the 4.14 tree.

Yes, I have seen that.

> > JFYI, the very same kernel does not show this issue on a Haswell CPU.
> 
> I have now queued up a bunch of patches that are in Linus's tree, can
> you test these out as well:
> 	https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.14

Does not seem to make any difference - with those patches applied I
still get:

   BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4383
   caller is native_flush_tlb_single+0x57/0xc0
   CPU: 3 PID: 4383 Comm: ovsdb-server Not tainted 4.14.11-kvm-00435-g3138001170c9 #3
   Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
   Call Trace:
    dump_stack+0x5c/0x86
    check_preemption_disabled+0xdd/0xe0
    native_flush_tlb_single+0x57/0xc0
    ? __set_pte_vaddr+0x2d/0x40
    __set_pte_vaddr+0x2d/0x40
    set_pte_vaddr+0x2f/0x40
    cea_set_pte+0x30/0x40
    ds_update_cea.constprop.4+0x4d/0x70
    reserve_ds_buffers+0x159/0x410
    ? wp_page_copy+0x36d/0x6a0
    x86_reserve_hardware+0x150/0x160
    x86_pmu_event_init+0x3e/0x1f0
    perf_try_init_event+0x69/0x80
    perf_event_alloc+0x652/0x740
    SyS_perf_event_open+0x3f6/0xd60
    do_syscall_64+0x5c/0x190
    entry_SYSCALL64_slow_path+0x25/0x25
   RIP: 0033:0x755c0b8580b9
   RSP: 002b:00007fffc87cf9e8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
   RAX: ffffffffffffffda RBX: 00007fffc87cfbc0 RCX: 0000755c0b8580b9
   RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fffc87cf9f0
   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
   R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
   R13: 0000000000000000 R14: 00007fffc87cfba0 R15: 000062ea2cbff600

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 12:43     ` Thomas Zeitlhofer
@ 2018-01-04 12:55       ` Greg Kroah-Hartman
  2018-01-04 15:25         ` Thomas Zeitlhofer
  0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-04 12:55 UTC (permalink / raw)
  To: Thomas Zeitlhofer, Thomas Gleixner; +Cc: Hugh Dickins, linux-kernel

On Thu, Jan 04, 2018 at 01:43:20PM +0100, Thomas Zeitlhofer wrote:
> On Thu, Jan 04, 2018 at 11:51:11AM +0100, Greg Kroah-Hartman wrote:
> > On Thu, Jan 04, 2018 at 11:20:29AM +0100, Thomas Zeitlhofer wrote:
> > > On Thu, Jan 04, 2018 at 02:59:06AM +0100, Thomas Zeitlhofer wrote:
> > > > Hello,
> > > > 
> > > > on an Ivybridge CPU, I get with 4.14.11:
> > > > 
> > > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
> > > >    caller is native_flush_tlb_single+0x57/0xc0
> > > >    CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
> > > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > > >    Call Trace:
> > > >     dump_stack+0x5c/0x86
> > > >     check_preemption_disabled+0xdd/0xe0
> > > >     native_flush_tlb_single+0x57/0xc0
> > > >     ? __set_pte_vaddr+0x2d/0x40
> > > >     __set_pte_vaddr+0x2d/0x40
> > > >     set_pte_vaddr+0x2f/0x40
> > > >     cea_set_pte+0x30/0x40
> > > >     ds_update_cea.constprop.4+0x4d/0x70
> > > >     reserve_ds_buffers+0x159/0x410
> > > >     ? wp_page_copy+0x36d/0x6a0
> > > >     x86_reserve_hardware+0x150/0x160
> > > >     x86_pmu_event_init+0x3e/0x1f0
> > > >     perf_try_init_event+0x69/0x80
> > > >     perf_event_alloc+0x652/0x740
> > > >     SyS_perf_event_open+0x3f6/0xd60
> > > >     do_syscall_64+0x5c/0x190
> > > >     entry_SYSCALL64_slow_path+0x25/0x25
> > > >    RIP: 0033:0x74a1d94580b9
> > > >    RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > > >    RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
> > > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
> > > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > > >    R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600
> > > > 
> > > > This does not show up when booting with pti=off.
> > > > 
> > > > Maybe it is related to the issue that is fixed for the upcoming 4.4.110
> > > > release by https://lkml.org/lkml/2018/1/3/692
> > 
> > I don't understand this link.  
> 
> I found that link when trying to search for the error message. That
> patch touches __native_flush_tlb_single() and mentions hardware
> differences in Ivybridge and below:
> 
> 	"We have many machines (Westmere, Sandybridge, Ivybridge)
> 	supporting PCID but not INVPCID..."
> 
> As I see the error message only on Ivybridge and not on Haswell, I came
> up with the vague guess that this could be related.
> 
> > The 4.4 and 4.9 backports are much different than the 4.14 tree.
> 
> Yes, I have seen that.
> 
> > > JFYI, the very same kernel does not show this issue on a Haswell CPU.
> > 
> > I have now queued up a bunch of patches that are in Linus's tree, can
> > you test these out as well:
> > 	https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.14
> 
> Does not seem to make any difference - with those patches applied I
> still get:
> 
>    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4383
>    caller is native_flush_tlb_single+0x57/0xc0
>    CPU: 3 PID: 4383 Comm: ovsdb-server Not tainted 4.14.11-kvm-00435-g3138001170c9 #3
>    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
>    Call Trace:
>     dump_stack+0x5c/0x86
>     check_preemption_disabled+0xdd/0xe0
>     native_flush_tlb_single+0x57/0xc0
>     ? __set_pte_vaddr+0x2d/0x40
>     __set_pte_vaddr+0x2d/0x40
>     set_pte_vaddr+0x2f/0x40
>     cea_set_pte+0x30/0x40
>     ds_update_cea.constprop.4+0x4d/0x70
>     reserve_ds_buffers+0x159/0x410
>     ? wp_page_copy+0x36d/0x6a0
>     x86_reserve_hardware+0x150/0x160
>     x86_pmu_event_init+0x3e/0x1f0
>     perf_try_init_event+0x69/0x80
>     perf_event_alloc+0x652/0x740
>     SyS_perf_event_open+0x3f6/0xd60
>     do_syscall_64+0x5c/0x190
>     entry_SYSCALL64_slow_path+0x25/0x25
>    RIP: 0033:0x755c0b8580b9
>    RSP: 002b:00007fffc87cf9e8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
>    RAX: ffffffffffffffda RBX: 00007fffc87cfbc0 RCX: 0000755c0b8580b9
>    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fffc87cf9f0
>    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
>    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
>    R13: 0000000000000000 R14: 00007fffc87cfba0 R15: 000062ea2cbff600
> 

Odd, does 4.15-rc6 also trigger the same error?  Thomas is working on an
issue with KALSR (see lkml with:
	Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 12:55       ` Greg Kroah-Hartman
@ 2018-01-04 15:25         ` Thomas Zeitlhofer
  2018-01-04 15:37           ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-04 15:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Thomas Gleixner, Hugh Dickins, linux-kernel

On Thu, Jan 04, 2018 at 01:55:28PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 01:43:20PM +0100, Thomas Zeitlhofer wrote:
> > On Thu, Jan 04, 2018 at 11:51:11AM +0100, Greg Kroah-Hartman wrote:
> > > On Thu, Jan 04, 2018 at 11:20:29AM +0100, Thomas Zeitlhofer wrote:
> > > > On Thu, Jan 04, 2018 at 02:59:06AM +0100, Thomas Zeitlhofer wrote:
> > > > > Hello,
> > > > > 
> > > > > on an Ivybridge CPU, I get with 4.14.11:
> > > > > 
> > > > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
> > > > >    caller is native_flush_tlb_single+0x57/0xc0
> > > > >    CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
> > > > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > > > >    Call Trace:
> > > > >     dump_stack+0x5c/0x86
> > > > >     check_preemption_disabled+0xdd/0xe0
> > > > >     native_flush_tlb_single+0x57/0xc0
> > > > >     ? __set_pte_vaddr+0x2d/0x40
> > > > >     __set_pte_vaddr+0x2d/0x40
> > > > >     set_pte_vaddr+0x2f/0x40
> > > > >     cea_set_pte+0x30/0x40
> > > > >     ds_update_cea.constprop.4+0x4d/0x70
> > > > >     reserve_ds_buffers+0x159/0x410
> > > > >     ? wp_page_copy+0x36d/0x6a0
> > > > >     x86_reserve_hardware+0x150/0x160
> > > > >     x86_pmu_event_init+0x3e/0x1f0
> > > > >     perf_try_init_event+0x69/0x80
> > > > >     perf_event_alloc+0x652/0x740
> > > > >     SyS_perf_event_open+0x3f6/0xd60
> > > > >     do_syscall_64+0x5c/0x190
> > > > >     entry_SYSCALL64_slow_path+0x25/0x25
> > > > >    RIP: 0033:0x74a1d94580b9
> > > > >    RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > > > >    RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
> > > > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
> > > > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > > > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > > > >    R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600
> > > > > 
> > > > > This does not show up when booting with pti=off.
> > > > > 
> > > > > Maybe it is related to the issue that is fixed for the upcoming 4.4.110
> > > > > release by https://lkml.org/lkml/2018/1/3/692
> > > 
> > > I don't understand this link.  
> > 
> > I found that link when trying to search for the error message. That
> > patch touches __native_flush_tlb_single() and mentions hardware
> > differences in Ivybridge and below:
> > 
> > 	"We have many machines (Westmere, Sandybridge, Ivybridge)
> > 	supporting PCID but not INVPCID..."
> > 
> > As I see the error message only on Ivybridge and not on Haswell, I came
> > up with the vague guess that this could be related.
> > 
> > > The 4.4 and 4.9 backports are much different than the 4.14 tree.
> > 
> > Yes, I have seen that.
> > 
> > > > JFYI, the very same kernel does not show this issue on a Haswell CPU.
> > > 
> > > I have now queued up a bunch of patches that are in Linus's tree, can
> > > you test these out as well:
> > > 	https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.14
> > 
> > Does not seem to make any difference - with those patches applied I
> > still get:
> > 
> >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4383
> >    caller is native_flush_tlb_single+0x57/0xc0
> >    CPU: 3 PID: 4383 Comm: ovsdb-server Not tainted 4.14.11-kvm-00435-g3138001170c9 #3
> >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> >    Call Trace:
> >     dump_stack+0x5c/0x86
> >     check_preemption_disabled+0xdd/0xe0
> >     native_flush_tlb_single+0x57/0xc0
> >     ? __set_pte_vaddr+0x2d/0x40
> >     __set_pte_vaddr+0x2d/0x40
> >     set_pte_vaddr+0x2f/0x40
> >     cea_set_pte+0x30/0x40
> >     ds_update_cea.constprop.4+0x4d/0x70
> >     reserve_ds_buffers+0x159/0x410
> >     ? wp_page_copy+0x36d/0x6a0
> >     x86_reserve_hardware+0x150/0x160
> >     x86_pmu_event_init+0x3e/0x1f0
> >     perf_try_init_event+0x69/0x80
> >     perf_event_alloc+0x652/0x740
> >     SyS_perf_event_open+0x3f6/0xd60
> >     do_syscall_64+0x5c/0x190
> >     entry_SYSCALL64_slow_path+0x25/0x25
> >    RIP: 0033:0x755c0b8580b9
> >    RSP: 002b:00007fffc87cf9e8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> >    RAX: ffffffffffffffda RBX: 00007fffc87cfbc0 RCX: 0000755c0b8580b9
> >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fffc87cf9f0
> >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> >    R13: 0000000000000000 R14: 00007fffc87cfba0 R15: 000062ea2cbff600
> > 
> 
> Odd, does 4.15-rc6 also trigger the same error? 

Yes:

   BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
   caller is native_flush_tlb_single+0x57/0xc0
   CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
   Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
   Call Trace:
    dump_stack+0x5c/0x86
    check_preemption_disabled+0xdd/0xe0
    native_flush_tlb_single+0x57/0xc0
    ? __set_pte_vaddr+0x2d/0x40
    __set_pte_vaddr+0x2d/0x40
    set_pte_vaddr+0x2f/0x40
    cea_set_pte+0x30/0x40
    ds_update_cea.constprop.4+0x4d/0x70
    reserve_ds_buffers+0x159/0x410
    ? wp_page_copy+0x370/0x6c0
    x86_reserve_hardware+0x150/0x160
    x86_pmu_event_init+0x3e/0x1f0
    perf_try_init_event+0x69/0x80
    perf_event_alloc+0x652/0x740
    SyS_perf_event_open+0x3f6/0xd60
    do_syscall_64+0x5c/0x190
    entry_SYSCALL64_slow_path+0x25/0x25
   RIP: 0033:0x72bff0a3c0b9
   RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
   RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
   RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
   R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
   R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600
   device ovs-system entered promiscuous mode
   netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.

In addition, with v4.15-rc6, netlink messages like in the last line show
up, but I guess this is a different openvswitch related issue.

> Thomas is working on an
> issue with KALSR (see lkml with:
> 	Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
> )

Yes, I have also seen that thread, but I did not see any similarities to
my issue. Anyway, I also tried out the patch proposed in
https://lkml.org/lkml/2018/1/4/313 but it does not change anything here.

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 15:25         ` Thomas Zeitlhofer
@ 2018-01-04 15:37           ` Thomas Gleixner
  2018-01-04 17:07             ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2018-01-04 15:37 UTC (permalink / raw)
  To: Thomas Zeitlhofer; +Cc: Greg Kroah-Hartman, Hugh Dickins, LKML, Peter Zijlstra

On Thu, 4 Jan 2018, Thomas Zeitlhofer wrote:
> On Thu, Jan 04, 2018 at 01:55:28PM +0100, Greg Kroah-Hartman wrote:
> > > > > > on an Ivybridge CPU, I get with 4.14.11:
> > > > > > 
> > > > > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510
> > > > > >    caller is native_flush_tlb_single+0x57/0xc0
> > > > > >    CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3
> > > > > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > > > > >    Call Trace:
> > > > > >     dump_stack+0x5c/0x86
> > > > > >     check_preemption_disabled+0xdd/0xe0
> > > > > >     native_flush_tlb_single+0x57/0xc0
> > > > > >     ? __set_pte_vaddr+0x2d/0x40
> > > > > >     __set_pte_vaddr+0x2d/0x40
> > > > > >     set_pte_vaddr+0x2f/0x40
> > > > > >     cea_set_pte+0x30/0x40
> > > > > >     ds_update_cea.constprop.4+0x4d/0x70
> > > > > >     reserve_ds_buffers+0x159/0x410
> > > > > >     ? wp_page_copy+0x36d/0x6a0
> > > > > >     x86_reserve_hardware+0x150/0x160
> > > > > >     x86_pmu_event_init+0x3e/0x1f0
> > > > > >     perf_try_init_event+0x69/0x80
> > > > > >     perf_event_alloc+0x652/0x740
> > > > > >     SyS_perf_event_open+0x3f6/0xd60
> > > > > >     do_syscall_64+0x5c/0x190
> > > > > >     entry_SYSCALL64_slow_path+0x25/0x25
> > > > > >    RIP: 0033:0x74a1d94580b9
> > > > > >    RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > > > > >    RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9
> > > > > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0
> > > > > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > > > > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > > > > >    R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600
> > > > > > 
> > > > > > This does not show up when booting with pti=off.

Right, because the code path is not invoked ....

> > Odd, does 4.15-rc6 also trigger the same error? 
> 
> Yes:
> 
>    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
>    caller is native_flush_tlb_single+0x57/0xc0
>    CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
>    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
>    Call Trace:
>     dump_stack+0x5c/0x86
>     check_preemption_disabled+0xdd/0xe0
>     native_flush_tlb_single+0x57/0xc0
>     ? __set_pte_vaddr+0x2d/0x40
>     __set_pte_vaddr+0x2d/0x40
>     set_pte_vaddr+0x2f/0x40
>     cea_set_pte+0x30/0x40
>     ds_update_cea.constprop.4+0x4d/0x70
>     reserve_ds_buffers+0x159/0x410
>     ? wp_page_copy+0x370/0x6c0
>     x86_reserve_hardware+0x150/0x160
>     x86_pmu_event_init+0x3e/0x1f0
>     perf_try_init_event+0x69/0x80
>     perf_event_alloc+0x652/0x740
>     SyS_perf_event_open+0x3f6/0xd60
>     do_syscall_64+0x5c/0x190
>     entry_SYSCALL64_slow_path+0x25/0x25
>    RIP: 0033:0x72bff0a3c0b9
>    RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
>    RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
>    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
>    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
>    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
>    R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600
>    device ovs-system entered promiscuous mode
>    netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
> 
> In addition, with v4.15-rc6, netlink messages like in the last line show
> up, but I guess this is a different openvswitch related issue.
> 
> > Thomas is working on an
> > issue with KALSR (see lkml with:
> > 	Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
> > )
> 
> Yes, I have also seen that thread, but I did not see any similarities to
> my issue. Anyway, I also tried out the patch proposed in
> https://lkml.org/lkml/2018/1/4/313 but it does not change anything here.

Correct. I'm looking into a fix. Stay tuned.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 15:37           ` Thomas Gleixner
@ 2018-01-04 17:07             ` Peter Zijlstra
  2018-01-04 18:38               ` Thomas Zeitlhofer
                                 ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Peter Zijlstra @ 2018-01-04 17:07 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Thomas Zeitlhofer, Greg Kroah-Hartman, Hugh Dickins, LKML

On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote:
> > Yes:
> > 
> >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
> >    caller is native_flush_tlb_single+0x57/0xc0
> >    CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
> >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> >    Call Trace:
> >     dump_stack+0x5c/0x86
> >     check_preemption_disabled+0xdd/0xe0
> >     native_flush_tlb_single+0x57/0xc0
> >     ? __set_pte_vaddr+0x2d/0x40
> >     __set_pte_vaddr+0x2d/0x40
> >     set_pte_vaddr+0x2f/0x40
> >     cea_set_pte+0x30/0x40
> >     ds_update_cea.constprop.4+0x4d/0x70
> >     reserve_ds_buffers+0x159/0x410
> >     ? wp_page_copy+0x370/0x6c0
> >     x86_reserve_hardware+0x150/0x160
> >     x86_pmu_event_init+0x3e/0x1f0
> >     perf_try_init_event+0x69/0x80
> >     perf_event_alloc+0x652/0x740
> >     SyS_perf_event_open+0x3f6/0xd60
> >     do_syscall_64+0x5c/0x190
> >     entry_SYSCALL64_slow_path+0x25/0x25
> >    RIP: 0033:0x72bff0a3c0b9
> >    RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> >    RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
> >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
> >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> >    R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600

Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be
per CPU. But the DS crud does cross CPU updates of those tables.

So we need some additional fun and games..

How's the below?

---
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8f0aace08b87..8156e47da7ba 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 
 #include <asm/cpu_entry_area.h>
 #include <asm/perf_event.h>
+#include <asm/tlbflush.h>
 #include <asm/insn.h>
 
 #include "../perf_event.h"
@@ -283,20 +284,35 @@ static DEFINE_PER_CPU(void *, insn_buffer);
 
 static void ds_update_cea(void *cea, void *addr, size_t size, pgprot_t prot)
 {
+	unsigned long start = (unsigned long)cea;
 	phys_addr_t pa;
 	size_t msz = 0;
 
 	pa = virt_to_phys(addr);
+
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, pa += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, pa, prot);
+
+	/*
+	 * This is a cross-CPU update of the cpu_entry_area, we must shoot down
+	 * all TLB entries for it.
+	 */
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void ds_clear_cea(void *cea, size_t size)
 {
+	unsigned long start = (unsigned long)cea;
 	size_t msz = 0;
 
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, 0, PAGE_NONE);
+
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 17:07             ` Peter Zijlstra
@ 2018-01-04 18:38               ` Thomas Zeitlhofer
  2018-01-06 21:38                 ` Thomas Zeitlhofer
  2018-01-04 22:11               ` [tip:x86/pti] x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers tip-bot for Peter Zijlstra
  2018-01-04 23:49               ` tip-bot for Peter Zijlstra
  2 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-04 18:38 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Thomas Gleixner, Greg Kroah-Hartman, Hugh Dickins, LKML

On Thu, Jan 04, 2018 at 06:07:12PM +0100, Peter Zijlstra wrote:
> On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote:
> > > Yes:
> > > 
> > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
> > >    caller is native_flush_tlb_single+0x57/0xc0
> > >    CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
> > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > >    Call Trace:
> > >     dump_stack+0x5c/0x86
> > >     check_preemption_disabled+0xdd/0xe0
> > >     native_flush_tlb_single+0x57/0xc0
> > >     ? __set_pte_vaddr+0x2d/0x40
> > >     __set_pte_vaddr+0x2d/0x40
> > >     set_pte_vaddr+0x2f/0x40
> > >     cea_set_pte+0x30/0x40
> > >     ds_update_cea.constprop.4+0x4d/0x70
> > >     reserve_ds_buffers+0x159/0x410
> > >     ? wp_page_copy+0x370/0x6c0
> > >     x86_reserve_hardware+0x150/0x160
> > >     x86_pmu_event_init+0x3e/0x1f0
> > >     perf_try_init_event+0x69/0x80
> > >     perf_event_alloc+0x652/0x740
> > >     SyS_perf_event_open+0x3f6/0xd60
> > >     do_syscall_64+0x5c/0x190
> > >     entry_SYSCALL64_slow_path+0x25/0x25
> > >    RIP: 0033:0x72bff0a3c0b9
> > >    RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > >    RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
> > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
> > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > >    R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600
> 
> Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be
> per CPU. But the DS crud does cross CPU updates of those tables.
> 
> So we need some additional fun and games..
> 
> How's the below?
[...]

Looks good - I have successfully tested it on top of 4.14.11 and
4.15-rc6. In both cases, the error message is gone when this patch is
applied.

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:x86/pti] x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers
  2018-01-04 17:07             ` Peter Zijlstra
  2018-01-04 18:38               ` Thomas Zeitlhofer
@ 2018-01-04 22:11               ` tip-bot for Peter Zijlstra
  2018-01-04 23:49               ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Peter Zijlstra @ 2018-01-04 22:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: gregkh, peterz, hughd, thomas.zeitlhofer+lkml, mingo,
	linux-kernel, tglx, hpa

Commit-ID:  2f8411c2e98d93100de55d413f7c54a090bdf04e
Gitweb:     https://git.kernel.org/tip/2f8411c2e98d93100de55d413f7c54a090bdf04e
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Thu, 4 Jan 2018 18:07:12 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 4 Jan 2018 23:04:58 +0100

x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers

Thomas reported the following warning:

 BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
 caller is native_flush_tlb_single+0x57/0xc0
 native_flush_tlb_single+0x57/0xc0
 __set_pte_vaddr+0x2d/0x40
 set_pte_vaddr+0x2f/0x40
 cea_set_pte+0x30/0x40
 ds_update_cea.constprop.4+0x4d/0x70
 reserve_ds_buffers+0x159/0x410
 x86_reserve_hardware+0x150/0x160
 x86_pmu_event_init+0x3e/0x1f0
 perf_try_init_event+0x69/0x80
 perf_event_alloc+0x652/0x740
 SyS_perf_event_open+0x3f6/0xd60
 do_syscall_64+0x5c/0x190

set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
there are two problems with that:

 1) The resulting flush is not supposed to be called in preemptible context

 2) The cpu entry area is supposed to be per CPU, but the debug store
    buffers are mapped for all CPUs so these mappings need to be flushed
    globally.

Add the necessary preemption protection across the mapping code and flush
TLBs globally.

Fixes: c1961a4631da ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net

---
 arch/x86/events/intel/ds.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8f0aace..8156e47 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 
 #include <asm/cpu_entry_area.h>
 #include <asm/perf_event.h>
+#include <asm/tlbflush.h>
 #include <asm/insn.h>
 
 #include "../perf_event.h"
@@ -283,20 +284,35 @@ static DEFINE_PER_CPU(void *, insn_buffer);
 
 static void ds_update_cea(void *cea, void *addr, size_t size, pgprot_t prot)
 {
+	unsigned long start = (unsigned long)cea;
 	phys_addr_t pa;
 	size_t msz = 0;
 
 	pa = virt_to_phys(addr);
+
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, pa += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, pa, prot);
+
+	/*
+	 * This is a cross-CPU update of the cpu_entry_area, we must shoot down
+	 * all TLB entries for it.
+	 */
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void ds_clear_cea(void *cea, size_t size)
 {
+	unsigned long start = (unsigned long)cea;
 	size_t msz = 0;
 
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, 0, PAGE_NONE);
+
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:x86/pti] x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers
  2018-01-04 17:07             ` Peter Zijlstra
  2018-01-04 18:38               ` Thomas Zeitlhofer
  2018-01-04 22:11               ` [tip:x86/pti] x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers tip-bot for Peter Zijlstra
@ 2018-01-04 23:49               ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Peter Zijlstra @ 2018-01-04 23:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: thomas.zeitlhofer+lkml, linux-kernel, tglx, peterz, gregkh,
	hughd, hpa, mingo

Commit-ID:  42f3bdc5dd962a5958bc024c1e1444248a6b8b4a
Gitweb:     https://git.kernel.org/tip/42f3bdc5dd962a5958bc024c1e1444248a6b8b4a
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Thu, 4 Jan 2018 18:07:12 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 5 Jan 2018 00:39:58 +0100

x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers

Thomas reported the following warning:

 BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
 caller is native_flush_tlb_single+0x57/0xc0
 native_flush_tlb_single+0x57/0xc0
 __set_pte_vaddr+0x2d/0x40
 set_pte_vaddr+0x2f/0x40
 cea_set_pte+0x30/0x40
 ds_update_cea.constprop.4+0x4d/0x70
 reserve_ds_buffers+0x159/0x410
 x86_reserve_hardware+0x150/0x160
 x86_pmu_event_init+0x3e/0x1f0
 perf_try_init_event+0x69/0x80
 perf_event_alloc+0x652/0x740
 SyS_perf_event_open+0x3f6/0xd60
 do_syscall_64+0x5c/0x190

set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
there are two problems with that:

 1) The resulting flush is not supposed to be called in preemptible context

 2) The cpu entry area is supposed to be per CPU, but the debug store
    buffers are mapped for all CPUs so these mappings need to be flushed
    globally.

Add the necessary preemption protection across the mapping code and flush
TLBs globally.

Fixes: c1961a4631da ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net

---
 arch/x86/events/intel/ds.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8f0aace..8156e47 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 
 #include <asm/cpu_entry_area.h>
 #include <asm/perf_event.h>
+#include <asm/tlbflush.h>
 #include <asm/insn.h>
 
 #include "../perf_event.h"
@@ -283,20 +284,35 @@ static DEFINE_PER_CPU(void *, insn_buffer);
 
 static void ds_update_cea(void *cea, void *addr, size_t size, pgprot_t prot)
 {
+	unsigned long start = (unsigned long)cea;
 	phys_addr_t pa;
 	size_t msz = 0;
 
 	pa = virt_to_phys(addr);
+
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, pa += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, pa, prot);
+
+	/*
+	 * This is a cross-CPU update of the cpu_entry_area, we must shoot down
+	 * all TLB entries for it.
+	 */
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void ds_clear_cea(void *cea, size_t size)
 {
+	unsigned long start = (unsigned long)cea;
 	size_t msz = 0;
 
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, 0, PAGE_NONE);
+
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-04 18:38               ` Thomas Zeitlhofer
@ 2018-01-06 21:38                 ` Thomas Zeitlhofer
  2018-01-07  8:17                   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-06 21:38 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Thomas Gleixner, Greg Kroah-Hartman, Hugh Dickins, LKML

On Thu, Jan 04, 2018 at 07:38:00PM +0100, Thomas Zeitlhofer wrote:
> On Thu, Jan 04, 2018 at 06:07:12PM +0100, Peter Zijlstra wrote:
> > On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote:
> > > > Yes:
> > > > 
> > > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
> > > >    caller is native_flush_tlb_single+0x57/0xc0
> > > >    CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
> > > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > > >    Call Trace:
> > > >     dump_stack+0x5c/0x86
> > > >     check_preemption_disabled+0xdd/0xe0
> > > >     native_flush_tlb_single+0x57/0xc0
> > > >     ? __set_pte_vaddr+0x2d/0x40
> > > >     __set_pte_vaddr+0x2d/0x40
> > > >     set_pte_vaddr+0x2f/0x40
> > > >     cea_set_pte+0x30/0x40
> > > >     ds_update_cea.constprop.4+0x4d/0x70
> > > >     reserve_ds_buffers+0x159/0x410
> > > >     ? wp_page_copy+0x370/0x6c0
> > > >     x86_reserve_hardware+0x150/0x160
> > > >     x86_pmu_event_init+0x3e/0x1f0
> > > >     perf_try_init_event+0x69/0x80
> > > >     perf_event_alloc+0x652/0x740
> > > >     SyS_perf_event_open+0x3f6/0xd60
> > > >     do_syscall_64+0x5c/0x190
> > > >     entry_SYSCALL64_slow_path+0x25/0x25
> > > >    RIP: 0033:0x72bff0a3c0b9
> > > >    RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > > >    RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
> > > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
> > > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > > >    R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600
> > 
> > Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be
> > per CPU. But the DS crud does cross CPU updates of those tables.
> > 
> > So we need some additional fun and games..
> > 
> > How's the below?
> [...]
> 
> Looks good - I have successfully tested it on top of 4.14.11 and
> 4.15-rc6. In both cases, the error message is gone when this patch is
> applied.

While solving the previous problem, this patch also introduces new "fun
and games"...  

Now, terminating a systemd-nspawn container, reliably crashes the host
(so far tested only on Haswell, if that matters). Once, I was able to
capture the following trace:

   BUG: unable to handle kernel paging request at 0000000000206ccc
   IP: __task_pid_nr_ns+0x57/0xc0
   PGD 0 P4D 0 
   Oops: 0000 [#1] PREEMPT SMP PTI
   Modules linked in: uinput veth ip_vti ip_tunnel esp4 xfrm6_mode_tunnel fuse ccm xt_CHECKSUM tun bridge stp llc xfrm_user xfrm_algo ebtable_filter twofish_generic twofish_avx_x86_64 ebtables twofish_x86_64_3way twofish_x86_64 twofish_common vxlan ip6_udp_tunnel udp_tunnel serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic devlink blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 ablk_helper camellia_x86_64 xcbc openvswitch nf_nat_ipv6 md4 algif_hash af_alg cmac rfcomm bnep xt_policy nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat msr nf_nat_ipv4 nf_nat xt_TCPMSS iptable_mangle ipt_REJECT
    nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc iptable_filter snd_hda_codec_hdmi hid_sensor_als hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio rtsx_pci_sdmmc mmc_core iTCO_wdt wmi_bmof arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo joydev wacom videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_sensor_hub videodev btusb btrtl hid_multitouch btbcm media btintel rtsx_pci i915 bluetooth snd_hda_codec_conexant lpc_ich snd_hda_codec_generic mfd_core iwlmvm iosf_mbi i2c_algo_bit ecdh_generic
    drm_kms_helper mac80211 snd_hda_intel syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core snd_pcm_oss iwlwifi fb_sys_fops thinkpad_acpi snd_mixer_oss drm nvram snd_pcm video cfg80211 intel_gtt snd_timer rfkill snd evdev wmi ecryptfs nfsd ip_tables x_tables ipv6 crc_ccitt
   CPU: 2 PID: 1 Comm: systemd Not tainted 4.14.12-kvm-00437-gd6765c06f03d #4
   Hardware name: LENOVO 20CD0035GE/20CD0035GE, BIOS GQET40WW (1.20 ) 11/07/2014
   task: ffff9c66560e0d00 task.stack: ffffbc6a00038000
   RIP: 0010:__task_pid_nr_ns+0x57/0xc0
   RSP: 0018:ffffbc6a0003bdb0 EFLAGS: 00010246
   RAX: ffff9c66560e8680 RBX: 0000000000000000 RCX: 0000000000206cc8
   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000004d0
   RBP: 0000000000000000 R08: ffffffffb0237b10 R09: 0000000000000005
   R10: ffffbc6a0003bee0 R11: ffff9c65aa33c004 R12: ffffffffb02309a0
   R13: 0000000000001000 R14: ffff9c65ecbd4a00 R15: ffff9c6624516b00
   FS:  0000767a01669980(0000) GS:ffff9c665f280000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 0000000000206ccc CR3: 0000000215476003 CR4: 00000000001606e0
   Call Trace:
    cgroup_procs_show+0x10/0x30
    seq_read+0x30c/0x3d0
    __vfs_read+0x2e/0x150
    vfs_read+0x84/0x110
    SyS_read+0x4d/0xc0
    do_syscall_64+0x5c/0x190
    entry_SYSCALL64_slow_path+0x25/0x25
   RIP: 0033:0x767a00fa671d
   RSP: 002b:00007ffca8edc6e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
   RAX: ffffffffffffffda RBX: 000057d4d8a02c10 RCX: 0000767a00fa671d
   RDX: 0000000000001000 RSI: 000057d4d8a05320 RDI: 0000000000000083
   RBP: 0000000000000d68 R08: 0000767a01265178 R09: 0000000000001010
   R10: 000057d4d8a03490 R11: 0000000000000293 R12: 0000767a01261440
   R13: 0000767a01260900 R14: 00000000ffffffff R15: 0000000000000000
   Code: 74 0d 48 8d 44 6d 00 48 8d 3c c5 d0 04 00 00 48 8b 9b 98 04 00 00 48 01 fb 48 8b 0b 48 85 c9 74 37 41 8b b4 24 30 08 00 00 31 db <3b> 71 04 77 0d 48 c1 e6 05 48 01 f1 4c 3b 61 38 74 0c e8 12 db 
   RIP: __task_pid_nr_ns+0x57/0xc0 RSP: ffffbc6a0003bdb0
   CR2: 0000000000206ccc
   ---[ end trace ce7578070732b5ee ]---
   BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
   IP: pids_free+0xb/0x30
   PGD 0 P4D 0 
   Oops: 0000 [#2] PREEMPT SMP PTI
   Modules linked in: uinput veth ip_vti ip_tunnel esp4 xfrm6_mode_tunnel fuse ccm xt_CHECKSUM tun bridge stp llc xfrm_user xfrm_algo ebtable_filter twofish_generic twofish_avx_x86_64 ebtables twofish_x86_64_3way twofish_x86_64 twofish_common vxlan ip6_udp_tunnel udp_tunnel serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic devlink blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 ablk_helper camellia_x86_64 xcbc openvswitch nf_nat_ipv6 md4 algif_hash af_alg cmac rfcomm bnep xt_policy nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat msr nf_nat_ipv4 nf_nat xt_TCPMSS iptable_mangle ipt_REJECT
    nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc iptable_filter snd_hda_codec_hdmi hid_sensor_als hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio rtsx_pci_sdmmc mmc_core iTCO_wdt wmi_bmof arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo joydev wacom videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_sensor_hub videodev btusb btrtl hid_multitouch btbcm media btintel rtsx_pci i915 bluetooth snd_hda_codec_conexant lpc_ich snd_hda_codec_generic mfd_core iwlmvm iosf_mbi i2c_algo_bit ecdh_generic
    drm_kms_helper mac80211 snd_hda_intel syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core snd_pcm_oss iwlwifi fb_sys_fops thinkpad_acpi snd_mixer_oss drm nvram snd_pcm video cfg80211 intel_gtt snd_timer rfkill snd evdev wmi ecryptfs nfsd ip_tables x_tables ipv6 crc_ccitt
   CPU: 2 PID: 1 Comm: systemd Tainted: G      D         4.14.12-kvm-00437-gd6765c06f03d #4
   Hardware name: LENOVO 20CD0035GE/20CD0035GE, BIOS GQET40WW (1.20 ) 11/07/2014
   task: ffff9c66560e0d00 task.stack: ffffbc6a00038000
   RIP: 0010:pids_free+0xb/0x30
   RSP: 0018:ffffbc6a0003bdd8 EFLAGS: 00010297
   RAX: 0000000000000000 RBX: 000000000000000a RCX: 000000000000000a
   RDX: 000000000000000a RSI: 000000000000000c RDI: ffff9c6624516b00
   RBP: ffff9c6624516b00 R08: 0000000000000000 R09: 0000000000000000
   R10: ffff9c65bf8a8510 R11: ffff9c6656003800 R12: ffffffffb02387e0
   R13: ffff9c662ac6d590 R14: ffff9c66534cc7a0 R15: ffff9c6625d5f1e0
   FS:  0000000000000000(0000) GS:ffff9c665f280000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00000000000000b0 CR3: 000000008220a006 CR4: 00000000001606e0
   Call Trace:
    cgroup_free+0x57/0xd0
    __put_task_struct+0x38/0x130
    cgroup_procs_release+0x12/0x20
    kernfs_fop_release+0x82/0x90
    __fput+0x9d/0x220
    task_work_run+0x84/0xa0
    do_exit+0x2b1/0xab0
    rewind_stack_do_exit+0x17/0x20
   Code: c7 e8 6a fd ff ff 48 8b 80 b0 00 00 00 48 83 b8 b0 00 00 00 00 75 e7 f3 c3 0f 1f 80 00 00 00 00 48 8b 87 88 07 00 00 48 8b 40 50 <48> 83 b8 b0 00 00 00 00 74 19 48 89 c7 e8 33 fd ff ff 48 8b 80 
   RIP: pids_free+0xb/0x30 RSP: ffffbc6a0003bdd8
   CR2: 00000000000000b0
   ---[ end trace ce7578070732b5ef ]---
   Fixing recursive fault but reboot is needed!
   ------------[ cut here ]------------
   WARNING: CPU: 2 PID: 1 at kernel/rcu/tree_plugin.h:329 rcu_note_context_switch+0x27/0x350
   Modules linked in: uinput veth ip_vti ip_tunnel esp4 xfrm6_mode_tunnel fuse ccm xt_CHECKSUM tun bridge stp llc xfrm_user xfrm_algo ebtable_filter twofish_generic twofish_avx_x86_64 ebtables twofish_x86_64_3way twofish_x86_64 twofish_common vxlan ip6_udp_tunnel udp_tunnel serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic devlink blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 ablk_helper camellia_x86_64 xcbc openvswitch nf_nat_ipv6 md4 algif_hash af_alg cmac rfcomm bnep xt_policy nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat msr nf_nat_ipv4 nf_nat xt_TCPMSS iptable_mangle ipt_REJECT
    nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc iptable_filter snd_hda_codec_hdmi hid_sensor_als hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio rtsx_pci_sdmmc mmc_core iTCO_wdt wmi_bmof arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo joydev wacom videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_sensor_hub videodev btusb btrtl hid_multitouch btbcm media btintel rtsx_pci i915 bluetooth snd_hda_codec_conexant lpc_ich snd_hda_codec_generic mfd_core iwlmvm iosf_mbi i2c_algo_bit ecdh_generic
    drm_kms_helper mac80211 snd_hda_intel syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core snd_pcm_oss iwlwifi fb_sys_fops thinkpad_acpi snd_mixer_oss drm nvram snd_pcm video cfg80211 intel_gtt snd_timer rfkill snd evdev wmi ecryptfs nfsd ip_tables x_tables ipv6 crc_ccitt
   CPU: 2 PID: 1 Comm: systemd Tainted: G      D         4.14.12-kvm-00437-gd6765c06f03d #4
   Hardware name: LENOVO 20CD0035GE/20CD0035GE, BIOS GQET40WW (1.20 ) 11/07/2014
   task: ffff9c66560e0d00 task.stack: ffffbc6a00038000
   RIP: 0010:rcu_note_context_switch+0x27/0x350
   RSP: 0018:ffffbc6a0003be58 EFLAGS: 00010002
   RAX: 0000000000000001 RBX: ffff9c66560e0d00 RCX: 0000000000000001
   RDX: 0000000000000000 RSI: ffffffffafff992f RDI: ffffffffaffb7ead
   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000365
   R10: 0000000000000086 R11: 0000000000000000 R12: ffff9c665f29fbc0
   R13: ffff9c66560e0d00 R14: ffff9c66560e12a8 R15: 000000000001fbc0
   FS:  0000000000000000(0000) GS:ffff9c665f280000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00000000000000b0 CR3: 000000008220a006 CR4: 00000000001606e0
   Call Trace:
    __schedule+0x84/0x6f0
    schedule+0x37/0x90
    do_exit+0x8c2/0xab0
    rewind_stack_do_exit+0x17/0x20
   Code: 00 00 00 00 41 56 41 55 41 54 55 89 fd 53 65 48 8b 1c 25 00 4d 01 00 e8 48 da ff ff 40 84 ed 8b 83 f8 02 00 00 75 7d 85 c0 7e 7d <0f> ff 80 bb fc 02 00 00 00 0f 84 89 00 00 00 e8 c5 ca ff ff e8 
   ---[ end trace ce7578070732b5f0 ]---
   INFO: rcu_preempt detected stalls on CPUs/tasks:
   	Tasks blocked on level-0 rcu_node (CPUs 0-7): P1
   	(detected by 2, t=60002 jiffies, g=551687, c=551686, q=11683)
   systemd         D    0     1      0 0x80080002
   Call Trace:
    ? __schedule+0x292/0x6f0
    schedule+0x37/0x90
    do_exit+0x8c2/0xab0
    rewind_stack_do_exit+0x17/0x20
   systemd         D    0     1      0 0x80080002
   Call Trace:
    ? __schedule+0x292/0x6f0
    schedule+0x37/0x90
    do_exit+0x8c2/0xab0
    rewind_stack_do_exit+0x17/0x20
   
The crash does not happen with plain 4.14.11, but when this patch (*) is
included it happens with 4.14.1[12], and 4.14.12 plus the following set
of patches from the current 4.14 stable-queue:

	x86-mm-set-modules_end-to-0xffffffffff000000.patch
	x86-mm-map-cpu_entry_area-at-the-same-place-on-4-5-level.patch
	x86-kaslr-fix-the-vaddr_end-mess.patch
(*)	x86-events-intel-ds-use-the-proper-cache-flush-method-for-mapping-ds-buffers.patch
	x86-tlb-drop-the-_gpl-from-the-cpu_tlbstate-export.patch
	x86-alternatives-add-missing-n-at-end-of-alternative-inline-asm.patch
	x86-pti-rename-bug_cpu_insecure-to-bug_cpu_meltdown.patch
	kernel-acct.c-fix-the-acct-needcheck-check-in-check_free_space.patch
	mm-mprotect-add-a-cond_resched-inside-change_pmd_range.patch
	mm-sparse.c-wrong-allocation-for-mem_section.patch
	userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails.patch
	btrfs-fix-refcount_t-usage-when-deleting-btrfs_delayed_nodes.patch
	efi-capsule-loader-reinstate-virtual-capsule-mapping.patch
	crypto-n2-cure-use-after-free.patch
	crypto-chacha20poly1305-validate-the-digest-size.patch
	crypto-pcrypt-fix-freeing-pcrypt-instances.patch
	crypto-chelsio-select-crypto_gf128mul.patch
	drm-i915-disable-dc-states-around-gmbus-on-glk.patch
	drm-i915-apply-display-wa-1183-on-skl-kbl-and-cfl.patch
	sunxi-rsb-include-of-based-modalias-in-device-uevent.patch
	fscache-fix-the-default-for-fscache_maybe_release_page.patch

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-06 21:38                 ` Thomas Zeitlhofer
@ 2018-01-07  8:17                   ` Greg Kroah-Hartman
  2018-01-07  8:53                     ` Thomas Zeitlhofer
  0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07  8:17 UTC (permalink / raw)
  To: Thomas Zeitlhofer; +Cc: Peter Zijlstra, Thomas Gleixner, Hugh Dickins, LKML

On Sat, Jan 06, 2018 at 10:38:38PM +0100, Thomas Zeitlhofer wrote:
> On Thu, Jan 04, 2018 at 07:38:00PM +0100, Thomas Zeitlhofer wrote:
> > On Thu, Jan 04, 2018 at 06:07:12PM +0100, Peter Zijlstra wrote:
> > > On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote:
> > > > > Yes:
> > > > > 
> > > > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
> > > > >    caller is native_flush_tlb_single+0x57/0xc0
> > > > >    CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
> > > > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > > > >    Call Trace:
> > > > >     dump_stack+0x5c/0x86
> > > > >     check_preemption_disabled+0xdd/0xe0
> > > > >     native_flush_tlb_single+0x57/0xc0
> > > > >     ? __set_pte_vaddr+0x2d/0x40
> > > > >     __set_pte_vaddr+0x2d/0x40
> > > > >     set_pte_vaddr+0x2f/0x40
> > > > >     cea_set_pte+0x30/0x40
> > > > >     ds_update_cea.constprop.4+0x4d/0x70
> > > > >     reserve_ds_buffers+0x159/0x410
> > > > >     ? wp_page_copy+0x370/0x6c0
> > > > >     x86_reserve_hardware+0x150/0x160
> > > > >     x86_pmu_event_init+0x3e/0x1f0
> > > > >     perf_try_init_event+0x69/0x80
> > > > >     perf_event_alloc+0x652/0x740
> > > > >     SyS_perf_event_open+0x3f6/0xd60
> > > > >     do_syscall_64+0x5c/0x190
> > > > >     entry_SYSCALL64_slow_path+0x25/0x25
> > > > >    RIP: 0033:0x72bff0a3c0b9
> > > > >    RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > > > >    RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
> > > > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
> > > > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > > > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > > > >    R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600
> > > 
> > > Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be
> > > per CPU. But the DS crud does cross CPU updates of those tables.
> > > 
> > > So we need some additional fun and games..
> > > 
> > > How's the below?
> > [...]
> > 
> > Looks good - I have successfully tested it on top of 4.14.11 and
> > 4.15-rc6. In both cases, the error message is gone when this patch is
> > applied.
> 
> While solving the previous problem, this patch also introduces new "fun
> and games"...  
> 
> Now, terminating a systemd-nspawn container, reliably crashes the host
> (so far tested only on Haswell, if that matters). Once, I was able to
> capture the following trace:

Is this also reproducable on Linus's tree right now?

I've been running nspawn containers on it with no issues like this at
all :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-07  8:17                   ` Greg Kroah-Hartman
@ 2018-01-07  8:53                     ` Thomas Zeitlhofer
  2018-01-08  0:37                       ` Thomas Zeitlhofer
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-07  8:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Peter Zijlstra, Thomas Gleixner, Hugh Dickins, LKML

On Sun, Jan 07, 2018 at 09:17:18AM +0100, Greg Kroah-Hartman wrote:
> On Sat, Jan 06, 2018 at 10:38:38PM +0100, Thomas Zeitlhofer wrote:
> > On Thu, Jan 04, 2018 at 07:38:00PM +0100, Thomas Zeitlhofer wrote:
> > > On Thu, Jan 04, 2018 at 06:07:12PM +0100, Peter Zijlstra wrote:
> > > > On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote:
> > > > > > Yes:
> > > > > > 
> > > > > >    BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
> > > > > >    caller is native_flush_tlb_single+0x57/0xc0
> > > > > >    CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3
> > > > > >    Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> > > > > >    Call Trace:
> > > > > >     dump_stack+0x5c/0x86
> > > > > >     check_preemption_disabled+0xdd/0xe0
> > > > > >     native_flush_tlb_single+0x57/0xc0
> > > > > >     ? __set_pte_vaddr+0x2d/0x40
> > > > > >     __set_pte_vaddr+0x2d/0x40
> > > > > >     set_pte_vaddr+0x2f/0x40
> > > > > >     cea_set_pte+0x30/0x40
> > > > > >     ds_update_cea.constprop.4+0x4d/0x70
> > > > > >     reserve_ds_buffers+0x159/0x410
> > > > > >     ? wp_page_copy+0x370/0x6c0
> > > > > >     x86_reserve_hardware+0x150/0x160
> > > > > >     x86_pmu_event_init+0x3e/0x1f0
> > > > > >     perf_try_init_event+0x69/0x80
> > > > > >     perf_event_alloc+0x652/0x740
> > > > > >     SyS_perf_event_open+0x3f6/0xd60
> > > > > >     do_syscall_64+0x5c/0x190
> > > > > >     entry_SYSCALL64_slow_path+0x25/0x25
> > > > > >    RIP: 0033:0x72bff0a3c0b9
> > > > > >    RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
> > > > > >    RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9
> > > > > >    RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20
> > > > > >    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000
> > > > > >    R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008
> > > > > >    R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600
> > > > 
> > > > Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be
> > > > per CPU. But the DS crud does cross CPU updates of those tables.
> > > > 
> > > > So we need some additional fun and games..
> > > > 
> > > > How's the below?
> > > [...]
> > > 
> > > Looks good - I have successfully tested it on top of 4.14.11 and
> > > 4.15-rc6. In both cases, the error message is gone when this patch is
> > > applied.
> > 
> > While solving the previous problem, this patch also introduces new "fun
> > and games"...  
> > 
> > Now, terminating a systemd-nspawn container, reliably crashes the host
> > (so far tested only on Haswell, if that matters). Once, I was able to
> > capture the following trace:
> 
> Is this also reproducable on Linus's tree right now?

It is reproducible with this patch on top of 4.15-rc6 (might be able to
test Linus's current tree later that day). 

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11
  2018-01-07  8:53                     ` Thomas Zeitlhofer
@ 2018-01-08  0:37                       ` Thomas Zeitlhofer
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Zeitlhofer @ 2018-01-08  0:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Peter Zijlstra, Thomas Gleixner, Hugh Dickins, LKML

On Sun, Jan 07, 2018 at 09:53:19AM +0100, Thomas Zeitlhofer wrote:
> On Sun, Jan 07, 2018 at 09:17:18AM +0100, Greg Kroah-Hartman wrote:
> > On Sat, Jan 06, 2018 at 10:38:38PM +0100, Thomas Zeitlhofer wrote:
[...]
> > > While solving the previous problem, this patch also introduces new
> > > "fun and games"...  
> > > 
> > > Now, terminating a systemd-nspawn container, reliably crashes the
> > > host (so far tested only on Haswell, if that matters). Once, I was
> > > able to capture the following trace:
> > 
> > Is this also reproducable on Linus's tree right now?
> 
> It is reproducible with this patch on top of 4.15-rc6 (might be able
> to test Linus's current tree later that day). 

Some more testing showed that this is not caused by the patch after all,
sorry for the noise.

The crash happens quite reliably, but with a rather low probability it
does not occur. When I have tested 4.14.11 without the patch it was
obviously such a low probability event - in the meantime 4.14.11 without
the patch also crashed. The situation is also unchanged with 4.15-rc7.

Interestingly, it happens only when using the boot switch "-b", i.e.:

        systemd-nspawn -b -D <path to rootfs>

_and_ terminating the container by pressing ^] three times.  Other
combinations (e.g., no "-b" and terminating with ^]^]^], "-b" and
terminating by running shutdown inside the container) work just fine.
Anyway, this is already off-topic and might be subject to a different
thread...

Thanks,

Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-08  0:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-04  1:59 "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11 Thomas Zeitlhofer
2018-01-04 10:20 ` Thomas Zeitlhofer
2018-01-04 10:51   ` Greg Kroah-Hartman
2018-01-04 12:43     ` Thomas Zeitlhofer
2018-01-04 12:55       ` Greg Kroah-Hartman
2018-01-04 15:25         ` Thomas Zeitlhofer
2018-01-04 15:37           ` Thomas Gleixner
2018-01-04 17:07             ` Peter Zijlstra
2018-01-04 18:38               ` Thomas Zeitlhofer
2018-01-06 21:38                 ` Thomas Zeitlhofer
2018-01-07  8:17                   ` Greg Kroah-Hartman
2018-01-07  8:53                     ` Thomas Zeitlhofer
2018-01-08  0:37                       ` Thomas Zeitlhofer
2018-01-04 22:11               ` [tip:x86/pti] x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers tip-bot for Peter Zijlstra
2018-01-04 23:49               ` tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.