From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753321AbeADPZ1 (ORCPT + 1 other); Thu, 4 Jan 2018 10:25:27 -0500 Received: from vie01a-dmta-pe07-3.mx.upcmail.net ([84.116.36.19]:32128 "EHLO vie01a-dmta-pe07-3.mx.upcmail.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751432AbeADPZ0 (ORCPT ); Thu, 4 Jan 2018 10:25:26 -0500 X-SourceIP: 84.112.117.109 Date: Thu, 4 Jan 2018 16:25:16 +0100 From: Thomas Zeitlhofer To: Greg Kroah-Hartman Cc: Thomas Gleixner , Hugh Dickins , linux-kernel@vger.kernel.org Subject: Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11 Message-ID: <20180104152516.3sql2ayoemlephig@toau> References: <20180104015906.czhm7kis33iizsia@toau> <20180104102029.tpv5utpbdkrisgvl@toau> <20180104105111.GA2754@kroah.com> <20180104124320.eawuo6q7wnwzpf7s@toau> <20180104125528.GA15548@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180104125528.GA15548@kroah.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 04, 2018 at 01:55:28PM +0100, Greg Kroah-Hartman wrote: > On Thu, Jan 04, 2018 at 01:43:20PM +0100, Thomas Zeitlhofer wrote: > > On Thu, Jan 04, 2018 at 11:51:11AM +0100, Greg Kroah-Hartman wrote: > > > On Thu, Jan 04, 2018 at 11:20:29AM +0100, Thomas Zeitlhofer wrote: > > > > On Thu, Jan 04, 2018 at 02:59:06AM +0100, Thomas Zeitlhofer wrote: > > > > > Hello, > > > > > > > > > > on an Ivybridge CPU, I get with 4.14.11: > > > > > > > > > > BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4510 > > > > > caller is native_flush_tlb_single+0x57/0xc0 > > > > > CPU: 3 PID: 4510 Comm: ovsdb-server Not tainted 4.14.11-kvm-00434-gcd0b8eb84f5c #3 > > > > > Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013 > > > > > Call Trace: > > > > > dump_stack+0x5c/0x86 > > > > > check_preemption_disabled+0xdd/0xe0 > > > > > native_flush_tlb_single+0x57/0xc0 > > > > > ? __set_pte_vaddr+0x2d/0x40 > > > > > __set_pte_vaddr+0x2d/0x40 > > > > > set_pte_vaddr+0x2f/0x40 > > > > > cea_set_pte+0x30/0x40 > > > > > ds_update_cea.constprop.4+0x4d/0x70 > > > > > reserve_ds_buffers+0x159/0x410 > > > > > ? wp_page_copy+0x36d/0x6a0 > > > > > x86_reserve_hardware+0x150/0x160 > > > > > x86_pmu_event_init+0x3e/0x1f0 > > > > > perf_try_init_event+0x69/0x80 > > > > > perf_event_alloc+0x652/0x740 > > > > > SyS_perf_event_open+0x3f6/0xd60 > > > > > do_syscall_64+0x5c/0x190 > > > > > entry_SYSCALL64_slow_path+0x25/0x25 > > > > > RIP: 0033:0x74a1d94580b9 > > > > > RSP: 002b:00007fff0c01d5d8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a > > > > > RAX: ffffffffffffffda RBX: 00007fff0c01d7b0 RCX: 000074a1d94580b9 > > > > > RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fff0c01d5e0 > > > > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000 > > > > > R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008 > > > > > R13: 0000000000000000 R14: 00007fff0c01d790 R15: 00005df43a799600 > > > > > > > > > > This does not show up when booting with pti=off. > > > > > > > > > > Maybe it is related to the issue that is fixed for the upcoming 4.4.110 > > > > > release by https://lkml.org/lkml/2018/1/3/692 > > > > > > I don't understand this link. > > > > I found that link when trying to search for the error message. That > > patch touches __native_flush_tlb_single() and mentions hardware > > differences in Ivybridge and below: > > > > "We have many machines (Westmere, Sandybridge, Ivybridge) > > supporting PCID but not INVPCID..." > > > > As I see the error message only on Ivybridge and not on Haswell, I came > > up with the vague guess that this could be related. > > > > > The 4.4 and 4.9 backports are much different than the 4.14 tree. > > > > Yes, I have seen that. > > > > > > JFYI, the very same kernel does not show this issue on a Haswell CPU. > > > > > > I have now queued up a bunch of patches that are in Linus's tree, can > > > you test these out as well: > > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.14 > > > > Does not seem to make any difference - with those patches applied I > > still get: > > > > BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4383 > > caller is native_flush_tlb_single+0x57/0xc0 > > CPU: 3 PID: 4383 Comm: ovsdb-server Not tainted 4.14.11-kvm-00435-g3138001170c9 #3 > > Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013 > > Call Trace: > > dump_stack+0x5c/0x86 > > check_preemption_disabled+0xdd/0xe0 > > native_flush_tlb_single+0x57/0xc0 > > ? __set_pte_vaddr+0x2d/0x40 > > __set_pte_vaddr+0x2d/0x40 > > set_pte_vaddr+0x2f/0x40 > > cea_set_pte+0x30/0x40 > > ds_update_cea.constprop.4+0x4d/0x70 > > reserve_ds_buffers+0x159/0x410 > > ? wp_page_copy+0x36d/0x6a0 > > x86_reserve_hardware+0x150/0x160 > > x86_pmu_event_init+0x3e/0x1f0 > > perf_try_init_event+0x69/0x80 > > perf_event_alloc+0x652/0x740 > > SyS_perf_event_open+0x3f6/0xd60 > > do_syscall_64+0x5c/0x190 > > entry_SYSCALL64_slow_path+0x25/0x25 > > RIP: 0033:0x755c0b8580b9 > > RSP: 002b:00007fffc87cf9e8 EFLAGS: 00000206 ORIG_RAX: 000000000000012a > > RAX: ffffffffffffffda RBX: 00007fffc87cfbc0 RCX: 0000755c0b8580b9 > > RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007fffc87cf9f0 > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000 > > R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008 > > R13: 0000000000000000 R14: 00007fffc87cfba0 R15: 000062ea2cbff600 > > > > Odd, does 4.15-rc6 also trigger the same error? Yes: BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498 caller is native_flush_tlb_single+0x57/0xc0 CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3 Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013 Call Trace: dump_stack+0x5c/0x86 check_preemption_disabled+0xdd/0xe0 native_flush_tlb_single+0x57/0xc0 ? __set_pte_vaddr+0x2d/0x40 __set_pte_vaddr+0x2d/0x40 set_pte_vaddr+0x2f/0x40 cea_set_pte+0x30/0x40 ds_update_cea.constprop.4+0x4d/0x70 reserve_ds_buffers+0x159/0x410 ? wp_page_copy+0x370/0x6c0 x86_reserve_hardware+0x150/0x160 x86_pmu_event_init+0x3e/0x1f0 perf_try_init_event+0x69/0x80 perf_event_alloc+0x652/0x740 SyS_perf_event_open+0x3f6/0xd60 do_syscall_64+0x5c/0x190 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x72bff0a3c0b9 RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9 RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000 R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008 R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600 device ovs-system entered promiscuous mode netlink: 'ovs-vswitchd': attribute type 5 has an invalid length. In addition, with v4.15-rc6, netlink messages like in the last line show up, but I guess this is a different openvswitch related issue. > Thomas is working on an > issue with KALSR (see lkml with: > Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs > ) Yes, I have also seen that thread, but I did not see any similarities to my issue. Anyway, I also tried out the patch proposed in https://lkml.org/lkml/2018/1/4/313 but it does not change anything here. Thanks, Thomas