From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753549AbeAFViw (ORCPT + 1 other); Sat, 6 Jan 2018 16:38:52 -0500 Received: from vie01a-dmta-pe05-1.mx.upcmail.net ([84.116.36.11]:29252 "EHLO vie01a-dmta-pe05-1.mx.upcmail.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753180AbeAFViu (ORCPT ); Sat, 6 Jan 2018 16:38:50 -0500 X-SourceIP: 84.112.117.109 Date: Sat, 6 Jan 2018 22:38:38 +0100 From: Thomas Zeitlhofer To: Peter Zijlstra Cc: Thomas Gleixner , Greg Kroah-Hartman , Hugh Dickins , LKML Subject: Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11 Message-ID: <20180106213838.zxzbvufa3j7xeyhe@toau> References: <20180104015906.czhm7kis33iizsia@toau> <20180104102029.tpv5utpbdkrisgvl@toau> <20180104105111.GA2754@kroah.com> <20180104124320.eawuo6q7wnwzpf7s@toau> <20180104125528.GA15548@kroah.com> <20180104152516.3sql2ayoemlephig@toau> <20180104170712.GB3040@hirez.programming.kicks-ass.net> <20180104183800.ewx42etmmzk5b544@toau> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180104183800.ewx42etmmzk5b544@toau> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 04, 2018 at 07:38:00PM +0100, Thomas Zeitlhofer wrote: > On Thu, Jan 04, 2018 at 06:07:12PM +0100, Peter Zijlstra wrote: > > On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote: > > > > Yes: > > > > > > > > BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498 > > > > caller is native_flush_tlb_single+0x57/0xc0 > > > > CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 4.15.0-rc6-kvm-00423-gea1908c252eb #3 > > > > Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013 > > > > Call Trace: > > > > dump_stack+0x5c/0x86 > > > > check_preemption_disabled+0xdd/0xe0 > > > > native_flush_tlb_single+0x57/0xc0 > > > > ? __set_pte_vaddr+0x2d/0x40 > > > > __set_pte_vaddr+0x2d/0x40 > > > > set_pte_vaddr+0x2f/0x40 > > > > cea_set_pte+0x30/0x40 > > > > ds_update_cea.constprop.4+0x4d/0x70 > > > > reserve_ds_buffers+0x159/0x410 > > > > ? wp_page_copy+0x370/0x6c0 > > > > x86_reserve_hardware+0x150/0x160 > > > > x86_pmu_event_init+0x3e/0x1f0 > > > > perf_try_init_event+0x69/0x80 > > > > perf_event_alloc+0x652/0x740 > > > > SyS_perf_event_open+0x3f6/0xd60 > > > > do_syscall_64+0x5c/0x190 > > > > entry_SYSCALL64_slow_path+0x25/0x25 > > > > RIP: 0033:0x72bff0a3c0b9 > > > > RSP: 002b:00007ffed11c2f18 EFLAGS: 00000206 ORIG_RAX: 000000000000012a > > > > RAX: ffffffffffffffda RBX: 00007ffed11c30f0 RCX: 000072bff0a3c0b9 > > > > RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffed11c2f20 > > > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000007000000000 > > > > R10: 00000000ffffffff R11: 0000000000000206 R12: 0000000000000008 > > > > R13: 0000000000000000 R14: 00007ffed11c30d0 R15: 000060986ecfb600 > > > > Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be > > per CPU. But the DS crud does cross CPU updates of those tables. > > > > So we need some additional fun and games.. > > > > How's the below? > [...] > > Looks good - I have successfully tested it on top of 4.14.11 and > 4.15-rc6. In both cases, the error message is gone when this patch is > applied. While solving the previous problem, this patch also introduces new "fun and games"... Now, terminating a systemd-nspawn container, reliably crashes the host (so far tested only on Haswell, if that matters). Once, I was able to capture the following trace: BUG: unable to handle kernel paging request at 0000000000206ccc IP: __task_pid_nr_ns+0x57/0xc0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI Modules linked in: uinput veth ip_vti ip_tunnel esp4 xfrm6_mode_tunnel fuse ccm xt_CHECKSUM tun bridge stp llc xfrm_user xfrm_algo ebtable_filter twofish_generic twofish_avx_x86_64 ebtables twofish_x86_64_3way twofish_x86_64 twofish_common vxlan ip6_udp_tunnel udp_tunnel serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic devlink blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 ablk_helper camellia_x86_64 xcbc openvswitch nf_nat_ipv6 md4 algif_hash af_alg cmac rfcomm bnep xt_policy nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat msr nf_nat_ipv4 nf_nat xt_TCPMSS iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc iptable_filter snd_hda_codec_hdmi hid_sensor_als hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio rtsx_pci_sdmmc mmc_core iTCO_wdt wmi_bmof arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo joydev wacom videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_sensor_hub videodev btusb btrtl hid_multitouch btbcm media btintel rtsx_pci i915 bluetooth snd_hda_codec_conexant lpc_ich snd_hda_codec_generic mfd_core iwlmvm iosf_mbi i2c_algo_bit ecdh_generic drm_kms_helper mac80211 snd_hda_intel syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core snd_pcm_oss iwlwifi fb_sys_fops thinkpad_acpi snd_mixer_oss drm nvram snd_pcm video cfg80211 intel_gtt snd_timer rfkill snd evdev wmi ecryptfs nfsd ip_tables x_tables ipv6 crc_ccitt CPU: 2 PID: 1 Comm: systemd Not tainted 4.14.12-kvm-00437-gd6765c06f03d #4 Hardware name: LENOVO 20CD0035GE/20CD0035GE, BIOS GQET40WW (1.20 ) 11/07/2014 task: ffff9c66560e0d00 task.stack: ffffbc6a00038000 RIP: 0010:__task_pid_nr_ns+0x57/0xc0 RSP: 0018:ffffbc6a0003bdb0 EFLAGS: 00010246 RAX: ffff9c66560e8680 RBX: 0000000000000000 RCX: 0000000000206cc8 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000004d0 RBP: 0000000000000000 R08: ffffffffb0237b10 R09: 0000000000000005 R10: ffffbc6a0003bee0 R11: ffff9c65aa33c004 R12: ffffffffb02309a0 R13: 0000000000001000 R14: ffff9c65ecbd4a00 R15: ffff9c6624516b00 FS: 0000767a01669980(0000) GS:ffff9c665f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000206ccc CR3: 0000000215476003 CR4: 00000000001606e0 Call Trace: cgroup_procs_show+0x10/0x30 seq_read+0x30c/0x3d0 __vfs_read+0x2e/0x150 vfs_read+0x84/0x110 SyS_read+0x4d/0xc0 do_syscall_64+0x5c/0x190 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x767a00fa671d RSP: 002b:00007ffca8edc6e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000057d4d8a02c10 RCX: 0000767a00fa671d RDX: 0000000000001000 RSI: 000057d4d8a05320 RDI: 0000000000000083 RBP: 0000000000000d68 R08: 0000767a01265178 R09: 0000000000001010 R10: 000057d4d8a03490 R11: 0000000000000293 R12: 0000767a01261440 R13: 0000767a01260900 R14: 00000000ffffffff R15: 0000000000000000 Code: 74 0d 48 8d 44 6d 00 48 8d 3c c5 d0 04 00 00 48 8b 9b 98 04 00 00 48 01 fb 48 8b 0b 48 85 c9 74 37 41 8b b4 24 30 08 00 00 31 db <3b> 71 04 77 0d 48 c1 e6 05 48 01 f1 4c 3b 61 38 74 0c e8 12 db RIP: __task_pid_nr_ns+0x57/0xc0 RSP: ffffbc6a0003bdb0 CR2: 0000000000206ccc ---[ end trace ce7578070732b5ee ]--- BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0 IP: pids_free+0xb/0x30 PGD 0 P4D 0 Oops: 0000 [#2] PREEMPT SMP PTI Modules linked in: uinput veth ip_vti ip_tunnel esp4 xfrm6_mode_tunnel fuse ccm xt_CHECKSUM tun bridge stp llc xfrm_user xfrm_algo ebtable_filter twofish_generic twofish_avx_x86_64 ebtables twofish_x86_64_3way twofish_x86_64 twofish_common vxlan ip6_udp_tunnel udp_tunnel serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic devlink blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 ablk_helper camellia_x86_64 xcbc openvswitch nf_nat_ipv6 md4 algif_hash af_alg cmac rfcomm bnep xt_policy nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat msr nf_nat_ipv4 nf_nat xt_TCPMSS iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc iptable_filter snd_hda_codec_hdmi hid_sensor_als hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio rtsx_pci_sdmmc mmc_core iTCO_wdt wmi_bmof arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo joydev wacom videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_sensor_hub videodev btusb btrtl hid_multitouch btbcm media btintel rtsx_pci i915 bluetooth snd_hda_codec_conexant lpc_ich snd_hda_codec_generic mfd_core iwlmvm iosf_mbi i2c_algo_bit ecdh_generic drm_kms_helper mac80211 snd_hda_intel syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core snd_pcm_oss iwlwifi fb_sys_fops thinkpad_acpi snd_mixer_oss drm nvram snd_pcm video cfg80211 intel_gtt snd_timer rfkill snd evdev wmi ecryptfs nfsd ip_tables x_tables ipv6 crc_ccitt CPU: 2 PID: 1 Comm: systemd Tainted: G D 4.14.12-kvm-00437-gd6765c06f03d #4 Hardware name: LENOVO 20CD0035GE/20CD0035GE, BIOS GQET40WW (1.20 ) 11/07/2014 task: ffff9c66560e0d00 task.stack: ffffbc6a00038000 RIP: 0010:pids_free+0xb/0x30 RSP: 0018:ffffbc6a0003bdd8 EFLAGS: 00010297 RAX: 0000000000000000 RBX: 000000000000000a RCX: 000000000000000a RDX: 000000000000000a RSI: 000000000000000c RDI: ffff9c6624516b00 RBP: ffff9c6624516b00 R08: 0000000000000000 R09: 0000000000000000 R10: ffff9c65bf8a8510 R11: ffff9c6656003800 R12: ffffffffb02387e0 R13: ffff9c662ac6d590 R14: ffff9c66534cc7a0 R15: ffff9c6625d5f1e0 FS: 0000000000000000(0000) GS:ffff9c665f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000008220a006 CR4: 00000000001606e0 Call Trace: cgroup_free+0x57/0xd0 __put_task_struct+0x38/0x130 cgroup_procs_release+0x12/0x20 kernfs_fop_release+0x82/0x90 __fput+0x9d/0x220 task_work_run+0x84/0xa0 do_exit+0x2b1/0xab0 rewind_stack_do_exit+0x17/0x20 Code: c7 e8 6a fd ff ff 48 8b 80 b0 00 00 00 48 83 b8 b0 00 00 00 00 75 e7 f3 c3 0f 1f 80 00 00 00 00 48 8b 87 88 07 00 00 48 8b 40 50 <48> 83 b8 b0 00 00 00 00 74 19 48 89 c7 e8 33 fd ff ff 48 8b 80 RIP: pids_free+0xb/0x30 RSP: ffffbc6a0003bdd8 CR2: 00000000000000b0 ---[ end trace ce7578070732b5ef ]--- Fixing recursive fault but reboot is needed! ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1 at kernel/rcu/tree_plugin.h:329 rcu_note_context_switch+0x27/0x350 Modules linked in: uinput veth ip_vti ip_tunnel esp4 xfrm6_mode_tunnel fuse ccm xt_CHECKSUM tun bridge stp llc xfrm_user xfrm_algo ebtable_filter twofish_generic twofish_avx_x86_64 ebtables twofish_x86_64_3way twofish_x86_64 twofish_common vxlan ip6_udp_tunnel udp_tunnel serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic devlink blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 ablk_helper camellia_x86_64 xcbc openvswitch nf_nat_ipv6 md4 algif_hash af_alg cmac rfcomm bnep xt_policy nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat msr nf_nat_ipv4 nf_nat xt_TCPMSS iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc iptable_filter snd_hda_codec_hdmi hid_sensor_als hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio rtsx_pci_sdmmc mmc_core iTCO_wdt wmi_bmof arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo joydev wacom videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_sensor_hub videodev btusb btrtl hid_multitouch btbcm media btintel rtsx_pci i915 bluetooth snd_hda_codec_conexant lpc_ich snd_hda_codec_generic mfd_core iwlmvm iosf_mbi i2c_algo_bit ecdh_generic drm_kms_helper mac80211 snd_hda_intel syscopyarea snd_hda_codec sysfillrect sysimgblt snd_hda_core snd_pcm_oss iwlwifi fb_sys_fops thinkpad_acpi snd_mixer_oss drm nvram snd_pcm video cfg80211 intel_gtt snd_timer rfkill snd evdev wmi ecryptfs nfsd ip_tables x_tables ipv6 crc_ccitt CPU: 2 PID: 1 Comm: systemd Tainted: G D 4.14.12-kvm-00437-gd6765c06f03d #4 Hardware name: LENOVO 20CD0035GE/20CD0035GE, BIOS GQET40WW (1.20 ) 11/07/2014 task: ffff9c66560e0d00 task.stack: ffffbc6a00038000 RIP: 0010:rcu_note_context_switch+0x27/0x350 RSP: 0018:ffffbc6a0003be58 EFLAGS: 00010002 RAX: 0000000000000001 RBX: ffff9c66560e0d00 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffffafff992f RDI: ffffffffaffb7ead RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000365 R10: 0000000000000086 R11: 0000000000000000 R12: ffff9c665f29fbc0 R13: ffff9c66560e0d00 R14: ffff9c66560e12a8 R15: 000000000001fbc0 FS: 0000000000000000(0000) GS:ffff9c665f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000008220a006 CR4: 00000000001606e0 Call Trace: __schedule+0x84/0x6f0 schedule+0x37/0x90 do_exit+0x8c2/0xab0 rewind_stack_do_exit+0x17/0x20 Code: 00 00 00 00 41 56 41 55 41 54 55 89 fd 53 65 48 8b 1c 25 00 4d 01 00 e8 48 da ff ff 40 84 ed 8b 83 f8 02 00 00 75 7d 85 c0 7e 7d <0f> ff 80 bb fc 02 00 00 00 0f 84 89 00 00 00 e8 c5 ca ff ff e8 ---[ end trace ce7578070732b5f0 ]--- INFO: rcu_preempt detected stalls on CPUs/tasks: Tasks blocked on level-0 rcu_node (CPUs 0-7): P1 (detected by 2, t=60002 jiffies, g=551687, c=551686, q=11683) systemd D 0 1 0 0x80080002 Call Trace: ? __schedule+0x292/0x6f0 schedule+0x37/0x90 do_exit+0x8c2/0xab0 rewind_stack_do_exit+0x17/0x20 systemd D 0 1 0 0x80080002 Call Trace: ? __schedule+0x292/0x6f0 schedule+0x37/0x90 do_exit+0x8c2/0xab0 rewind_stack_do_exit+0x17/0x20 The crash does not happen with plain 4.14.11, but when this patch (*) is included it happens with 4.14.1[12], and 4.14.12 plus the following set of patches from the current 4.14 stable-queue: x86-mm-set-modules_end-to-0xffffffffff000000.patch x86-mm-map-cpu_entry_area-at-the-same-place-on-4-5-level.patch x86-kaslr-fix-the-vaddr_end-mess.patch (*) x86-events-intel-ds-use-the-proper-cache-flush-method-for-mapping-ds-buffers.patch x86-tlb-drop-the-_gpl-from-the-cpu_tlbstate-export.patch x86-alternatives-add-missing-n-at-end-of-alternative-inline-asm.patch x86-pti-rename-bug_cpu_insecure-to-bug_cpu_meltdown.patch kernel-acct.c-fix-the-acct-needcheck-check-in-check_free_space.patch mm-mprotect-add-a-cond_resched-inside-change_pmd_range.patch mm-sparse.c-wrong-allocation-for-mem_section.patch userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails.patch btrfs-fix-refcount_t-usage-when-deleting-btrfs_delayed_nodes.patch efi-capsule-loader-reinstate-virtual-capsule-mapping.patch crypto-n2-cure-use-after-free.patch crypto-chacha20poly1305-validate-the-digest-size.patch crypto-pcrypt-fix-freeing-pcrypt-instances.patch crypto-chelsio-select-crypto_gf128mul.patch drm-i915-disable-dc-states-around-gmbus-on-glk.patch drm-i915-apply-display-wa-1183-on-skl-kbl-and-cfl.patch sunxi-rsb-include-of-based-modalias-in-device-uevent.patch fscache-fix-the-default-for-fscache_maybe_release_page.patch Thanks, Thomas