From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753228AbdKNQLN (ORCPT ); Tue, 14 Nov 2017 11:11:13 -0500 Received: from mail.kernel.org ([198.145.29.99]:53840 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750851AbdKNQLD (ORCPT ); Tue, 14 Nov 2017 11:11:03 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0B8B821992 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org X-Google-Smtp-Source: AGs4zMZ8NDeE68nb8DcmIDzCiXwYQuLEIH3PnnFkK/TybLX97rIGZmZZuS1+ugtZ5Q549bjAcwYPsfSiumt43BDZRts= MIME-Version: 1.0 In-Reply-To: <20171114160541.GC3165@worktop.lehotels.local> References: <20171110211249.10742-1-mathieu.desnoyers@efficios.com> <885227610.13045.1510351034488.JavaMail.zimbra@efficios.com> <617343212.13932.1510592207202.JavaMail.zimbra@efficios.com> <4d47fbb8-8f99-19d3-a9cf-66841aeffac3@scylladb.com> <4431530.14831.1510672632887.JavaMail.zimbra@efficios.com> <20171114160541.GC3165@worktop.lehotels.local> From: Andy Lutomirski Date: Tue, 14 Nov 2017 08:10:41 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration To: Peter Zijlstra Cc: Mathieu Desnoyers , Avi Kivity , Linus Torvalds , Andy Lutomirski , linux-kernel , linux-api , "Paul E. McKenney" , Boqun Feng , Andrew Hunter , maged michael , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Dave Watson , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrea Parri , "Russell King, ARM Linux" , Greg Hackmann , Will Deacon , David Sehr , x86 Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 14, 2017 at 8:05 AM, Peter Zijlstra wrote: > On Tue, Nov 14, 2017 at 03:17:12PM +0000, Mathieu Desnoyers wrote: >> I've tried to create a small single-threaded self-modifying loop in >> user-space to trigger a trace cache or speculative execution quirk, >> but I have not succeeded yet. I suspect that I would need to know >> more about the internals of the processor architecture to create the >> right stalls that would allow speculative execution to move further >> ahead, and trigger an incoherent execution flow. Ideas on how to >> trigger this would be welcome. > > I thought the whole problem was per definition multi-threaded. > > Single-threaded stuff can't get out of sync with itself; you'll always > observe your own stores. > > And ISTR the JIT scenario being something like the JIT overwriting > previously executed but supposedly no longer used code. And in this > scenario you'd want to guarantee all CPUs observe the new code before > jumping into it. > > The current approach is using mprotect(), except that on a number of > platforms the TLB invalidate from that is not guaranteed to be strong > enough to sync for code changes. > > On x86 the mprotect() should work just fine, since we broadcast IPIs for > the TLB invalidate and the IRET from those will get the things synced up > again (if nothing else; very likely we'll have done a MOV-CR3 which will > of course also have sufficient syncness on it). > > But PowerPC, s390, ARM et al that do TLB invalidates without interrupts > and don't guarantee their TLB invalidate sync against execution units > are left broken by this scheme. > On x86 single-thread, you can still get in trouble, I think. Do a store, get migrated, execute the stored code. There's no actual guarantee that the new CPU does a CR3 load due to laziness. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration Date: Tue, 14 Nov 2017 08:10:41 -0800 Message-ID: References: <20171110211249.10742-1-mathieu.desnoyers@efficios.com> <885227610.13045.1510351034488.JavaMail.zimbra@efficios.com> <617343212.13932.1510592207202.JavaMail.zimbra@efficios.com> <4d47fbb8-8f99-19d3-a9cf-66841aeffac3@scylladb.com> <4431530.14831.1510672632887.JavaMail.zimbra@efficios.com> <20171114160541.GC3165@worktop.lehotels.local> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <20171114160541.GC3165-IIpfhp3q70x9+YH6RuovlLjjLBE8jN/0@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Peter Zijlstra Cc: Mathieu Desnoyers , Avi Kivity , Linus Torvalds , Andy Lutomirski , linux-kernel , linux-api , "Paul E. McKenney" , Boqun Feng , Andrew Hunter , maged michael , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Dave Watson , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrea Parri , "Russell King, ARM Linux" List-Id: linux-api@vger.kernel.org On Tue, Nov 14, 2017 at 8:05 AM, Peter Zijlstra wrote: > On Tue, Nov 14, 2017 at 03:17:12PM +0000, Mathieu Desnoyers wrote: >> I've tried to create a small single-threaded self-modifying loop in >> user-space to trigger a trace cache or speculative execution quirk, >> but I have not succeeded yet. I suspect that I would need to know >> more about the internals of the processor architecture to create the >> right stalls that would allow speculative execution to move further >> ahead, and trigger an incoherent execution flow. Ideas on how to >> trigger this would be welcome. > > I thought the whole problem was per definition multi-threaded. > > Single-threaded stuff can't get out of sync with itself; you'll always > observe your own stores. > > And ISTR the JIT scenario being something like the JIT overwriting > previously executed but supposedly no longer used code. And in this > scenario you'd want to guarantee all CPUs observe the new code before > jumping into it. > > The current approach is using mprotect(), except that on a number of > platforms the TLB invalidate from that is not guaranteed to be strong > enough to sync for code changes. > > On x86 the mprotect() should work just fine, since we broadcast IPIs for > the TLB invalidate and the IRET from those will get the things synced up > again (if nothing else; very likely we'll have done a MOV-CR3 which will > of course also have sufficient syncness on it). > > But PowerPC, s390, ARM et al that do TLB invalidates without interrupts > and don't guarantee their TLB invalidate sync against execution units > are left broken by this scheme. > On x86 single-thread, you can still get in trouble, I think. Do a store, get migrated, execute the stored code. There's no actual guarantee that the new CPU does a CR3 load due to laziness.