All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	"H. Peter Anvin" <h.peter.anvin@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Jan Kiszka <jan.kiszka@siemens.com>, X86 ML <x86@kernel.org>
Subject: Re: [RFC][PATCH 2/2] x86: add extra serialization for non-serializing MSRs
Date: Fri, 5 Feb 2021 11:02:10 +0100	[thread overview]
Message-ID: <YB0XonRIr1GcCy6M@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CALCETrXMhe3ULF9UDc1=8CKVfKqneCxJ2wYmCdKPpntkkMNGWg@mail.gmail.com>

On Thu, Feb 04, 2021 at 04:11:12PM -0800, Andy Lutomirski wrote:
> I'm wondering if a more mild violation is possible:
> 
> Initialize *addr = 0.
> 
> mov $1, (addr)
> wrmsr
> 
> remote cpu's IDT vector:
> 
> mov (addr), %rax
> %rax == 0!
> 
> There's no speculative-execution-becoming-visible-even-if-it-doesn't-retire
> here -- there's just an ordering violation.  For Linux, this would
> presumably only manifest as a potential deadlock or confusion if the
> IPI vector code looks at the list of pending work and doesn't find the
> expected work in it.
> 
> Dave?  hpa?  What is the SDM trying to tell us?

[ Big caveat, I've not spoken to any hardware people about this. The
below is purely my own understanding. ]

This is my interpretation as well. Without the MFENCE+LFENCE there is no
guarantee the store is out of the store-buffer and the remote load isn't
guaranteed to observe it.

What I think the SDM is trying to tell us, is that the IPI, even if it
goes on the same regular coherency fabric as memory transfers, is not
subject to the regular memory ordering rules.

Normal TSO rules tells us that when:

P1() {
	x = 1;
	y = 1;
}

P2() {
	r1 = y;
	r2 = x;
}

r2 must not be 0 when r1 is 1. Because if we see store to y, we must
also see store to x. But the IPI thing doesn't behave like a store. The
(fast) wrmsr isn't even considered a memop.

The thing is, the above ordering does not guarantee we have r2 != 0.
r2==0 is allowed when r1==0. And that's an entirely sane outcome even if
we run the instructions like:

		CPU1		CPU2

cycle-1		mov $1, ([x])
cycle-2		mov $1, ([y])
cycle-3				mov ([y]), rax
cycle-4				mov ([x]), rbx

There is no guarantee _any_ of the stores will have made it out. And
that's exactly the issue. The IPI might make it out of the core before
any of the stores will.

Furthermore, since there is no dependency between:

	mov	$1, ([x])
	wrmsr

The CPU is allowed to reorder the execution and retire the wrmsr before
the store. Very much like it would for normal non-dependent
instructions.

And presumably it is still allowed to do that when we write it like:

	mov	$1, ([x])
	mfence
	wrmsr

because, mfence only has dependencies to memops and (fast) wrmsr is not
a memop.

Which then brings us to:

	mov	$1, ([x])
	mfence
	lfence
	wrmsr

In this case, the lfence acts like the newly minted ifence (see
spectre), and will block execution of (any) later instructions until
completion of all prior instructions. This, and only this ensures the
wrmsr happens after the mfence, which in turn ensures the store to x is
globally visible.


  reply	other threads:[~2021-02-06  0:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-05 17:47 [RFC][PATCH 1/2] x86: remove duplicate TSC DEADLINE MSR definitions Dave Hansen
2020-03-05 17:47 ` [RFC][PATCH 2/2] x86: add extra serialization for non-serializing MSRs Dave Hansen
2021-02-04 16:37   ` Dave Hansen
2021-02-04 19:37   ` [tip: x86/urgent] x86/apic: Add " tip-bot2 for Dave Hansen
2021-02-04 23:37   ` [RFC][PATCH 2/2] x86: add " Andrew Cooper
2021-02-05  0:11     ` Andy Lutomirski
2021-02-05 10:02       ` Peter Zijlstra [this message]
2021-02-05 12:08         ` Andrew Cooper
2021-02-05 12:21         ` Peter Zijlstra
2021-09-01 13:07   ` x86/apic: Add " Corey Minyard
2020-03-09 23:50 ` [RFC][PATCH 1/2] x86: remove duplicate TSC DEADLINE MSR definitions Sean Christopherson
2021-02-05  9:31 ` Borislav Petkov
2021-02-12 21:37   ` Arnaldo Carvalho de Melo
2021-03-07  1:29 ` Dave Hansen
2021-03-08 11:46 ` [tip: x86/cleanups] x86: Remove " tip-bot2 for Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YB0XonRIr1GcCy6M@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=h.peter.anvin@intel.com \
    --cc=jan.kiszka@siemens.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.