All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>,
	"Luck, Tony" <tony.luck@intel.com>,
	"Yu, Fenghua" <fenghua.yu@intel.com>,
	David Laight <David.Laight@aculab.com>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, "Raj, Ashok" <ashok.raj@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter
Date: Wed, 11 Dec 2019 10:12:56 -0800	[thread overview]
Message-ID: <CALCETrXUZ790WFk9SEzuiKg-wMva=RpWhZNYPf+MqzT0xdu+gg@mail.gmail.com> (raw)
In-Reply-To: <20191211175202.GQ2827@hirez.programming.kicks-ass.net>

On Wed, Dec 11, 2019 at 9:52 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Nov 22, 2019 at 01:23:30PM -0800, Andy Lutomirski wrote:
> > On Fri, Nov 22, 2019 at 12:31 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Fri, Nov 22, 2019 at 05:48:14PM +0000, Luck, Tony wrote:
> > > > > When we use byte ops, we must consider the word as 4 independent
> > > > > variables. And in that case the later load might observe the lock-byte
> > > > > state from 3, because the modification to the lock byte from 4 is in
> > > > > CPU2's store-buffer.
> > > >
> > > > So we absolutely violate this with the optimization for constant arguments
> > > > to set_bit(), clear_bit() and change_bit() that are implemented as byte ops.
> > > >
> > > > So is code that does:
> > > >
> > > >       set_bit(0, bitmap);
> > > >
> > > > on one CPU. While another is doing:
> > > >
> > > >       set_bit(mybit, bitmap);
> > > >
> > > > on another CPU safe? The first operates on just one byte, the second  on 8 bytes.
> > >
> > > It is safe if all you care about is the consistency of that one bit.
> > >
> >
> > I'm still lost here.  Can you explain how one could write code that
> > observes an issue?  My trusty SDM, Vol 3 8.2.2 says "Locked
> > instructions have a total order."
>
> This is the thing I don't fully believe. Per this thread the bus-lock is
> *BAD* and not used for normal LOCK prefixed operations. But without the
> bus-lock it becomes very hard to guarantee total order.
>
> After all, if some CPU doesn't observe a specific variable, it doesn't
> care where in the order it fell. So I'm thinking they punted and went
> with some partial order that is near enough that it becomes very hard to
> tell the difference the moment you actually do observe stuff.

I hope that, if the SDM is indeed wrong, that Intel would fix the SDM.
It's definitely not fun to try to understand locking if we don't trust
the manual.

>
> > 8.2.3.9 says "Loads and Stores Are
> > Not Reordered with Locked Instructions."  Admittedly, the latter is an
> > "example", but the section is very clear about the fact that a locked
> > instruction prevents reordering of a load or a store issued by the
> > same CPU relative to the locked instruction *regardless of whether
> > they overlap*.
>
> IIRC this rule is CPU-local.
>
> Sure, but we're talking two cpus here.
>
>         u32 var = 0;
>         u8 *ptr = &var;
>
>         CPU0                    CPU1
>
>                                 xchg(ptr, 1)
>
>         xchg((ptr+1, 1);
>         r = READ_ONCE(var);
>
> AFAICT nothing guarantees r == 0x0101. The CPU1 store can be stuck in
> CPU1's store-buffer. CPU0's xchg() does not overlap and therefore
> doesn't force a snoop or forward.

I think I don't quite understand.  The final value of var had better
be 0x0101 or something is severely wrong.  But r can be 0x0100 because
nothing in this example guarantees that the total order of the locked
instructions has CPU 1's instruction first.

>
> From the perspective of the LOCK prefixed instructions CPU0 never
> observes the variable @ptr. And therefore doesn't need to provide order.

I suspect that the implementation works on whole cache lines for
everything except the actual store buffer entries, which would mean
that CPU 0 does think it observed ptr[0].

>
> Note how the READ_ONCE() is a normal load on CPU0, and per the rules is
> only forced to happen after it's own LOCK prefixed instruction, but it
> is free to observe ptr[0,2,3] from before, only ptr[1] will be forwarded
> from its own store-buffer.
>
> This is exactly the one reorder TSO allows.

If so, then our optimized smp_mb() has all kinds of problems, no?

>
> > I understand that the CPU is probably permitted to optimize a LOCK RMW
> > operation such that it retires before the store buffers of earlier
> > instructions are fully flushed, but only if the store buffer and cache
> > coherency machinery work together to preserve the architecturally
> > guaranteed ordering.
>
> Maybe, maybe not. I'm very loathe to trust this without things being
> better specified.
>
> Like I said, it is possible that it all works, but the way I understand
> things I _really_ don't want to rely on it.
>
> Therefore, I've written:
>
>         u32 var = 0;
>         u8 *ptr = &var;
>
>         CPU0                    CPU1
>
>                                 xchg(ptr, 1)
>
>         set_bit(8, ptr);
>
>         r = READ_ONCE(var);
>
> Because then the LOCK BTSL overlaps with the LOCK XCHGB and CPU0 now
> observes the variable @ptr and therefore must force order.
>
> Did this clarify, or confuse more?

Probably confuses more.

If you're actual concerned that the SDM is wrong, I think that roping
in some architects would be a good idea.

I still think that making set_bit() do 32-bit or smaller accesses is okay.

  reply	other threads:[~2019-12-11 18:13 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-21  0:53 [PATCH v10 0/6] Enable split lock detection for real time and debug Fenghua Yu
2019-11-21  0:53 ` [PATCH v10 1/6] x86/msr-index: Add two new MSRs Fenghua Yu
2019-11-21  0:53 ` [PATCH v10 2/6] x86/cpufeatures: Enumerate the IA32_CORE_CAPABILITIES MSR Fenghua Yu
2019-11-21  0:53 ` [PATCH v10 3/6] x86/split_lock: Enumerate split lock detection by " Fenghua Yu
2019-11-21  0:53 ` [PATCH v10 4/6] x86/split_lock: Enumerate split lock detection if the IA32_CORE_CAPABILITIES MSR is not supported Fenghua Yu
2019-11-21 22:07   ` Andy Lutomirski
2019-11-22  0:37     ` Fenghua Yu
2019-11-22  2:13       ` Andy Lutomirski
2019-11-22  9:46         ` Peter Zijlstra
2019-11-21  0:53 ` [PATCH v10 5/6] x86/split_lock: Handle #AC exception for split lock Fenghua Yu
2019-11-21 22:10   ` Andy Lutomirski
2019-11-21 23:14     ` Fenghua Yu
2019-11-21 23:12       ` Andy Lutomirski
2019-11-21  0:53 ` [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter Fenghua Yu
2019-11-21  6:04   ` Ingo Molnar
2019-11-21 13:01     ` Peter Zijlstra
2019-11-21 13:15       ` Peter Zijlstra
2019-11-21 21:51         ` Luck, Tony
2019-11-21 22:24           ` Andy Lutomirski
2019-11-21 22:29             ` Luck, Tony
2019-11-21 23:18               ` Andy Lutomirski
2019-11-21 23:53                 ` Fenghua Yu
2019-11-22  1:52                   ` Sean Christopherson
2019-11-22  2:21                     ` Andy Lutomirski
2019-11-22  2:39                       ` Xiaoyao Li
2019-11-22  2:57                         ` Andy Lutomirski
2019-11-21 23:55                 ` Luck, Tony
2019-11-22  0:55             ` Luck, Tony
2019-11-22 10:08           ` Peter Zijlstra
2019-11-21 16:14       ` Fenghua Yu
2019-11-21 17:14         ` Ingo Molnar
2019-11-21 17:35         ` Peter Zijlstra
2019-11-21 17:12       ` Ingo Molnar
2019-11-21 17:34         ` Luck, Tony
2019-11-22 10:51           ` Peter Zijlstra
2019-11-22 15:27             ` Peter Zijlstra
2019-11-22 17:22               ` Luck, Tony
2019-11-22 20:23                 ` Peter Zijlstra
2019-11-22 18:02               ` Luck, Tony
2019-11-22 20:23                 ` Peter Zijlstra
2019-11-22 20:42                   ` Fenghua Yu
2019-11-22 21:25                     ` Andy Lutomirski
2019-12-12  8:57                       ` Peter Zijlstra
2019-12-12 18:52                         ` Luck, Tony
2019-12-12 19:46                           ` Luck, Tony
2019-12-12 20:01                             ` Andy Lutomirski
2019-12-16 16:21                               ` David Laight
2019-11-22 18:44               ` Sean Christopherson
2019-11-22 20:30                 ` Peter Zijlstra
2019-11-23  0:30               ` Luck, Tony
2019-11-25 16:13                 ` Sean Christopherson
2019-12-02 18:20                   ` Luck, Tony
2019-12-12  8:59                   ` Peter Zijlstra
2020-01-10 19:24                     ` [PATCH v11] x86/split_lock: Enable split lock detection by kernel Luck, Tony
2020-01-14  5:55                       ` Sean Christopherson
2020-01-15 22:27                         ` Luck, Tony
2020-01-15 22:57                           ` Sean Christopherson
2020-01-15 23:48                             ` Luck, Tony
2020-01-22 18:55                             ` [PATCH v12] " Luck, Tony
2020-01-22 19:04                               ` Borislav Petkov
2020-01-22 20:03                                 ` Luck, Tony
2020-01-22 20:55                                   ` Borislav Petkov
2020-01-22 22:42                               ` Arvind Sankar
2020-01-22 22:52                                 ` Arvind Sankar
2020-01-22 23:24                                 ` Luck, Tony
2020-01-23  0:45                                   ` Arvind Sankar
2020-01-23  1:23                                     ` Luck, Tony
2020-01-23  4:21                                       ` Arvind Sankar
2020-01-23 17:15                                         ` Luck, Tony
2020-01-23  3:53                                     ` [PATCH v13] " Luck, Tony
2020-01-23  4:45                                       ` Arvind Sankar
2020-01-23 23:16                                         ` [PATCH v14] " Luck, Tony
2020-01-24 21:36                                           ` Thomas Gleixner
2020-01-25  2:47                                             ` [PATCH v15] " Luck, Tony
2020-01-25 10:44                                               ` Borislav Petkov
2020-01-25 19:55                                                 ` Luck, Tony
2020-01-25 20:12                                                   ` Peter Zijlstra
2020-01-25 20:33                                                     ` Borislav Petkov
2020-01-25 21:42                                                       ` Luck, Tony
2020-01-25 22:17                                                         ` Borislav Petkov
2020-01-25 20:29                                                   ` Borislav Petkov
2020-01-25 13:41                                               ` Thomas Gleixner
2020-01-25 22:07                                                 ` [PATCH v16] " Luck, Tony
2020-01-25 22:43                                                   ` Mark D Rustad
2020-01-25 23:10                                                     ` Luck, Tony
2020-01-26 17:27                                                       ` Mark D Rustad
2020-01-26 20:05                                                         ` [PATCH v17] " Luck, Tony
2020-01-29 12:31                                                           ` Thomas Gleixner
2020-01-29 15:24                                                           ` [tip: x86/cpu] " tip-bot2 for Peter Zijlstra (Intel)
2020-02-03 20:41                                                           ` [PATCH v17] " Sean Christopherson
2020-02-06  0:49                                                             ` [PATCH] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR Luck, Tony
2020-02-06  1:18                                                               ` Andy Lutomirski
2020-02-06 16:46                                                                 ` Luck, Tony
2020-02-06 19:37                                                                   ` Andy Lutomirski
2020-03-03 19:22                                                                     ` Sean Christopherson
2020-02-04  0:04                                                           ` [PATCH v17] x86/split_lock: Enable split lock detection by kernel Sean Christopherson
2020-02-04 12:52                                                             ` Thomas Gleixner
2020-01-26  0:34                                                   ` [PATCH v16] " Andy Lutomirski
2020-01-26 20:01                                                     ` Luck, Tony
2020-01-25 21:25                                               ` [PATCH v15] " Arvind Sankar
2020-01-25 21:50                                                 ` Luck, Tony
2020-01-25 23:51                                                   ` Arvind Sankar
2020-01-26  2:52                                                     ` Luck, Tony
2020-01-27  2:05                                                       ` Tony Luck
2020-01-27  8:04                                                   ` Peter Zijlstra
2020-01-27  8:36                                                     ` Peter Zijlstra
2020-01-27 17:35                                                     ` Luck, Tony
2020-01-27  8:02                                                 ` Peter Zijlstra
2019-12-13  0:09               ` [PATCH v11] x86/split_lock: Enable split lock detection by kernel parameter Tony Luck
2019-12-13  0:16                 ` Luck, Tony
2019-11-21 17:43         ` [PATCH v10 6/6] " David Laight
2019-11-21 17:51           ` Andy Lutomirski
2019-11-21 18:53             ` Fenghua Yu
2019-11-21 19:01               ` Andy Lutomirski
2019-11-21 20:25                 ` Fenghua Yu
2019-11-21 20:19                   ` Peter Zijlstra
2019-11-21 19:46               ` Peter Zijlstra
2019-11-21 20:25               ` Peter Zijlstra
2019-11-21 21:22                 ` Andy Lutomirski
2019-11-22  9:25                   ` Peter Zijlstra
2019-11-22 17:48                     ` Luck, Tony
2019-11-22 20:31                       ` Peter Zijlstra
2019-11-22 21:23                         ` Andy Lutomirski
2019-12-11 17:52                           ` Peter Zijlstra
2019-12-11 18:12                             ` Andy Lutomirski [this message]
2019-12-11 22:34                               ` Peter Zijlstra
2019-12-12 19:40                                 ` Andy Lutomirski
2019-12-16  9:59                                   ` David Laight
2019-12-16 17:22                                     ` Andy Lutomirski
2019-12-16 17:45                                       ` David Laight
2019-12-16 18:06                                         ` Andy Lutomirski
2019-12-17 10:03                                           ` David Laight
2019-12-11 18:44                             ` Luck, Tony
2019-12-11 22:39                               ` Peter Zijlstra
2019-12-12 10:36                                 ` David Laight
2019-12-12 13:04                                   ` Peter Zijlstra
2019-12-12 16:02                                     ` Andy Lutomirski
2019-12-12 16:23                                       ` David Laight
2019-12-12 16:29                                     ` David Laight
2019-11-21 19:56             ` Peter Zijlstra
2019-11-21 21:01               ` Andy Lutomirski
2019-11-22  9:36                 ` Peter Zijlstra
2019-11-22  9:46             ` David Laight
2019-11-22 20:32               ` Peter Zijlstra
2019-11-21  8:00   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrXUZ790WFk9SEzuiKg-wMva=RpWhZNYPf+MqzT0xdu+gg@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=David.Laight@aculab.com \
    --cc=ashok.raj@intel.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.