LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Andi Kleen <ak@linux.intel.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>,
	Ingo Molnar <mingo@kernel.org>, Andy Lutomirski <luto@kernel.org>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Ravi Shankar <ravi.v.shankar@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	x86@kernel.org, Linus Torvalds <torvalds@linux-foundation.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: New feature/ABI review process [was Re: [RESEND PATCH v6 04/12] x86/fsgsbase/64:..]
Date: Wed, 27 Mar 2019 22:15:50 +0100 (CET)
Message-ID: <alpine.DEB.2.21.1903272147180.1789@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20190326225638.GQ18020@tassilo.jf.intel.com>

On Tue, 26 Mar 2019, Andi Kleen wrote:
> As long as everything is cache hot it's likely only a couple
> of cycles difference (as Intel CPUs are very good executing
> crappy code too), but if it's not then you end up with a huge cache miss
> cost, causing jitter. That's a problem for real time for example.

That extra cache miss is really not the worst issue for realtime. The
inherent latencies of contemporary systems have way worse to offer than
that. Any realtime system has to cope with the worst case and an extra
cache miss is not the end of the world.

> >   > Accessing user GSBASE needs a couple of SWAPGS operations. It is
> >   > avoidable if the user GSBASE is saved at kernel entry, being updated as
> >   > changes, and restored back at kernel exit. However, it seems to spend
> >   > more cycles for savings and restorations. Little or no benefit was
> >   > measured from experiments.
> > 
> > So little or no benefit was measured. I don't see how that maps to your
> > 'SWAPGS will be a lot faster' claim. One of those claims is obviously
> > wrong.
> 
> If everything is cache hot it won't make much difference,
> but if you have a cache miss you end up eating the cost.
> 
> > 
> > Aside of this needs more than numbers:
> > 
> >   1) Proper documentation how the mixed bag is managed.
> 
> How SWAPGS is managed?
> 
> Like it always was since 20+ years when the x86_64
> port was originally born.

I know how SWAPGS works.
 
> The only case which has to do an two SWAPGS is the 
> context switch when it switches the base. Everything else
> just does SWAPGS at the edges for kernel entries.

And exactly here is the problem. You are not even describing it correctly
now:

	You cannot do SWAPGS on _all_ edges.

You cannot do SWAPGS in the paranoid entry when FSGSBASE is in use, because
user space can write arbitrary values into GS. Which breaks the existing
differentiation of kernel/user GS. That's why you have the FSGSBASE variant
there. Is that documented?

The changelog has some convoluted description of it:

  "The FSGSBASE instructions allow fast accesses on GSBASE.  Now, at the
   paranoid_entry, the per-CPU base value can be always copied to GSBASE.
   And the original GSBASE value will be restored at the exit."

So that part blurbs about fast access and comes first. Really useful.

  "So far, GSBASE modification has not been directly allowed from userspace.
   So, swapping GSBASE has been conditionally executed according to the
   kernel-enforced convention that a negative GSBASE indicates a kernel value.
   But when FSGSBASE is enabled, userspace can put an arbitrary value in
   GSBASE. The change will secure a correct GSBASE value with FSGSBASE."

I can decode that because I'm familiar with the inner workings of the
paranoid entry code. But that changelog is just not providing properly
structured information and the full context.

What's worse is the comment in the code itself:

+ * When FSGSBASE enabled, current GSBASE is always copied to %rbx.

Where is the documentation that FSGSBASE is required to be used here and
why? I can blody well see from the code that the FSGSBASE path does this
unconditionally. But that does not explain why and it does not explain why
FSGSBASE is not used all over the place instead of SWAPGS and just here.

+ * Without FSGSBASE, SWAPGS is needed when entering from userspace.
+ * A positive GSBASE means it is a user value and a negative GSBASE
+ * means it is a kernel value.

So this has more explanation about the SWAPGS mode than about the
subtlities of FSGSBASE.

This stuff wants to be documented in great length for everyones sake
including yourself when you have to stare into that code a year from now. I
don't care about you're headache but I care about mine and that of people
who might end up debugging some subtle bug in that area.

Thanks,

	tglx








  reply index

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-15 20:06 [RESEND PATCH v6 00/12] x86: Enable FSGSBASE instructions Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 01/12] x86/fsgsbase/64: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 02/12] kbuild: Raise the minimum required binutils version to 2.21 Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 03/12] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 04/12] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions Chang S. Bae
2019-03-25 11:38   ` Thomas Gleixner
2019-03-25 12:46     ` Thomas Gleixner
2019-03-25 13:05       ` Thomas Gleixner
2019-03-26  0:38     ` Andi Kleen
2019-03-26 15:01       ` New feature/ABI review process [was Re: [RESEND PATCH v6 04/12] x86/fsgsbase/64:..] Thomas Gleixner
2019-03-26 22:56         ` Andi Kleen
2019-03-27 21:15           ` Thomas Gleixner [this message]
2019-03-15 20:06 ` [RESEND PATCH v6 05/12] x86/fsgsbase/64: Preserve FS/GS state in __switch_to() if FSGSBASE is on Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 06/12] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 07/12] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro Chang S. Bae
2019-03-25  9:02   ` Thomas Gleixner
2019-05-01 13:52     ` Bae, Chang Seok
2019-03-15 20:06 ` [RESEND PATCH v6 08/12] x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry Chang S. Bae
2019-03-25  9:44   ` Thomas Gleixner
2019-04-05  8:35     ` Thomas Gleixner
2019-04-05 13:50       ` Andy Lutomirski
2019-05-01 13:52         ` Bae, Chang Seok
2019-05-01 17:40           ` Andy Lutomirski
2019-05-01 18:01             ` Bae, Chang Seok
     [not found]               ` <7029A32B-958E-4C1E-8B5F-D49BA68E4755@intel.com>
2019-05-01 20:25                 ` Andy Lutomirski
2019-05-01 21:04                   ` Bae, Chang Seok
2019-05-02  0:29                     ` Andy Lutomirski
2019-05-06 22:56     ` Bae, Chang Seok
2019-03-15 20:06 ` [RESEND PATCH v6 09/12] selftests/x86/fsgsbase: Test WRGSBASE Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 10/12] x86/fsgsbase/64: Enable FSGSBASE by default and add a chicken bit Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 11/12] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Chang S. Bae
2019-03-15 20:06 ` [RESEND PATCH v6 12/12] x86/fsgsbase/64: Add documentation for FSGSBASE Chang S. Bae
2019-03-30 16:15   ` Randy Dunlap
2019-03-26  0:43 ` [RESEND PATCH v6 00/12] x86: Enable FSGSBASE instructions Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1903272147180.1789@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=ak@linux.intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=arjan@linux.intel.com \
    --cc=chang.seok.bae@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git