linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Dave Hansen <dave.hansen@intel.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [GIT PULL] x86/shstk for 6.4
Date: Sat, 6 May 2023 13:09:07 -0700	[thread overview]
Message-ID: <CAHk-=wiB0wy6oXOsPtYU4DSbqJAY8z5iNBKdjdOp2LP23khUoA@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wiuVXTfgapmjYQvrEDzn3naF2oYnHuky+feEJSj_G_yFQ@mail.gmail.com>

On Sat, May 6, 2023 at 12:34 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> And I'm about a quarter in, haven't even gotten to the meat yet, and
> I've already found a bug.

Ok, so despite this I'm going a bit further, just to familiarize
myself with this even if I can't pull it.

And this is horrendous: pte_wrprotect() doing

    if (pte_dirty(pte))
        pte = pte_mksaveddirty(pte);

which actually has two fundamental problems:

 (a) it shouldn't be a conditional thing at all, it should just be bit
operations

 (b) "pte_dirty()" isn't even the right thing to use, since it
includes the SW dirty bit.

so what this *should* do is likely to just do

   pte.val |= (pte.val >> _PAGE_BIT_DIRTY) & 1) << _PAGE_BIT_SOFT_DIRTY;
   pte.val &= ~_PAGE_DIRTY;

which the compiler should be able to turn into a nice unconditional
series. Smaller than any conditional.

(You could simplify the above by just using fixed bit numbers - the
above is written with two shifts just to work with "any bit pair", but
obviously it could be written to be simpler and more straightforward
by just shifting the bit right into place)

I think the compilers may be able to do that all the simplification
for you even with a

    if (pte.val & _PAGE_DIRTY)
        pte.val |= _PAGE_SOFT_DIRTY;
    pte.val &= ~_PAGE_DIRTY;

but that is only true when there are *not* things like those
cpu_feature_enabled() tests in between those operations.

So I strongly suspect that all those

     if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK))

around this area are only making things worse. You're replacing a
couple of simple bit operations with jump-arounds (and then doing the
bit operations anyway in one side). And you are limited the compiler
from doing obvious simplifications in the process.

Conditionals are really bad, even when they can be turned into static jumps.

As static jumps they just cause odd code flow, and lack of testing.
And compiler barriers.

All these bit operations are actually cheaper - and would get more
test coverage - if they are just done unconditionally with a couple of
shift-and-mask operations.

Now, my reaction here is

 - the whole shadow stack notion of "dirty but not writable is a magic
marker" is *DISGUSTING*. It's wrong.

   Whatever Intel designer that came up with that - instead of just
picking another bit for the *HARDWARE* to check - should be ashamed.

   Now we have to pick a software bit instead, and play games for
this. BAD BAD BAD.

   I'm assuming this is something where Microsoft went "we already
don't have that, and we want all the sw bits for sw, so do this". But
from a design standpoint it's just nasty.

 - But if we have to play those games, just *play* them. Do it all
unconditionally, and make the x86-64 rules be that "dirty but not
writable" is something we should never have.

   Having two different rules, and conditionals for them, is both more
complex for maintainers, *and* for compilers.

So just make that _PAGE_BIT_SOFT_DIRTY be unconditional, and let's
just live with it. But make the bit operations efficient.

Maybe I'm missing something, and the people who have been working on
this have a really good reason for this mess. But it really looks
horrible to me.

So maybe you can explain in small words why I'm wrong, but right now
my feeling is that not only did I find an arm bug in the series
(trivially fixed with a one-liner, but worrying, and triggered by the
series being done in a particularly fragile way).

But I also think there are some x86 things that are just not done the
way they should have been done.

But again: maybe I don't understand some of the problems.

                   Linus

  reply	other threads:[~2023-05-06 20:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24 21:21 [GIT PULL] x86/shstk for 6.4 Dave Hansen
2023-04-28 18:17 ` Linus Torvalds
2023-04-29  0:26   ` Edgecombe, Rick P
2023-04-29  0:40     ` Dave Hansen
2023-05-06 19:34       ` Linus Torvalds
2023-05-06 20:09         ` Linus Torvalds [this message]
2023-05-07  0:18           ` Edgecombe, Rick P
2023-05-07  0:38             ` Linus Torvalds
2023-05-07 15:57               ` Edgecombe, Rick P
2023-05-08 22:57           ` Dave Hansen
2023-05-08 23:31             ` Linus Torvalds
2023-05-08 23:47               ` Linus Torvalds
2023-05-12 17:34                 ` Dave Hansen
2023-05-12 21:55                   ` Linus Torvalds
2023-05-15 21:36                     ` Dave Hansen
2023-05-15 21:37                       ` Dave Hansen
2023-05-15 22:40                       ` Linus Torvalds
2023-05-15 23:02                         ` Linus Torvalds
2023-05-16 20:38                         ` Linus Torvalds
2023-05-16 20:42                           ` Dave Hansen
2023-05-09  0:07               ` Dave Hansen
2023-05-07  0:10         ` Edgecombe, Rick P
2023-05-07  0:19           ` Linus Torvalds
2023-05-07 16:24             ` Edgecombe, Rick P
2023-05-15 21:22               ` Deepak Gupta
2023-05-25 16:20                 ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiB0wy6oXOsPtYU4DSbqJAY8z5iNBKdjdOp2LP23khUoA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).