linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: ira.weiny@intel.com, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-doc@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [PATCH V3 04/10] x86/pks: Preserve the PKRS MSR on context switch
Date: Fri, 18 Dec 2020 14:57:51 +0100	[thread overview]
Message-ID: <878s9vqkrk.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <87mtycqcjf.fsf@nanos.tec.linutronix.de>

On Thu, Dec 17 2020 at 23:43, Thomas Gleixner wrote:
> The only use case for this in your tree is: kmap() and the possible
> usage of that mapping outside of the thread context which sets it up.
>
> The only hint for doing this at all is:
>
>     Some users, such as kmap(), sometimes requires PKS to be global.
>
> 'sometime requires' is really _not_ a technical explanation.
>
> Where is the explanation why kmap() usage 'sometimes' requires this
> global trainwreck in the first place and where is the analysis why this
> can't be solved differently?
>
> Detailed use case analysis please.

A lengthy conversation with Dan and Dave over IRC confirmed what I was
suspecting.

The approach of this whole PKS thing is to make _all_ existing code
magically "work". That means aside of the obvious thread local mappings,
the kmap() part is needed to solve the problem of async handling where
the mapping is handed to some other context which then uses it and
notifies the context which created the mapping when done. That's the
principle which was used to make highmem work long time ago.

IMO that was a mistake back then. The right thing would have been to
change the code so that it does not rely on a temporary mapping created
by the initiator. Instead let the initiator hand the page over to the
other context which then creates a temporary mapping for fiddling with
it. Water under the bridge...

Glueing PKS on to that kmap() thing is horrible and global PKS is pretty
much the opposite of what PKS wants to achieve. It's disabling
protection systemwide for an unspecified amount of time and for all
contexts.

So instead of trying to make global PKS "work" we really should go and
take a smarter approach.

  1) Many kmap() use cases are strictly thread local and the mapped
     address is never handed to some other context, which means this can
     be replaced with kmap_local() now, which preserves the mapping
     accross preemption. PKS just works nicely on top of that.

  2) Modify kmap() so that it marks the to be mapped page as 'globaly
     unprotected' instead of doing this global unprotect PKS dance.
     kunmap() undoes that. That obviously needs some thought
     vs. refcounting if there are concurrent users, but that's a
     solvable problem either as part of struct page itself or
     stored in some global hash.

  3) Have PKS modes:

     - STRICT:   No pardon
     
     - RELAXED:  Warn and unprotect temporary for the current context

     - SILENT:	 Like RELAXED, but w/o warning to make sysadmins happy.
                 Default should be RELAXED.

     - OFF:      Disable the whole PKS thing


  4) Have a smart #PF mechanism which does:

     if (error_code & X86_PF_PK) {
         page = virt_to_page(address);

         if (!page || !page_is_globaly_unprotected(page))
                 goto die;

         if (pks_mode == PKS_MODE_STRICT)
         	 goto die;

         WARN_ONCE(pks_mode == PKS_MODE_RELAXED, "Useful info ...");

         temporary_unprotect(page, regs);
         return;
     }

     temporary_unprotect(page, regs)
     {
        key = page_to_key(page);

	/* Return from #PF will establish this for the faulting context */
        extended_state(regs)->pks &= ~PKS_MASK(key);
     }

     This temporary unprotect is undone when the context is left, so
     depending on the context (thread, interrupt, softirq) the
     unprotected section might be way wider than actually needed, but
     that's still orders of magnitudes better than having this fully
     unrestricted global PKS mode which is completely scopeless.

     The above is at least restricted to the pages which are in use for
     a particular operation. Stray pointers during that time are
     obviously not caught, but that's not any different from that
     proposed global thingy.

     The warning allows to find the non-obvious places so they can be
     analyzed and worked on.

  5) The DAX case which you made "work" with dev_access_enable() and
     dev_access_disable(), i.e. with yet another lazy approach of
     avoiding to change a handful of usage sites.

     The use cases are strictly context local which means the global
     magic is not used at all. Why does it exist in the first place?

     Aside of that this global thing would never work at all because the
     refcounting is per thread and not global.

     So that DAX use case is just a matter of:

        grant/revoke_access(DEV_PKS_KEY, READ/WRITE)

     which is effective for the current execution context and really
     wants to be a distinct READ/WRITE protection and not the magic
     global thing which just has on/off. All usage sites know whether
     they want to read or write.
   
     That leaves the question about the refcount. AFAICT, nothing nests
     in that use case for a given execution context. I'm surely missing
     something subtle here.

     Hmm?

Thanks,

        tglx
     


  reply	other threads:[~2020-12-18 13:58 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-06 23:28 [PATCH V3 00/10] PKS: Add Protection Keys Supervisor (PKS) support V3 ira.weiny
2020-11-06 23:28 ` [PATCH V3 01/10] x86/pkeys: Create pkeys_common.h ira.weiny
2020-11-06 23:29 ` [PATCH V3 02/10] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support ira.weiny
2020-11-06 23:29 ` [PATCH V3 03/10] x86/pks: Add PKS defines and Kconfig options ira.weiny
2020-11-06 23:29 ` [PATCH V3 04/10] x86/pks: Preserve the PKRS MSR on context switch ira.weiny
2020-12-17 14:50   ` Thomas Gleixner
2020-12-17 22:43     ` Thomas Gleixner
2020-12-18 13:57       ` Thomas Gleixner [this message]
2020-12-18 19:20         ` Dan Williams
2020-12-18 21:06           ` Thomas Gleixner
2020-12-18 21:58             ` Dan Williams
2020-12-18 22:44               ` Thomas Gleixner
2020-12-18 19:42         ` Ira Weiny
2020-12-18 20:10           ` Dave Hansen
2020-12-18 21:30           ` Thomas Gleixner
2020-12-18  4:05     ` Ira Weiny
2020-12-17 20:41   ` [NEEDS-REVIEW] " Dave Hansen
2020-12-18  4:10     ` Ira Weiny
2020-12-18 15:33       ` Dave Hansen
2020-11-06 23:29 ` [PATCH V3 05/10] x86/entry: Pass irqentry_state_t by reference ira.weiny
2020-11-15 18:58   ` Thomas Gleixner
2020-11-16 18:49     ` Ira Weiny
2020-11-16 20:36       ` Thomas Gleixner
2020-11-24  6:09   ` [PATCH V3.1] entry: " ira.weiny
2020-12-11 22:14     ` Andy Lutomirski
2020-12-16  1:32       ` Ira Weiny
2020-12-16  2:09         ` Andy Lutomirski
2020-12-17  0:38           ` Ira Weiny
2020-12-17 13:07       ` Thomas Gleixner
2020-12-17 13:19         ` Peter Zijlstra
2020-12-17 15:35           ` Andy Lutomirski
2020-12-17 16:58     ` Thomas Gleixner
2020-11-06 23:29 ` [PATCH V3 06/10] x86/entry: Preserve PKRS MSR across exceptions ira.weiny
2020-12-17 15:28   ` Thomas Gleixner
2020-11-06 23:29 ` [PATCH V3 07/10] x86/fault: Report the PKRS state on fault ira.weiny
2020-11-06 23:29 ` [PATCH V3 08/10] x86/pks: Add PKS kernel API ira.weiny
2020-12-23 20:39   ` Randy Dunlap
2020-11-06 23:29 ` [PATCH V3 09/10] x86/pks: Enable Protection Keys Supervisor (PKS) ira.weiny
2020-11-06 23:29 ` [PATCH V3 10/10] x86/pks: Add PKS test code ira.weiny
2020-12-17 20:55   ` Dave Hansen
2020-12-18  4:05     ` Ira Weiny
2020-12-18 16:59       ` Dan Williams
2020-12-07 22:14 ` [PATCH V3 00/10] PKS: Add Protection Keys Supervisor (PKS) support V3 Ira Weiny
2020-12-08 15:55   ` Thomas Gleixner
2020-12-08 17:22     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878s9vqkrk.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ira.weiny@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).