linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-doc@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [PATCH V3 04/10] x86/pks: Preserve the PKRS MSR on context switch
Date: Fri, 18 Dec 2020 22:30:16 +0100	[thread overview]
Message-ID: <873602redz.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20201218194249.GE1563847@iweiny-DESK2.sc.intel.com>

On Fri, Dec 18 2020 at 11:42, Ira Weiny wrote:
> On Fri, Dec 18, 2020 at 02:57:51PM +0100, Thomas Gleixner wrote:
>>   2) Modify kmap() so that it marks the to be mapped page as 'globaly
>>      unprotected' instead of doing this global unprotect PKS dance.
>>      kunmap() undoes that. That obviously needs some thought
>>      vs. refcounting if there are concurrent users, but that's a
>>      solvable problem either as part of struct page itself or
>>      stored in some global hash.
>
> How would this globally unprotected flag work?  I suppose if kmap created a new
> PTE we could make that PTE non-PKS protected then we don't have to fiddle with
> the register...  I think I like that idea.

No. Look at the highmem implementation of kmap(). It's a terrible idea,
really. Don't even think about that.

There is _no_ global flag. The point is that the kmap is strictly bound
to a particular struct page. So you can simply do:

  kmap(page)
    if (page_is_access_protected(page))
        atomic_inc(&page->unprotect);

  kunmap(page)
    if (page_is_access_protected(page))
        atomic_dec(&page->unprotect);

and in the #PF handler:

    if (!page->unprotect)
       goto die;

The reason why I said: either in struct page itself or in a global hash
is that struct page is already packed and people are not really happy
about increasing it's size. But the principle is roughly the same.

>> 
>>   4) Have a smart #PF mechanism which does:
>> 
>>      if (error_code & X86_PF_PK) {
>>          page = virt_to_page(address);
>> 
>>          if (!page || !page_is_globaly_unprotected(page))
>>                  goto die;
>> 
>>          if (pks_mode == PKS_MODE_STRICT)
>>          	 goto die;
>> 
>>          WARN_ONCE(pks_mode == PKS_MODE_RELAXED, "Useful info ...");
>> 
>>          temporary_unprotect(page, regs);
>>          return;
>>      }
>
> I feel like this is very similar to what I had in the global patch you found in
> my git tree with the exception of the RELAXED mode.  I simply had globally
> unprotected or die.

Your stuff depends on that global_pks_state which is not maintainable
especially not the teardown side. This depends on per page state which
is clearly way simpler and more focussed.

> Regardless I think unprotecting a global context is the easy part.  The code
> you had a problem with (and I see is fully broken) was the restriction of
> access.  A failure to update in that direction would only result in a wider
> window of access.  I contemplated not doing a global update at all and just
> leave the access open until the next context switch.  But the code as it stands
> tries to force an update for a couple of reasons:
>
> 1) kmap_local_page() removes most of the need for global pks.  So I was
>    thinking that global PKS could be a slow path.
>
> 2) kmap()'s that are handed to other contexts they are likely to be 'long term'
>    and should not need to be updated 'too' often.  I will admit that I don't
>    know how often 'too often' is.

Even once in while is not a justification for stopping the world for N
milliseconds.

>>      temporary_unprotect(page, regs)
>>      {
>>         key = page_to_key(page);
>> 
>> 	/* Return from #PF will establish this for the faulting context */
>>         extended_state(regs)->pks &= ~PKS_MASK(key);
>>      }
>> 
>>      This temporary unprotect is undone when the context is left, so
>>      depending on the context (thread, interrupt, softirq) the
>>      unprotected section might be way wider than actually needed, but
>>      that's still orders of magnitudes better than having this fully
>>      unrestricted global PKS mode which is completely scopeless.
>
> I'm not sure I follow you.  How would we know when the context is
> left?

The context goes away on it's own. Either context switch or return from
interrupt. As I said there is an extended window where the external
context still might have unprotected access even if the initiating
context has called kunmap() already. It's not pretty, but it's not the
end of the world either.

That's why I suggested to have that WARN_ONCE() so we can actually see
why and where that happens and think about solutions to make this go
into local context, e.g. by changing the vaddr pointer to a struct page
pointer for these particular use cases and then the other context can do
kmap/unmap_local().

>>   5) The DAX case which you made "work" with dev_access_enable() and
>>      dev_access_disable(), i.e. with yet another lazy approach of
>>      avoiding to change a handful of usage sites.
>> 
>>      The use cases are strictly context local which means the global
>>      magic is not used at all. Why does it exist in the first place?
>
> I'm not following.  What is 'it'?

That global argument to dev_access_enable()/disable(). 

>>      That leaves the question about the refcount. AFAICT, nothing nests
>>      in that use case for a given execution context. I'm surely missing
>>      something subtle here.
>
> The refcount is needed for non-global pks as well as global.  I've not resolved
> if anything needs to be done with the refcount on the global update since the
> following is legal.
>
> kmap()
> kmap_local_page()
> kunmap()
> kunmap_local()
>
> Which would be a problem.  But I don't think it is ever actually done.

If it does not exist why would we support it in the first place? We can
have some warning there to catch that case.

> Another problem would be if the kmap and kunmap happened in different
> contexts...  :-/  I don't think that is done either but I don't know for
> certain.
>
> Frankly, my main focus before any of this global support has been to
> get rid of as many kmaps as possible.[1] Once that is done I think
> more of these questions can be answered better.

I was expecting that you could answer these questions :)

Thanks,

        tglx

  parent reply	other threads:[~2020-12-18 21:31 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-06 23:28 [PATCH V3 00/10] PKS: Add Protection Keys Supervisor (PKS) support V3 ira.weiny
2020-11-06 23:28 ` [PATCH V3 01/10] x86/pkeys: Create pkeys_common.h ira.weiny
2020-11-06 23:29 ` [PATCH V3 02/10] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support ira.weiny
2020-11-06 23:29 ` [PATCH V3 03/10] x86/pks: Add PKS defines and Kconfig options ira.weiny
2020-11-06 23:29 ` [PATCH V3 04/10] x86/pks: Preserve the PKRS MSR on context switch ira.weiny
2020-12-17 14:50   ` Thomas Gleixner
2020-12-17 22:43     ` Thomas Gleixner
2020-12-18 13:57       ` Thomas Gleixner
2020-12-18 19:20         ` Dan Williams
2020-12-18 21:06           ` Thomas Gleixner
2020-12-18 21:58             ` Dan Williams
2020-12-18 22:44               ` Thomas Gleixner
2020-12-18 19:42         ` Ira Weiny
2020-12-18 20:10           ` Dave Hansen
2020-12-18 21:30           ` Thomas Gleixner [this message]
2020-12-18  4:05     ` Ira Weiny
2020-12-17 20:41   ` [NEEDS-REVIEW] " Dave Hansen
2020-12-18  4:10     ` Ira Weiny
2020-12-18 15:33       ` Dave Hansen
2020-11-06 23:29 ` [PATCH V3 05/10] x86/entry: Pass irqentry_state_t by reference ira.weiny
2020-11-15 18:58   ` Thomas Gleixner
2020-11-16 18:49     ` Ira Weiny
2020-11-16 20:36       ` Thomas Gleixner
2020-11-24  6:09   ` [PATCH V3.1] entry: " ira.weiny
2020-12-11 22:14     ` Andy Lutomirski
2020-12-16  1:32       ` Ira Weiny
2020-12-16  2:09         ` Andy Lutomirski
2020-12-17  0:38           ` Ira Weiny
2020-12-17 13:07       ` Thomas Gleixner
2020-12-17 13:19         ` Peter Zijlstra
2020-12-17 15:35           ` Andy Lutomirski
2020-12-17 16:58     ` Thomas Gleixner
2020-11-06 23:29 ` [PATCH V3 06/10] x86/entry: Preserve PKRS MSR across exceptions ira.weiny
2020-12-17 15:28   ` Thomas Gleixner
2020-11-06 23:29 ` [PATCH V3 07/10] x86/fault: Report the PKRS state on fault ira.weiny
2020-11-06 23:29 ` [PATCH V3 08/10] x86/pks: Add PKS kernel API ira.weiny
2020-12-23 20:39   ` Randy Dunlap
2020-11-06 23:29 ` [PATCH V3 09/10] x86/pks: Enable Protection Keys Supervisor (PKS) ira.weiny
2020-11-06 23:29 ` [PATCH V3 10/10] x86/pks: Add PKS test code ira.weiny
2020-12-17 20:55   ` Dave Hansen
2020-12-18  4:05     ` Ira Weiny
2020-12-18 16:59       ` Dan Williams
2020-12-07 22:14 ` [PATCH V3 00/10] PKS: Add Protection Keys Supervisor (PKS) support V3 Ira Weiny
2020-12-08 15:55   ` Thomas Gleixner
2020-12-08 17:22     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=873602redz.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ira.weiny@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH V3 04/10] x86/pks: Preserve the PKRS MSR on context switch' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).