All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	X86 ML <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>
Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions
Date: Thu, 23 Jul 2020 10:08:42 -0700	[thread overview]
Message-ID: <C03DA782-BD1A-42E3-B118-ABB34BC5F2AF@amacapital.net> (raw)
In-Reply-To: <20200723165204.GB77434@romley-ivt3.sc.intel.com>


> On Jul 23, 2020, at 9:52 AM, Fenghua Yu <fenghua.yu@intel.com> wrote:
> 
> Hi, Dave,
> 
>> On Thu, Jul 23, 2020 at 09:23:13AM -0700, Dave Hansen wrote:
>>> On 7/23/20 9:18 AM, Fenghua Yu wrote:
>>> The PKRS MSR has been preserved in thread_info during kernel entry. We
>>> don't need to preserve it in another place (i.e. idtentry_state).
>> 
>> I'm missing how the PKRS MSR gets preserved in thread_info.  Could you
>> explain the mechanism by which this happens and point to the code
>> implementing it, please?
> 
> [Sorry, my mistake: I mean "thread_struct" instead of "thread_info".
> Hopefully the typo doesn't change the essential part in my last email.]
> 
> The "saved_pkrs" is defined in thread_struct and context switched in
> patch 04/17:
> https://lore.kernel.org/lkml/20200717072056.73134-5-ira.weiny@intel.com/
> 
> Because there is no XSAVE support the PKRS MSR, we preserve it in
> "saved_pkrs" in thread_struct. It's initialized as 0 (init state, no
> protection key) in fork() or exec(). It's updated to a right protection
> value when a driver calls the updating API. The PKRS MSR is context
> switched by "saved_pkrs" when switching to a task (unless optimized if the
> cached MSR is the same as the saved one).
> 
> 

Suppose some kernel code (a syscall or
kernel thread) changes PKRS then takes a page fault. The page fault handler needs a fresh PKRS. Then the page fault handler (say a VMA’s .fault handler) changes PKRS.  The we get an interrupt. The interrupt *also* needs a fresh PKRS and the page fault value needs to be saved somewhere.

So we have more than one saved value per thread, and thread_struct isn’t going to solve this problem.

But idtentry_state is also not great for a couple reasons.  Not all entries have idtentry_state, and the unwinder can’t find it for debugging. For that matter, the page fault logic probably wants to know the previous PKRS, so it should either be stashed somewhere findable or it should be explicitly passed around.

My suggestion is to enlarge pt_regs.  The save and restore logic can probably be in C, but pt_regs is the logical place to put a register that is saved and restored across all entries.

Whoever does this work will have the delightful job of figuring out whether BPF thinks that the layout of pt_regs is ABI and, if so, fixing the resulting mess.

The fact the new fields will go at the beginning of pt_regs will make this an entertaining prospect.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@amacapital.net>
To: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Weiny Ira <ira.weiny@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	X86 ML <x86@kernel.org>, Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>
Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions
Date: Thu, 23 Jul 2020 10:08:42 -0700	[thread overview]
Message-ID: <C03DA782-BD1A-42E3-B118-ABB34BC5F2AF@amacapital.net> (raw)
In-Reply-To: <20200723165204.GB77434@romley-ivt3.sc.intel.com>


> On Jul 23, 2020, at 9:52 AM, Fenghua Yu <fenghua.yu@intel.com> wrote:
> 
> Hi, Dave,
> 
>> On Thu, Jul 23, 2020 at 09:23:13AM -0700, Dave Hansen wrote:
>>> On 7/23/20 9:18 AM, Fenghua Yu wrote:
>>> The PKRS MSR has been preserved in thread_info during kernel entry. We
>>> don't need to preserve it in another place (i.e. idtentry_state).
>> 
>> I'm missing how the PKRS MSR gets preserved in thread_info.  Could you
>> explain the mechanism by which this happens and point to the code
>> implementing it, please?
> 
> [Sorry, my mistake: I mean "thread_struct" instead of "thread_info".
> Hopefully the typo doesn't change the essential part in my last email.]
> 
> The "saved_pkrs" is defined in thread_struct and context switched in
> patch 04/17:
> https://lore.kernel.org/lkml/20200717072056.73134-5-ira.weiny@intel.com/
> 
> Because there is no XSAVE support the PKRS MSR, we preserve it in
> "saved_pkrs" in thread_struct. It's initialized as 0 (init state, no
> protection key) in fork() or exec(). It's updated to a right protection
> value when a driver calls the updating API. The PKRS MSR is context
> switched by "saved_pkrs" when switching to a task (unless optimized if the
> cached MSR is the same as the saved one).
> 
> 

Suppose some kernel code (a syscall or
kernel thread) changes PKRS then takes a page fault. The page fault handler needs a fresh PKRS. Then the page fault handler (say a VMA’s .fault handler) changes PKRS.  The we get an interrupt. The interrupt *also* needs a fresh PKRS and the page fault value needs to be saved somewhere.

So we have more than one saved value per thread, and thread_struct isn’t going to solve this problem.

But idtentry_state is also not great for a couple reasons.  Not all entries have idtentry_state, and the unwinder can’t find it for debugging. For that matter, the page fault logic probably wants to know the previous PKRS, so it should either be stashed somewhere findable or it should be explicitly passed around.

My suggestion is to enlarge pt_regs.  The save and restore logic can probably be in C, but pt_regs is the logical place to put a register that is saved and restored across all entries.

Whoever does this work will have the delightful job of figuring out whether BPF thinks that the layout of pt_regs is ABI and, if so, fixing the resulting mess.

The fact the new fields will go at the beginning of pt_regs will make this an entertaining prospect.

WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@amacapital.net>
To: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Weiny Ira <ira.weiny@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	X86 ML <x86@kernel.org>, Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>
Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions
Date: Thu, 23 Jul 2020 10:08:42 -0700	[thread overview]
Message-ID: <C03DA782-BD1A-42E3-B118-ABB34BC5F2AF@amacapital.net> (raw)
In-Reply-To: <20200723165204.GB77434@romley-ivt3.sc.intel.com>


> On Jul 23, 2020, at 9:52 AM, Fenghua Yu <fenghua.yu@intel.com> wrote:
> 
> Hi, Dave,
> 
>> On Thu, Jul 23, 2020 at 09:23:13AM -0700, Dave Hansen wrote:
>>> On 7/23/20 9:18 AM, Fenghua Yu wrote:
>>> The PKRS MSR has been preserved in thread_info during kernel entry. We
>>> don't need to preserve it in another place (i.e. idtentry_state).
>> 
>> I'm missing how the PKRS MSR gets preserved in thread_info.  Could you
>> explain the mechanism by which this happens and point to the code
>> implementing it, please?
> 
> [Sorry, my mistake: I mean "thread_struct" instead of "thread_info".
> Hopefully the typo doesn't change the essential part in my last email.]
> 
> The "saved_pkrs" is defined in thread_struct and context switched in
> patch 04/17:
> https://lore.kernel.org/lkml/20200717072056.73134-5-ira.weiny@intel.com/
> 
> Because there is no XSAVE support the PKRS MSR, we preserve it in
> "saved_pkrs" in thread_struct. It's initialized as 0 (init state, no
> protection key) in fork() or exec(). It's updated to a right protection
> value when a driver calls the updating API. The PKRS MSR is context
> switched by "saved_pkrs" when switching to a task (unless optimized if the
> cached MSR is the same as the saved one).
> 
> 

Suppose some kernel code (a syscall or
kernel thread) changes PKRS then takes a page fault. The page fault handler needs a fresh PKRS. Then the page fault handler (say a VMA’s .fault handler) changes PKRS.  The we get an interrupt. The interrupt *also* needs a fresh PKRS and the page fault value needs to be saved somewhere.

So we have more than one saved value per thread, and thread_struct isn’t going to solve this problem.

But idtentry_state is also not great for a couple reasons.  Not all entries have idtentry_state, and the unwinder can’t find it for debugging. For that matter, the page fault logic probably wants to know the previous PKRS, so it should either be stashed somewhere findable or it should be explicitly passed around.

My suggestion is to enlarge pt_regs.  The save and restore logic can probably be in C, but pt_regs is the logical place to put a register that is saved and restored across all entries.

Whoever does this work will have the delightful job of figuring out whether BPF thinks that the layout of pt_regs is ABI and, if so, fixing the resulting mess.

The fact the new fields will go at the beginning of pt_regs will make this an entertaining prospect.

  reply	other threads:[~2020-07-23 17:08 UTC|newest]

Thread overview: 157+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-17  7:20 [PATCH RFC V2 00/17] PKS: Add Protection Keys Supervisor (PKS) support ira.weiny
2020-07-17  7:20 ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 01/17] x86/pkeys: Create pkeys_internal.h ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 02/17] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  8:54   ` Peter Zijlstra
2020-07-17  8:54     ` Peter Zijlstra
2020-07-17 20:52     ` Ira Weiny
2020-07-17 20:52       ` Ira Weiny
2020-07-20  9:14       ` Peter Zijlstra
2020-07-20  9:14         ` Peter Zijlstra
2020-07-17 22:36     ` Dave Hansen
2020-07-17 22:36       ` Dave Hansen
2020-07-20  9:13       ` Peter Zijlstra
2020-07-20  9:13         ` Peter Zijlstra
2020-07-17  7:20 ` [PATCH RFC V2 03/17] x86/pks: Enable Protection Keys Supervisor (PKS) ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 04/17] x86/pks: Preserve the PKRS MSR on context switch ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  8:31   ` Peter Zijlstra
2020-07-17  8:31     ` Peter Zijlstra
2020-07-17 21:39     ` Ira Weiny
2020-07-17 21:39       ` Ira Weiny
2020-07-17  8:59   ` Peter Zijlstra
2020-07-17  8:59     ` Peter Zijlstra
2020-07-17 22:34     ` Ira Weiny
2020-07-17 22:34       ` Ira Weiny
2020-07-20  9:15       ` Peter Zijlstra
2020-07-20  9:15         ` Peter Zijlstra
2020-07-20 18:35         ` Ira Weiny
2020-07-20 18:35           ` Ira Weiny
2020-07-17  7:20 ` [PATCH RFC V2 05/17] x86/pks: Add PKS kernel API ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 06/17] x86/pks: Add a debugfs file for allocated PKS keys ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 07/17] Documentation/pkeys: Update documentation for kernel pkeys ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 08/17] x86/pks: Add PKS Test code ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 09/17] memremap: Convert devmap static branch to {inc,dec} ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 10/17] fs/dax: Remove unused size parameter ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 11/17] drivers/dax: Expand lock scope to cover the use of addresses ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 12/17] memremap: Add zone device access protection ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  9:10   ` Peter Zijlstra
2020-07-17  9:10     ` Peter Zijlstra
2020-07-18  5:06     ` Ira Weiny
2020-07-18  5:06       ` Ira Weiny
2020-07-20  9:16       ` Peter Zijlstra
2020-07-20  9:16         ` Peter Zijlstra
2020-07-17  9:17   ` Peter Zijlstra
2020-07-17  9:17     ` Peter Zijlstra
2020-07-18  5:51     ` Ira Weiny
2020-07-18  5:51       ` Ira Weiny
2020-07-17  9:20   ` Peter Zijlstra
2020-07-17  9:20     ` Peter Zijlstra
2020-07-17  7:20 ` [PATCH RFC V2 13/17] kmap: Add stray write protection for device pages ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  9:21   ` Peter Zijlstra
2020-07-17  9:21     ` Peter Zijlstra
2020-07-19  4:13     ` Ira Weiny
2020-07-19  4:13       ` Ira Weiny
2020-07-20  9:17       ` Peter Zijlstra
2020-07-20  9:17         ` Peter Zijlstra
2020-07-21 16:31         ` Ira Weiny
2020-07-21 16:31           ` Ira Weiny
2020-07-17  7:20 ` [PATCH RFC V2 14/17] dax: Stray write protection for dax_direct_access() ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  9:22   ` Peter Zijlstra
2020-07-17  9:22     ` Peter Zijlstra
2020-07-19  4:41     ` Ira Weiny
2020-07-19  4:41       ` Ira Weiny
2020-07-17  7:20 ` [PATCH RFC V2 15/17] nvdimm/pmem: Stray write protection for pmem->virt_addr ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 16/17] [dax|pmem]: Enable stray write protection ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  9:25   ` Peter Zijlstra
2020-07-17  9:25     ` Peter Zijlstra
2020-07-17  7:20 ` [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions ira.weiny
2020-07-17  7:20   ` ira.weiny
2020-07-17  9:30   ` Peter Zijlstra
2020-07-17  9:30     ` Peter Zijlstra
2020-07-21 18:01     ` Ira Weiny
2020-07-21 18:01       ` Ira Weiny
2020-07-21 19:11       ` Peter Zijlstra
2020-07-21 19:11         ` Peter Zijlstra
2020-07-17  9:34   ` Peter Zijlstra
2020-07-17  9:34     ` Peter Zijlstra
2020-07-17 10:06   ` Peter Zijlstra
2020-07-17 10:06     ` Peter Zijlstra
2020-07-22  5:27     ` Ira Weiny
2020-07-22  5:27       ` Ira Weiny
2020-07-22  9:48       ` Peter Zijlstra
2020-07-22  9:48         ` Peter Zijlstra
2020-07-22 21:24         ` Ira Weiny
2020-07-22 21:24           ` Ira Weiny
2020-07-23 20:08       ` Thomas Gleixner
2020-07-23 20:08         ` Thomas Gleixner
2020-07-23 20:15         ` Thomas Gleixner
2020-07-23 20:15           ` Thomas Gleixner
2020-07-24 17:23           ` Ira Weiny
2020-07-24 17:23             ` Ira Weiny
2020-07-24 17:29             ` Andy Lutomirski
2020-07-24 17:29               ` Andy Lutomirski
2020-07-24 19:43               ` Ira Weiny
2020-07-24 19:43                 ` Ira Weiny
2020-07-22 16:21   ` Andy Lutomirski
2020-07-22 16:21     ` Andy Lutomirski
2020-07-22 16:21     ` Andy Lutomirski
2020-07-23 16:18     ` Fenghua Yu
2020-07-23 16:18       ` Fenghua Yu
2020-07-23 16:18       ` Fenghua Yu
2020-07-23 16:23       ` Dave Hansen
2020-07-23 16:23         ` Dave Hansen
2020-07-23 16:23         ` Dave Hansen
2020-07-23 16:52         ` Fenghua Yu
2020-07-23 16:52           ` Fenghua Yu
2020-07-23 16:52           ` Fenghua Yu
2020-07-23 17:08           ` Andy Lutomirski [this message]
2020-07-23 17:08             ` Andy Lutomirski
2020-07-23 17:08             ` Andy Lutomirski
2020-07-23 17:30             ` Dave Hansen
2020-07-23 17:30               ` Dave Hansen
2020-07-23 17:30               ` Dave Hansen
2020-07-23 20:23               ` Thomas Gleixner
2020-07-23 20:23                 ` Thomas Gleixner
2020-07-23 20:23                 ` Thomas Gleixner
2020-07-23 20:22             ` Thomas Gleixner
2020-07-23 20:22               ` Thomas Gleixner
2020-07-23 20:22               ` Thomas Gleixner
2020-07-23 21:30               ` Andy Lutomirski
2020-07-23 21:30                 ` Andy Lutomirski
2020-07-23 21:30                 ` Andy Lutomirski
2020-07-23 22:14                 ` Thomas Gleixner
2020-07-23 22:14                   ` Thomas Gleixner
2020-07-23 22:14                   ` Thomas Gleixner
2020-07-23 19:53   ` Thomas Gleixner
2020-07-23 19:53     ` Thomas Gleixner
2020-07-23 22:04     ` Ira Weiny
2020-07-23 22:04       ` Ira Weiny
2020-07-23 23:41       ` Thomas Gleixner
2020-07-23 23:41         ` Thomas Gleixner
2020-07-24 21:24         ` Thomas Gleixner
2020-07-24 21:24           ` Thomas Gleixner
2020-07-24 21:31           ` Thomas Gleixner
2020-07-24 21:31             ` Thomas Gleixner
2020-07-25  0:09           ` Andy Lutomirski
2020-07-25  0:09             ` Andy Lutomirski
2020-07-25  0:09             ` Andy Lutomirski
2020-07-27 20:59           ` Ira Weiny
2020-07-27 20:59             ` Ira Weiny
2020-07-24 22:19 ` [PATCH RFC V2 00/17] PKS: Add Protection Keys Supervisor (PKS) support Kees Cook
2020-07-24 22:19   ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C03DA782-BD1A-42E3-B118-ABB34BC5F2AF@amacapital.net \
    --to=luto@amacapital.net \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.