linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: ira.weiny@intel.com, Dave Hansen <dave.hansen@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	nvdimm@lists.linux.dev, linux-mm@kvack.org
Subject: Re: [PATCH V7 08/18] x86/entry: Preserve PKRS MSR across exceptions
Date: Thu, 25 Nov 2021 15:12:47 +0100	[thread overview]
Message-ID: <87r1b4l3xc.ffs@tglx> (raw)
In-Reply-To: <20210804043231.2655537-9-ira.weiny@intel.com>

Ira,

On Tue, Aug 03 2021 at 21:32, ira weiny wrote:
> +/*
> + * __call_ext_ptregs - Helper macro to call into C with extended pt_regs
> + * @cfunc:		C function to be called
> + *
> + * This will ensure that extended_ptregs is added and removed as needed during
> + * a call into C code.
> + */
> +.macro __call_ext_ptregs cfunc annotate_retpoline_safe:req
> +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
> +	/* add space for extended_pt_regs */
> +	subq    $EXTENDED_PT_REGS_SIZE, %rsp
> +#endif
> +	.if \annotate_retpoline_safe == 1
> +		ANNOTATE_RETPOLINE_SAFE
> +	.endif

This annotation is new and nowhere mentioned why it is part of this
patch.

Can you please do _ONE_ functional change per patch and not a
unreviewable pile of changes in one go? Same applies for the ASM and the
C code changes. The ASM change has to go first and then the C code can
build upon it.

> +	call	\cfunc
> +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
> +	/* remove space for extended_pt_regs */
> +	addq    $EXTENDED_PT_REGS_SIZE, %rsp
> +#endif

I really have to ask the question whether this #ifdeffery has any value
at all. 8 bytes extra stack usage is not going to be the end of the
world and distro kernels will enable that config anyway.

If we really want to save the space then certainly not by sprinkling
CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS all over the place and hiding the
extra sized ptregs in the pkrs header.

You are changing generic architecture code so you better think about
making such a change generic and extensible. Can folks please start
thinking beyond the brim of their teacup and not pretend that the
feature they are working on is the unicorn which requires unique magic
brandnamed after the unicorn of the day.

If the next feature comes around which needs to save something in that
extended area then we are going to change the world again, right?
Certainly not.

This wants to go into asm/ptrace.h:

struct pt_regs_aux {
	u32	pkrs;
};

struct pt_regs_extended {
	struct pt_regs_aux	aux;
        struct pt_regs		regs __attribute__((aligned(8)));
};

and then have in asm-offset:

   DEFINE(PT_REGS_AUX_SIZE, sizeof(struct pt_regs_extended) - sizeof(struct pt_regs));

which does the right thing whatever the size of pt_regs_aux is. So for
the above it will have:

 #define PT_REGS_AUX_SIZE 8 /* sizeof(struct pt_regs_extended) - sizeof(struct pt_regs) */

Even, if you do

struct pt_regs_aux {
#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
	u32	pkrs;
#endif        
};

and the config switch is disabled. It's still correct:

 #define PT_REGS_AUX_SIZE 0 /* sizeof(struct pt_regs_extended) - sizeof(struct pt_regs) */

See? No magic hardcoded constant, no build time error checking for that
constant. Nothing, it just works.

That's one part, but let me come back to this:

> +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
> +	/* add space for extended_pt_regs */
> +	subq    $EXTENDED_PT_REGS_SIZE, %rsp

What guarantees that RSP points to pt_regs at this point?  Nothing at
all. It's just pure luck and a question of time until this explodes in
hard to diagnose ways.

Because between

        movq	%rsp, %rdi
and
        call    ....

can legitimately be other code which causes the stack pointer to
change. It's not the case today, but nothing prevents this in the
future.

The correct thing to do is:

        movq	%rsp, %rdi
        RSP_MAKE_PT_REGS_AUX_SPACE
        call	...
        RSP_REMOVE_PT_REGS_AUX_SPACE

The few extra macro lines in the actual code are way better as they make
it completely obvious what's going on and any misuse can be spotted
easily.

> +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
> +/*
> + * PKRS is a per-logical-processor MSR which overlays additional protection for
> + * pages which have been mapped with a protection key.
> + *
> + * Context switches save the MSR in the task struct thus taking that value to
> + * other processors if necessary.
> + *
> + * To protect against exceptions having access to this memory save the current
> + * thread value and set the PKRS value to be used during the exception.
> + */
> +void pkrs_save_irq(struct pt_regs *regs)

That's a misnomer as this is invoked for _any_ exception not just
interrupts.

>  #ifdef CONFIG_XEN_PV
>  #ifndef CONFIG_PREEMPTION
>  /*
> @@ -309,6 +361,8 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct pt_regs *regs)
>  
>  	inhcall = get_and_clear_inhcall();
>  	if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) {
> +		/* Normally called by irqentry_exit, restore pkrs here */
> +		pkrs_restore_irq(regs);
> 		irqentry_exit_cond_resched();

Sigh. Consistency is overrated....

> +
>  void setup_pks(void);
>  void pkrs_write_current(void);
>  void pks_init_task(struct task_struct *task);
> +void write_pkrs(u32 new_pkrs);

So we have pkrs_write_current() and write_pkrs() now. Can you please
stick to a common prefix, i.e. pkrs_ ?

> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index bf16395b9e13..aa0b1e8dd742 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -6,6 +6,7 @@
>  #include <linux/livepatch.h>
>  #include <linux/audit.h>
>  #include <linux/tick.h>
> +#include <linux/pkeys.h>
>  
>  #include "common.h"
>  
> @@ -364,7 +365,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
>  		instrumentation_end();
>  
>  		ret.exit_rcu = true;
> -		return ret;
> +		goto done;
>  	}
>  
>  	/*
> @@ -379,6 +380,8 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
>  	trace_hardirqs_off_finish();
>  	instrumentation_end();
>  
> +done:
> +	pkrs_save_irq(regs);

This still calls out into instrumentable code. I explained you before
why this is wrong. Also objtool emits warnings to that effect if you do a
proper verified build.

>  	return ret;
>  }
>  
> @@ -404,7 +407,12 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state)
>  	/* Check whether this returns to user mode */
>  	if (user_mode(regs)) {
>  		irqentry_exit_to_user_mode(regs);
> -	} else if (!regs_irqs_disabled(regs)) {
> +		return;
> +	}
> +
> +	pkrs_restore_irq(regs);

At least you are now putting it consistently at the wrong place
vs. noinstr.

Though, if you look at the xen_pv_evtchn_do_upcall() part where you
added this extra invocation you might figure out that adding
pkrs_restore_irq() to irqentry_exit_cond_resched() and explicitely to
the 'else' path in irqentry_exit() makes it magically consistent for
both use cases.

Thanks,

        tglx

  parent reply	other threads:[~2021-11-25 14:14 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04  4:32 [PATCH V7 00/18] PKS/PMEM: Add Stray Write Protection ira.weiny
2021-08-04  4:32 ` [PATCH V7 01/18] x86/pkeys: Create pkeys_common.h ira.weiny
2021-08-04  4:32 ` [PATCH V7 02/18] x86/fpu: Refactor arch_set_user_pkey_access() ira.weiny
2021-11-25 14:23   ` Thomas Gleixner
2021-08-04  4:32 ` [PATCH V7 03/18] x86/pks: Add additional PKEY helper macros ira.weiny
2021-11-25 14:25   ` Thomas Gleixner
2021-11-25 16:58     ` Thomas Gleixner
2021-12-08  0:51     ` Ira Weiny
2021-12-08 15:11       ` Thomas Gleixner
2021-08-04  4:32 ` [PATCH V7 04/18] x86/pks: Add PKS defines and Kconfig options ira.weiny
2021-08-04  4:32 ` [PATCH V7 05/18] x86/pks: Add PKS setup code ira.weiny
2021-11-25 15:15   ` Thomas Gleixner
2021-11-26  3:11     ` taoyi.ty
2021-11-26  9:57       ` Thomas Gleixner
2021-11-26 11:03     ` Thomas Gleixner
2021-08-04  4:32 ` [PATCH V7 06/18] x86/fault: Adjust WARN_ON for PKey fault ira.weiny
2021-08-04  4:32 ` [PATCH V7 07/18] x86/pks: Preserve the PKRS MSR on context switch ira.weiny
2021-11-25 15:25   ` Thomas Gleixner
2021-08-04  4:32 ` [PATCH V7 08/18] x86/entry: Preserve PKRS MSR across exceptions ira.weiny
2021-11-13  0:50   ` Ira Weiny
2021-11-25 11:19     ` Thomas Gleixner
2021-12-03  1:13     ` Andy Lutomirski
2021-11-25 14:12   ` Thomas Gleixner [this message]
2021-12-07  1:54     ` Ira Weiny
2021-12-07  4:45       ` Ira Weiny
2021-12-08  0:21       ` Thomas Gleixner
2021-08-04  4:32 ` [PATCH V7 09/18] x86/pks: Add PKS kernel API ira.weiny
2021-08-04  4:32 ` [PATCH V7 10/18] x86/pks: Introduce pks_abandon_protections() ira.weiny
2021-08-04  4:32 ` [PATCH V7 11/18] x86/pks: Add PKS Test code ira.weiny
2021-08-04  4:32 ` [PATCH V7 12/18] x86/pks: Add PKS fault callbacks ira.weiny
2021-08-11 21:18   ` Edgecombe, Rick P
2021-08-17  3:21     ` Ira Weiny
2021-08-04  4:32 ` [PATCH V7 13/18] memremap_pages: Add access protection via supervisor Protection Keys (PKS) ira.weiny
2021-08-04  4:32 ` [PATCH V7 14/18] memremap_pages: Add memremap.pks_fault_mode ira.weiny
2021-08-04  4:57   ` Randy Dunlap
2021-08-07 19:32     ` Ira Weiny
2021-08-11 19:01   ` Edgecombe, Rick P
2021-08-17  3:12     ` Ira Weiny
2021-08-04  4:32 ` [PATCH V7 15/18] kmap: Add stray access protection for devmap pages ira.weiny
2021-08-04  4:32 ` [PATCH V7 16/18] dax: Stray access protection for dax_direct_access() ira.weiny
2021-08-04  4:32 ` [PATCH V7 17/18] nvdimm/pmem: Enable stray access protection ira.weiny
2021-08-04  4:32 ` [PATCH V7 18/18] devdax: " ira.weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r1b4l3xc.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).