All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brendan Jackman <jackmanb@google.com>
To: linux-kernel@vger.kernel.org
Cc: linux-tip-commits@vger.kernel.org,
	Lai Jiangshan <laijs@linux.alibaba.com>,
	 Thomas Gleixner <tglx@linutronix.de>,
	x86@kernel.org, Kevin Cheng <chengkev@google.com>,
	 Yosry Ahmed <yosryahmed@google.com>
Subject: Re: [tip: x86/entry] x86/entry: Avoid redundant CR3 write on paranoid returns
Date: Mon, 19 Feb 2024 11:49:46 +0100	[thread overview]
Message-ID: <CA+i-1C1OpZQTS3EQa8fEc5BTzcLNMcgrwt0b9mR_jqiY0-zV3A@mail.gmail.com> (raw)
In-Reply-To: <170612139384.398.13715690088153668463.tip-bot2@tip-bot2>

[Apologies if you see this as a duplicate, accidentally sent the
original in HTML, please disregard the other one]

Hi Thomas,

I have just noticed that the commit has disappeared from
tip/x86/entry. Is that deliberate?

Thanks,
Brendan


On Wed, 24 Jan 2024 at 19:36, tip-bot2 for Lai Jiangshan
<tip-bot2@linutronix.de> wrote:
>
> The following commit has been merged into the x86/entry branch of tip:
>
> Commit-ID:     bb998361999e79bc87dae1ebe0f5bf317f632585
> Gitweb:        https://git.kernel.org/tip/bb998361999e79bc87dae1ebe0f5bf317f632585
> Author:        Lai Jiangshan <laijs@linux.alibaba.com>
> AuthorDate:    Mon, 08 Jan 2024 11:39:50
> Committer:     Thomas Gleixner <tglx@linutronix.de>
> CommitterDate: Wed, 24 Jan 2024 13:57:59 +01:00
>
> x86/entry: Avoid redundant CR3 write on paranoid returns
>
> The CR3 restore happens in:
>
>   1. #NMI return.
>   2. paranoid_exit() (i.e. #MCE, #VC, #DB and #DF return)
>
> Contrary to the implication in commit 21e94459110252 ("x86/mm: Optimize
> RESTORE_CR3"), the kernel never modifies CR3 in any of these exceptions,
> except for switching from user to kernel pagetables under PTI. That
> means that most of the time when returning from an exception that
> interrupted the kernel no CR3 restore is necessary. Writing CR3 is
> expensive on some machines.
>
> Most of the time because the interrupt might have come during kernel entry
> before the user to kernel CR3 switch or the during exit after the kernel to
> user switch. In the former case skipping the restore would be correct, but
> definitely not for the latter.
>
> So check the saved CR3 value and restore it only, if it is a user CR3.
>
> Give the macro a new name to clarify its usage, and remove a comment that
> was describing the original behaviour along with the not longer needed jump
> label.
>
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Link: https://lore.kernel.org/r/20240108113950.360438-1-jackmanb@google.com
>
> [Rewrote commit message; responded to review comments]
> Change-Id: I6e56978c4753fb943a7897ff101f519514fa0827
> ---
>  arch/x86/entry/calling.h  | 26 ++++++++++----------------
>  arch/x86/entry/entry_64.S |  7 +++----
>  2 files changed, 13 insertions(+), 20 deletions(-)
>
> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
> index 9f1d947..92dca4a 100644
> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -239,17 +239,19 @@ For 32-bit we have the following conventions - kernel is built with
>  .Ldone_\@:
>  .endm
>
> -.macro RESTORE_CR3 scratch_reg:req save_reg:req
> +/* Restore CR3 from a kernel context. May restore a user CR3 value. */
> +.macro PARANOID_RESTORE_CR3 scratch_reg:req save_reg:req
>         ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
>
> -       ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
> -
>         /*
> -        * KERNEL pages can always resume with NOFLUSH as we do
> -        * explicit flushes.
> +        * If CR3 contained the kernel page tables at the paranoid exception
> +        * entry, then there is nothing to restore as CR3 is not modified while
> +        * handling the exception.
>          */
>         bt      $PTI_USER_PGTABLE_BIT, \save_reg
> -       jnc     .Lnoflush_\@
> +       jnc     .Lend_\@
> +
> +       ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
>
>         /*
>          * Check if there's a pending flush for the user ASID we're
> @@ -257,20 +259,12 @@ For 32-bit we have the following conventions - kernel is built with
>          */
>         movq    \save_reg, \scratch_reg
>         andq    $(0x7FF), \scratch_reg
> -       bt      \scratch_reg, THIS_CPU_user_pcid_flush_mask
> -       jnc     .Lnoflush_\@
> -
>         btr     \scratch_reg, THIS_CPU_user_pcid_flush_mask
> -       jmp     .Lwrcr3_\@
> +       jc      .Lwrcr3_\@
>
> -.Lnoflush_\@:
>         SET_NOFLUSH_BIT \save_reg
>
>  .Lwrcr3_\@:
> -       /*
> -        * The CR3 write could be avoided when not changing its value,
> -        * but would require a CR3 read *and* a scratch register.
> -        */
>         movq    \save_reg, %cr3
>  .Lend_\@:
>  .endm
> @@ -285,7 +279,7 @@ For 32-bit we have the following conventions - kernel is built with
>  .endm
>  .macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
>  .endm
> -.macro RESTORE_CR3 scratch_reg:req save_reg:req
> +.macro PARANOID_RESTORE_CR3 scratch_reg:req save_reg:req
>  .endm
>
>  #endif
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index c40f89a..aedd169 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -968,14 +968,14 @@ SYM_CODE_START_LOCAL(paranoid_exit)
>         IBRS_EXIT save_reg=%r15
>
>         /*
> -        * The order of operations is important. RESTORE_CR3 requires
> +        * The order of operations is important. PARANOID_RESTORE_CR3 requires
>          * kernel GSBASE.
>          *
>          * NB to anyone to try to optimize this code: this code does
>          * not execute at all for exceptions from user mode. Those
>          * exceptions go through error_return instead.
>          */
> -       RESTORE_CR3     scratch_reg=%rax save_reg=%r14
> +       PARANOID_RESTORE_CR3 scratch_reg=%rax save_reg=%r14
>
>         /* Handle the three GSBASE cases */
>         ALTERNATIVE "jmp .Lparanoid_exit_checkgs", "", X86_FEATURE_FSGSBASE
> @@ -1404,8 +1404,7 @@ end_repeat_nmi:
>         /* Always restore stashed SPEC_CTRL value (see paranoid_entry) */
>         IBRS_EXIT save_reg=%r15
>
> -       /* Always restore stashed CR3 value (see paranoid_entry) */
> -       RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
> +       PARANOID_RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
>
>         /*
>          * The above invocation of paranoid_entry stored the GSBASE

  reply	other threads:[~2024-02-19 10:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-08 11:39 [PATCH v3 RESEND] x86/entry: Avoid redundant CR3 write on paranoid returns Brendan Jackman
2024-01-23  5:33 ` Yosry Ahmed
2024-01-24 18:36 ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
2024-02-19 10:49   ` Brendan Jackman [this message]
2024-02-19 14:42     ` Borislav Petkov
2024-02-19 14:51       ` Brendan Jackman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+i-1C1OpZQTS3EQa8fEc5BTzcLNMcgrwt0b9mR_jqiY0-zV3A@mail.gmail.com \
    --to=jackmanb@google.com \
    --cc=chengkev@google.com \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.