All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu@kernel.org>
To: mhklinux@outlook.com
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	kirill.shutemov@linux.intel.com, haiyangz@microsoft.com,
	wei.liu@kernel.org, decui@microsoft.com, luto@kernel.org,
	peterz@infradead.org, akpm@linux-foundation.org,
	urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com,
	thomas.lendacky@amd.com, ardb@kernel.org, jroedel@suse.de,
	seanjc@google.com, rick.p.edgecombe@intel.com,
	sathyanarayanan.kuppuswamy@linux.intel.com,
	linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev,
	linux-hyperv@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v4 0/3] x86/hyperv: Mark CoCo VM pages not present when changing encrypted state
Date: Fri, 1 Mar 2024 09:26:18 +0000	[thread overview]
Message-ID: <ZeGfOiizl5o-PNLZ@liuwe-devbox-debian-v2> (raw)
In-Reply-To: <20240116022008.1023398-1-mhklinux@outlook.com>

On Mon, Jan 15, 2024 at 06:20:05PM -0800, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> In a CoCo VM, when transitioning memory from encrypted to decrypted, or
> vice versa, the caller of set_memory_encrypted() or set_memory_decrypted()
> is responsible for ensuring the memory isn't in use and isn't referenced
> while the transition is in progress.  The transition has multiple steps,
> and the memory is in an inconsistent state until all steps are complete.
> A reference while the state is inconsistent could result in an exception
> that can't be cleanly fixed up.
> 
> However, the kernel load_unaligned_zeropad() mechanism could cause a stray
> reference that can't be prevented by the caller of set_memory_encrypted()
> or set_memory_decrypted(), so there's specific code to handle this case.
> But a CoCo VM running on Hyper-V may be configured to run with a paravisor,
> with the #VC or #VE exception routed to the paravisor. There's no
> architectural way to forward the exceptions back to the guest kernel, and
> in such a case, the load_unaligned_zeropad() specific code doesn't work.
> 
> To avoid this problem, mark pages as "not present" while a transition
> is in progress. If load_unaligned_zeropad() causes a stray reference, a
> normal page fault is generated instead of #VC or #VE, and the
> page-fault-based fixup handlers for load_unaligned_zeropad() resolve the
> reference. When the encrypted/decrypted transition is complete, mark the
> pages as "present" again.
> 
> This version of the patch series marks transitioning pages "not present"
> only when running as a Hyper-V guest with a paravisor. Previous
> versions[1] marked transitioning pages "not present" regardless of the
> hypervisor and regardless of whether a paravisor is in use.  That more
> general use had the benefit of decoupling the load_unaligned_zeropad()
> fixup from CoCo VM #VE and #VC exception handling.  But the implementation
> was problematic for SEV-SNP because the SEV-SNP hypervisor callbacks
> require a valid virtual address, not a physical address like with TDX and
> the Hyper-V paravisor.  Marking the transitioning pages "not present"
> causes the virtual address to not be valid, and the PVALIDATE
> instruction in the SEV-SNP callback fails. Constructing a temporary
> virtual address for this purpose is slower and adds complexity that
> negates the benefits of the more general use. So this version narrows
> the applicability of the approach to just where it is required
> because of the #VC and #VE exceptions being routed to a paravisor.
> 
> The previous version minimized the TLB flushing done during page
> transitions between encrypted and decrypted. Because this version
> marks the pages "not present" in hypervisor specific callbacks and
> not in __set_memory_enc_pgtable(), doing such optimization is more
> difficult to coordinate. But the page transitions are not a hot path,
> so this version eschews optimization of TLB flushing in favor of
> simplicity.
> 
> Since this version no longer touches __set_memory_enc_pgtable(),
> I've also removed patches that add comments about error handling
> in that function.  Rick Edgecombe has proposed patches to improve
> that error handling, and I'll leave those comments to Rick's
> patches.
> 
> Patch 1 handles implications of the hypervisor callbacks needing
> to do virt-to-phys translations on pages that are temporarily
> marked not present.
> 
> Patch 2 makes the existing set_memory_p() function available for
> use in the hypervisor callbacks.
> 
> Patch 3 is the core change that marks the transitioning pages
> as not present.
> 
> This patch set is based on the linux-next20240103 code tree.
> 
> Changes in v4:
> * Patch 1: Updated comment in slow_virt_to_phys() to reduce the
>   likelihood of the comment becoming stale.  The new comment
>   describes the requirement to work with leaf PTE not present,
>   but doesn't directly reference the CoCo hypervisor callbacks.
>   [Rick Edgecombe]
> * Patch 1: Decomposed a complex line-wrapped statement into
>   multiple statements for ease of understanding. No functional
>   change compared with v3. [Kirill Shutemov]
> * Patch 3: Fixed handling of memory allocation errors. [Rick
>   Edgecombe]
> 
> Changes in v3:
> * Major rework and simplification per discussion above.
> 
> Changes in v2:
> * Added Patches 3 and 4 to deal with the failure on SEV-SNP
>   [Tom Lendacky]
> * Split the main change into two separate patches (Patch 5 and
>   Patch 6) to improve reviewability and to offer the option of
>   retaining both hypervisor callbacks.
> * Patch 5 moves set_memory_p() out of an #ifdef CONFIG_X86_64
>   so that the code builds correctly for 32-bit, even though it
>   is never executed for 32-bit [reported by kernel test robot]
> 
> [1] https://lore.kernel.org/lkml/20231121212016.1154303-1-mhklinux@outlook.com/
> 
> Michael Kelley (3):
>   x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor
>     callback
>   x86/mm: Regularize set_memory_p() parameters and make non-static
>   x86/hyperv: Make encrypted/decrypted changes safe for
>     load_unaligned_zeropad()

Applied. Thanks.

The changes to mm code are mostly cosmetic. The changes had been pending
for a while without any objections from the maintainers, so I picked
them up as well.

If this becomes problematic, please let me know.

> 
>  arch/x86/hyperv/ivm.c             | 65 ++++++++++++++++++++++++++++---
>  arch/x86/include/asm/set_memory.h |  1 +
>  arch/x86/mm/pat/set_memory.c      | 24 +++++++-----
>  3 files changed, 75 insertions(+), 15 deletions(-)
> 
> -- 
> 2.25.1
> 

      parent reply	other threads:[~2024-03-01  9:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-16  2:20 [PATCH v4 0/3] x86/hyperv: Mark CoCo VM pages not present when changing encrypted state mhkelley58
2024-01-16  2:20 ` [PATCH v4 1/3] x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback mhkelley58
2024-01-16 14:58   ` Edgecombe, Rick P
2024-01-16 17:02   ` rick.p.edgecombe
2024-01-17 13:37   ` kirill.shutemov
2024-01-16  2:20 ` [PATCH v4 2/3] x86/mm: Regularize set_memory_p() parameters and make non-static mhkelley58
2024-01-16  2:20 ` [PATCH v4 3/3] x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad() mhkelley58
2024-01-16 14:59 ` [PATCH v4 0/3] x86/hyperv: Mark CoCo VM pages not present when changing encrypted state Edgecombe, Rick P
2024-01-16 17:10 ` Rick Edgecombe
2024-02-09 15:51 ` Michael Kelley
2024-03-01  9:26 ` Wei Liu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZeGfOiizl5o-PNLZ@liuwe-devbox-debian-v2 \
    --to=wei.liu@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jroedel@suse.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=luto@kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=urezki@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.