linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
@ 2023-06-20 15:48 Dexuan Cui
  2023-06-20 15:48 ` [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
  2023-06-20 15:48 ` [PATCH v8 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
  0 siblings, 2 replies; 9+ messages in thread
From: Dexuan Cui @ 2023-06-20 15:48 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	x86, mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui

v8 is a rebased version of v7:
  v8 is based on tip.git's master branch, which has commit
      75d090fd167a ("x86/tdx: Add unaccepted memory support"), so I have
      to rebase v7 to the commit and post v8.

  v7 is based on tip.git's x86/tdx branch.

The two patches (which are based on the latest master branch of the tip
tree) are the x86/tdx part of the v6 patchset:
https://lwn.net/ml/linux-kernel/20230504225351.10765-1-decui@microsoft.com/

The other patches of the v6 patchset needs more changes in preparation for
the upcoming paravisor support, so let me post the x86/tdx part first.

This v8 patchset addressed Dave's comments on patch 1:
see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com/

Patch 2 is just a repost. There was a race between set_memory_encrypted()
and load_unaligned_zeropad(), which has been fixed by the 3 patches of
Kirill in the tip tree:
  3f6819dd192e ("x86/mm: Allow guest.enc_status_change_prepare() to fail")
  195edce08b63 ("x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()")
  94142c9d1bdf ("x86/mm: Fix enc_status_change_finish_noop()")
  (see https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/tdx)

If you want to view the patchset on github, it is here:
https://github.com/dcui/tdx/commits/decui/upstream-tip/master/tdx/v8-x86-tdx-only

Dexuan Cui (2):
  x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  x86/tdx: Support vmalloc() for tdx_enc_status_changed()

 arch/x86/coco/tdx/tdx.c           | 86 ++++++++++++++++++++++++++-----
 arch/x86/include/asm/shared/tdx.h |  2 +
 2 files changed, 76 insertions(+), 12 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-20 15:48 [PATCH v8 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
@ 2023-06-20 15:48 ` Dexuan Cui
  2023-06-20 18:30   ` Sathyanarayanan Kuppuswamy
  2023-06-20 23:34   ` Dave Hansen
  2023-06-20 15:48 ` [PATCH v8 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
  1 sibling, 2 replies; 9+ messages in thread
From: Dexuan Cui @ 2023-06-20 15:48 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	x86, mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui

GHCI spec for TDX 1.0 says that the MapGPA call may fail with the R10
error code = TDG.VP.VMCALL_RETRY (1), and the guest must retry this
operation for the pages in the region starting at the GPA specified
in R11.

When a fully enlightened TDX guest runs on Hyper-V, Hyper-V can return
the retry error when set_memory_decrypted() is called to decrypt up to
1GB of swiotlb bounce buffers.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---

 arch/x86/coco/tdx/tdx.c           | 63 +++++++++++++++++++++++++------
 arch/x86/include/asm/shared/tdx.h |  2 +
 2 files changed, 53 insertions(+), 12 deletions(-)

Changes in v2:
  Used __tdx_hypercall() directly in tdx_map_gpa().
  Added a max_retry_cnt of 1000.
  Renamed a few variables, e.g., r11 -> map_fail_paddr.

Changes in v3:
  Changed max_retry_cnt from 1000 to 3.

Changes in v4:
  __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT) -> __tdx_hypercall_ret()
  Added Kirill's Acked-by.

Changes in v5:
  Added Michael's Reviewed-by.

Changes in v6: None.

Changes in v7:
  Addressed Dave's comments:
  see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com

Changes in v8:
  Rebased to tip.git's master branch.

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..0c198ab73aa7 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -703,14 +703,16 @@ static bool tdx_cache_flush_required(void)
 }
 
 /*
- * Inform the VMM of the guest's intent for this physical page: shared with
- * the VMM or private to the guest.  The VMM is expected to change its mapping
- * of the page in response.
+ * Notify the VMM about page mapping conversion. More info about ABI
+ * can be found in TDX Guest-Host-Communication Interface (GHCI),
+ * section "TDG.VP.VMCALL<MapGPA>".
  */
-static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
 {
-	phys_addr_t start = __pa(vaddr);
-	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+	const int max_retries_per_page = 3;
+	struct tdx_hypercall_args args;
+	u64 map_fail_paddr, ret;
+	int retry_count = 0;
 
 	if (!enc) {
 		/* Set the shared (decrypted) bits: */
@@ -718,12 +720,49 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
 		end   |= cc_mkdec(0);
 	}
 
-	/*
-	 * Notify the VMM about page mapping conversion. More info about ABI
-	 * can be found in TDX Guest-Host-Communication Interface (GHCI),
-	 * section "TDG.VP.VMCALL<MapGPA>"
-	 */
-	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
+	while (retry_count < max_retries_per_page) {
+		memset(&args, 0, sizeof(args));
+		args.r10 = TDX_HYPERCALL_STANDARD;
+		args.r11 = TDVMCALL_MAP_GPA;
+		args.r12 = start;
+		args.r13 = end - start;
+
+		ret = __tdx_hypercall_ret(&args);
+		if (ret != TDVMCALL_STATUS_RETRY)
+			return !ret;
+		/*
+		 * The guest must retry the operation for the pages in the
+		 * region starting at the GPA specified in R11. R11 comes
+		 * from the untrusted VMM. Sanity check it.
+		 */
+		map_fail_paddr = args.r11;
+		if (map_fail_paddr < start || map_fail_paddr >= end)
+			return false;
+
+		/* "Consume" a retry without forward progress */
+		if (map_fail_paddr == start) {
+			retry_count++;
+			continue;
+		}
+
+		start = map_fail_paddr;
+		retry_count = 0;
+	}
+
+	return false;
+}
+
+/*
+ * Inform the VMM of the guest's intent for this physical page: shared with
+ * the VMM or private to the guest.  The VMM is expected to change its mapping
+ * of the page in response.
+ */
+static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+{
+	phys_addr_t start = __pa(vaddr);
+	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+
+	if (!tdx_map_gpa(start, end, enc))
 		return false;
 
 	/* shared->private conversion requires memory to be accepted before use */
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 90ea813c4b99..9db89a99ae5b 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -24,6 +24,8 @@
 #define TDVMCALL_MAP_GPA		0x10001
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
 
+#define TDVMCALL_STATUS_RETRY		1
+
 #ifndef __ASSEMBLY__
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v8 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
  2023-06-20 15:48 [PATCH v8 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
  2023-06-20 15:48 ` [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
@ 2023-06-20 15:48 ` Dexuan Cui
  2023-06-20 18:34   ` Sathyanarayanan Kuppuswamy
  1 sibling, 1 reply; 9+ messages in thread
From: Dexuan Cui @ 2023-06-20 15:48 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	x86, mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui

When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
allocates buffers using vzalloc(), and needs to share the buffers with the
host OS by calling set_memory_decrypted(), which is not working for
vmalloc() yet. Add the support by handling the pages one by one.

Co-developed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/coco/tdx/tdx.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

Changes in v2:
  Changed tdx_enc_status_changed() in place.

Changes in v3:
  No change since v2.

Changes in v4:
  Added Kirill's Co-developed-by since Kirill helped to improve the
    code by adding tdx_enc_status_changed_phys().

  Thanks Kirill for the clarification on load_unaligned_zeropad()!

Changes in v5:
  Added Kirill's Signed-off-by.
  Added Michael's Reviewed-by.

Changes in v6: None.

Changes in v7: None.
  Note: there was a race between set_memory_encrypted() and
  load_unaligned_zeropad(), which has been fixed by the 3 patches of
  Kirill in the x86/tdx branch of the tip tree.

Changes in v8:
  Rebased to tip.git's master branch.

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 0c198ab73aa7..a313d5ab42f1 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -8,6 +8,7 @@
 #include <linux/export.h>
 #include <linux/io.h>
+#include <linux/mm.h>
 #include <asm/coco.h>
 #include <asm/tdx.h>
 #include <asm/vmx.h>
 #include <asm/insn.h>
@@ -752,6 +753,19 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
 	return false;
 }
 
+static bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end,
+					bool enc)
+{
+	if (!tdx_map_gpa(start, end, enc))
+		return false;
+
+	/* shared->private conversion requires memory to be accepted before use */
+	if (enc)
+		return tdx_accept_memory(start, end);
+
+	return true;
+}
+
 /*
  * Inform the VMM of the guest's intent for this physical page: shared with
  * the VMM or private to the guest.  The VMM is expected to change its mapping
@@ -759,15 +773,24 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
  */
 static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
 {
-	phys_addr_t start = __pa(vaddr);
-	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+	unsigned long start = vaddr;
+	unsigned long end = start + numpages * PAGE_SIZE;
 
-	if (!tdx_map_gpa(start, end, enc))
+	if (offset_in_page(start) != 0)
 		return false;
 
-	/* shared->private conversion requires memory to be accepted before use */
-	if (enc)
-		return tdx_accept_memory(start, end);
+	if (!is_vmalloc_addr((void *)start))
+		return tdx_enc_status_changed_phys(__pa(start), __pa(end), enc);
+
+	while (start < end) {
+		phys_addr_t start_pa = slow_virt_to_phys((void *)start);
+		phys_addr_t end_pa = start_pa + PAGE_SIZE;
+
+		if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc))
+			return false;
+
+		start += PAGE_SIZE;
+	}
 
 	return true;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-20 15:48 ` [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
@ 2023-06-20 18:30   ` Sathyanarayanan Kuppuswamy
  2023-06-20 19:23     ` Dexuan Cui
  2023-06-20 23:34   ` Dave Hansen
  1 sibling, 1 reply; 9+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-20 18:30 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-arch, linux-hyperv, luto, mingo,
	peterz, rostedt, seanjc, tglx, tony.luck, wei.liu, x86, mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe

Hi,

On 6/20/23 8:48 AM, Dexuan Cui wrote:
> GHCI spec for TDX 1.0 says that the MapGPA call may fail with the R10
> error code = TDG.VP.VMCALL_RETRY (1), and the guest must retry this
> operation for the pages in the region starting at the GPA specified
> in R11.
> 
> When a fully enlightened TDX guest runs on Hyper-V, Hyper-V can return
> the retry error when set_memory_decrypted() is called to decrypt up to
> 1GB of swiotlb bounce buffers.
> 
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
> 
>  arch/x86/coco/tdx/tdx.c           | 63 +++++++++++++++++++++++++------
>  arch/x86/include/asm/shared/tdx.h |  2 +
>  2 files changed, 53 insertions(+), 12 deletions(-)
> 
> Changes in v2:
>   Used __tdx_hypercall() directly in tdx_map_gpa().
>   Added a max_retry_cnt of 1000.
>   Renamed a few variables, e.g., r11 -> map_fail_paddr.
> 
> Changes in v3:
>   Changed max_retry_cnt from 1000 to 3.
> 
> Changes in v4:
>   __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT) -> __tdx_hypercall_ret()
>   Added Kirill's Acked-by.
> 
> Changes in v5:
>   Added Michael's Reviewed-by.
> 
> Changes in v6: None.
> 
> Changes in v7:
>   Addressed Dave's comments:
>   see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com
> 
> Changes in v8:
>   Rebased to tip.git's master branch.
> 
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 1d6b863c42b0..0c198ab73aa7 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -703,14 +703,16 @@ static bool tdx_cache_flush_required(void)
>  }
>  
>  /*
> - * Inform the VMM of the guest's intent for this physical page: shared with
> - * the VMM or private to the guest.  The VMM is expected to change its mapping
> - * of the page in response.
> + * Notify the VMM about page mapping conversion. More info about ABI
> + * can be found in TDX Guest-Host-Communication Interface (GHCI),
> + * section "TDG.VP.VMCALL<MapGPA>".
>   */
> -static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
> +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
>  {
> -	phys_addr_t start = __pa(vaddr);
> -	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> +	const int max_retries_per_page = 3;

Add some details about why you chose 3? Maybe you can also use macro for it.

> +	struct tdx_hypercall_args args;
> +	u64 map_fail_paddr, ret;
> +	int retry_count = 0;
>  
>  	if (!enc) {
>  		/* Set the shared (decrypted) bits: */
> @@ -718,12 +720,49 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
>  		end   |= cc_mkdec(0);
>  	}
>  
> -	/*
> -	 * Notify the VMM about page mapping conversion. More info about ABI
> -	 * can be found in TDX Guest-Host-Communication Interface (GHCI),
> -	 * section "TDG.VP.VMCALL<MapGPA>"
> -	 */
> -	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
> +	while (retry_count < max_retries_per_page) {
> +		memset(&args, 0, sizeof(args));
> +		args.r10 = TDX_HYPERCALL_STANDARD;
> +		args.r11 = TDVMCALL_MAP_GPA;
> +		args.r12 = start;
> +		args.r13 = end - start;
> +
> +		ret = __tdx_hypercall_ret(&args);
> +		if (ret != TDVMCALL_STATUS_RETRY)
> +			return !ret;
> +		/*
> +		 * The guest must retry the operation for the pages in the
> +		 * region starting at the GPA specified in R11. R11 comes
> +		 * from the untrusted VMM. Sanity check it.
> +		 */
> +		map_fail_paddr = args.r11;

Do you really need map_fail_paddr? Why not directly use args.r11?

> +		if (map_fail_paddr < start || map_fail_paddr >= end)
> +			return false;
> +
> +		/* "Consume" a retry without forward progress */
> +		if (map_fail_paddr == start) {
> +			retry_count++;
> +			continue;
> +		}
> +
> +		start = map_fail_paddr;
> +		retry_count = 0;
> +	}
> +
> +	return false;
> +}
> +
> +/*
> + * Inform the VMM of the guest's intent for this physical page: shared with
> + * the VMM or private to the guest.  The VMM is expected to change its mapping
> + * of the page in response.
> + */
> +static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
> +{
> +	phys_addr_t start = __pa(vaddr);
> +	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> +
> +	if (!tdx_map_gpa(start, end, enc))
>  		return false;
>  
>  	/* shared->private conversion requires memory to be accepted before use */
> diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
> index 90ea813c4b99..9db89a99ae5b 100644
> --- a/arch/x86/include/asm/shared/tdx.h
> +++ b/arch/x86/include/asm/shared/tdx.h
> @@ -24,6 +24,8 @@
>  #define TDVMCALL_MAP_GPA		0x10001
>  #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
>  
> +#define TDVMCALL_STATUS_RETRY		1
> +
>  #ifndef __ASSEMBLY__
>  
>  /*

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
  2023-06-20 15:48 ` [PATCH v8 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
@ 2023-06-20 18:34   ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 9+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-20 18:34 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-arch, linux-hyperv, luto, mingo,
	peterz, rostedt, seanjc, tglx, tony.luck, wei.liu, x86, mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe

Hi,

On 6/20/23 8:48 AM, Dexuan Cui wrote:
> When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
> allocates buffers using vzalloc(), and needs to share the buffers with the
> host OS by calling set_memory_decrypted(), which is not working for
> vmalloc() yet. Add the support by handling the pages one by one.
> 
> Co-developed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---

Looks good to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

>  arch/x86/coco/tdx/tdx.c | 35 +++++++++++++++++++++++++++++------
>  1 file changed, 29 insertions(+), 6 deletions(-)
> 
> Changes in v2:
>   Changed tdx_enc_status_changed() in place.
> 
> Changes in v3:
>   No change since v2.
> 
> Changes in v4:
>   Added Kirill's Co-developed-by since Kirill helped to improve the
>     code by adding tdx_enc_status_changed_phys().
> 
>   Thanks Kirill for the clarification on load_unaligned_zeropad()!
> 
> Changes in v5:
>   Added Kirill's Signed-off-by.
>   Added Michael's Reviewed-by.
> 
> Changes in v6: None.
> 
> Changes in v7: None.
>   Note: there was a race between set_memory_encrypted() and
>   load_unaligned_zeropad(), which has been fixed by the 3 patches of
>   Kirill in the x86/tdx branch of the tip tree.
> 
> Changes in v8:
>   Rebased to tip.git's master branch.
> 
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 0c198ab73aa7..a313d5ab42f1 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -8,6 +8,7 @@
>  #include <linux/export.h>
>  #include <linux/io.h>
> +#include <linux/mm.h>
>  #include <asm/coco.h>
>  #include <asm/tdx.h>
>  #include <asm/vmx.h>
>  #include <asm/insn.h>
> @@ -752,6 +753,19 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
>  	return false;
>  }
>  
> +static bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end,
> +					bool enc)
> +{
> +	if (!tdx_map_gpa(start, end, enc))
> +		return false;
> +
> +	/* shared->private conversion requires memory to be accepted before use */
> +	if (enc)
> +		return tdx_accept_memory(start, end);
> +
> +	return true;
> +}
> +
>  /*
>   * Inform the VMM of the guest's intent for this physical page: shared with
>   * the VMM or private to the guest.  The VMM is expected to change its mapping
> @@ -759,15 +773,24 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
>   */
>  static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
>  {
> -	phys_addr_t start = __pa(vaddr);
> -	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> +	unsigned long start = vaddr;
> +	unsigned long end = start + numpages * PAGE_SIZE;
>  
> -	if (!tdx_map_gpa(start, end, enc))
> +	if (offset_in_page(start) != 0)
>  		return false;
>  
> -	/* shared->private conversion requires memory to be accepted before use */
> -	if (enc)
> -		return tdx_accept_memory(start, end);
> +	if (!is_vmalloc_addr((void *)start))
> +		return tdx_enc_status_changed_phys(__pa(start), __pa(end), enc);
> +
> +	while (start < end) {
> +		phys_addr_t start_pa = slow_virt_to_phys((void *)start);
> +		phys_addr_t end_pa = start_pa + PAGE_SIZE;
> +
> +		if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc))
> +			return false;
> +
> +		start += PAGE_SIZE;
> +	}
>  
>  	return true;
>  }

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-20 18:30   ` Sathyanarayanan Kuppuswamy
@ 2023-06-20 19:23     ` Dexuan Cui
  2023-06-20 19:44       ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 9+ messages in thread
From: Dexuan Cui @ 2023-06-20 19:23 UTC (permalink / raw)
  To: Sathyanarayanan Kuppuswamy, ak, arnd, bp, brijesh.singh,
	dan.j.williams, dave.hansen, dave.hansen, Haiyang Zhang, hpa,
	jane.chu, kirill.shutemov, KY Srinivasan, linux-arch,
	linux-hyperv, luto, mingo, peterz, rostedt, seanjc, tglx,
	tony.luck, wei.liu, x86, Michael Kelley (LINUX)
  Cc: linux-kernel, Tianyu Lan, rick.p.edgecombe

> From: Sathyanarayanan Kuppuswamy
> Sent: Tuesday, June 20, 2023 11:31 AM
> > ...
> > -static bool tdx_enc_status_changed(unsigned long vaddr, int numpages,
> bool enc)
> > +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
> >  {
> > -	phys_addr_t start = __pa(vaddr);
> > -	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> > +	const int max_retries_per_page = 3;
> 
> Add some details about why you chose 3? Maybe you can also use macro for it.

It's a small number recommended by Kirill:
https://lwn.net/ml/linux-kernel/20221208194800.n27ak4xj6pmyny46@box.shutemov.name/

The spec doesn't define a max retry count. Normally I guess a max retry count
of 2 should be enough, at least for Hyper-V according to my testing.

Maybe we can add a comment like this:

/* Retrying the hypercall a second time should succeed; use 3 just in case. */

Does this look good to all?

> > +	struct tdx_hypercall_args args;
> > +	u64 map_fail_paddr, ret;
> > +	int retry_count = 0;
> >
> >  	if (!enc) {
> >  		/* Set the shared (decrypted) bits: */
> > @@ -718,12 +720,49 @@ static bool tdx_enc_status_changed(unsigned long
> vaddr, int numpages, bool enc)
> >  		end   |= cc_mkdec(0);
> >  	}
> >
> > -	/*
> > -	 * Notify the VMM about page mapping conversion. More info about ABI
> > -	 * can be found in TDX Guest-Host-Communication Interface (GHCI),
> > -	 * section "TDG.VP.VMCALL<MapGPA>"
> > -	 */
> > -	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
> > +	while (retry_count < max_retries_per_page) {
> > +		memset(&args, 0, sizeof(args));
> > +		args.r10 = TDX_HYPERCALL_STANDARD;
> > +		args.r11 = TDVMCALL_MAP_GPA;
> > +		args.r12 = start;
> > +		args.r13 = end - start;
> > +
> > +		ret = __tdx_hypercall_ret(&args);
> > +		if (ret != TDVMCALL_STATUS_RETRY)
> > +			return !ret;
> > +		/*
> > +		 * The guest must retry the operation for the pages in the
> > +		 * region starting at the GPA specified in R11. R11 comes
> > +		 * from the untrusted VMM. Sanity check it.
> > +		 */
> > +		map_fail_paddr = args.r11;
> 
> Do you really need map_fail_paddr? Why not directly use args.r11?
> 
> > +		if (map_fail_paddr < start || map_fail_paddr >= end)
> > +			return false;

Originally, I used r11. 

Dave says " 'r11' needs a real, logical name":
https://lwn.net/ml/linux-kernel/6bb65614-d420-49d3-312f-316dc8ca4cc4@intel.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-20 19:23     ` Dexuan Cui
@ 2023-06-20 19:44       ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 9+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-06-20 19:44 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, Haiyang Zhang, hpa, jane.chu,
	kirill.shutemov, KY Srinivasan, linux-arch, linux-hyperv, luto,
	mingo, peterz, rostedt, seanjc, tglx, tony.luck, wei.liu, x86,
	Michael Kelley (LINUX)
  Cc: linux-kernel, Tianyu Lan, rick.p.edgecombe

Hi,

On 6/20/23 12:23 PM, Dexuan Cui wrote:
>> From: Sathyanarayanan Kuppuswamy
>> Sent: Tuesday, June 20, 2023 11:31 AM
>>> ...
>>> -static bool tdx_enc_status_changed(unsigned long vaddr, int numpages,
>> bool enc)
>>> +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
>>>  {
>>> -	phys_addr_t start = __pa(vaddr);
>>> -	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
>>> +	const int max_retries_per_page = 3;
>>
>> Add some details about why you chose 3? Maybe you can also use macro for it.
> 
> It's a small number recommended by Kirill:
> https://lwn.net/ml/linux-kernel/20221208194800.n27ak4xj6pmyny46@box.shutemov.name/
> 
> The spec doesn't define a max retry count. Normally I guess a max retry count
> of 2 should be enough, at least for Hyper-V according to my testing.
> 
> Maybe we can add a comment like this:
> 
> /* Retrying the hypercall a second time should succeed; use 3 just in case. */
> 
> Does this look good to all?

Looks fine to me.

> 
>>> +	struct tdx_hypercall_args args;
>>> +	u64 map_fail_paddr, ret;
>>> +	int retry_count = 0;
>>>
>>>  	if (!enc) {
>>>  		/* Set the shared (decrypted) bits: */
>>> @@ -718,12 +720,49 @@ static bool tdx_enc_status_changed(unsigned long
>> vaddr, int numpages, bool enc)
>>>  		end   |= cc_mkdec(0);
>>>  	}
>>>
>>> -	/*
>>> -	 * Notify the VMM about page mapping conversion. More info about ABI
>>> -	 * can be found in TDX Guest-Host-Communication Interface (GHCI),
>>> -	 * section "TDG.VP.VMCALL<MapGPA>"
>>> -	 */
>>> -	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
>>> +	while (retry_count < max_retries_per_page) {
>>> +		memset(&args, 0, sizeof(args));
>>> +		args.r10 = TDX_HYPERCALL_STANDARD;
>>> +		args.r11 = TDVMCALL_MAP_GPA;
>>> +		args.r12 = start;
>>> +		args.r13 = end - start;
>>> +
>>> +		ret = __tdx_hypercall_ret(&args);
>>> +		if (ret != TDVMCALL_STATUS_RETRY)
>>> +			return !ret;
>>> +		/*
>>> +		 * The guest must retry the operation for the pages in the
>>> +		 * region starting at the GPA specified in R11. R11 comes
>>> +		 * from the untrusted VMM. Sanity check it.
>>> +		 */
>>> +		map_fail_paddr = args.r11;
>>
>> Do you really need map_fail_paddr? Why not directly use args.r11?
>>
>>> +		if (map_fail_paddr < start || map_fail_paddr >= end)
>>> +			return false;
> 
> Originally, I used r11. 
> 
> Dave says " 'r11' needs a real, logical name":
> https://lwn.net/ml/linux-kernel/6bb65614-d420-49d3-312f-316dc8ca4cc4@intel.com/

Got it.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-20 15:48 ` [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
  2023-06-20 18:30   ` Sathyanarayanan Kuppuswamy
@ 2023-06-20 23:34   ` Dave Hansen
  2023-06-21  0:28     ` Dexuan Cui
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Hansen @ 2023-06-20 23:34 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	x86, mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe

On 6/20/23 08:48, Dexuan Cui wrote:
> -static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
> +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
>  {
> -	phys_addr_t start = __pa(vaddr);
> -	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> +	const int max_retries_per_page = 3;
> +	struct tdx_hypercall_args args;
> +	u64 map_fail_paddr, ret;
> +	int retry_count = 0;
>  
>  	if (!enc) {
>  		/* Set the shared (decrypted) bits: */
> @@ -718,12 +720,49 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
>  		end   |= cc_mkdec(0);
>  	}
>  
> -	/*
> -	 * Notify the VMM about page mapping conversion. More info about ABI
> -	 * can be found in TDX Guest-Host-Communication Interface (GHCI),
> -	 * section "TDG.VP.VMCALL<MapGPA>"
> -	 */
> -	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
> +	while (retry_count < max_retries_per_page) {
> +		memset(&args, 0, sizeof(args));
> +		args.r10 = TDX_HYPERCALL_STANDARD;
> +		args.r11 = TDVMCALL_MAP_GPA;
> +		args.r12 = start;
> +		args.r13 = end - start;
> +

What's wrong with:

	while (retry_count < max_retries_per_page) {
		struct tdx_hypercall_args args = {
			.r10 = TDX_HYPERCALL_STANDARD,
			.r11 = TDVMCALL_MAP_GPA,
			.r12 = start,
			.r13 = end - start };

?

Or maybe with the brackets slightly differently arranged.

Why'd you declare all the variables outside the while() loop anyway?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-20 23:34   ` Dave Hansen
@ 2023-06-21  0:28     ` Dexuan Cui
  0 siblings, 0 replies; 9+ messages in thread
From: Dexuan Cui @ 2023-06-21  0:28 UTC (permalink / raw)
  To: Dave Hansen, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, Haiyang Zhang, hpa, jane.chu, kirill.shutemov,
	KY Srinivasan, linux-arch, linux-hyperv, luto, mingo, peterz,
	rostedt, sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck,
	wei.liu, x86, Michael Kelley (LINUX)
  Cc: linux-kernel, Tianyu Lan, rick.p.edgecombe

> From: Dave Hansen <dave.hansen@intel.com>
> Sent: Tuesday, June 20, 2023 4:34 PM
> >  ...
> > -	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
> > +	while (retry_count < max_retries_per_page) {
> > +		memset(&args, 0, sizeof(args));
> > +		args.r10 = TDX_HYPERCALL_STANDARD;
> > +		args.r11 = TDVMCALL_MAP_GPA;
> > +		args.r12 = start;
> > +		args.r13 = end - start;
> > +
> 
> What's wrong with:
> 
> 	while (retry_count < max_retries_per_page) {
> 		struct tdx_hypercall_args args = {
> 			.r10 = TDX_HYPERCALL_STANDARD,
> 			.r11 = TDVMCALL_MAP_GPA,
> 			.r12 = start,
> 			.r13 = end - start };
> 
> ?
> 
> Or maybe with the brackets slightly differently arranged.
> 
> Why'd you declare all the variables outside the while() loop anyway?

Thanks for the suggestion of making the code compact!
I'll apply the below diff, and post v9 tomorrow (trying to
not post too frequently...)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 08eac8f46c11..1cb7e9ee3a68 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -710,9 +710,8 @@ static bool tdx_cache_flush_required(void)
  */
 static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
 {
+	/* Retrying the hypercall a second time should succeed; use 3 just in case */
 	const int max_retries_per_page = 3;
-	struct tdx_hypercall_args args;
-	u64 map_fail_paddr, ret;
 	int retry_count = 0;
 
 	if (!enc) {
@@ -722,13 +721,14 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
 	}
 
 	while (retry_count < max_retries_per_page) {
-		memset(&args, 0, sizeof(args));
-		args.r10 = TDX_HYPERCALL_STANDARD;
-		args.r11 = TDVMCALL_MAP_GPA;
-		args.r12 = start;
-		args.r13 = end - start;
-
-		ret = __tdx_hypercall_ret(&args);
+		struct tdx_hypercall_args args = {
+			.r10 = TDX_HYPERCALL_STANDARD,
+			.r11 = TDVMCALL_MAP_GPA,
+			.r12 = start,
+			.r13 = end - start };
+
+		u64 map_fail_paddr;
+		u64 ret = __tdx_hypercall_ret(&args);
 		if (ret != TDVMCALL_STATUS_RETRY)
 			return !ret;
 		/*


The function now looks like this:

/*
 * Notify the VMM about page mapping conversion. More info about ABI
 * can be found in TDX Guest-Host-Communication Interface (GHCI),
 * section "TDG.VP.VMCALL<MapGPA>".
 */
static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
{
	/* Retrying the hypercall a second time should succeed; use 3 just in case */
	const int max_retries_per_page = 3;
	int retry_count = 0;

	if (!enc) {
		/* Set the shared (decrypted) bits: */
		start |= cc_mkdec(0);
		end   |= cc_mkdec(0);
	}

	while (retry_count < max_retries_per_page) {
		struct tdx_hypercall_args args = {
			.r10 = TDX_HYPERCALL_STANDARD,
			.r11 = TDVMCALL_MAP_GPA,
			.r12 = start,
			.r13 = end - start };

		u64 map_fail_paddr;
		u64 ret = __tdx_hypercall_ret(&args);
		if (ret != TDVMCALL_STATUS_RETRY)
			return !ret;
		/*
		 * The guest must retry the operation for the pages in the
		 * region starting at the GPA specified in R11. R11 comes
		 * from the untrusted VMM. Sanity check it.
		 */
		map_fail_paddr = args.r11;
		if (map_fail_paddr < start || map_fail_paddr >= end)
			return false;

		/* "Consume" a retry without forward progress */
		if (map_fail_paddr == start) {
			retry_count++;
			continue;
		}

		start = map_fail_paddr;
		retry_count = 0;
	}

	return false;
}


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-06-21  0:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-20 15:48 [PATCH v8 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
2023-06-20 15:48 ` [PATCH v8 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2023-06-20 18:30   ` Sathyanarayanan Kuppuswamy
2023-06-20 19:23     ` Dexuan Cui
2023-06-20 19:44       ` Sathyanarayanan Kuppuswamy
2023-06-20 23:34   ` Dave Hansen
2023-06-21  0:28     ` Dexuan Cui
2023-06-20 15:48 ` [PATCH v8 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
2023-06-20 18:34   ` Sathyanarayanan Kuppuswamy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).