All of lore.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Jarkko Sakkinen <jarkko@kernel.org>, <linux-sgx@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	<stable@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	<x86@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Jethro Beekman <jethro@fortanix.com>,
	Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH v3] x86/sgx: Free backing memory after faulting the enclave page
Date: Wed, 12 Jan 2022 22:08:02 -0800	[thread overview]
Message-ID: <cd26205a-8551-194f-58df-05f89cd4f049@intel.com> (raw)
In-Reply-To: <20220108140510.76583-1-jarkko@kernel.org>

Hi Jarkko,

On 1/8/2022 6:05 AM, Jarkko Sakkinen wrote:
> There is a limited amount of SGX memory (EPC) on each system.  When that
> memory is used up, SGX has its own swapping mechanism which is similar
> in concept but totally separate from the core mm/* code.  Instead of
> swapping to disk, SGX swaps from EPC to normal RAM.  That normal RAM
> comes from a shared memory pseudo-file and can itself be swapped by the
> core mm code.  There is a hierarchy like this:
> 
> 	EPC <-> shmem <-> disk
> 
> After data is swapped back in from shmem to EPC, the shmem backing
> storage needs to be freed.  Currently, the backing shmem is not freed.
> This effectively wastes the shmem while the enclave is running.  The
> memory is recovered when the enclave is destroyed and the backing
> storage freed.
> 
> Sort this out by freeing memory with shmem_truncate_range(), as soon as
> a page is faulted back to the EPC.  In addition, free the memory for
> PCMD pages as soon as all PCMD's in a page have been marked as unused
> by zeroing its contents.
> 
> Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: stable@vger.kernel.org
> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> ---
> v3:
> * Resend.
> v2:
> * Rewrite commit message as proposed by Dave.
> * Truncate PCMD pages (Dave).
> ---
>  arch/x86/kernel/cpu/sgx/encl.c | 48 +++++++++++++++++++++++++++++++---
>  1 file changed, 44 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 001808e3901c..ea43c10e5458 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -12,6 +12,27 @@
>  #include "encls.h"
>  #include "sgx.h"
>  
> +
> +/*
> + * Get the page number of the page in the backing storage, which stores the PCMD
> + * of the enclave page in the given page index.  PCMD pages are located after
> + * the backing storage for the visible enclave pages and SECS.
> + */
> +static inline pgoff_t sgx_encl_get_backing_pcmd_nr(struct sgx_encl *encl, pgoff_t index)
> +{
> +	return PFN_DOWN(encl->size) + 1 + (index / sizeof(struct sgx_pcmd));
> +}
> +
> +/*
> + * Free a page from the backing storage in the given page index.
> + */
> +static inline void sgx_encl_truncate_backing_page(struct sgx_encl *encl, pgoff_t index)
> +{
> +	struct inode *inode = file_inode(encl->backing);
> +
> +	shmem_truncate_range(inode, PFN_PHYS(index), PFN_PHYS(index) + PAGE_SIZE - 1);
> +}
> +
>  /*
>   * ELDU: Load an EPC page as unblocked. For more info, see "OS Management of EPC
>   * Pages" in the SDM.
> @@ -24,7 +45,10 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  	struct sgx_encl *encl = encl_page->encl;
>  	struct sgx_pageinfo pginfo;
>  	struct sgx_backing b;
> +	bool pcmd_page_empty;
>  	pgoff_t page_index;
> +	pgoff_t pcmd_index;
> +	u8 *pcmd_page;
>  	int ret;
>  
>  	if (secs_page)
> @@ -38,8 +62,8 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  
>  	pginfo.addr = encl_page->desc & PAGE_MASK;
>  	pginfo.contents = (unsigned long)kmap_atomic(b.contents);
> -	pginfo.metadata = (unsigned long)kmap_atomic(b.pcmd) +
> -			  b.pcmd_offset;
> +	pcmd_page = kmap_atomic(b.pcmd);
> +	pginfo.metadata = (unsigned long)pcmd_page + b.pcmd_offset;
>  
>  	if (secs_page)
>  		pginfo.secs = (u64)sgx_get_epc_virt_addr(secs_page);
> @@ -55,11 +79,27 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  		ret = -EFAULT;
>  	}
>  
> -	kunmap_atomic((void *)(unsigned long)(pginfo.metadata - b.pcmd_offset));
> +	memset(pcmd_page + b.pcmd_offset, 0, sizeof(struct sgx_pcmd));
> +
> +	/*
> +	 * The area for the PCMD in the page was zeroed above.  Check if the
> +	 * whole page is now empty meaning that all PCMD's have been zeroed:
> +	 */
> +	pcmd_page_empty = !memchr_inv(pcmd_page, 0, PAGE_SIZE);
> +
> +	kunmap_atomic(pcmd_page);
>  	kunmap_atomic((void *)(unsigned long)pginfo.contents);
>  
>  	sgx_encl_put_backing(&b, false);
>  
> +	/* Free the backing memory. */
> +	sgx_encl_truncate_backing_page(encl, page_index);
> +
> +	if (pcmd_page_empty) {
> +		pcmd_index = sgx_encl_get_backing_pcmd_nr(encl, page_index);
> +		sgx_encl_truncate_backing_page(encl, pcmd_index);
> +	}
> +
>  	return ret;
>  }
>  
> @@ -577,7 +617,7 @@ static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
>  int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
>  			 struct sgx_backing *backing)
>  {
> -	pgoff_t pcmd_index = PFN_DOWN(encl->size) + 1 + (page_index >> 5);
> +	pgoff_t pcmd_index = sgx_encl_get_backing_pcmd_nr(encl, page_index);
>  	struct page *contents;
>  	struct page *pcmd;
>  

I applied this patch on top of commit 2056e2989bf4 ("x86/sgx: Fix NULL pointer
dereference on non-SGX systems") found on branch x86/sgx of the tip repo.

When I run the SGX selftests the new oversubscription test case is failing with
the error below:
./test_sgx
TAP version 13
1..6
# Starting 6 tests from 2 test cases.
#  RUN           enclave.unclobbered_vdso ...
#            OK  enclave.unclobbered_vdso
ok 1 enclave.unclobbered_vdso
#  RUN           enclave.unclobbered_vdso_oversubscribed ...
# main.c:330:unclobbered_vdso_oversubscribed:Expected (&self->run)->function (2) == EEXIT (4)
# main.c:330:unclobbered_vdso_oversubscribed:0x0e 0x06 0x00007f6000000fff
# main.c:338:unclobbered_vdso_oversubscribed:Expected get_op.value (0) == MAGIC (1234605616436508552)
# main.c:339:unclobbered_vdso_oversubscribed:Expected (&self->run)->function (2) == EEXIT (4)
# main.c:339:unclobbered_vdso_oversubscribed:0x0e 0x06 0x00007f6000000fff
# unclobbered_vdso_oversubscribed: Test failed at step #2
#          FAIL  enclave.unclobbered_vdso_oversubscribed
not ok 2 enclave.unclobbered_vdso_oversubscribed
#  RUN           enclave.clobbered_vdso ...
#            OK  enclave.clobbered_vdso
ok 3 enclave.clobbered_vdso
#  RUN           enclave.clobbered_vdso_and_user_function ...
#            OK  enclave.clobbered_vdso_and_user_function
ok 4 enclave.clobbered_vdso_and_user_function
#  RUN           enclave.tcs_entry ...
#            OK  enclave.tcs_entry
ok 5 enclave.tcs_entry
#  RUN           enclave.pte_permissions ...
#            OK  enclave.pte_permissions

The kernel logs also contain a splat that I have not encountered before:

------------[ cut here ]------------
ELDU returned 9 (0x9)
WARNING: CPU: 6 PID: 2470 at arch/x86/kernel/cpu/sgx/encl.c:77 sgx_encl_eldu+0x37c/0x3f0
Modules linked in: intel_rapl_msr intel_rapl_common i10nm_edac x86_pkg_temp_thermal ipmi_ssif coretemp kvm_intel kvm cmdlinepart intel_spi_pci intel_spi spi_nor ipmi_si mei_me ipmi_devintf input_leds irqbypass mtd mei ioatdma intel_pch_thermal wmi ipmi_msghandler acpi_power_meter iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear ixgbe crct10dif_pclmul crc32_pclmul ghash_clmulni_intel hid_generic aesni_intel crypto_simd xfrm_algo usbhid cryptd ast dca hid mdio drm_vram_helper drm_ttm_helper
CPU: 6 PID: 2470 Comm: test_sgx Not tainted 5.16.0-rc1+ #24
Hardware name: Intel Corporation 
RIP: 0010:sgx_encl_eldu+0x37c/0x3f0
Code: 89 c2 48 c7 c6 e1 e9 3e 9b 48 c7 c7 e6 e9 3e 9b 44 89 95 54 ff ff ff 4c 89 85 58 ff ff ff c6 05 fc bd dd 01 01 e8 54 88 03 00 <0f> 0b 44 8b 95 54 ff ff ff 4c 8b 85 58 ff ff ff e9 46 fe ff ff 48
<snip>
Call Trace:
<TASK>
sgx_encl_load_page+0x82/0xc0
? sgx_encl_load_page+0x82/0xc0
sgx_vma_fault+0x40/0xe0
__do_fault+0x32/0x110
__handle_mm_fault+0xf84/0x1510
handle_mm_fault+0x13e/0x3f0
do_user_addr_fault+0x210/0x660
? rcu_read_lock_sched_held+0x4f/0x80
exc_page_fault+0x7b/0x270
? asm_exc_page_fault+0x8/0x30
asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7ffe7fdc3dba
<snip>

I ran the test on two systems and in both cases the test failed accompanied by
the kernel splat.

Reinette

  reply	other threads:[~2022-01-13  6:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-08 14:05 [PATCH v3] x86/sgx: Free backing memory after faulting the enclave page Jarkko Sakkinen
2022-01-13  6:08 ` Reinette Chatre [this message]
2022-01-14 21:25   ` Jarkko Sakkinen
2022-01-17  8:30     ` Jarkko Sakkinen
2022-01-24 17:17   ` Dave Hansen
2022-01-26 13:31     ` Jarkko Sakkinen
2022-01-27  7:01 ` Jarkko Sakkinen
2022-01-31 23:30   ` Dave Hansen
2022-02-20 18:52     ` Jarkko Sakkinen
2022-02-20 18:58       ` Dave Hansen
2022-02-20 20:09         ` Jarkko Sakkinen
2022-02-20 20:42           ` Jarkko Sakkinen
2022-02-22  0:15     ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd26205a-8551-194f-58df-05f89cd4f049@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=jethro@fortanix.com \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.