From: Alexander Graf <graf@amazon.de>
To: Andra Paraschiv <andraprs@amazon.com>, <linux-kernel@vger.kernel.org>
Cc: Anthony Liguori <aliguori@amazon.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Colm MacCarthaigh <colmmacc@amazon.com>,
"Bjoern Doebel" <doebel@amazon.de>,
David Woodhouse <dwmw@amazon.co.uk>,
"Frank van der Linden" <fllinden@amazon.com>,
Greg KH <gregkh@linuxfoundation.org>,
Martin Pohlack <mpohlack@amazon.de>, Matt Wilson <msw@amazon.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
Balbir Singh <sblbir@amazon.com>,
"Stefano Garzarella" <sgarzare@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Stewart Smith <trawets@amazon.com>,
Uwe Dannowski <uwed@amazon.de>, <kvm@vger.kernel.org>,
<ne-devel-upstream@amazon.com>
Subject: Re: [PATCH v4 11/18] nitro_enclaves: Add logic for enclave memory region set
Date: Mon, 6 Jul 2020 12:46:39 +0200 [thread overview]
Message-ID: <798dbb9f-0fe4-9fd9-2e64-f6f2bc740abf@amazon.de> (raw)
In-Reply-To: <20200622200329.52996-12-andraprs@amazon.com>
On 22.06.20 22:03, Andra Paraschiv wrote:
> Another resource that is being set for an enclave is memory. User space
> memory regions, that need to be backed by contiguous memory regions,
> are associated with the enclave.
>
> One solution for allocating / reserving contiguous memory regions, that
> is used for integration, is hugetlbfs. The user space process that is
> associated with the enclave passes to the driver these memory regions.
>
> The enclave memory regions need to be from the same NUMA node as the
> enclave CPUs.
>
> Add ioctl command logic for setting user space memory region for an
> enclave.
>
> Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
> Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
> ---
> Changelog
>
> v3 -> v4
>
> * Check enclave memory regions are from the same NUMA node as the
> enclave CPUs.
> * Use dev_err instead of custom NE log pattern.
> * Update the NE ioctl call to match the decoupling from the KVM API.
>
> v2 -> v3
>
> * Remove the WARN_ON calls.
> * Update static calls sanity checks.
> * Update kzfree() calls to kfree().
>
> v1 -> v2
>
> * Add log pattern for NE.
> * Update goto labels to match their purpose.
> * Remove the BUG_ON calls.
> * Check if enclave max memory regions is reached when setting an enclave
> memory region.
> * Check if enclave state is init when setting an enclave memory region.
> ---
> drivers/virt/nitro_enclaves/ne_misc_dev.c | 257 ++++++++++++++++++++++
> 1 file changed, 257 insertions(+)
>
> diff --git a/drivers/virt/nitro_enclaves/ne_misc_dev.c b/drivers/virt/nitro_enclaves/ne_misc_dev.c
> index cfdefa52ed2a..17ccb6cdbd75 100644
> --- a/drivers/virt/nitro_enclaves/ne_misc_dev.c
> +++ b/drivers/virt/nitro_enclaves/ne_misc_dev.c
> @@ -476,6 +476,233 @@ static int ne_create_vcpu_ioctl(struct ne_enclave *ne_enclave, u32 vcpu_id)
> return rc;
> }
>
> +/**
> + * ne_sanity_check_user_mem_region - Sanity check the userspace memory
> + * region received during the set user memory region ioctl call.
> + *
> + * This function gets called with the ne_enclave mutex held.
> + *
> + * @ne_enclave: private data associated with the current enclave.
> + * @mem_region: user space memory region to be sanity checked.
> + *
> + * @returns: 0 on success, negative return value on failure.
> + */
> +static int ne_sanity_check_user_mem_region(struct ne_enclave *ne_enclave,
> + struct ne_user_memory_region *mem_region)
> +{
> + if (ne_enclave->mm != current->mm)
> + return -EIO;
> +
> + if ((mem_region->memory_size % NE_MIN_MEM_REGION_SIZE) != 0) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Mem size not multiple of 2 MiB\n");
> +
> + return -EINVAL;
Can we make this an error that gets propagated to user space explicitly?
I'd rather have a clear error return value of this function than a
random message in dmesg.
> + }
> +
> + if ((mem_region->userspace_addr & (NE_MIN_MEM_REGION_SIZE - 1)) ||
This logic already relies on the fact that NE_MIN_MEM_REGION_SIZE is a
power of two. Can you do the same above on the memory_size check?
> + !access_ok((void __user *)(unsigned long)mem_region->userspace_addr,
> + mem_region->memory_size)) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Invalid user space addr range\n");
> +
> + return -EINVAL;
Same comment again. Return different errors for different conditions, so
that user space has a chance to print proper errors to its users.
Also, don't we have to check alignment of userspace_addr as well?
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * ne_set_user_memory_region_ioctl - Add user space memory region to the slot
> + * associated with the current enclave.
> + *
> + * This function gets called with the ne_enclave mutex held.
> + *
> + * @ne_enclave: private data associated with the current enclave.
> + * @mem_region: user space memory region to be associated with the given slot.
> + *
> + * @returns: 0 on success, negative return value on failure.
> + */
> +static int ne_set_user_memory_region_ioctl(struct ne_enclave *ne_enclave,
> + struct ne_user_memory_region *mem_region)
> +{
> + struct ne_pci_dev_cmd_reply cmd_reply = {};
> + long gup_rc = 0;
> + unsigned long i = 0;
> + struct ne_mem_region *ne_mem_region = NULL;
> + unsigned long nr_phys_contig_mem_regions = 0;
> + unsigned long nr_pinned_pages = 0;
> + struct page **phys_contig_mem_regions = NULL;
> + int rc = -EINVAL;
> + struct slot_add_mem_req slot_add_mem_req = {};
> +
> + rc = ne_sanity_check_user_mem_region(ne_enclave, mem_region);
> + if (rc < 0)
> + return rc;
> +
> + ne_mem_region = kzalloc(sizeof(*ne_mem_region), GFP_KERNEL);
> + if (!ne_mem_region)
> + return -ENOMEM;
> +
> + /*
> + * TODO: Update nr_pages value to handle contiguous virtual address
> + * ranges mapped to non-contiguous physical regions. Hugetlbfs can give
> + * 2 MiB / 1 GiB contiguous physical regions.
> + */
> + ne_mem_region->nr_pages = mem_region->memory_size /
> + NE_MIN_MEM_REGION_SIZE;
> +
> + ne_mem_region->pages = kcalloc(ne_mem_region->nr_pages,
> + sizeof(*ne_mem_region->pages),
> + GFP_KERNEL);
> + if (!ne_mem_region->pages) {
> + kfree(ne_mem_region);
> +
> + return -ENOMEM;
kfree(NULL) is a nop, so you can just set rc and goto free_mem_region
here and below.
> + }
> +
> + phys_contig_mem_regions = kcalloc(ne_mem_region->nr_pages,
> + sizeof(*phys_contig_mem_regions),
> + GFP_KERNEL);
> + if (!phys_contig_mem_regions) {
> + kfree(ne_mem_region->pages);
> + kfree(ne_mem_region);
> +
> + return -ENOMEM;
> + }
> +
> + /*
> + * TODO: Handle non-contiguous memory regions received from user space.
> + * Hugetlbfs can give 2 MiB / 1 GiB contiguous physical regions. The
> + * virtual address space can be seen as contiguous, although it is
> + * mapped underneath to 2 MiB / 1 GiB physical regions e.g. 8 MiB
> + * virtual address space mapped to 4 physically contiguous regions of 2
> + * MiB.
> + */
> + do {
> + unsigned long tmp_nr_pages = ne_mem_region->nr_pages -
> + nr_pinned_pages;
> + struct page **tmp_pages = ne_mem_region->pages +
> + nr_pinned_pages;
> + u64 tmp_userspace_addr = mem_region->userspace_addr +
> + nr_pinned_pages * NE_MIN_MEM_REGION_SIZE;
> +
> + gup_rc = get_user_pages(tmp_userspace_addr, tmp_nr_pages,
> + FOLL_GET, tmp_pages, NULL);
> + if (gup_rc < 0) {
> + rc = gup_rc;
> +
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Error in gup [rc=%d]\n", rc);
> +
> + unpin_user_pages(ne_mem_region->pages, nr_pinned_pages);
> +
> + goto free_mem_region;
> + }
> +
> + nr_pinned_pages += gup_rc;
> +
> + } while (nr_pinned_pages < ne_mem_region->nr_pages);
Can this deadlock the kernel? Shouldn't we rather return an error when
we can't pin all pages?
> +
> + /*
> + * TODO: Update checks once physically contiguous regions are collected
> + * based on the user space address and get_user_pages() results.
> + */
> + for (i = 0; i < ne_mem_region->nr_pages; i++) {
> + if (!PageHuge(ne_mem_region->pages[i])) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Not a hugetlbfs page\n");
> +
> + goto unpin_pages;
> + }
> +
> + if (huge_page_size(page_hstate(ne_mem_region->pages[i])) !=
> + NE_MIN_MEM_REGION_SIZE) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Page size isn't 2 MiB\n");
Why is a huge page size of >2MB a problem? Can't we just make
huge_page_size() the ne mem slot size?
> +
> + goto unpin_pages;
> + }
> +
> + if (ne_enclave->numa_node !=
> + page_to_nid(ne_mem_region->pages[i])) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Page isn't from NUMA node %d\n",
> + ne_enclave->numa_node);
> +
> + goto unpin_pages;
Is there a way to give user space hints on *why* things are going wrong?
> + }
> +
> + /*
> + * TODO: Update once handled non-contiguous memory regions
> + * received from user space.
> + */
> + phys_contig_mem_regions[i] = ne_mem_region->pages[i];
> + }
> +
> + /*
> + * TODO: Update once handled non-contiguous memory regions received
> + * from user space.
> + */
> + nr_phys_contig_mem_regions = ne_mem_region->nr_pages;
> +
> + if ((ne_enclave->nr_mem_regions + nr_phys_contig_mem_regions) >
> + ne_enclave->max_mem_regions) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Reached max memory regions %lld\n",
> + ne_enclave->max_mem_regions);
> +
> + goto unpin_pages;
> + }
> +
> + for (i = 0; i < nr_phys_contig_mem_regions; i++) {
> + u64 phys_addr = page_to_phys(phys_contig_mem_regions[i]);
> +
> + slot_add_mem_req.slot_uid = ne_enclave->slot_uid;
> + slot_add_mem_req.paddr = phys_addr;
> + /*
> + * TODO: Update memory size of physical contiguous memory
> + * region, in case of non-contiguous memory regions received
> + * from user space.
> + */
> + slot_add_mem_req.size = NE_MIN_MEM_REGION_SIZE;
Yeah, for now, just make it huge_page_size()! :)
> +
> + rc = ne_do_request(ne_enclave->pdev, SLOT_ADD_MEM,
> + &slot_add_mem_req, sizeof(slot_add_mem_req),
> + &cmd_reply, sizeof(cmd_reply));
> + if (rc < 0) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Error in slot add mem [rc=%d]\n",
> + rc);
> +
> + /* TODO: Only unpin memory regions not added. */
Are we sure we're not creating an unusable system here?
> + goto unpin_pages;
> + }
> +
> + ne_enclave->mem_size += slot_add_mem_req.size;
> + ne_enclave->nr_mem_regions++;
> +
> + memset(&slot_add_mem_req, 0, sizeof(slot_add_mem_req));
> + memset(&cmd_reply, 0, sizeof(cmd_reply));
If you define the variables in the for loop scope, you don't need to
manually zero them again.
Alex
> + }
> +
> + list_add(&ne_mem_region->mem_region_list_entry,
> + &ne_enclave->mem_regions_list);
> +
> + kfree(phys_contig_mem_regions);
> +
> + return 0;
> +
> +unpin_pages:
> + unpin_user_pages(ne_mem_region->pages, ne_mem_region->nr_pages);
> +free_mem_region:
> + kfree(phys_contig_mem_regions);
> + kfree(ne_mem_region->pages);
> + kfree(ne_mem_region);
> +
> + return rc;
> +}
> +
> static long ne_enclave_ioctl(struct file *file, unsigned int cmd,
> unsigned long arg)
> {
> @@ -561,6 +788,36 @@ static long ne_enclave_ioctl(struct file *file, unsigned int cmd,
> return 0;
> }
>
> + case NE_SET_USER_MEMORY_REGION: {
> + struct ne_user_memory_region mem_region = {};
> + int rc = -EINVAL;
> +
> + if (copy_from_user(&mem_region, (void *)arg,
> + sizeof(mem_region))) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Error in copy from user\n");
> +
> + return -EFAULT;
> + }
> +
> + mutex_lock(&ne_enclave->enclave_info_mutex);
> +
> + if (ne_enclave->state != NE_STATE_INIT) {
> + dev_err_ratelimited(ne_misc_dev.this_device,
> + "Enclave isn't in init state\n");
> +
> + mutex_unlock(&ne_enclave->enclave_info_mutex);
> +
> + return -EINVAL;
> + }
> +
> + rc = ne_set_user_memory_region_ioctl(ne_enclave, &mem_region);
> +
> + mutex_unlock(&ne_enclave->enclave_info_mutex);
> +
> + return rc;
> + }
> +
> default:
> return -ENOTTY;
> }
>
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
next prev parent reply other threads:[~2020-07-06 10:47 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-22 20:03 [PATCH v4 00/18] Add support for Nitro Enclaves Andra Paraschiv
2020-06-22 20:03 ` [PATCH v4 01/18] nitro_enclaves: Add ioctl interface definition Andra Paraschiv
2020-06-23 8:56 ` Stefan Hajnoczi
2020-06-24 14:02 ` Paraschiv, Andra-Irina
2020-06-25 13:29 ` Stefan Hajnoczi
2020-06-25 17:42 ` Paraschiv, Andra-Irina
2020-07-02 15:24 ` Alexander Graf
2020-07-04 8:09 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 02/18] nitro_enclaves: Define the PCI device interface Andra Paraschiv
2020-07-02 15:24 ` Alexander Graf
2020-07-04 8:20 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 03/18] nitro_enclaves: Define enclave info for internal bookkeeping Andra Paraschiv
2020-07-02 15:24 ` Alexander Graf
2020-07-04 8:23 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 04/18] nitro_enclaves: Init PCI device driver Andra Paraschiv
2020-07-02 15:09 ` Alexander Graf
2020-07-04 10:00 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 05/18] nitro_enclaves: Handle PCI device command requests Andra Paraschiv
2020-07-02 15:19 ` Alexander Graf
2020-07-04 15:05 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 06/18] nitro_enclaves: Handle out-of-band PCI device events Andra Paraschiv
2020-07-02 15:24 ` Alexander Graf
2020-07-04 15:43 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 07/18] nitro_enclaves: Init misc device providing the ioctl interface Andra Paraschiv
2020-06-29 16:20 ` Greg KH
2020-06-29 17:45 ` Paraschiv, Andra-Irina
2020-06-30 8:05 ` Greg KH
2020-06-30 9:08 ` Paraschiv, Andra-Irina
2020-07-06 7:13 ` Alexander Graf
2020-07-06 7:49 ` Paraschiv, Andra-Irina
2020-07-06 8:01 ` Alexander Graf
2020-07-06 13:09 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 08/18] nitro_enclaves: Add logic for enclave vm creation Andra Paraschiv
2020-07-06 7:53 ` Alexander Graf
2020-07-06 13:12 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 09/18] nitro_enclaves: Add logic for enclave vcpu creation Andra Paraschiv
2020-07-06 10:12 ` Alexander Graf
2020-07-08 12:46 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 10/18] nitro_enclaves: Add logic for enclave image load info Andra Paraschiv
2020-07-06 10:16 ` Alexander Graf
2020-07-06 13:35 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 11/18] nitro_enclaves: Add logic for enclave memory region set Andra Paraschiv
2020-07-06 10:46 ` Alexander Graf [this message]
2020-07-09 7:36 ` Paraschiv, Andra-Irina
2020-07-09 8:40 ` Alexander Graf
2020-07-09 9:41 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 12/18] nitro_enclaves: Add logic for enclave start Andra Paraschiv
2020-07-06 11:21 ` Alexander Graf
2020-07-07 18:27 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 13/18] nitro_enclaves: Add logic for enclave termination Andra Paraschiv
2020-07-06 11:26 ` Alexander Graf
2020-07-06 14:15 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 14/18] nitro_enclaves: Add Kconfig for the Nitro Enclaves driver Andra Paraschiv
2020-07-06 11:28 ` Alexander Graf
2020-07-06 13:50 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 15/18] nitro_enclaves: Add Makefile " Andra Paraschiv
2020-07-06 11:30 ` Alexander Graf
2020-07-06 14:00 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 16/18] nitro_enclaves: Add sample for ioctl interface usage Andra Paraschiv
2020-07-06 11:39 ` Alexander Graf
2020-07-07 19:03 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 17/18] nitro_enclaves: Add overview documentation Andra Paraschiv
2020-06-23 8:59 ` Stefan Hajnoczi
2020-06-24 14:39 ` Paraschiv, Andra-Irina
2020-06-25 13:10 ` Stefan Hajnoczi
2020-06-25 17:36 ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 18/18] MAINTAINERS: Add entry for the Nitro Enclaves driver Andra Paraschiv
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=798dbb9f-0fe4-9fd9-2e64-f6f2bc740abf@amazon.de \
--to=graf@amazon.de \
--cc=aliguori@amazon.com \
--cc=andraprs@amazon.com \
--cc=benh@kernel.crashing.org \
--cc=colmmacc@amazon.com \
--cc=doebel@amazon.de \
--cc=dwmw@amazon.co.uk \
--cc=fllinden@amazon.com \
--cc=gregkh@linuxfoundation.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpohlack@amazon.de \
--cc=msw@amazon.com \
--cc=ne-devel-upstream@amazon.com \
--cc=pbonzini@redhat.com \
--cc=sblbir@amazon.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=trawets@amazon.com \
--cc=uwed@amazon.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).