linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <graf@amazon.de>
To: Andra Paraschiv <andraprs@amazon.com>, <linux-kernel@vger.kernel.org>
Cc: Anthony Liguori <aliguori@amazon.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Colm MacCarthaigh <colmmacc@amazon.com>,
	"Bjoern Doebel" <doebel@amazon.de>,
	David Woodhouse <dwmw@amazon.co.uk>,
	"Frank van der Linden" <fllinden@amazon.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Martin Pohlack <mpohlack@amazon.de>, Matt Wilson <msw@amazon.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	Balbir Singh <sblbir@amazon.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Stewart Smith <trawets@amazon.com>,
	Uwe Dannowski <uwed@amazon.de>, <kvm@vger.kernel.org>,
	<ne-devel-upstream@amazon.com>
Subject: Re: [PATCH v4 11/18] nitro_enclaves: Add logic for enclave memory region set
Date: Mon, 6 Jul 2020 12:46:39 +0200	[thread overview]
Message-ID: <798dbb9f-0fe4-9fd9-2e64-f6f2bc740abf@amazon.de> (raw)
In-Reply-To: <20200622200329.52996-12-andraprs@amazon.com>



On 22.06.20 22:03, Andra Paraschiv wrote:
> Another resource that is being set for an enclave is memory. User space
> memory regions, that need to be backed by contiguous memory regions,
> are associated with the enclave.
> 
> One solution for allocating / reserving contiguous memory regions, that
> is used for integration, is hugetlbfs. The user space process that is
> associated with the enclave passes to the driver these memory regions.
> 
> The enclave memory regions need to be from the same NUMA node as the
> enclave CPUs.
> 
> Add ioctl command logic for setting user space memory region for an
> enclave.
> 
> Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
> Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
> ---
> Changelog
> 
> v3 -> v4
> 
> * Check enclave memory regions are from the same NUMA node as the
>    enclave CPUs.
> * Use dev_err instead of custom NE log pattern.
> * Update the NE ioctl call to match the decoupling from the KVM API.
> 
> v2 -> v3
> 
> * Remove the WARN_ON calls.
> * Update static calls sanity checks.
> * Update kzfree() calls to kfree().
> 
> v1 -> v2
> 
> * Add log pattern for NE.
> * Update goto labels to match their purpose.
> * Remove the BUG_ON calls.
> * Check if enclave max memory regions is reached when setting an enclave
>    memory region.
> * Check if enclave state is init when setting an enclave memory region.
> ---
>   drivers/virt/nitro_enclaves/ne_misc_dev.c | 257 ++++++++++++++++++++++
>   1 file changed, 257 insertions(+)
> 
> diff --git a/drivers/virt/nitro_enclaves/ne_misc_dev.c b/drivers/virt/nitro_enclaves/ne_misc_dev.c
> index cfdefa52ed2a..17ccb6cdbd75 100644
> --- a/drivers/virt/nitro_enclaves/ne_misc_dev.c
> +++ b/drivers/virt/nitro_enclaves/ne_misc_dev.c
> @@ -476,6 +476,233 @@ static int ne_create_vcpu_ioctl(struct ne_enclave *ne_enclave, u32 vcpu_id)
>   	return rc;
>   }
>   
> +/**
> + * ne_sanity_check_user_mem_region - Sanity check the userspace memory
> + * region received during the set user memory region ioctl call.
> + *
> + * This function gets called with the ne_enclave mutex held.
> + *
> + * @ne_enclave: private data associated with the current enclave.
> + * @mem_region: user space memory region to be sanity checked.
> + *
> + * @returns: 0 on success, negative return value on failure.
> + */
> +static int ne_sanity_check_user_mem_region(struct ne_enclave *ne_enclave,
> +	struct ne_user_memory_region *mem_region)
> +{
> +	if (ne_enclave->mm != current->mm)
> +		return -EIO;
> +
> +	if ((mem_region->memory_size % NE_MIN_MEM_REGION_SIZE) != 0) {
> +		dev_err_ratelimited(ne_misc_dev.this_device,
> +				    "Mem size not multiple of 2 MiB\n");
> +
> +		return -EINVAL;

Can we make this an error that gets propagated to user space explicitly? 
I'd rather have a clear error return value of this function than a 
random message in dmesg.

> +	}
> +
> +	if ((mem_region->userspace_addr & (NE_MIN_MEM_REGION_SIZE - 1)) ||

This logic already relies on the fact that NE_MIN_MEM_REGION_SIZE is a 
power of two. Can you do the same above on the memory_size check?

> +	    !access_ok((void __user *)(unsigned long)mem_region->userspace_addr,
> +		       mem_region->memory_size)) {
> +		dev_err_ratelimited(ne_misc_dev.this_device,
> +				    "Invalid user space addr range\n");
> +
> +		return -EINVAL;

Same comment again. Return different errors for different conditions, so 
that user space has a chance to print proper errors to its users.

Also, don't we have to check alignment of userspace_addr as well?

> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * ne_set_user_memory_region_ioctl - Add user space memory region to the slot
> + * associated with the current enclave.
> + *
> + * This function gets called with the ne_enclave mutex held.
> + *
> + * @ne_enclave: private data associated with the current enclave.
> + * @mem_region: user space memory region to be associated with the given slot.
> + *
> + * @returns: 0 on success, negative return value on failure.
> + */
> +static int ne_set_user_memory_region_ioctl(struct ne_enclave *ne_enclave,
> +	struct ne_user_memory_region *mem_region)
> +{
> +	struct ne_pci_dev_cmd_reply cmd_reply = {};
> +	long gup_rc = 0;
> +	unsigned long i = 0;
> +	struct ne_mem_region *ne_mem_region = NULL;
> +	unsigned long nr_phys_contig_mem_regions = 0;
> +	unsigned long nr_pinned_pages = 0;
> +	struct page **phys_contig_mem_regions = NULL;
> +	int rc = -EINVAL;
> +	struct slot_add_mem_req slot_add_mem_req = {};
> +
> +	rc = ne_sanity_check_user_mem_region(ne_enclave, mem_region);
> +	if (rc < 0)
> +		return rc;
> +
> +	ne_mem_region = kzalloc(sizeof(*ne_mem_region), GFP_KERNEL);
> +	if (!ne_mem_region)
> +		return -ENOMEM;
> +
> +	/*
> +	 * TODO: Update nr_pages value to handle contiguous virtual address
> +	 * ranges mapped to non-contiguous physical regions. Hugetlbfs can give
> +	 * 2 MiB / 1 GiB contiguous physical regions.
> +	 */
> +	ne_mem_region->nr_pages = mem_region->memory_size /
> +		NE_MIN_MEM_REGION_SIZE;
> +
> +	ne_mem_region->pages = kcalloc(ne_mem_region->nr_pages,
> +				       sizeof(*ne_mem_region->pages),
> +				       GFP_KERNEL);
> +	if (!ne_mem_region->pages) {
> +		kfree(ne_mem_region);
> +
> +		return -ENOMEM;

kfree(NULL) is a nop, so you can just set rc and goto free_mem_region 
here and below.

> +	}
> +
> +	phys_contig_mem_regions = kcalloc(ne_mem_region->nr_pages,
> +					  sizeof(*phys_contig_mem_regions),
> +					  GFP_KERNEL);
> +	if (!phys_contig_mem_regions) {
> +		kfree(ne_mem_region->pages);
> +		kfree(ne_mem_region);
> +
> +		return -ENOMEM;
> +	}
> +
> +	/*
> +	 * TODO: Handle non-contiguous memory regions received from user space.
> +	 * Hugetlbfs can give 2 MiB / 1 GiB contiguous physical regions. The
> +	 * virtual address space can be seen as contiguous, although it is
> +	 * mapped underneath to 2 MiB / 1 GiB physical regions e.g. 8 MiB
> +	 * virtual address space mapped to 4 physically contiguous regions of 2
> +	 * MiB.
> +	 */
> +	do {
> +		unsigned long tmp_nr_pages = ne_mem_region->nr_pages -
> +			nr_pinned_pages;
> +		struct page **tmp_pages = ne_mem_region->pages +
> +			nr_pinned_pages;
> +		u64 tmp_userspace_addr = mem_region->userspace_addr +
> +			nr_pinned_pages * NE_MIN_MEM_REGION_SIZE;
> +
> +		gup_rc = get_user_pages(tmp_userspace_addr, tmp_nr_pages,
> +					FOLL_GET, tmp_pages, NULL);
> +		if (gup_rc < 0) {
> +			rc = gup_rc;
> +
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Error in gup [rc=%d]\n", rc);
> +
> +			unpin_user_pages(ne_mem_region->pages, nr_pinned_pages);
> +
> +			goto free_mem_region;
> +		}
> +
> +		nr_pinned_pages += gup_rc;
> +
> +	} while (nr_pinned_pages < ne_mem_region->nr_pages);

Can this deadlock the kernel? Shouldn't we rather return an error when 
we can't pin all pages?

> +
> +	/*
> +	 * TODO: Update checks once physically contiguous regions are collected
> +	 * based on the user space address and get_user_pages() results.
> +	 */
> +	for (i = 0; i < ne_mem_region->nr_pages; i++) {
> +		if (!PageHuge(ne_mem_region->pages[i])) {
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Not a hugetlbfs page\n");
> +
> +			goto unpin_pages;
> +		}
> +
> +		if (huge_page_size(page_hstate(ne_mem_region->pages[i])) !=
> +		    NE_MIN_MEM_REGION_SIZE) {
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Page size isn't 2 MiB\n");

Why is a huge page size of >2MB a problem? Can't we just make 
huge_page_size() the ne mem slot size?

> +
> +			goto unpin_pages;
> +		}
> +
> +		if (ne_enclave->numa_node !=
> +		    page_to_nid(ne_mem_region->pages[i])) {
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Page isn't from NUMA node %d\n",
> +					    ne_enclave->numa_node);
> +
> +			goto unpin_pages;

Is there a way to give user space hints on *why* things are going wrong?

> +		}
> +
> +		/*
> +		 * TODO: Update once handled non-contiguous memory regions
> +		 * received from user space.
> +		 */
> +		phys_contig_mem_regions[i] = ne_mem_region->pages[i];
> +	}
> +
> +	/*
> +	 * TODO: Update once handled non-contiguous memory regions received
> +	 * from user space.
> +	 */
> +	nr_phys_contig_mem_regions = ne_mem_region->nr_pages;
> +
> +	if ((ne_enclave->nr_mem_regions + nr_phys_contig_mem_regions) >
> +	    ne_enclave->max_mem_regions) {
> +		dev_err_ratelimited(ne_misc_dev.this_device,
> +				    "Reached max memory regions %lld\n",
> +				    ne_enclave->max_mem_regions);
> +
> +		goto unpin_pages;
> +	}
> +
> +	for (i = 0; i < nr_phys_contig_mem_regions; i++) {
> +		u64 phys_addr = page_to_phys(phys_contig_mem_regions[i]);
> +
> +		slot_add_mem_req.slot_uid = ne_enclave->slot_uid;
> +		slot_add_mem_req.paddr = phys_addr;
> +		/*
> +		 * TODO: Update memory size of physical contiguous memory
> +		 * region, in case of non-contiguous memory regions received
> +		 * from user space.
> +		 */
> +		slot_add_mem_req.size = NE_MIN_MEM_REGION_SIZE;

Yeah, for now, just make it huge_page_size()! :)

> +
> +		rc = ne_do_request(ne_enclave->pdev, SLOT_ADD_MEM,
> +				   &slot_add_mem_req, sizeof(slot_add_mem_req),
> +				   &cmd_reply, sizeof(cmd_reply));
> +		if (rc < 0) {
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Error in slot add mem [rc=%d]\n",
> +					    rc);
> +
> +			/* TODO: Only unpin memory regions not added. */

Are we sure we're not creating an unusable system here?

> +			goto unpin_pages;
> +		}
> +
> +		ne_enclave->mem_size += slot_add_mem_req.size;
> +		ne_enclave->nr_mem_regions++;
> +
> +		memset(&slot_add_mem_req, 0, sizeof(slot_add_mem_req));
> +		memset(&cmd_reply, 0, sizeof(cmd_reply));

If you define the variables in the for loop scope, you don't need to 
manually zero them again.


Alex

> +	}
> +
> +	list_add(&ne_mem_region->mem_region_list_entry,
> +		 &ne_enclave->mem_regions_list);
> +
> +	kfree(phys_contig_mem_regions);
> +
> +	return 0;
> +
> +unpin_pages:
> +	unpin_user_pages(ne_mem_region->pages, ne_mem_region->nr_pages);
> +free_mem_region:
> +	kfree(phys_contig_mem_regions);
> +	kfree(ne_mem_region->pages);
> +	kfree(ne_mem_region);
> +
> +	return rc;
> +}
> +
>   static long ne_enclave_ioctl(struct file *file, unsigned int cmd,
>   			     unsigned long arg)
>   {
> @@ -561,6 +788,36 @@ static long ne_enclave_ioctl(struct file *file, unsigned int cmd,
>   		return 0;
>   	}
>   
> +	case NE_SET_USER_MEMORY_REGION: {
> +		struct ne_user_memory_region mem_region = {};
> +		int rc = -EINVAL;
> +
> +		if (copy_from_user(&mem_region, (void *)arg,
> +				   sizeof(mem_region))) {
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Error in copy from user\n");
> +
> +			return -EFAULT;
> +		}
> +
> +		mutex_lock(&ne_enclave->enclave_info_mutex);
> +
> +		if (ne_enclave->state != NE_STATE_INIT) {
> +			dev_err_ratelimited(ne_misc_dev.this_device,
> +					    "Enclave isn't in init state\n");
> +
> +			mutex_unlock(&ne_enclave->enclave_info_mutex);
> +
> +			return -EINVAL;
> +		}
> +
> +		rc = ne_set_user_memory_region_ioctl(ne_enclave, &mem_region);
> +
> +		mutex_unlock(&ne_enclave->enclave_info_mutex);
> +
> +		return rc;
> +	}
> +
>   	default:
>   		return -ENOTTY;
>   	}
> 



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




  reply	other threads:[~2020-07-06 10:47 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-22 20:03 [PATCH v4 00/18] Add support for Nitro Enclaves Andra Paraschiv
2020-06-22 20:03 ` [PATCH v4 01/18] nitro_enclaves: Add ioctl interface definition Andra Paraschiv
2020-06-23  8:56   ` Stefan Hajnoczi
2020-06-24 14:02     ` Paraschiv, Andra-Irina
2020-06-25 13:29       ` Stefan Hajnoczi
2020-06-25 17:42         ` Paraschiv, Andra-Irina
2020-07-02 15:24   ` Alexander Graf
2020-07-04  8:09     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 02/18] nitro_enclaves: Define the PCI device interface Andra Paraschiv
2020-07-02 15:24   ` Alexander Graf
2020-07-04  8:20     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 03/18] nitro_enclaves: Define enclave info for internal bookkeeping Andra Paraschiv
2020-07-02 15:24   ` Alexander Graf
2020-07-04  8:23     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 04/18] nitro_enclaves: Init PCI device driver Andra Paraschiv
2020-07-02 15:09   ` Alexander Graf
2020-07-04 10:00     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 05/18] nitro_enclaves: Handle PCI device command requests Andra Paraschiv
2020-07-02 15:19   ` Alexander Graf
2020-07-04 15:05     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 06/18] nitro_enclaves: Handle out-of-band PCI device events Andra Paraschiv
2020-07-02 15:24   ` Alexander Graf
2020-07-04 15:43     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 07/18] nitro_enclaves: Init misc device providing the ioctl interface Andra Paraschiv
2020-06-29 16:20   ` Greg KH
2020-06-29 17:45     ` Paraschiv, Andra-Irina
2020-06-30  8:05       ` Greg KH
2020-06-30  9:08         ` Paraschiv, Andra-Irina
2020-07-06  7:13   ` Alexander Graf
2020-07-06  7:49     ` Paraschiv, Andra-Irina
2020-07-06  8:01       ` Alexander Graf
2020-07-06 13:09         ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 08/18] nitro_enclaves: Add logic for enclave vm creation Andra Paraschiv
2020-07-06  7:53   ` Alexander Graf
2020-07-06 13:12     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 09/18] nitro_enclaves: Add logic for enclave vcpu creation Andra Paraschiv
2020-07-06 10:12   ` Alexander Graf
2020-07-08 12:46     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 10/18] nitro_enclaves: Add logic for enclave image load info Andra Paraschiv
2020-07-06 10:16   ` Alexander Graf
2020-07-06 13:35     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 11/18] nitro_enclaves: Add logic for enclave memory region set Andra Paraschiv
2020-07-06 10:46   ` Alexander Graf [this message]
2020-07-09  7:36     ` Paraschiv, Andra-Irina
2020-07-09  8:40       ` Alexander Graf
2020-07-09  9:41         ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 12/18] nitro_enclaves: Add logic for enclave start Andra Paraschiv
2020-07-06 11:21   ` Alexander Graf
2020-07-07 18:27     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 13/18] nitro_enclaves: Add logic for enclave termination Andra Paraschiv
2020-07-06 11:26   ` Alexander Graf
2020-07-06 14:15     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 14/18] nitro_enclaves: Add Kconfig for the Nitro Enclaves driver Andra Paraschiv
2020-07-06 11:28   ` Alexander Graf
2020-07-06 13:50     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 15/18] nitro_enclaves: Add Makefile " Andra Paraschiv
2020-07-06 11:30   ` Alexander Graf
2020-07-06 14:00     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 16/18] nitro_enclaves: Add sample for ioctl interface usage Andra Paraschiv
2020-07-06 11:39   ` Alexander Graf
2020-07-07 19:03     ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 17/18] nitro_enclaves: Add overview documentation Andra Paraschiv
2020-06-23  8:59   ` Stefan Hajnoczi
2020-06-24 14:39     ` Paraschiv, Andra-Irina
2020-06-25 13:10       ` Stefan Hajnoczi
2020-06-25 17:36         ` Paraschiv, Andra-Irina
2020-06-22 20:03 ` [PATCH v4 18/18] MAINTAINERS: Add entry for the Nitro Enclaves driver Andra Paraschiv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=798dbb9f-0fe4-9fd9-2e64-f6f2bc740abf@amazon.de \
    --to=graf@amazon.de \
    --cc=aliguori@amazon.com \
    --cc=andraprs@amazon.com \
    --cc=benh@kernel.crashing.org \
    --cc=colmmacc@amazon.com \
    --cc=doebel@amazon.de \
    --cc=dwmw@amazon.co.uk \
    --cc=fllinden@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpohlack@amazon.de \
    --cc=msw@amazon.com \
    --cc=ne-devel-upstream@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=sblbir@amazon.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=trawets@amazon.com \
    --cc=uwed@amazon.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).