Linux-Sgx Archive on lore.kernel.org
 help / color / Atom feed
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: x86@kernel.org, linux-sgx@vger.kernel.org,
	akpm@linux-foundation.org, dave.hansen@intel.com,
	nhorman@redhat.com, npmccallum@redhat.com, serge.ayoun@intel.com,
	shay.katz-zamir@intel.com, haitao.huang@intel.com,
	andriy.shevchenko@linux.intel.com, tglx@linutronix.de,
	kai.svahn@intel.com, bp@alien8.de, josh@joshtriplett.org,
	luto@kernel.org, kai.huang@intel.com, rientjes@google.com,
	Suresh Siddha <suresh.b.siddha@intel.com>
Subject: Re: [PATCH v19 16/27] x86/sgx: Add the Linux SGX Enclave Driver
Date: Thu, 21 Mar 2019 10:38:13 -0700
Message-ID: <20190321173813.GF6519@linux.intel.com> (raw)
In-Reply-To: <20190321161827.GT4603@linux.intel.com>

On Thu, Mar 21, 2019 at 06:18:27PM +0200, Jarkko Sakkinen wrote:
> On Tue, Mar 19, 2019 at 04:00:47PM -0700, Sean Christopherson wrote:
> > On Sun, Mar 17, 2019 at 11:14:45PM +0200, Jarkko Sakkinen wrote:
> > > Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> > > can be used by applications to set aside private regions of code and
> > > data. The code outside the enclave is disallowed to access the memory
> > > inside the enclave by the CPU access control.
> > > 
> > > This commit adds the Linux SGX Enclave Driver that provides an ioctl API
> > > to manage enclaves. The address range for an enclave, commonly referred
> > > as ELRANGE in the documentation (e.g. Intel SDM), is reserved with
> > > mmap() against /dev/sgx. After that a set ioctls is used to build
> > > the enclave to the ELRANGE.
> > 
> > 
> > ...
> > 
> > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > > new file mode 100644
> > > index 000000000000..bd8bcd748976
> > > --- /dev/null
> > > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > 
> > ...
> > 
> > > +/**
> > > + * sgx_encl_next_mm() - Iterate to the next mm
> > > + * @encl:	an enclave
> > > + * @mm:		an mm list entry
> > > + * @iter:	iterator status
> > > + *
> > > + * Return: the enclave mm or NULL
> > > + */
> > > +struct sgx_encl_mm *sgx_encl_next_mm(struct sgx_encl *encl,
> > > +				     struct sgx_encl_mm *mm, int *iter)
> > > +{
> > > +	struct list_head *entry;
> > > +
> > > +	WARN(!encl, "%s: encl is NULL", __func__);
> > > +	WARN(!iter, "%s: iter is NULL", __func__);
> > > +
> > > +	spin_lock(&encl->mm_lock);
> > > +
> > > +	entry = mm ? mm->list.next : encl->mm_list.next;
> > > +	WARN(!entry, "%s: entry is NULL", __func__);
> > > +
> > > +	if (entry == &encl->mm_list) {
> > > +		mm = NULL;
> > > +		*iter = SGX_ENCL_MM_ITER_DONE;
> > > +		goto out;
> > > +	}
> > > +
> > > +	mm = list_entry(entry, struct sgx_encl_mm, list);
> > > +
> > > +	if (!kref_get_unless_zero(&mm->refcount)) {
> > > +		*iter = SGX_ENCL_MM_ITER_RESTART;
> > > +		mm = NULL;
> > > +		goto out;
> > > +	}
> > > +
> > > +	if (!atomic_add_unless(&mm->mm->mm_count, 1, 0)) {
> > 
> > This is a use-after-free scenario if mm_count==0.  Once the count goes
> > to zero, __mmdrop() begins, at which point this code is racing against
> > free_mm().  What you want here (or rather, in flows where mm != current->mm
> > and you want to access PTEs) is mmget_not_zero(), i.e. "unless zero"
> > on mm_users.  mm_count prevents the mm_struct from being freed, but
> > doesn't protect the page tables.  mm_users protects the page tables,
> > i.e. lets us safely call sgx_encl_test_and_clear_young in the reclaimer.
> > 
> > To ensure liveliness of the mm itself, register an mmu_notifier for each
> > mm_struct (I think in sgx_vma_open()).  The enclave's .release callback
> > would then delete the mm from its list and drop its reference (exit_mmap()
> > holds a reference to mm_count so it's safe to do mmdrop() in the .release
> > callback).  E.g.:
> > 
> > static void sgx_vma_open(struct vm_area_struct *vma)
> > {
> > 	...
> > 
> > 	rcu_read_lock();
> > 	list_for_each_entry_rcu(...) {
> > 		if (vma->vm_mm == tmp->mm) {
> > 			encl_mm = tmp;
> > 			break;
> > 		}
> > 	}
> > 	rcu_read_unlock();
> > 
> > 	if (!encl_mm) {
> > 		mm = kzalloc(sizeof(*mm), GFP_KERNEL);
> > 		if (!mm) {
> > 			goto error;
> > 
> > 		encl_mm->encl = encl;
> > 		encl_mm->mm = vma->vm_mm;
> > 
> > 		if (mmu_notifier_register(&encl->mmu_notifier, encl_mm)) {
> > 			kfree(encl_mm);
> > 			goto error;
> > 		}
> 
> OK, thanks for catching the bug. I'm cool with adding MMU notifier back.
> Just wondering when unregister should be called.

We'd need to refcount the encl_mm and unregister its callback when its
refcount goes to zero.  I dislike the idea of refcounting both encl and
encl_mm, but it does seem to be the most (only?) robust solution.

The alternative is to not refcount encl_mm, e.g. unregister the callback
when the encl itself is freed, but then there is no way to detect when
the last vma is closed, i.e. we have to hold the encl_mm until the entire
mm_struct dies.

> 
> > 
> > 		spin_lock(&encl->mm_lock);
> > 		list_add(&encl_mm->list, &encl->mm_list);
> > 		spin_unlock(&encl->mm_lock);
> > 	}
> > 
> > 	...
> > error:
> > 	<not sure what should go here if we don't kill the enclave>
> > }
> > 
> > static void sgx_encl_mmu_release(struct mmu_notifier *mn, struct mm_struct *mm)
> > {
> > 	struct sgx_encl_mm *encl_mm =
> > 		container_of(mn, struct sgx_encl_mm, mmu_notifier);
> > 
> > 	spin_lock(encl_mm->encl->mm_lock);
> > 	list_del_rcu(&encl_mm->list);
> > 	spin_unlock(encl_mm->encl->mm_lock);
> > 
> > 	synchronize_rcu();
> > 
> > 	mmdrop(mm);
> > }
> > 
> > Alternatively, the sgx_encl_mmu_release() could mark the encl_mm as dead
> > instead of removing it from the list, but I don't think that'd mesh well
> > with an RCU list, i.e. we'd need a regular lock-protected list and a
> > custom walker.
> > 
> > The only downside with the RCU approach that I can think of is that the
> > encl_mm would stay on the enclave's list until the enclave or the mm
> > itself died.  That could result in unnecessary IPIs during reclaim (or
> > invalidation), but that seems like a minor corner case that could be
> > avoided in userspace, e.g. don't mmap() an enclave unless you actually
> > plan on running it.
> 
> Yeah, that is really the root why ended up what I have i.e to be able
> to move them real time. If they can be in the list forever, then RCU
> is doable. I was wondering with your RCU comments how you would deal
> with this.

Aha!  Similar to the old 1:1 encl:mm approach, the release callback would
simply mark the associated entity "dead", in this case the encl_mm.  We'd
still refcount encl_mm and handle unregistering and whatnot when the last
vma is closed, i.e. refcount goes to zero.  Marking the encl_mm as dead
instead of trying to delete it from the list avoids scenarios where we're
holding a reference to the encl_mm but it's no longer on the list.

The synchronize_srcu() during release ensures we don't operate on a dead
enclave.  And the only time we'd take mm_lock is to insert/delete to/from
the list, i.e. vma open/close, thus sidestepping lock ordering issues
between mm_lock and mmap_sem.  Traversing the list in the fault handler
can be avoided by nullifying vm_private_data or by checking the liveliness
of the enclave iself.

E.g.:

static void sgx_encl_mmu_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
        struct sgx_encl_mm *encl_mm =
                container_of(mn, struct sgx_encl_mm, mmu_notifier);

	encl_mm->dead = true;

        synchronize_srcu(&encl_srcu);
}


And reclaim looks something like:

static void sgx_reclaimer_block(struct sgx_epc_page *epc_page)
{
	...

	id = srcu_read_lock(&encl_srcu);

	list_for_each_entry_rcu(...) {
		if (encl_mm->dead)
			continue;

		down_read(&encl_mm->mm->mmap_sem);

		ret = sgx_encl_find(encl_mm->mm, addr, &vma);
		if (!ret && encl == vma->vm_private_data)
			zap_vma_ptes(vma, addr, PAGE_SIZE);

		up_read(&encl_mm->mm->mmap_sem);
	}

	srcu_read_unlock(&encl_srcu, id);

	mutex_lock(&encl->lock);

	if (!(encl->flags & SGX_ENCL_DEAD)) {
		ret = __eblock(sgx_epc_addr(epc_page));
		if (encls_failed(ret))
			ENCLS_WARN(ret, "EBLOCK");
	}

	mutex_unlock(&encl->lock);
}



  reply index

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-17 21:14 [PATCH v19 00/27] Intel SGX1 support Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 01/27] x86/cpufeatures: Add Intel-defined SGX feature bit Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 02/27] x86/cpufeatures: Add SGX sub-features (as Linux-defined bits) Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 03/27] x86/msr: Add IA32_FEATURE_CONTROL.SGX_ENABLE definition Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 04/27] x86/cpufeatures: Add Intel-defined SGX_LC feature bit Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 05/27] x86/msr: Add SGX Launch Control MSR definitions Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 06/27] x86/mm: x86/sgx: Add new 'PF_SGX' page fault error code bit Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 07/27] x86/mm: x86/sgx: Signal SIGSEGV for userspace #PFs w/ PF_SGX Jarkko Sakkinen
2019-03-18 17:15   ` Dave Hansen
2019-03-18 19:53     ` Sean Christopherson
2019-03-17 21:14 ` [PATCH v19 08/27] x86/cpu/intel: Detect SGX support and update caps appropriately Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 09/27] x86/sgx: Add ENCLS architectural error codes Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 10/27] x86/sgx: Add SGX1 and SGX2 architectural data structures Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 11/27] x86/sgx: Add definitions for SGX's CPUID leaf and variable sub-leafs Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 12/27] x86/sgx: Enumerate and track EPC sections Jarkko Sakkinen
2019-03-18 19:50   ` Sean Christopherson
2019-03-21 14:40     ` Jarkko Sakkinen
2019-03-21 15:28       ` Sean Christopherson
2019-03-22 10:19         ` Jarkko Sakkinen
2019-03-22 10:50           ` Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 13/27] x86/sgx: Add wrappers for ENCLS leaf functions Jarkko Sakkinen
2019-03-19 19:59   ` Sean Christopherson
2019-03-21 14:51     ` Jarkko Sakkinen
2019-03-21 15:40       ` Sean Christopherson
2019-03-22 11:00         ` Jarkko Sakkinen
2019-03-22 16:43           ` Sean Christopherson
2019-03-17 21:14 ` [PATCH v19 16/27] x86/sgx: Add the Linux SGX Enclave Driver Jarkko Sakkinen
2019-03-19 21:19   ` Sean Christopherson
2019-03-21 15:51     ` Jarkko Sakkinen
2019-03-21 16:47       ` Sean Christopherson
2019-03-22 11:10         ` Jarkko Sakkinen
2019-03-26 13:26       ` Jarkko Sakkinen
2019-03-26 23:58         ` Sean Christopherson
2019-03-27  5:28           ` Jarkko Sakkinen
2019-03-27 17:57             ` Sean Christopherson
2019-03-27 18:38             ` Jethro Beekman
2019-03-27 20:06               ` Sean Christopherson
2019-03-28  1:21                 ` Jethro Beekman
2019-03-28 13:19                 ` Jarkko Sakkinen
2019-03-28 19:05                   ` Andy Lutomirski
2019-03-29  9:43                     ` Jarkko Sakkinen
2019-03-29 16:20                     ` Sean Christopherson
2019-04-01 10:01                       ` Jarkko Sakkinen
2019-04-01 17:25                         ` Jethro Beekman
2019-04-01 22:57                           ` Jarkko Sakkinen
2019-03-28 13:15               ` Jarkko Sakkinen
2019-03-19 23:00   ` Sean Christopherson
2019-03-21 16:18     ` Jarkko Sakkinen
2019-03-21 17:38       ` Sean Christopherson [this message]
2019-03-22 11:17         ` Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 17/27] x86/sgx: Add provisioning Jarkko Sakkinen
2019-03-19 20:09   ` Sean Christopherson
2019-03-21  2:08     ` Huang, Kai
2019-03-21 14:32       ` Jarkko Sakkinen
2019-03-21 21:41         ` Huang, Kai
2019-03-22 11:31           ` Jarkko Sakkinen
2019-03-21 14:30     ` Jarkko Sakkinen
2019-03-21 14:38   ` Nathaniel McCallum
2019-03-22 11:22     ` Jarkko Sakkinen
2019-03-21 16:50   ` Andy Lutomirski
2019-03-22 11:29     ` Jarkko Sakkinen
2019-03-22 11:43       ` Jarkko Sakkinen
2019-03-22 18:20         ` Andy Lutomirski
2019-03-25 14:55           ` Jarkko Sakkinen
2019-03-27  0:14             ` Sean Christopherson
2019-04-05 10:18             ` Jarkko Sakkinen
2019-04-05 13:53               ` Andy Lutomirski
2019-04-05 14:20                 ` Jarkko Sakkinen
2019-04-05 14:34                   ` Greg KH
2019-04-09 13:37                     ` Jarkko Sakkinen
2019-04-05 14:21                 ` Greg KH
2019-03-17 21:14 ` [PATCH v19 19/27] x86/sgx: ptrace() support for the SGX driver Jarkko Sakkinen
2019-03-19 22:22   ` Sean Christopherson
2019-03-21 15:02     ` Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 20/27] x86/vdso: Add support for exception fixup in vDSO functions Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 21/27] x86/fault: Add helper function to sanitize error code Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 22/27] x86/fault: Attempt to fixup unhandled #PF in vDSO before signaling Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 23/27] x86/traps: Attempt to fixup exceptions " Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 25/27] x86/sgx: SGX documentation Jarkko Sakkinen
2019-03-20 17:14   ` Sean Christopherson
2019-03-21 16:24     ` Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 26/27] selftests/x86: Add a selftest for SGX Jarkko Sakkinen
2019-03-17 21:14 ` [PATCH v19 27/27] x86/sgx: Update MAINTAINERS Jarkko Sakkinen
2019-03-19 17:12   ` Sean Christopherson
2019-03-21 14:42     ` Jarkko Sakkinen
     [not found] ` <20190317211456.13927-19-jarkko.sakkinen@linux.intel.com>
2019-03-19 22:09   ` [PATCH v19 18/27] x86/sgx: Add swapping code to the core and SGX driver Sean Christopherson
2019-03-21 14:59     ` Jarkko Sakkinen
2019-03-19 23:41 ` [PATCH v19 00/27] Intel SGX1 support Sean Christopherson
2019-03-19 23:52   ` Jethro Beekman
2019-03-20  0:22     ` Sean Christopherson
2019-03-21 16:20     ` Jarkko Sakkinen
2019-03-21 16:00   ` Jarkko Sakkinen

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190321173813.GF6519@linux.intel.com \
    --to=sean.j.christopherson@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=haitao.huang@intel.com \
    --cc=jarkko.sakkinen@linux.intel.com \
    --cc=josh@joshtriplett.org \
    --cc=kai.huang@intel.com \
    --cc=kai.svahn@intel.com \
    --cc=linux-sgx@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=nhorman@redhat.com \
    --cc=npmccallum@redhat.com \
    --cc=rientjes@google.com \
    --cc=serge.ayoun@intel.com \
    --cc=shay.katz-zamir@intel.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Sgx Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-sgx/0 linux-sgx/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-sgx linux-sgx/ https://lore.kernel.org/linux-sgx \
		linux-sgx@vger.kernel.org linux-sgx@archiver.kernel.org
	public-inbox-index linux-sgx


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-sgx


AGPL code for this site: git clone https://public-inbox.org/ public-inbox