linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kai Huang <kai.huang@intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>,
	Sean Christopherson <seanjc@google.com>,
	kvm@vger.kernel.org, x86@kernel.org, linux-sgx@vger.kernel.org,
	linux-kernel@vger.kernel.org, jarkko@kernel.org, luto@kernel.org,
	dave.hansen@intel.com, rick.p.edgecombe@intel.com,
	haitao.huang@intel.com, tglx@linutronix.de, mingo@redhat.com,
	hpa@zytor.com
Subject: Re: [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
Date: Wed, 24 Mar 2021 23:48:39 +1300	[thread overview]
Message-ID: <20210324234839.bf5bef54fd7a84030cf1bcf8@intel.com> (raw)
In-Reply-To: <236c0aa9-92f2-97c8-ab11-d55b9a98c931@redhat.com>

On Wed, 24 Mar 2021 11:09:20 +0100 Paolo Bonzini wrote:
> On 24/03/21 10:38, Kai Huang wrote:
> > Hi Sean, Boris, Paolo,
> > 
> > Thanks for the discussion. I tried to digest all your conversations and
> > hopefully I have understood you correctly. I pasted the new patch here
> > (not full patch, but relevant part only). I modified the error msg, added
> > some writeup to Documentation/x86/sgx.rst, and put Sean's explanation of this
> > bug to the commit msg (per Paolo). I am terrible Documentation writer, so
> > please help to check and give comments. Thanks!
> 
> I have some phrasing suggestions below but that was actually pretty good.
> 
> > ---
> > commit 1e297a535bcb4f51a08343c40207520017d85efe (HEAD)
> > Author: Kai Huang <kai.huang@intel.com>
> > Date:   Wed Jan 20 03:40:53 2021 +0200
> > 
> >      x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
> >      
> >      EREMOVE takes a page and removes any association between that page and
> >      an enclave.  It must be run on a page before it can be added into
> >      another enclave.  Currently, EREMOVE is run as part of pages being freed
> >      into the SGX page allocator.  It is not expected to fail.
> >      
> >      KVM does not track how guest pages are used, which means that SGX
> >      virtualization use of EREMOVE might fail.  Specifically, it is
> >      legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to
> >      KVM guest, because KVM/kernel doesn't track SECS pages.
> >
> >      Break out the EREMOVE call from the SGX page allocator.  This will allow
> >      the SGX virtualization code to use the allocator directly.  (SGX/KVM
> >      will also introduce a more permissive EREMOVE helper).
> 
> Ok, I think I got the source of my confusion.  The part in parentheses
> is the key.  It was not clear that KVM can deal with EREMOVE failures
> *without printing the error*.  Good!

Yes the "will also introduce a more premissive EREMOVE helper" is done in patch
5 (x86/sgx: Introduce virtual EPC for use by KVM guests).

> 
> >      Implement original sgx_free_epc_page() as sgx_encl_free_epc_page() to be
> >      more specific that it is used to free EPC page assigned host enclave.
> >      Replace sgx_free_epc_page() with sgx_encl_free_epc_page() in all call
> >      sites so there's no functional change.
> >      
> >      Improve error message when EREMOVE fails, and kernel is about to leak
> >      EPC page, which is likely a kernel bug.  This is effectively a kernel
> >      use-after-free of EPC, and due to the way SGX works, the bug is detected
> >      at freeing.  Rather than add the page back to the pool of available EPC,
> >      the kernel intentionally leaks the page to avoid additional errors in
> >      the future.
> >      
> >      Also add documentation to explain to user what is the bug and suggest
> >      user what to do when this bug happens, although extremely unlikely.
> 
> Rewritten:
> 
> EREMOVE takes a page and removes any association between that page and
> an enclave.  It must be run on a page before it can be added into
> another enclave.  Currently, EREMOVE is run as part of pages being freed
> into the SGX page allocator.  It is not expected to fail, as it would
> indicate a use-after-free of EPC.  Rather than add the page back to the
> pool of available EPC, the kernel intentionally leaks the page to avoid
> additional errors in the future.
> 
> However, KVM does not track how guest pages are used, which means that SGX
> virtualization use of EREMOVE might fail.  Specifically, it is
> legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to
> KVM guest, because KVM/kernel doesn't track SECS pages.
> 
> To allow SGX/KVM to introduce a more permissive EREMOVE helper and to
> let the SGX virtualization code use the allocator directly,
> break out the EREMOVE call from the SGX page allocator.  Rename the
> original sgx_free_epc_page() to sgx_encl_free_epc_page(),
> indicating that it is used to free EPC page assigned host enclave.
> Replace sgx_free_epc_page() with sgx_encl_free_epc_page() in all call
> sites so there's no functional change.
> 
> At the same time improve error message when EREMOVE fails, and add
> documentation to explain to user what is the bug and suggest user what
> to do when this bug happens, although extremely unlikely.

Thanks :)

> 
> > +Although extremely unlikely, EPC leaks can happen if kernel SGX bug happens,
> > +when a WARNING with below message is shown in dmesg:
> 
> Remove "Although extremely unlikely".

Will do.

> 
> > +"...EREMOVE returned ..., kernel bug likely.  EPC page leaked, SGX may become
> > +unusuable.  Please refer to Documentation/x86/sgx.rst for more information."
> > +
> > +This is effectively a kernel use-after-free of EPC, and due to the way SGX
> > +works, the bug is detected at freeing. Rather than add the page back to the pool
> > +of available EPC, the kernel intentionally leaks the page to avoid additional
> > +errors in the future.
> > +
> > +When this happens, kernel will likely soon leak majority of EPC pages, and SGX
> > +will likely become unusable. However while this may be fatal to SGX, other
> > +kernel functionalities are unlikely to be impacted, and should continue to work.
> > +
> > +As a result, when this happpens, user should stop running any new SGX workloads,
> > +(or just any new workloads), and migrate all valuable workloads, for instance,
> > +virtual machines, to other places.
> 
> Remove everything starting with "for instance".

Will do.

> 
>   Although a machine reboot can recover all
> > +EPC, debugging and fixing this bug is appreciated.
> 
> Replace the second part with "the bug should be reported to the Linux developers".
> The poor user is not expected to debug SGX. ;)

Hmm.. Makes sense :)

> 
> > +/* Error message for EREMOVE failure, when kernel is about to leak EPC page */
> > +#define EREMOVE_ERROR_MESSAGE \
> > +       "EREMOVE returned %d (0x%x), kernel bug likely.  EPC page leaked, SGX may become
> > unusuable.  Please refer to Documentation/x86/sgx.rst for more information."
> 
> Rewritten:
> 
> EREMOVE returned %d and an EPC page was leaked; SGX may become unusable.
> This is a kernel bug, refer to Documentation/x86/sgx.rst for more information.

Fine to me, although this would have %d (0x%x) -> %d change in the code.

> 
> Also please split it across multiple lines.
> 
> Paolo
> 

  reply	other threads:[~2021-03-24 10:49 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19  7:29 [PATCH v3 00/25] KVM SGX virtualization support Kai Huang
2021-03-19  7:22 ` [PATCH v3 01/25] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Kai Huang
2021-03-19  7:22 ` [PATCH v3 02/25] x86/cpufeatures: Add SGX1 and SGX2 sub-features Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:22 ` [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page() Kai Huang
2021-03-22 18:16   ` Borislav Petkov
2021-03-22 18:56     ` Sean Christopherson
2021-03-22 19:11       ` Paolo Bonzini
2021-03-22 20:43         ` Kai Huang
2021-03-23 16:40           ` Paolo Bonzini
2021-03-22 19:15       ` Borislav Petkov
2021-03-22 19:37         ` Sean Christopherson
2021-03-22 20:36           ` Kai Huang
2021-03-22 21:06           ` Borislav Petkov
2021-03-22 22:06             ` Kai Huang
2021-03-22 22:37               ` Borislav Petkov
2021-03-22 23:16                 ` Kai Huang
2021-03-23 15:45                   ` Sean Christopherson
2021-03-23 16:06                     ` Borislav Petkov
2021-03-23 16:21                       ` Sean Christopherson
2021-03-23 16:32                         ` Borislav Petkov
2021-03-23 16:51                           ` Sean Christopherson
2021-03-24  9:38                           ` Kai Huang
2021-03-24 10:09                             ` Paolo Bonzini
2021-03-24 10:48                               ` Kai Huang [this message]
2021-03-24 11:24                                 ` Paolo Bonzini
2021-03-24 23:23                               ` Kai Huang
2021-03-24 23:39                                 ` Paolo Bonzini
2021-03-24 23:46                                   ` Kai Huang
2021-03-25  8:42                                     ` Borislav Petkov
2021-03-25  9:38                                       ` Kai Huang
2021-03-25 16:52                                         ` Borislav Petkov
2021-03-24  9:28                         ` Jarkko Sakkinen
2021-03-23 16:38                       ` Paolo Bonzini
2021-03-23 17:02                         ` Sean Christopherson
2021-03-23 17:06                           ` Paolo Bonzini
2021-03-23 17:16                             ` Sean Christopherson
2021-03-23 18:16                             ` Borislav Petkov
2021-03-24  9:26                       ` Jarkko Sakkinen
2021-03-22 22:23             ` Kai Huang
2021-03-25  9:30   ` [PATCH v4 " Kai Huang
2021-03-26 19:48     ` Jarkko Sakkinen
2021-03-26 20:38       ` Kai Huang
2021-03-26 21:39       ` Jarkko Sakkinen
2021-04-07 10:03     ` [tip: x86/sgx] " tip-bot2 for Kai Huang
2021-03-19  7:22 ` [PATCH v3 04/25] x86/sgx: Add SGX_CHILD_PRESENT hardware error code Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:22 ` [PATCH v3 05/25] x86/sgx: Introduce virtual EPC for use by KVM guests Kai Huang
2021-03-25  9:36   ` Kai Huang
2021-03-26 15:03   ` Borislav Petkov
2021-03-26 15:17     ` Dave Hansen
2021-03-26 15:29       ` Borislav Petkov
2021-03-26 15:35         ` Dave Hansen
2021-03-26 17:02           ` Borislav Petkov
2021-03-31  1:10     ` Kai Huang
2021-03-31  6:44       ` Boris Petkov
2021-03-31  6:51         ` Kai Huang
2021-03-31  7:44           ` Boris Petkov
2021-03-31  8:53             ` Kai Huang
2021-03-31 12:20               ` Kai Huang
2021-04-01 18:31                 ` Borislav Petkov
2021-04-01 23:38                   ` Kai Huang
2021-04-01  9:45               ` Kai Huang
2021-04-01  9:42   ` [PATCH v4 " Kai Huang
2021-04-05  9:01   ` [PATCH v3 " Borislav Petkov
2021-04-05 21:46     ` Kai Huang
2021-04-06  8:28       ` Borislav Petkov
2021-04-06  9:04         ` Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:22 ` [PATCH v3 06/25] x86/cpu/intel: Allow SGX virtualization without Launch Control support Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 07/25] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled Kai Huang
2021-04-02  9:48   ` Borislav Petkov
2021-04-02 11:08     ` Kai Huang
2021-04-02 11:22       ` Borislav Petkov
2021-04-02 11:38         ` Kai Huang
2021-04-02 15:42     ` Sean Christopherson
2021-04-02 19:08       ` Kai Huang
2021-04-02 19:19       ` Borislav Petkov
2021-04-02 19:30         ` Sean Christopherson
2021-04-02 19:46           ` Borislav Petkov
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Kai Huang
2021-03-19  7:23 ` [PATCH v3 08/25] x86/sgx: Expose SGX architectural definitions to the kernel Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 09/25] x86/sgx: Move ENCLS leaf definitions to sgx.h Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 10/25] x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 11/25] x86/sgx: Add encls_faulted() helper Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 12/25] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Kai Huang
2021-03-19  7:23 ` [PATCH v3 13/25] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM Kai Huang
2021-04-05  9:07   ` Borislav Petkov
2021-04-05 21:44     ` Kai Huang
2021-04-06  7:40       ` Borislav Petkov
2021-04-06  8:59         ` Kai Huang
2021-04-06  9:09           ` Borislav Petkov
2021-04-06  9:24             ` Kai Huang
2021-04-06  9:32               ` Borislav Petkov
2021-04-06  9:41                 ` Kai Huang
2021-04-06 17:08                   ` Borislav Petkov
2021-04-06 20:33                     ` Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 14/25] x86/sgx: Move provisioning device creation out of SGX driver Kai Huang
2021-04-07 10:03   ` [tip: x86/sgx] " tip-bot2 for Sean Christopherson
2021-03-19  7:23 ` [PATCH v3 15/25] KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) Kai Huang
2021-03-19  7:23 ` [PATCH v3 16/25] KVM: x86: Define new #PF SGX error code bit Kai Huang
2021-03-19  7:23 ` [PATCH v3 17/25] KVM: x86: Add support for reverse CPUID lookup of scattered features Kai Huang
2021-03-19  7:23 ` [PATCH v3 18/25] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features Kai Huang
2021-03-19  7:23 ` [PATCH v3 19/25] KVM: VMX: Add basic handling of VM-Exit from SGX enclave Kai Huang
2021-03-19  7:23 ` [PATCH v3 20/25] KVM: VMX: Frame in ENCLS handler for SGX virtualization Kai Huang
2021-03-19  7:23 ` [PATCH v3 21/25] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions Kai Huang
2021-03-19  7:23 ` [PATCH v3 22/25] KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs Kai Huang
2021-03-19  7:23 ` [PATCH v3 23/25] KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) Kai Huang
2021-03-19  7:23 ` [PATCH v3 24/25] KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC Kai Huang
2021-03-19  7:24 ` [PATCH v3 25/25] KVM: x86: Add capability to grant VM access to privileged SGX attribute Kai Huang
2021-03-19 14:52 ` [PATCH v3 00/25] KVM SGX virtualization support Jarkko Sakkinen
2021-03-22 10:03   ` Kai Huang
2021-03-22 10:31     ` Borislav Petkov
2021-03-26 22:46 ` Jarkko Sakkinen
2021-03-28 21:01   ` Huang, Kai
2021-03-31 23:23     ` Jarkko Sakkinen
  -- strict thread matches above, loose matches on Subject: below --
2021-03-09  1:39 [PATCH v2 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page() Kai Huang
2021-03-11  2:01 ` [PATCH v3 " Kai Huang
2021-03-12 21:21   ` Sean Christopherson
2021-03-13 10:45     ` Jarkko Sakkinen
2021-03-15  7:12       ` Kai Huang
2021-03-15 13:18         ` Jarkko Sakkinen
2021-03-15 13:19           ` Jarkko Sakkinen
2021-03-15 20:29             ` Kai Huang
2021-03-15 22:59               ` Jarkko Sakkinen
2021-03-15 23:50                 ` Kai Huang
2021-03-15 23:11               ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210324234839.bf5bef54fd7a84030cf1bcf8@intel.com \
    --to=kai.huang@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=haitao.huang@intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).