Il ven 10 set 2021, 22:21 Sean Christopherson <seanjc@google.com> ha
scritto:

> > It's also possible that QEMU handles failure, but the kernel does two
> > passes; then QEMU can just do two passes.  The kernel will overall do
> four
> > passes, but:
> >
> > 1) the second (SECS pinned by children in the same vEPC) would be cheaper
> > than a full second pass
>
> The problem is that this would require a list_head (or temp allocations)
> to track
> the SECS pages that failed the first time 'round.  For vEPC destruction,
> the kernel
> can use sgx_epc_page.list because it can take the pages off the
> active/allocated
> list, but that's not an option in this case because the
> presumably-upcoming EPC
> cgroup needs to keep pages on the list to handle OOM.
>

Good point, so yeah: let's go for a ioctl that does full removal, returning
the number of failures. I will try and cobble up a patch unless Kai beats
me to it.

Thanks for the quick discussion!

Paolo


> The kernel's ioctl/syscall/whatever could return the number of pages that
> were
> not freed, or maybe just -EAGAIN, and userspace could use that to know it
> needs
> to do another reset to free everything.
>
> My thought for QEMU was to do (bad pseudocode):
>
>         /* Retry to EREMOVE pinned SECS pages if necessary. */
>         ret = ioctl(SGX_VEPC_RESET, ...);
>         if (ret)
>                 ret = ioctl(SGX_VEPC_RESET, ...);
>
>         /*
>          * Tag the VM as needed yet another round of resets to ERMOVE SECS
> pages
>          * that were pinned across vEPC sections.
>          */
>         vm->sgx_epc_final_reset_needed = !!ret;
>
> > 2) the fourth would actually do nothing, because there would be no pages
> > failing the EREMOV'al.
> >
> > A hypothetical other SGX client that only uses one vEPC will do the right
> > thing with a single pass.
>
>