Il ven 10 set 2021, 22:21 Sean Christopherson ha scritto: > > It's also possible that QEMU handles failure, but the kernel does two > > passes; then QEMU can just do two passes. The kernel will overall do > four > > passes, but: > > > > 1) the second (SECS pinned by children in the same vEPC) would be cheaper > > than a full second pass > > The problem is that this would require a list_head (or temp allocations) > to track > the SECS pages that failed the first time 'round. For vEPC destruction, > the kernel > can use sgx_epc_page.list because it can take the pages off the > active/allocated > list, but that's not an option in this case because the > presumably-upcoming EPC > cgroup needs to keep pages on the list to handle OOM. > Good point, so yeah: let's go for a ioctl that does full removal, returning the number of failures. I will try and cobble up a patch unless Kai beats me to it. Thanks for the quick discussion! Paolo > The kernel's ioctl/syscall/whatever could return the number of pages that > were > not freed, or maybe just -EAGAIN, and userspace could use that to know it > needs > to do another reset to free everything. > > My thought for QEMU was to do (bad pseudocode): > > /* Retry to EREMOVE pinned SECS pages if necessary. */ > ret = ioctl(SGX_VEPC_RESET, ...); > if (ret) > ret = ioctl(SGX_VEPC_RESET, ...); > > /* > * Tag the VM as needed yet another round of resets to ERMOVE SECS > pages > * that were pinned across vEPC sections. > */ > vm->sgx_epc_final_reset_needed = !!ret; > > > 2) the fourth would actually do nothing, because there would be no pages > > failing the EREMOV'al. > > > > A hypothetical other SGX client that only uses one vEPC will do the right > > thing with a single pass. > >