On Wed, 2021-04-14 at 09:34 +0200, Johannes Berg wrote: > On Wed, 2021-04-14 at 08:22 +0100, Anton Ivanov wrote: > > On 14/04/2021 06:52, Andrei Vagin wrote: > > > We already have process_vm_readv and process_vm_writev to read and > > > write > > > to a process memory faster than we can do this with ptrace. And now > > > it > > > is time for process_vm_exec that allows executing code in an > > > address > > > space of another process. We can do this with ptrace but it is much > > > slower. > > > > > > = Use-cases = > > > > > > Here are two known use-cases. The first one is “application kernel” > > > sandboxes like User-mode Linux and gVisor. In this case, we have a > > > process that runs the sandbox kernel and a set of stub processes > > > that > > > are used to manage guest address spaces. Guest code is executed in > > > the > > > context of stub processes but all system calls are intercepted and > > > handled in the sandbox kernel. Right now, these sort of sandboxes > > > use > > > PTRACE_SYSEMU to trap system calls, but the process_vm_exec can > > > significantly speed them up. > > > > Certainly interesting, but will require um to rework most of its > > memory > > management and we will most likely need extra mm support to make use > > of > > it in UML. We are not likely to get away just with one syscall there. > > Might help the seccomp mode though: > > https://patchwork.ozlabs.org/project/linux-um/list/?series=231980 Hmm, to me it sounds like it replaces both ptrace and seccomp mode while completely avoiding the scheduling overhead that these techniques have. I think everything UML needs is covered: * The new API can do syscalls in the target memory space (we can modify the address space) * The new API can run code until the next syscall happens (or a signal happens, which means SIGALRM for scheduling works) * Single step tracing should work by setting EFLAGS I think the memory management itself stays fundamentally the same. We just do the initial clone() using CLONE_STOPPED. We don't need any stub code/data and we have everything we need to modify the address space and run the userspace process. Benjamin