From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754434AbcCNSf5 (ORCPT ); Mon, 14 Mar 2016 14:35:57 -0400 Received: from mail-ig0-f173.google.com ([209.85.213.173]:36214 "EHLO mail-ig0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753998AbcCNSfy (ORCPT ); Mon, 14 Mar 2016 14:35:54 -0400 MIME-Version: 1.0 In-Reply-To: <20160314131500.21b9f6c5@jules-lenovo3> References: <441AF596.F6E66BC9@tv-sign.ru> <20060317125607.78a5dbe4.akpm@osdl.org> <441C0741.3BC25010@tv-sign.ru> <441C2AA0.3080200@us.ibm.com> <441C4263.B779CDA8@tv-sign.ru> <20160314131500.21b9f6c5@jules-lenovo3> Date: Mon, 14 Mar 2016 11:35:53 -0700 X-Google-Sender-Auth: AWQhSrcUtP7RUTDgrmA-M5l8QtY Message-ID: Subject: Re: unshare(CLONE_VM) Re: [PATCH] unshare: Use rcu_assign_pointer when setting sighand From: Linus Torvalds To: Julian Smith Cc: Oleg Nesterov , Janak Desai , Andrew Morton , "Eric W. Biederman" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 14, 2016 at 6:15 AM, Julian Smith wrote: > > I'm looking into whether it would be possible to extend the unshare > syscall to support the CLONE_VM flag with multi-threaded processes, > because this would allow us at Undo to record multi-threaded user > processes much more efficiently than at present. At the point where you want to unsahe the VM, you should just use a full clone() instead. The thing is, unsharing the VM absolutely _requires_ you to also unshare signals and some other state too (we require that thread groups are in the same VM, for example, but also the child tid information etc etc). And the whole "copy VM" case is so expensive that at that point there's no advantage to "unshare", you might as well just do a full clone() (perhaps still sharing file descriptors and fs state). So while I think a unshare(CLONE_VM | CLONE_SIGHAND | CLONE_THREAD | CLONE_CHILD_CLEARTID | CLONE_CHILD_SETTID); might be possible from a technical standpoint, I'm not seeing the huge advantage to users vs just doing something like clone(new_vm_function, NULL, CLONE_VFORK | CLONE_FILES | CLONE_FS | CLONE_PARENT..); _exit(); (fixup details to actually work - the above is meant more as a "something remotely like this" rather than actually equivalent) The costs of forking and exiting a thread are almost all about just the VM copying and tear-down, so a "unshare(CLONE_VM)" is fundamentally not a cheap operation (and at the other range of the spectrum: an exit of a thread where there are other sharing threads is fundamentally quite cheap, because it just ends up decrementing counters). So my gut feel is that no, we really don't want unshare(CLONE_VM), because it *is* a very complicated operation and doesn't actually perform any better than just cloning. And the "it is a very complicated operation" really comes not from the fact that we can't copy the VM - we have that support already, but because CLONE_VM really does go hand-in-hand with so many special cases. Oleg pointed out that mm->core_waiters thing last time around, but that just ends up being a detail: the whole VM sharing just ends up being very central to a lot of small details.. Linus