From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw0-f182.google.com ([209.85.161.182]:32831 "EHLO mail-yw0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756174AbcIRT6H (ORCPT ); Sun, 18 Sep 2016 15:58:07 -0400 Received: by mail-yw0-f182.google.com with SMTP id i129so122449115ywb.0 for ; Sun, 18 Sep 2016 12:58:07 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20160918184507.GT10601@decadent.org.uk> References: <1474211117-16674-1-git-send-email-jann@thejh.net> <1474211117-16674-3-git-send-email-jann@thejh.net> <1474222407.2428.2.camel@decadent.org.uk> <20160918183137.GA17170@pc.thejh.net> <20160918184507.GT10601@decadent.org.uk> From: Andy Lutomirski Date: Sun, 18 Sep 2016 12:57:46 -0700 Message-ID: Subject: Re: [PATCH 2/9] exec: turn self_exec_id into self_privunit_id To: Ben Hutchings Cc: Thomas Gleixner , Stephen Smalley , Andrew Morton , "security@kernel.org" , James Morris , Janis Danisevskis , Casey Schaufler , Roland McGrath , Kees Cook , Alexander Viro , LSM List , "Serge E. Hallyn" , Jann Horn , "Eric . Biederman" , Paul Moore , Linux FS Devel , Oleg Nesterov , Benjamin LaHaise , Eric Paris , Seth Forshee , John Johansen Content-Type: text/plain; charset=UTF-8 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sep 18, 2016 8:45 AM, "Ben Hutchings" wrote: > > On Sun, Sep 18, 2016 at 08:31:37PM +0200, Jann Horn wrote: > > On Sun, Sep 18, 2016 at 07:13:27PM +0100, Ben Hutchings wrote: > > > On Sun, 2016-09-18 at 17:05 +0200, Jann Horn wrote: > > > > This ensures that self_privunit_id ("privilege unit ID") is only shared by > > > > processes that share the mm_struct and the signal_struct; not just > > > > spatially, but also temporally. In other words, if you do execve() or > > > > clone() without CLONE_THREAD, you get a new privunit_id that has never been > > > > used before. > > > [...] > > > > +void increment_privunit_counter(void) > > > > +{ > > > > + BUILD_BUG_ON(NR_CPUS > (1 << 16)); > > > > + current->self_privunit_id = this_cpu_add_return(exec_counter, NR_CPUS); > > > > +} > > > [...] > > > > > > This will wrap incorrectly if NR_CPUS is not a power of 2 (which is > > > unusual but allowed). > > > > If this wraps, hell breaks loose permission-wise - processes that have > > no relationship whatsoever with each other will suddenly be able to ptrace > > each other. > > > > The idea is that it never wraps. > > That's what I suspected, but wasn't sure. In that case you can > initialise each counter to U64_MAX/NR_CPUS*cpu and increment by > 1 each time, which might be more efficient on some architectures. > > > It wraps after (2^64)/NR_CPUS execs or > > forks on one CPU core. NR_CPUS is bounded to <=2^16, so in the worst case, > > it wraps after 2^48 execs or forks. > > > > On my system with 3.7GHz per core, 2^16 minimal sequential non-thread clone() > > calls need 1 second system time (and 2 seconds wall clock time, but let's > > disregard that), so 2^48 non-thread clone() calls should need over 100 years. > > > > But I guess both the kernel and machines get faster - if you think the margin > > might not be future-proof enough (or if you think I measured wrong and it's > > actually much faster), I guess I could bump this to a 128bit number. > > Sequential execution speed isn't likely to get significantly faster so > with those current numbers this seems to be quite safe. > But how big can NR_CPUs get before this gets uncomfortable? We could do: struct luid { u64 count: unsigned cpu; }; (LUID = locally unique ID). IIRC my draft PCID code does something similar to uniquely identify mms. If I accidentally reused a PCID without a flush, everything would explode. --Andy