From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757786AbaFYRJE (ORCPT ); Wed, 25 Jun 2014 13:09:04 -0400 Received: from mail-oa0-f43.google.com ([209.85.219.43]:42765 "EHLO mail-oa0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757434AbaFYRJA (ORCPT ); Wed, 25 Jun 2014 13:09:00 -0400 MIME-Version: 1.0 In-Reply-To: <20140625165209.GA14720@redhat.com> References: <1403642893-23107-1-git-send-email-keescook@chromium.org> <1403642893-23107-10-git-send-email-keescook@chromium.org> <20140625142121.GD7892@redhat.com> <20140625165209.GA14720@redhat.com> Date: Wed, 25 Jun 2014 10:09:00 -0700 X-Google-Sender-Auth: LyzNbpp1gBF1qXUfuxVtfLNpQuk Message-ID: Subject: Re: [PATCH v8 9/9] seccomp: implement SECCOMP_FILTER_FLAG_TSYNC From: Kees Cook To: Oleg Nesterov Cc: LKML , Andy Lutomirski , "Michael Kerrisk (man-pages)" , Alexei Starovoitov , Andrew Morton , Daniel Borkmann , Will Drewry , Julien Tinnes , David Drysdale , Linux API , "x86@kernel.org" , "linux-arm-kernel@lists.infradead.org" , linux-mips@linux-mips.org, linux-arch , linux-security-module Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 25, 2014 at 9:52 AM, Oleg Nesterov wrote: > On 06/25, Kees Cook wrote: >> >> On Wed, Jun 25, 2014 at 7:21 AM, Oleg Nesterov wrote: >> > >> > But. Doesn't this change add a new security hole? >> > >> > Obviously, we should not allow to install a filter and then (say) exec >> > a suid binary, that is why we have no_new_privs/LSM_UNSAFE_NO_NEW_PRIVS. >> > >> > But what if "thread->seccomp.filter = caller->seccomp.filter" races with >> > any user of task_no_new_privs() ? Say, suppose this thread has already >> > passed check_unsafe_exec/etc and it is going to exec the suid binary? >> >> Oh, ew. Yeah. It looks like there's a cred lock to be held to combat this? > > Yes, cred_guard_mutex looks like an obvious choice... Hmm, but somehow > initially I thought that the fix won't be simple. Not sure why. > > Yes, at least this should close the race with suid-exec. And there are no > other users. Except apparmor, and I hope you will check it because I simply > do not know what it does ;) > >> I wonder if changes to nnp need to "flushed" during syscall entry >> instead of getting updated externally/asynchronously? That way it >> won't be out of sync with the seccomp mode/filters. >> >> Perhaps secure computing needs to check some (maybe seccomp-only) >> atomic flags and flip on the "real" nnp if found? > > Not sure I understand you, could you clarify? Instead of having TSYNC change the nnp bit, it can set a new flag, say: task->seccomp.flags |= SECCOMP_NEEDS_NNP; This would be set along with seccomp.mode, seccomp.filter, and TIF_SECCOMP. Then, during the next secure_computing() call that thread makes, it would check the flag: if (task->seccomp.flags & SECCOMP_NEEDS_NNP) task->nnp = 1; This means that nnp couldn't change in the middle of a running syscall. Hmmm. Perhaps this doesn't solve anything, though? Perhaps my proposal above would actually make things worse, since now we'd have a thread with seccomp set up, and no nnp. If it was in the middle of exec, we're still causing a problem. I think we'd also need a way to either delay the seccomp changes, or to notice this condition during exec. Bleh. What actually happens with a multi-threaded process calls exec? I assume all the other threads are destroyed? > But I was also worried that task_no_new_privs(current) is no longer stable > inside the syscall paths, perhaps this is what you meant? However I do not > see something bad here... And this has nothing to do with the race above. > > Also. Even ignoring no_new_privs, SECCOMP_FILTER_FLAG_TSYNC is not atomic > and we can do nothing with this fact (unless it try to freeze the thread > group somehow), perhaps it makes sense to document this somehow. > > I mean, suppose you want to ensure write-to-file is not possible, so you > do seccomp(SECCOMP_FILTER_FLAG_TSYNC, nack_write_to_file_filter). You can't > assume that this has effect right after seccomp() returns, this can obviously > race with a sub-thread which has already entered sys_write(). > > Once again, I am not arguing, just I think it makes sense to at least mention > the limitations during the discussion. Right -- this is an accepted limitation. I will call it out specifically in the man-page; that's a good idea. -Kees -- Kees Cook Chrome OS Security