From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: * X-Spam-Status: No, score=1.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, RDNS_NONE,SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from out01.mta.xmission.com ([166.70.13.231]:54052 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729675AbgCKQbm (ORCPT ); Wed, 11 Mar 2020 12:31:42 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Bernd Edlinger Cc: Jann Horn , Christian Brauner , Kees Cook , Jonathan Corbet , Alexander Viro , Andrew Morton , Alexey Dobriyan , Thomas Gleixner , Oleg Nesterov , Frederic Weisbecker , Andrei Vagin , Ingo Molnar , "Peter Zijlstra \(Intel\)" , Yuyang Du , David Hildenbrand , Sebastian Andrzej Siewior , Anshuman Khandual , David Howells , James Morris , Greg Kroah-Hartman , Shakeel Butt , Jason Gunthorpe , Christian Kellner , Andrea Arcangeli , Aleksa Sarai , "Dmitry V. Levin" , "linux-doc\@vger.kernel.org" , "linux-kernel\@vger.kernel.org" , "linux-fsdevel\@vger.kernel.org" , "linux-mm\@kvack.org" , "stable\@vger.kernel.org" , "linux-api\@vger.kernel.org" References: <87v9nlii0b.fsf@x220.int.ebiederm.org> <87a74xi4kz.fsf@x220.int.ebiederm.org> <87r1y8dqqz.fsf@x220.int.ebiederm.org> <87tv32cxmf.fsf_-_@x220.int.ebiederm.org> <87v9ne5y4y.fsf_-_@x220.int.ebiederm.org> <87zhcq4jdj.fsf_-_@x220.int.ebiederm.org> <87wo7roq2c.fsf@x220.int.ebiederm.org> <87k13roigf.fsf@x220.int.ebiederm.org> Date: Wed, 11 Mar 2020 11:29:08 -0500 In-Reply-To: (Bernd Edlinger's message of "Wed, 11 Mar 2020 07:33:35 +0100") Message-ID: <87d09ionxn.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [PATCH v2 5/5] exec: Add a exec_update_mutex to replace cred_guard_mutex Sender: linux-kernel-owner@vger.kernel.org List-ID: Bernd Edlinger writes: > On 3/11/20 1:15 AM, Eric W. Biederman wrote: >> Jann Horn writes: >> >>> On Tue, Mar 10, 2020 at 10:33 PM Eric W. Biederman >>> wrote: >>>> Jann Horn writes: >>>>> On Sun, Mar 8, 2020 at 10:41 PM Eric W. Biederman wrote: >>>>>> The cred_guard_mutex is problematic. The cred_guard_mutex is held >>>>>> over the userspace accesses as the arguments from userspace are read. >>>>>> The cred_guard_mutex is held of PTRACE_EVENT_EXIT as the the other >>>>>> threads are killed. The cred_guard_mutex is held over >>>>>> "put_user(0, tsk->clear_child_tid)" in exit_mm(). >>>>>> >>>>>> Any of those can result in deadlock, as the cred_guard_mutex is held >>>>>> over a possible indefinite userspace waits for userspace. >>>>>> >>>>>> Add exec_update_mutex that is only held over exec updating process >>>>>> with the new contents of exec, so that code that needs not to be >>>>>> confused by exec changing the mm and the cred in ways that can not >>>>>> happen during ordinary execution of a process. >>>>>> >>>>>> The plan is to switch the users of cred_guard_mutex to >>>>>> exec_udpate_mutex one by one. This lets us move forward while still >>>>>> being careful and not introducing any regressions. >>>>> [...] >>>>>> @@ -1034,6 +1035,11 @@ static int exec_mmap(struct mm_struct *mm) >>>>>> return -EINTR; >>>>>> } >>>>>> } >>>>>> + >>>>>> + ret = mutex_lock_killable(&tsk->signal->exec_update_mutex); >>>>>> + if (ret) >>>>>> + return ret; >>>>> >>>>> We're already holding the old mmap_sem, and now nest the >>>>> exec_update_mutex inside it; but then while still holding the >>>>> exec_update_mutex, we do mmput(), which can e.g. end up in ksm_exit(), >>>>> which can do down_write(&mm->mmap_sem) from __ksm_exit(). So I think >>>>> at least lockdep will be unhappy, and I'm not sure whether it's an >>>>> actual problem or not. >>>> >>>> Good point. I should double check the lock ordering here with mmap_sem. >>>> It doesn't look like mmput takes mmap_sem >>> >>> You sure about that? mmput() -> __mmput() -> ksm_exit() -> >>> __ksm_exit() -> down_write(&mm->mmap_sem) >>> >>> Or also: mmput() -> __mmput() -> khugepaged_exit() -> >>> __khugepaged_exit() -> down_write(&mm->mmap_sem) >>> >>> Or is there a reason why those paths can't happen? >> >> Clearly I didn't look far enough. >> >> I will adjust this so that exec_update_mutex is taken before mmap_sem. >> Anything else is just asking for trouble. >> > > Note that vm_access does also mmput under the exec_update_mutex. > So I don't see a huge problem here. > But maybe I missed something. The issue is that to prevent deadlock locks must always be taken in the same order. Taking mmap_sem then exec_update_mutex at the start of the function, then taking exec_update_mutex then mmap_sem in mmput, takes the two locks in two different orders. Which means that in the right set or circumstances: thread1: thread2: obtain mmap_sem optain exec_update_mutex wait for exec_update_mutex wait for mmap_sem Which guarantees that neither thread will make progress. The fix is easy I just need to take exec_update_mutex a few lines earlier. Eric