From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44473C433E3 for ; Tue, 23 Mar 2021 17:50:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2DDF8619CB for ; Tue, 23 Mar 2021 17:50:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231279AbhCWRuS (ORCPT ); Tue, 23 Mar 2021 13:50:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231191AbhCWRtw (ORCPT ); Tue, 23 Mar 2021 13:49:52 -0400 Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C0CBC061764 for ; Tue, 23 Mar 2021 10:49:51 -0700 (PDT) Received: by mail-lf1-x131.google.com with SMTP id n138so27920608lfa.3 for ; Tue, 23 Mar 2021 10:49:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=P7FG8frtqOnIaco1BBcfNWyiT6YFpbAkPyq6JjHD/SQ=; b=RqC83U3mrLkb+QFRLoON0Zw9/YzFosu29mToXdupbyes+eP8PIbUGwPqm8r7/KpwRR oZelC5ZmxImg8iBbM2gmLroSoYCiCEUvGtkQiOC4/PAEclQS7ugWWjIJNtnS3605Q2J8 ZIN/PuHLu1RX17VYbH9ddtJ1VMYZ2UYl/zac1wDW7HAuCdsaxKFv6RGRsTr5dG5anaRB vswno94C7eL1m8PiYJ70kLt+hevGKr4aTJ011awdWOMnAZwre4yQob6AqxNo1DVYfd7u GLlsD8eUCAS1xPvQV5CkZ/lmgLn5dFcI7bh1eIS+ioRFxMljDx1upkfbDIXRXDqwZEuq dslw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=P7FG8frtqOnIaco1BBcfNWyiT6YFpbAkPyq6JjHD/SQ=; b=auAkvuORvv4VRRtRQictaIvjJKs75JKgqcVuj1ibcyF/yf4cWxnA6HWMxTrnQSrz6J D4pnXJx9vRj47H9VynxCE1p8lN/iAQHc8fYgghbIitDA+ayrkH+YvXAJAVdL79qe7ATW idAgpkYeiX4zETrdOn1eS85QlxWqHp7aORQwb3dII0MoMZxriTAKb3Q4HoXRlOOkj0rl xbMgWYTugSGUICNJtBRSemqah5qtbdwg7PaaZmSN+tLUd+MIZxUSy2h9iakET/jc2Kzt tOsE1uvzMxwP13kXhiDuq57AasS+/lQFaWIZy1U2I5hA4rv8tgiBbB5V3FNda5c0LuQ1 PWjA== X-Gm-Message-State: AOAM532ItIk3+hZOd+Xuj2HKmUKeyHGXqq2UVvZ1UvD9El1WKcsWdcb4 xWG1Bg5kZMPlOmJ70zL/F6MfEllOZzUNJlSGVSW/vA== X-Google-Smtp-Source: ABdhPJwyenFneH4fs+z0uzL0vIEhWiSdc0Ge3Zq/4vWaO4unkEzFLjqdXBLZ4GR4Y3t93oZd2IPWbg1ktdyPCk9cf3w= X-Received: by 2002:a19:22d6:: with SMTP id i205mr3164515lfi.352.1616521789635; Tue, 23 Mar 2021 10:49:49 -0700 (PDT) MIME-Version: 1.0 References: <20210316204252.427806-1-mic@digikod.net> <20210316204252.427806-8-mic@digikod.net> In-Reply-To: From: Jann Horn Date: Tue, 23 Mar 2021 18:49:22 +0100 Message-ID: Subject: Re: [PATCH v30 07/12] landlock: Support filesystem access-control To: =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= Cc: James Morris , "Serge E . Hallyn" , Al Viro , Andrew Morton , Andy Lutomirski , Anton Ivanov , Arnd Bergmann , Casey Schaufler , David Howells , Jeff Dike , Jonathan Corbet , Kees Cook , Michael Kerrisk , Richard Weinberger , Shuah Khan , Vincent Dagonneau , Kernel Hardening , Linux API , linux-arch , "open list:DOCUMENTATION" , linux-fsdevel , kernel list , "open list:KERNEL SELFTEST FRAMEWORK" , linux-security-module , "the arch/x86 maintainers" , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 23, 2021 at 4:54 PM Micka=C3=ABl Sala=C3=BCn = wrote: > On 23/03/2021 01:13, Jann Horn wrote: > > On Tue, Mar 16, 2021 at 9:43 PM Micka=C3=ABl Sala=C3=BCn wrote: > >> Using Landlock objects and ruleset, it is possible to tag inodes > >> according to a process's domain. > > [...] > >> +static void release_inode(struct landlock_object *const object) > >> + __releases(object->lock) > >> +{ > >> + struct inode *const inode =3D object->underobj; > >> + struct super_block *sb; > >> + > >> + if (!inode) { > >> + spin_unlock(&object->lock); > >> + return; > >> + } > >> + > >> + /* > >> + * Protects against concurrent use by hook_sb_delete() of the = reference > >> + * to the underlying inode. > >> + */ > >> + object->underobj =3D NULL; > >> + /* > >> + * Makes sure that if the filesystem is concurrently unmounted= , > >> + * hook_sb_delete() will wait for us to finish iput(). > >> + */ > >> + sb =3D inode->i_sb; > >> + atomic_long_inc(&landlock_superblock(sb)->inode_refs); > >> + spin_unlock(&object->lock); > >> + /* > >> + * Because object->underobj was not NULL, hook_sb_delete() and > >> + * get_inode_object() guarantee that it is safe to reset > >> + * landlock_inode(inode)->object while it is not NULL. It is = therefore > >> + * not necessary to lock inode->i_lock. > >> + */ > >> + rcu_assign_pointer(landlock_inode(inode)->object, NULL); > >> + /* > >> + * Now, new rules can safely be tied to @inode with get_inode_= object(). > >> + */ > >> + > >> + iput(inode); > >> + if (atomic_long_dec_and_test(&landlock_superblock(sb)->inode_r= efs)) > >> + wake_up_var(&landlock_superblock(sb)->inode_refs); > >> +} > > [...] > >> +static struct landlock_object *get_inode_object(struct inode *const i= node) > >> +{ > >> + struct landlock_object *object, *new_object; > >> + struct landlock_inode_security *inode_sec =3D landlock_inode(i= node); > >> + > >> + rcu_read_lock(); > >> +retry: > >> + object =3D rcu_dereference(inode_sec->object); > >> + if (object) { > >> + if (likely(refcount_inc_not_zero(&object->usage))) { > >> + rcu_read_unlock(); > >> + return object; > >> + } > >> + /* > >> + * We are racing with release_inode(), the object is g= oing > >> + * away. Wait for release_inode(), then retry. > >> + */ > >> + spin_lock(&object->lock); > >> + spin_unlock(&object->lock); > >> + goto retry; > >> + } > >> + rcu_read_unlock(); > >> + > >> + /* > >> + * If there is no object tied to @inode, then create a new one= (without > >> + * holding any locks). > >> + */ > >> + new_object =3D landlock_create_object(&landlock_fs_underops, i= node); > >> + if (IS_ERR(new_object)) > >> + return new_object; > >> + > >> + /* Protects against concurrent get_inode_object() calls. */ > >> + spin_lock(&inode->i_lock); > >> + object =3D rcu_dereference_protected(inode_sec->object, > >> + lockdep_is_held(&inode->i_lock)); > > > > rcu_dereference_protected() requires that inode_sec->object is not > > concurrently changed, but I think another thread could call > > get_inode_object() while we're in landlock_create_object(), and then > > we could race with the NULL write in release_inode() here? (It > > wouldn't actually be a UAF though because we're not actually accessing > > `object` here.) Or am I missing a lock that prevents this? > > > > In v28 this wasn't an issue because release_inode() was holding > > inode->i_lock (and object->lock) during the NULL store; but in v29 and > > this version the NULL store in release_inode() moved out of the locked > > region. I think you could just move the NULL store in release_inode() > > back up (and maybe add a comment explaining the locking rules for > > landlock_inode(...)->object)? > > > > (Or alternatively you could use rcu_dereference_raw() with a comment > > explaining that the read pointer is only used to check for NULL-ness, > > and that it is guaranteed that the pointer can't change if it is NULL > > and we're holding the lock. But that'd be needlessly complicated, I > > think.) > > To reach rcu_assign_pointer(landlock_inode(inode)->object, NULL) in > release_inode() or in hook_sb_delete(), the > landlock_inode(inode)->object need to be non-NULL, Yes. > which implies that a > call to get_inode_object(inode) either "retry" (because release_inode is > only called by landlock_put_object, which set object->usage to 0) until > it creates a new object, or reuses the existing referenced object (and > increments object->usage). But it can be that landlock_inode(inode)->object only becomes non-NULL after get_inode_object() has checked rcu_dereference(inode_sec->object). > The worse case would be if > get_inode_object(inode) is called just before the > rcu_assign_pointer(landlock_inode(inode)->object, NULL) from > hook_sb_delete(), which would result in an object with a NULL underobj, > which is the expected behavior (and checked by release_inode). The scenario I'm talking about doesn't involve hook_sb_delete(). > The line rcu_assign_pointer(inode_sec->object, new_object) from > get_inode_object() can only be reached if the underlying inode doesn't > reference an object, Yes. > in which case hook_sb_delete() will not reach the > rcu_assign_pointer(landlock_inode(inode)->object, NULL) line for this > same inode. > > This works because get_inode_object(inode) is mutually exclusive to > itself with the same inode (i.e. an inode can only point to an object > that references this same inode). To clarify: You can concurrently call get_inode_object() multiple times on the same inode, right? There are no locks held on entry to that function. > I tried to explain this with the comment "Protects against concurrent > get_inode_object() calls" in get_inode_object(), and the comments just > before both rcu_assign_pointer(landlock_inode(inode)->object, NULL). The scenario I'm talking about is: Initially the inode does not have an associated landlock_object. There are two threads A and B. Thread A is going to execute get_inode_object(). Thread B is going to execute get_inode_object() followed immediately by landlock_put_object(). thread A: enters get_inode_object() thread A: rcu_dereference(inode_sec->object) returns NULL thread A: enters landlock_create_object() thread B: enters get_inode_object() thread B: rcu_dereference(inode_sec->object) returns NULL thread B: calls landlock_create_object() thread B: sets inode_sec->object while holding inode->i_lock thread B: leaves get_inode_object() thread B: enters landlock_put_object() thread B: object->usage drops to 0, object->lock is taken thread B: calls release_inode() thread B: drops object->lock thread A: returns from landlock_create_object() thread A: takes inode->i_lock At this point, thread B will run: rcu_assign_pointer(landlock_inode(inode)->object, NULL); while thread A runs: rcu_dereference_protected(inode_sec->object, lockdep_is_held(&inode->i_lock)); meaning there is a (theoretical) data race, since rcu_dereference_protected() doesn't use READ_ONCE(). > >> + if (unlikely(object)) { > >> + /* Someone else just created the object, bail out and = retry. */ > >> + spin_unlock(&inode->i_lock); > >> + kfree(new_object); > >> + > >> + rcu_read_lock(); > >> + goto retry; > >> + } > >> + > >> + rcu_assign_pointer(inode_sec->object, new_object); > >> + /* > >> + * @inode will be released by hook_sb_delete() on its superblo= ck > >> + * shutdown. > >> + */ > >> + ihold(inode); > >> + spin_unlock(&inode->i_lock); > >> + return new_object; > >> +}