All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Hillf Danton <hdanton@sina.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Alexey Gladkov <legion@kernel.org>
Subject: Re: [GIT PULL] ucount fix for v5.14-rc
Date: Fri, 6 Aug 2021 10:37:52 -0700	[thread overview]
Message-ID: <CAHk-=wjdfLQ+z=uM3qUPSb1wgnugeN5+wyH11kmatUXskYqrCA@mail.gmail.com> (raw)
In-Reply-To: <20210806061458.3075-1-hdanton@sina.com>

On Thu, Aug 5, 2021 at 11:15 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Thu, 05 Aug 2021 22:38:05 -0500 Eric W. Biederman wrote:
> >
> >I think you are saying if someone calls get_ucounts without knowing the
> >reference count is at least one a user after free will happen.  It is a
> >bug to call get_ucounts unless it is known the reference count is at
> >least one.
>
> Doubt it works because no atomicity is ensured by two atomic operations
> in tow.
>
>         if (atomic_read(&ucounts->count) >= 1)
>                 if (ucounts && atomic_add_negative(1, &ucounts->count))

Note that the first atomic_read() check is purely a debug check.

Eric's point is that you can't do "get_ucounts()" on an ucount that
you don't already have a reference to - that would be a bug even
*without* the "get_ucounts()", because you're clearly dealing with an
ucount that could be released at any time.

So you have only a couple of operations:

 (a) find_ucounts() looks up the ucount for a user that doesn't have a ref yet.

 (b) get_ucounts() increments a ref for somebody who already has a
ucount but is giving it away to somebody else too. We know the ref
can't go down to zero, because we are ourselves holding a ref to it.

 (c) put_ucounts() decrements a ref (and frees it when the refs go
down to zero).

Of these, (a) needs to be called under the lock, and needs to
increment the ref-count before the lock is released.

But (b) does *not* need a lock, exactly because the current getter
must already hold a ref, so it can't go away.

And (c) needs to hold a lock only for the "goes to zero" case, so you
can avoid the lock if the count started out higher than 1 (which is
why we have that atomic_dec_and_lock_irqsave() pattern)

The bug was in (a) and (c), but (b) is fine.

Side note: other data structures - not ucounts - can have slightly
more complex behavior, and in particular you can do the find/put
operations without locking if you

 - use RCU to free it

 - do the "find" in a RCU read-locked section

 - after finding it, you use "get_ref_not_zero()" to do an atomic "did
somebody free the last ref, if not increment it"

and the above pattern allows you to do all of a-c without any actual
locking, just local atomics.

That's what all the really critical kernel data structures tend to do.

(And the *really* critical data structures with complex topology - ie
dentries - have a local lock too, and use the lockref data structure,
but that's basically just dentries and the gfs2 gl/qd_lock - and I
have a sneaking suspicion that the gfs2 use is more of a "because I
can" rather than a "because I need to")

                 Linus

  parent reply	other threads:[~2021-08-06 17:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-05 17:15 [GIT PULL] ucount fix for v5.14-rc Eric W. Biederman
2021-08-05 19:26 ` pr-tracker-bot
     [not found] ` <20210806021052.3013-1-hdanton@sina.com>
2021-08-06  3:38   ` Eric W. Biederman
     [not found]     ` <20210806061458.3075-1-hdanton@sina.com>
2021-08-06 17:37       ` Linus Torvalds [this message]
     [not found]         ` <20210807050314.1807-1-hdanton@sina.com>
2021-08-07  8:23           ` Linus Torvalds
     [not found]             ` <20210807091128.1862-1-hdanton@sina.com>
2021-08-07 15:10               ` Linus Torvalds
     [not found]                 ` <20210808004243.2007-1-hdanton@sina.com>
2021-08-08  1:00                   ` Linus Torvalds
     [not found]                     ` <20210808012056.2067-1-hdanton@sina.com>
2021-08-08  1:45                       ` Linus Torvalds
2021-08-08  2:05                         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wjdfLQ+z=uM3qUPSb1wgnugeN5+wyH11kmatUXskYqrCA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=hdanton@sina.com \
    --cc=legion@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.