linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Hillf Danton <hdanton@sina.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Alexey Gladkov <legion@kernel.org>
Subject: Re: [GIT PULL] ucount fix for v5.14-rc
Date: Fri, 6 Aug 2021 10:37:52 -0700	[thread overview]
Message-ID: <CAHk-=wjdfLQ+z=uM3qUPSb1wgnugeN5+wyH11kmatUXskYqrCA@mail.gmail.com> (raw)
In-Reply-To: <20210806061458.3075-1-hdanton@sina.com>

On Thu, Aug 5, 2021 at 11:15 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Thu, 05 Aug 2021 22:38:05 -0500 Eric W. Biederman wrote:
> >
> >I think you are saying if someone calls get_ucounts without knowing the
> >reference count is at least one a user after free will happen.  It is a
> >bug to call get_ucounts unless it is known the reference count is at
> >least one.
>
> Doubt it works because no atomicity is ensured by two atomic operations
> in tow.
>
>         if (atomic_read(&ucounts->count) >= 1)
>                 if (ucounts && atomic_add_negative(1, &ucounts->count))

Note that the first atomic_read() check is purely a debug check.

Eric's point is that you can't do "get_ucounts()" on an ucount that
you don't already have a reference to - that would be a bug even
*without* the "get_ucounts()", because you're clearly dealing with an
ucount that could be released at any time.

So you have only a couple of operations:

 (a) find_ucounts() looks up the ucount for a user that doesn't have a ref yet.

 (b) get_ucounts() increments a ref for somebody who already has a
ucount but is giving it away to somebody else too. We know the ref
can't go down to zero, because we are ourselves holding a ref to it.

 (c) put_ucounts() decrements a ref (and frees it when the refs go
down to zero).

Of these, (a) needs to be called under the lock, and needs to
increment the ref-count before the lock is released.

But (b) does *not* need a lock, exactly because the current getter
must already hold a ref, so it can't go away.

And (c) needs to hold a lock only for the "goes to zero" case, so you
can avoid the lock if the count started out higher than 1 (which is
why we have that atomic_dec_and_lock_irqsave() pattern)

The bug was in (a) and (c), but (b) is fine.

Side note: other data structures - not ucounts - can have slightly
more complex behavior, and in particular you can do the find/put
operations without locking if you

 - use RCU to free it

 - do the "find" in a RCU read-locked section

 - after finding it, you use "get_ref_not_zero()" to do an atomic "did
somebody free the last ref, if not increment it"

and the above pattern allows you to do all of a-c without any actual
locking, just local atomics.

That's what all the really critical kernel data structures tend to do.

(And the *really* critical data structures with complex topology - ie
dentries - have a local lock too, and use the lockref data structure,
but that's basically just dentries and the gfs2 gl/qd_lock - and I
have a sneaking suspicion that the gfs2 use is more of a "because I
can" rather than a "because I need to")

                 Linus

  parent reply	other threads:[~2021-08-06 17:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-05 17:15 [GIT PULL] ucount fix for v5.14-rc Eric W. Biederman
2021-08-05 19:26 ` pr-tracker-bot
     [not found] ` <20210806021052.3013-1-hdanton@sina.com>
2021-08-06  3:38   ` Eric W. Biederman
     [not found]     ` <20210806061458.3075-1-hdanton@sina.com>
2021-08-06 17:37       ` Linus Torvalds [this message]
     [not found]         ` <20210807050314.1807-1-hdanton@sina.com>
2021-08-07  8:23           ` Linus Torvalds
     [not found]             ` <20210807091128.1862-1-hdanton@sina.com>
2021-08-07 15:10               ` Linus Torvalds
     [not found]                 ` <20210808004243.2007-1-hdanton@sina.com>
2021-08-08  1:00                   ` Linus Torvalds
     [not found]                     ` <20210808012056.2067-1-hdanton@sina.com>
2021-08-08  1:45                       ` Linus Torvalds
2021-08-08  2:05                         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wjdfLQ+z=uM3qUPSb1wgnugeN5+wyH11kmatUXskYqrCA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=hdanton@sina.com \
    --cc=legion@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).