All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] block: switch to atomic_t for request references
Date: Wed, 8 Dec 2021 10:00:04 -0800	[thread overview]
Message-ID: <CAHk-=wiFLbv2M9gRkh6_Zkwiza17QP0gJLAL7AgDqDArGBGpSQ@mail.gmail.com> (raw)
In-Reply-To: <YbDmWFM5Kh1J2YqS@hirez.programming.kicks-ass.net>

On Wed, Dec 8, 2021 at 9:07 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> IOW, the effective range becomes: [1..INT_MIN], which is a bit
> counter-intuitive, but then so is most of this stuff.

I'd suggest not codifying it too strictly, because the exact range at
the upper end might depend on what is convenient for an architecture
to do.

For x86, 'xadd' has odd semantics in that the flags register is about
the *new* state, but the returned value is about the *old* state.

That means that on x86, some things are cheaper to test based on the
pre-inc/dec values, and other things are cheaper to test based on the
post-inc/dec ones.

It's also why for "page->_mapcount" we have the "free" value being -1,
not 0, and the refcount is "off by one". It makes the special cases of
"increment from zero" and "decrement to zero" be very easy and
straightforward to test for.

That might be an option for an "atomic_ref" type - with our existing
"page_mapcount()" code being the thing we'd convert first, and make be
the example for it.

I think it should also make the error cases be very easy to check for
without extra tests. If you make "decrement from zero" be the "ok, now
it's free", then that shows in the carry flag. But otherwise, if SF or
OF is set, it's an error.  That means we can use the regular atomics
and flags (although not "dec" and "inc", since we'd care about CF).

So on x86, I think "atomic_dec_ref()" could be

        lock subl $1,ptr
        jc now_its_free
        jl this_is_an_error

if we end up having that "off by one" model.

And importantly, "atomic_inc_ref()" would be just

        lock incl ptr
        jle this_is_an_error

and this avoids us having to have the value in a register and test it
separately.

So your suggestion is _close_, but note how you can't do the "inc_ofl"
without that "off-by-one" model.

And again - I might have gotten the exact flag test instructions
wrong. That's what you get for not actually doing serious assembly
language for a couple of decades.

            Linus

  reply	other threads:[~2021-12-08 18:00 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03 15:35 [PATCH] block: switch to atomic_t for request references Jens Axboe
2021-12-03 15:56 ` Keith Busch
2021-12-06  6:53 ` Christoph Hellwig
2021-12-06  8:31   ` Peter Zijlstra
2021-12-06 16:32     ` Jens Axboe
2021-12-06 17:19       ` Peter Zijlstra
2021-12-06 17:35     ` Linus Torvalds
2021-12-06 18:13       ` Jens Axboe
2021-12-06 20:51         ` Kees Cook
2021-12-06 21:17           ` Linus Torvalds
2021-12-06 23:28             ` Kees Cook
2021-12-07  0:13               ` Linus Torvalds
2021-12-07  4:56                 ` Kees Cook
2021-12-07  9:34                 ` Peter Zijlstra
2021-12-07 16:03                   ` Linus Torvalds
2021-12-07 10:30                 ` Peter Zijlstra
2021-12-07 16:10                   ` Linus Torvalds
2021-12-07 16:23                     ` Peter Zijlstra
2021-12-06 16:31   ` Jens Axboe
2021-12-07 11:26   ` Peter Zijlstra
2021-12-07 13:28     ` Peter Zijlstra
2021-12-07 15:51       ` Peter Zijlstra
2021-12-07 16:13       ` Linus Torvalds
2021-12-07 16:52         ` Peter Zijlstra
2021-12-07 17:41           ` Peter Zijlstra
2021-12-07 17:43           ` Linus Torvalds
2021-12-07 17:45             ` Linus Torvalds
2021-12-07 20:28       ` Peter Zijlstra
2021-12-07 23:23         ` Linus Torvalds
2021-12-08 17:07           ` Peter Zijlstra
2021-12-08 18:00             ` Linus Torvalds [this message]
2021-12-08 18:44               ` Peter Zijlstra
2021-12-08 18:50                 ` Linus Torvalds
2021-12-08 20:32                   ` Peter Zijlstra
2021-12-10 10:57                   ` Peter Zijlstra
2021-12-10 12:38               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiFLbv2M9gRkh6_Zkwiza17QP0gJLAL7AgDqDArGBGpSQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.