All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Hugh Dickins <hughd@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 1/4] mm: introduce 'encoded' page pointers with embedded extra bits
Date: Wed, 9 Nov 2022 10:00:46 -0800	[thread overview]
Message-ID: <CAHk-=wiqUpnAzD74H7wCvJ+L9jK+wY=H=8BiFFNkWP21r9uoUQ@mail.gmail.com> (raw)
In-Reply-To: <Y2tKixpO4RO6DgW5@tuxmaker.boeblingen.de.ibm.com>

On Tue, Nov 8, 2022 at 10:38 PM Alexander Gordeev
<agordeev@linux.ibm.com> wrote:
>
> On Tue, Nov 08, 2022 at 11:41:36AM -0800, Linus Torvalds wrote:
>
> > +static inline struct encoded_page *encode_page(struct page *page, unsigned long flags)
> > +{
>
> Any reaction in case ((flags & ~ENCODE_PAGE_BITS) != 0)?

Heh. I've actually had three different implementations for that during
the development series, and I think I even posted them all at one
point or another (although usually just as attachments). And none of
them are good.

Those three trivial versions are: (a) use VM_BUG_ON(), (b) just
silently mask the bits and (c) just silently add them.

And (c) is that least annoying option that this latest patch uses,
because both (a) and (b) are just nasty.

Basically, all users are locally trivial to verify statically, so
VM_BUG_ON() is just conceptually wrong and generates extra pointless
code. And the silent masking - if it makes any difference - is just
another version of "just silently add the bits": regardless of whether
it clears them or not, it does the wrong thing if the bits don't fit.

So there are three bad options, I've gone back and forth between them
all, and I chose the least offensive one that is "invisible", in that
it at least doesn't do any extra pointless work.

Now, there are two non-offensive options too, and I actually
considered, but never implemented them. They both fix the problem
properly, by making it a *buildtime* check, but they have other
issues.

There's two ways to just make it a build-time check, and it's
annoyingly _close_ to being usable, but not quite there.

One is simply to require that the flags argument is always a plain
constant, and simply using BUILD_BUG_ON().

I actually almost went down that path - one of the things I considered
was to not add a 'flags' argument to __tlb_remove_page() at all, but
instead just have separate __tlb_remove_page() and
__tlb_remove_page_dirty() functions.

That would have meant that the argument to __tlb_remove_page_size
would have always been a built-time constant, and then it would be
trivial to just have that BUILD_BUG_ON(). Problem solved.

But it turns out that it's just nasty, particularly with different
configurations wanting different rules for what the dirty bit is. So
forcing it to some constant value was really not acceptable.

The thing that I actually *wanted* to do, but didn't actually dare,
was to just say "I will trust the compiler to do the value range
tracking".

Because *technically* our BUILD_BUG_ON() doesn't need a compile-time
constant. Because our implementation of BUILD_BUG_ON() is not the
garbage that the compiler gives us in "_Static_assert()" that really
requires a syntactically pure integer constant expression.

So the kernel version of BUILD_BUG_ON() is actually something much
smarter: it depends on the compiler actually *optimizing* the
expression, and it's only that optimized value that needs to be
determined at compile-time to be either true or false. You can use
things like inline functions etc, just as long as the end result is
obvious enough that the compiler ends up saying "ok, that's never the
case".

And *if* the compiler does any kind of reasonable range analysis, then a

        BUILD_BUG_ON(flags > ENCODE_PAGE_BITS);

should actually work. In theory.

In practice? Not so much.

Because while the argument isn't constant (not even in the caller),
the compiler *should* be smart enough to see that in the use in
mm/memory.c, 'flags' is always that

        unsigned int delay_rmap;

which then gets initialized to

        delay_rmap = 0;

and conditionally set to '1' later. So it's not a *constant*, but the
compiler can see that the value of flags is clearly never larger than
ENCODE_PAGE_BITS.

But right now the compiler cannot track that over the non-inline
function in __tlb_remove_page_size().

Maybe if the 'encode_page()' was done in the caller, and
__tlb_remove_page_size() were to just take an encoded_page as the
argument, then the compiler would always only see this all through
inlined functions, and it would work.

But even if it were to work for me (I never tried), I'd have been much
too worried that some other compiler version, with some other config
options, on some other architecture, wouldn't make the required
optimizations.

We do require compiler optimizations to be on for 'BUILD_BUG_ON()' to
do anything at all:

   #ifdef __OPTIMIZE__
   # define __compiletime_assert(condition, msg, prefix, suffix)           \
   ..
   #else
   # define __compiletime_assert(condition, msg, prefix, suffix) do {
} while (0)
   #endif

and we have a lot of places that depend on BUILD_BUG_ON() to do basic
constant folding and other fairly simple optimizations.

But while I think a BUILD_BUG_ON() would be the right thing to do
here, I do not feel confident enough to really put that to the test.

              Linus

  reply	other threads:[~2022-11-09 18:03 UTC|newest]

Thread overview: 150+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-22 11:14 [PATCH 00/13] Clean up pmd_get_atomic() and i386-PAE Peter Zijlstra
2022-10-22 11:14 ` [PATCH 01/13] mm: Update ptep_get_lockless()s comment Peter Zijlstra
2022-10-24  5:42   ` John Hubbard
2022-10-24  8:00     ` Peter Zijlstra
2022-10-24 19:58       ` Jann Horn
2022-10-24 20:19         ` Linus Torvalds
2022-10-24 20:23           ` Jann Horn
2022-10-24 20:36             ` Linus Torvalds
2022-10-25  3:21             ` Matthew Wilcox
2022-10-25  7:54               ` Alistair Popple
2022-10-25 13:33                 ` Peter Zijlstra
2022-10-25 13:44                 ` Jann Horn
2022-10-26  0:45                   ` Alistair Popple
2022-10-25 14:02         ` Peter Zijlstra
2022-10-25 14:18           ` Jann Horn
2022-10-25 15:06             ` Peter Zijlstra
2022-10-26 16:45               ` Jann Horn
2022-10-27  7:08                 ` Peter Zijlstra
2022-10-27 18:13                   ` Linus Torvalds
2022-10-27 19:35                     ` Peter Zijlstra
2022-10-27 19:43                       ` Linus Torvalds
2022-10-27 20:15                     ` Nadav Amit
2022-10-27 20:31                       ` Linus Torvalds
2022-10-27 21:44                         ` Nadav Amit
2022-10-28 23:57                           ` Nadav Amit
2022-10-29  0:42                             ` Linus Torvalds
2022-10-29 18:05                               ` Nadav Amit
2022-10-29 18:36                                 ` Linus Torvalds
2022-10-29 18:58                                   ` Linus Torvalds
2022-10-29 19:14                                     ` Linus Torvalds
2022-10-29 19:28                                       ` Nadav Amit
2022-10-30  0:18                                       ` Nadav Amit
2022-10-30  2:17                                     ` Nadav Amit
2022-10-30 18:19                                       ` Linus Torvalds
2022-10-30 18:51                                         ` Linus Torvalds
2022-10-30 22:47                                           ` Linus Torvalds
2022-10-31  1:47                                             ` Linus Torvalds
2022-10-31  4:09                                               ` Nadav Amit
2022-10-31  4:55                                                 ` Nadav Amit
2022-10-31  5:00                                                 ` Linus Torvalds
2022-10-31 15:43                                                   ` Nadav Amit
2022-10-31 17:32                                                     ` Linus Torvalds
2022-10-31  9:36                                               ` Peter Zijlstra
2022-10-31 17:28                                                 ` Linus Torvalds
2022-10-31 18:43                                                   ` mm: delay rmap removal until after TLB flush Linus Torvalds
2022-11-02  9:14                                                     ` Christian Borntraeger
2022-11-02  9:23                                                       ` Christian Borntraeger
2022-11-02 17:55                                                       ` Linus Torvalds
2022-11-02 18:28                                                         ` Linus Torvalds
2022-11-02 22:29                                                         ` Gerald Schaefer
2022-11-02 12:45                                                     ` Peter Zijlstra
2022-11-02 22:31                                                     ` Gerald Schaefer
2022-11-02 23:13                                                       ` Linus Torvalds
2022-11-03  9:52                                                     ` David Hildenbrand
2022-11-03 16:54                                                       ` Linus Torvalds
2022-11-03 17:09                                                         ` Linus Torvalds
2022-11-03 17:36                                                           ` David Hildenbrand
2022-11-04  6:33                                                     ` Alexander Gordeev
2022-11-04 17:35                                                       ` Linus Torvalds
2022-11-06 21:06                                                         ` Hugh Dickins
2022-11-06 22:34                                                           ` Linus Torvalds
2022-11-06 23:14                                                             ` Andrew Morton
2022-11-07  0:06                                                               ` Stephen Rothwell
2022-11-07 16:19                                                               ` Linus Torvalds
2022-11-07 23:02                                                                 ` Andrew Morton
2022-11-07 23:44                                                                   ` Stephen Rothwell
2022-11-07  9:12                                                           ` Peter Zijlstra
2022-11-07 20:07                                                           ` Johannes Weiner
2022-11-07 20:29                                                             ` Linus Torvalds
2022-11-07 23:47                                                               ` Linus Torvalds
2022-11-08  4:28                                                                 ` Linus Torvalds
2022-11-08 19:56                                                                   ` Linus Torvalds
2022-11-08 20:03                                                                     ` Konstantin Ryabitsev
2022-11-08 20:18                                                                       ` Linus Torvalds
2022-11-08 19:41                                                                 ` [PATCH 1/4] mm: introduce 'encoded' page pointers with embedded extra bits Linus Torvalds
2022-11-08 20:37                                                                   ` Nadav Amit
2022-11-08 20:46                                                                     ` Linus Torvalds
2022-11-09  6:36                                                                   ` Alexander Gordeev
2022-11-09 18:00                                                                     ` Linus Torvalds [this message]
2022-11-09 20:02                                                                       ` Linus Torvalds
2022-11-08 19:41                                                                 ` [PATCH 2/4] mm: teach release_pages() to take an array of encoded page pointers too Linus Torvalds
2022-11-08 19:41                                                                 ` [PATCH 3/4] mm: mmu_gather: prepare to gather encoded page pointers with flags Linus Torvalds
2022-11-08 19:41                                                                 ` [PATCH 4/4] mm: delay page_remove_rmap() until after the TLB has been flushed Linus Torvalds
2022-11-08 20:48                                                                   ` [lkp] [+115 bytes kernel size regression] [i386-tinyconfig] [0309f16088] " kernel test robot
2022-11-08 21:01                                                                     ` Linus Torvalds
2022-11-08 21:05                                                                   ` [PATCH 4/4] " Nadav Amit
2022-11-09 15:53                                                                   ` Johannes Weiner
2022-11-09 19:31                                                                     ` Hugh Dickins
2022-10-31  9:39                                               ` [PATCH 01/13] mm: Update ptep_get_lockless()s comment Peter Zijlstra
2022-10-31 17:22                                                 ` Linus Torvalds
2022-10-31  9:46                                               ` Peter Zijlstra
2022-10-31  9:28                                             ` Peter Zijlstra
2022-10-31 17:19                                               ` Linus Torvalds
2022-10-30 19:34                                         ` Nadav Amit
2022-10-29 19:39                                   ` John Hubbard
2022-10-29 20:15                                     ` Linus Torvalds
2022-10-29 20:30                                       ` Linus Torvalds
2022-10-29 20:42                                         ` John Hubbard
2022-10-29 20:56                                       ` Nadav Amit
2022-10-29 21:03                                         ` Nadav Amit
2022-10-29 21:12                                         ` Linus Torvalds
2022-10-29 20:59                                       ` Theodore Ts'o
2022-10-26 19:43               ` Nadav Amit
2022-10-27  7:27                 ` Peter Zijlstra
2022-10-27 17:30                   ` Nadav Amit
2022-10-22 11:14 ` [PATCH 02/13] x86/mm/pae: Make pmd_t similar to pte_t Peter Zijlstra
2022-10-22 11:14 ` [PATCH 03/13] sh/mm: " Peter Zijlstra
2022-12-21 13:54   ` Guenter Roeck
2022-10-22 11:14 ` [PATCH 04/13] mm: Fix pmd_read_atomic() Peter Zijlstra
2022-10-22 17:30   ` Linus Torvalds
2022-10-24  8:09     ` Peter Zijlstra
2022-11-01 12:41     ` Peter Zijlstra
2022-11-01 17:42       ` Linus Torvalds
2022-11-02  9:12       ` [tip: x86/mm] mm: Convert __HAVE_ARCH_P..P_GET to the new style tip-bot2 for Peter Zijlstra
2022-11-03 21:15       ` tip-bot2 for Peter Zijlstra
2022-12-17 18:55       ` tip-bot2 for Peter Zijlstra
2022-10-22 11:14 ` [PATCH 05/13] mm: Rename GUP_GET_PTE_LOW_HIGH Peter Zijlstra
2022-10-22 11:14 ` [PATCH 06/13] mm: Rename pmd_read_atomic() Peter Zijlstra
2022-10-22 11:14 ` [PATCH 07/13] mm/gup: Fix the lockless PMD access Peter Zijlstra
2022-10-23  0:42   ` Hugh Dickins
2022-10-24  7:42     ` Peter Zijlstra
2022-10-25  3:58       ` Hugh Dickins
2022-10-22 11:14 ` [PATCH 08/13] x86/mm/pae: Dont (ab)use atomic64 Peter Zijlstra
2022-10-22 11:14 ` [PATCH 09/13] x86/mm/pae: Use WRITE_ONCE() Peter Zijlstra
2022-10-22 17:42   ` Linus Torvalds
2022-10-24 10:21     ` Peter Zijlstra
2022-10-22 11:14 ` [PATCH 10/13] x86/mm/pae: Be consistent with pXXp_get_and_clear() Peter Zijlstra
2022-10-22 17:53   ` Linus Torvalds
2022-10-24 11:13     ` Peter Zijlstra
2022-10-22 11:14 ` [PATCH 11/13] x86_64: Remove pointless set_64bit() usage Peter Zijlstra
2022-10-22 17:55   ` Linus Torvalds
2022-11-03 19:09   ` Nathan Chancellor
2022-11-03 19:23     ` Uros Bizjak
2022-11-03 19:35       ` Nathan Chancellor
2022-11-03 20:39         ` Linus Torvalds
2022-11-03 21:06           ` Peter Zijlstra
2022-11-04 16:01           ` Peter Zijlstra
2022-11-04 17:15             ` Linus Torvalds
2022-11-05 13:29               ` Jason A. Donenfeld
2022-11-05 15:14                 ` Peter Zijlstra
2022-11-05 20:54                   ` Jason A. Donenfeld
2022-11-07  9:14                   ` David Laight
2022-12-19 15:44               ` Peter Zijlstra
2022-10-22 11:14 ` [PATCH 12/13] x86/mm/pae: Get rid of set_64bit() Peter Zijlstra
2022-10-22 11:14 ` [PATCH 13/13] mm: Remove pointless barrier() after pmdp_get_lockless() Peter Zijlstra
2022-10-22 19:59   ` Yu Zhao
2022-10-22 17:57 ` [PATCH 00/13] Clean up pmd_get_atomic() and i386-PAE Linus Torvalds
2022-10-29 12:21 ` Peter Zijlstra
2022-11-09 20:30 [PATCH 1/4] mm: introduce 'encoded' page pointers with embedded extra bits Linus Torvalds
2022-11-16  9:15 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiqUpnAzD74H7wCvJ+L9jK+wY=H=8BiFFNkWP21r9uoUQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.