All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Matthew Wilcox <willy@infradead.org>,
	Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	muchun.song@linux.dev, mike.kravetz@oracle.com,
	akpm@linux-foundation.org, gerald.schaefer@linux.ibm.com
Subject: Re: [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound_order
Date: Wed, 14 Dec 2022 19:48:04 -0700	[thread overview]
Message-ID: <CAA1CXcA2dGeG2tzc+-OZ77eMVpnSN2SKkdtz9LqpLPywhJMOwA@mail.gmail.com> (raw)
In-Reply-To: <Y5oCD0gFV+Cq1JqJ@casper.infradead.org>

On Wed, Dec 14, 2022 at 10:04 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Dec 13, 2022 at 04:45:05PM -0700, Nico Pache wrote:
> > Since commit 1378a5ee451a ("mm: store compound_nr as well as
> > compound_order") the page[1].compound_nr must be explicitly set to 0 if
> > calling set_compound_order(page, 0).
> >
> > This can lead to bugs if the caller of set_compound_order(page, 0) forgets
> > to explicitly set compound_nr=0. An example of this is commit ba9c1201beaa
> > ("mm/hugetlb: clear compound_nr before freeing gigantic pages")
> >
> > Collapse these calls into the set_compound_order by utilizing branchless
> > bitmaths [1].
> >
> > [1] https://graphics.stanford.edu/~seander/bithacks.html#ConditionalSetOrClearBitsWithoutBranching
> >
> > V2: slight changes to commit log and remove extra '//' in the comments
>
> We don't usually use // comments anywhere in the kernel other than
> the SPDX header.

Whoops!

> >  static inline void set_compound_order(struct page *page, unsigned int order)
> >  {
> > +     unsigned long shift = (1U << order);
>
> Shift is a funny name for this variable.  order is the shift.  this is 'nr'.

Good point! Waiman found an even better/cleaner solution that would
avoid needing an extra variable.
    page[1].compound_nr = (1U << order) & ~1U;

> >       page[1].compound_order = order;
> >  #ifdef CONFIG_64BIT
> > -     page[1].compound_nr = 1U << order;
> > +     // Branchless conditional:
> > +     // order  > 0 --> compound_nr = shift
> > +     // order == 0 --> compound_nr = 0
> > +     page[1].compound_nr = shift ^ (-order  ^ shift) & shift;
>
> Can the compiler see through this?  Before, the compiler sees:
>
>         page[1].compound_order = 0;
>         page[1].compound_nr = 1U << 0;
> ...
>         page[1].compound_nr = 0;
>
> and it can eliminate the first store.

This may be the case at the moment, but with:
https://lore.kernel.org/linux-mm/20221213212053.106058-1-sidhartha.kumar@oracle.com/
we will have a branch instead. Sidhartha tested it and found no
regression; the concern is that if THPs get implemented using this
callpath then we may end up seeing a slowdown.

After doing my analysis below I dont think this is the case for the
destroy case(at least on x86).
In the destroy case for both the branch and branchless approach we see
the compiler optimizing away the bitmath and the branch and setting
the variable to zero.
In the prep case we see the introduction of a test and cmovne
instruction, implying a branch.

> Now the compiler sees:
>         unsigned long shift = (1U << 0);
>         page[1].compound_order = order;
>         page[1].compound_nr = shift ^ (0  ^ shift) & shift;
>
> Does it do the maths at compile-time, knowing that order is 0 at this
> callsite and deducing that it can just store a 0?
>
> I think it might, since shift is constant-1,
>
>         page[1].compound_nr = 1 ^ (0 ^ 1) & 1;
> ->      page[1].compound_nr = 1 ^ 1 & 1;
> ->      page[1].compound_nr = 0 & 1;
> ->      page[1].compound_nr = 0;
>
> But you should run it through the compiler and check the assembly
> output for __destroy_compound_gigantic_page().

Yep it does look like it gets optimized away for the destroy case:

Bitmath Case (destroy)
---------------------------------
Dump of assembler code for function __update_and_free_page:
...
mov    %rsi,%rbp //move 2nd arg (page) to rbp
...
movb   $0x0,0x51(%rbp) //page[1].compound_order = 0
movl   $0x0,0x5c(%rbp)  //page[1].compound_nr = 0
...

Math for movl : 0x5c (92) - 64 (sizeof page[0]) = 28
pahole page: unsigned int compound_nr;        /*    28     4 */

Bitmath Case (prep)
---------------------------------
In the case of prep_compound_gigantic_page the bitmath is being computed
   0xffffffff8134f17d <+13>:    mov    %rdi,%r12
   0xffffffff8134f180 <+16>:    push   %rbp
   0xffffffff8134f181 <+17>:    mov    $0x1,%ebp
   0xffffffff8134f186 <+22>:    shl    %cl,%ebp
   0xffffffff8134f188 <+24>:    neg    %ecx
   0xffffffff8134f18a <+26>:    push   %rbx
   0xffffffff8134f18b <+27>:    and    %ebp,%ecx
   0xffffffff8134f18d <+29>:    mov    %sil,0x51(%rdi)
   0xffffffff8134f191 <+33>:    mov    %ecx,0x5c(%rdi) //set page[1].compound_nr

Now to break down the approach with the branch:

Branch Case (destroy)
---------------------------------
  No branch utilized to determine the following instructions.
   0xffffffff813507bc <+236>:    movb   $0x0,0x51(%rbp)
   0xffffffff813507c0 <+240>:    movl   $0x0,0x5c(%rbp)

Branch  Case (prep)
---------------------------------
The branch is being computed with the introduction of a cmovne instruction.
   0xffffffff8134f15d <+13>:    mov    %rdi,%r12
   0xffffffff8134f160 <+16>:    push   %rbp
   0xffffffff8134f161 <+17>:    mov    $0x1,%ebp
   0xffffffff8134f166 <+22>:    shl    %cl,%ebp
   0xffffffff8134f168 <+24>:    test   %esi,%esi             //test
   0xffffffff8134f16a <+26>:    push   %rbx
   0xffffffff8134f16b <+27>:    cmovne %ebp,%ecx     //branch evaluation
   0xffffffff8134f16e <+30>:    mov    %sil,0x51(%rdi)
   0xffffffff8134f172 <+34>:    mov    %ecx,0x5c(%rdi)

So it looks like in the destruction of compound pages we'll see no
gain or loss between the bitmath or branch approach.
However, in the prep case we may see some performance loss once/if THP
utilizes this path due to the branch and the loss of CPU
parallelization that can be achieved utilizing the bitmath approach.

Cheers,
-- Nico



>


  reply	other threads:[~2022-12-15  2:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-13 23:45 [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound_order Nico Pache
2022-12-13 23:47 ` Mike Kravetz
2022-12-13 23:53   ` Nico Pache
2022-12-14  0:27     ` Nico Pache
2022-12-14  1:02       ` Mike Kravetz
2022-12-14  6:38         ` Sidhartha Kumar
2022-12-15  1:05           ` Nico Pache
2022-12-14  1:43 ` kernel test robot
2022-12-14  5:25 ` kernel test robot
2022-12-14 17:04 ` Matthew Wilcox
2022-12-15  2:48   ` Nico Pache [this message]
2022-12-15 21:38     ` Nico Pache
2022-12-15 21:47       ` Matthew Wilcox
2022-12-15 22:02         ` Nico Pache
2022-12-16  6:46 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA1CXcA2dGeG2tzc+-OZ77eMVpnSN2SKkdtz9LqpLPywhJMOwA@mail.gmail.com \
    --to=npache@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=sidhartha.kumar@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.