All of lore.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: linux-arch <linux-arch@vger.kernel.org>,
	tony.luck@intel.com, Catalin Marinas <catalin.marinas@arm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
	Jan Glauber <jan.glauber@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: lockref scalability on x86-64 vs cpu_relax
Date: Fri, 13 Jan 2023 09:46:25 +0000	[thread overview]
Message-ID: <20230113094625.GA12235@willie-the-truck> (raw)
In-Reply-To: <CAGudoHE0tzL8OAqvwpDR4Nn_g70a8qBdE_+-fmhXF-DEx_K6kg@mail.gmail.com>

On Fri, Jan 13, 2023 at 02:12:50AM +0100, Mateusz Guzik wrote:
> On 1/13/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > Side note on your access() changes - if it turns out that you can
> > remove all the cred games, we should possibly then revert my old
> > commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> > temporary subjective credentials") which avoided the biggest issue
> > with the unnecessary cred switching.
> >
> > I *think* access() is the only user of that special 'non_rcu' thing,
> > but it is possible that the whole 'non_rcu' thing ends up mattering
> > for cases where the cred actually does change because euid != uid (ie
> > suid programs), so this would need a bit more effort to do performance
> > testing on.
> >
> 
> I don't think the games are avoidable. For one I found non-root
> processes with non-empty cap_effective even on my laptop, albeit I did
> not check how often something like this is doing access().
> 
> Discussion for another time.
> 
> > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
> >> All that said, I think the thing to do here is to replace cpu_relax
> >> with a dedicated arch-dependent macro, akin to the following:
> >
> > I would actually prefer just removing it entirely and see if somebody
> > else hollers. You have the numbers to prove it hurts on real hardware,
> > and I don't think we have any numbers to the contrary.
> >
> > So I think it's better to trust the numbers and remove it as a
> > failure, than say "let's just remove it on x86-64 and leave everybody
> > else with the potentially broken code"
> >
> [snip]
> > Then other architectures can try to run their numbers, and only *if*
> > it then turns out that they have a reason to do something else should
> > we make this conditional and different on different architectures.
> >
> > Let's try to keep the code as common as possibly until we have hard
> > evidence for special cases, in other words.
> >
> 
> I did not want to make such a change without redoing the ThunderX2
> benchmark, or at least something else arm64-y. I may be able to bench it
> tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
> no arm64 people show up with their results.
> 
> Even then IMHO the safest route is to patch it out on x86-64 and give
> other people time to bench their archs as they get around to it, and
> ultimately whack the thing if it turns out nobody benefits from it.
> I would say beats backpedaling on the removal, but I'm not going to
> fight for it.
> 
> That said, does waiting for arm64 numbers and/or producing them for the
> removal commit message sound like a plan? If so, I'll post soon(tm).

Honestly, I wouldn't worry about us (arm64) here. I don't think any real
hardware implements the YIELD instruction (i.e. it behaves as a NOP in
practice). The only place I'm aware of where it _does_ something is in
QEMU, which was actually the motivation behind having it in cpu_relax() to
start with (see 1baa82f48030 ("arm64: Implement cpu_relax as yield")).

So, from the arm64 side of the fence, I'm perfectly happy just removing
the cpu_relax() calls from lockref.

Will

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will@kernel.org>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	Jan Glauber <jan.glauber@gmail.com>,
	tony.luck@intel.com,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: lockref scalability on x86-64 vs cpu_relax
Date: Fri, 13 Jan 2023 09:46:25 +0000	[thread overview]
Message-ID: <20230113094625.GA12235@willie-the-truck> (raw)
In-Reply-To: <CAGudoHE0tzL8OAqvwpDR4Nn_g70a8qBdE_+-fmhXF-DEx_K6kg@mail.gmail.com>

On Fri, Jan 13, 2023 at 02:12:50AM +0100, Mateusz Guzik wrote:
> On 1/13/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > Side note on your access() changes - if it turns out that you can
> > remove all the cred games, we should possibly then revert my old
> > commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> > temporary subjective credentials") which avoided the biggest issue
> > with the unnecessary cred switching.
> >
> > I *think* access() is the only user of that special 'non_rcu' thing,
> > but it is possible that the whole 'non_rcu' thing ends up mattering
> > for cases where the cred actually does change because euid != uid (ie
> > suid programs), so this would need a bit more effort to do performance
> > testing on.
> >
> 
> I don't think the games are avoidable. For one I found non-root
> processes with non-empty cap_effective even on my laptop, albeit I did
> not check how often something like this is doing access().
> 
> Discussion for another time.
> 
> > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
> >> All that said, I think the thing to do here is to replace cpu_relax
> >> with a dedicated arch-dependent macro, akin to the following:
> >
> > I would actually prefer just removing it entirely and see if somebody
> > else hollers. You have the numbers to prove it hurts on real hardware,
> > and I don't think we have any numbers to the contrary.
> >
> > So I think it's better to trust the numbers and remove it as a
> > failure, than say "let's just remove it on x86-64 and leave everybody
> > else with the potentially broken code"
> >
> [snip]
> > Then other architectures can try to run their numbers, and only *if*
> > it then turns out that they have a reason to do something else should
> > we make this conditional and different on different architectures.
> >
> > Let's try to keep the code as common as possibly until we have hard
> > evidence for special cases, in other words.
> >
> 
> I did not want to make such a change without redoing the ThunderX2
> benchmark, or at least something else arm64-y. I may be able to bench it
> tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
> no arm64 people show up with their results.
> 
> Even then IMHO the safest route is to patch it out on x86-64 and give
> other people time to bench their archs as they get around to it, and
> ultimately whack the thing if it turns out nobody benefits from it.
> I would say beats backpedaling on the removal, but I'm not going to
> fight for it.
> 
> That said, does waiting for arm64 numbers and/or producing them for the
> removal commit message sound like a plan? If so, I'll post soon(tm).

Honestly, I wouldn't worry about us (arm64) here. I don't think any real
hardware implements the YIELD instruction (i.e. it behaves as a NOP in
practice). The only place I'm aware of where it _does_ something is in
QEMU, which was actually the motivation behind having it in cpu_relax() to
start with (see 1baa82f48030 ("arm64: Implement cpu_relax as yield")).

So, from the arm64 side of the fence, I'm perfectly happy just removing
the cpu_relax() calls from lockref.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will@kernel.org>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	Jan Glauber <jan.glauber@gmail.com>,
	tony.luck@intel.com,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: lockref scalability on x86-64 vs cpu_relax
Date: Fri, 13 Jan 2023 09:46:25 +0000	[thread overview]
Message-ID: <20230113094625.GA12235@willie-the-truck> (raw)
In-Reply-To: <CAGudoHE0tzL8OAqvwpDR4Nn_g70a8qBdE_+-fmhXF-DEx_K6kg@mail.gmail.com>

On Fri, Jan 13, 2023 at 02:12:50AM +0100, Mateusz Guzik wrote:
> On 1/13/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > Side note on your access() changes - if it turns out that you can
> > remove all the cred games, we should possibly then revert my old
> > commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> > temporary subjective credentials") which avoided the biggest issue
> > with the unnecessary cred switching.
> >
> > I *think* access() is the only user of that special 'non_rcu' thing,
> > but it is possible that the whole 'non_rcu' thing ends up mattering
> > for cases where the cred actually does change because euid != uid (ie
> > suid programs), so this would need a bit more effort to do performance
> > testing on.
> >
> 
> I don't think the games are avoidable. For one I found non-root
> processes with non-empty cap_effective even on my laptop, albeit I did
> not check how often something like this is doing access().
> 
> Discussion for another time.
> 
> > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
> >> All that said, I think the thing to do here is to replace cpu_relax
> >> with a dedicated arch-dependent macro, akin to the following:
> >
> > I would actually prefer just removing it entirely and see if somebody
> > else hollers. You have the numbers to prove it hurts on real hardware,
> > and I don't think we have any numbers to the contrary.
> >
> > So I think it's better to trust the numbers and remove it as a
> > failure, than say "let's just remove it on x86-64 and leave everybody
> > else with the potentially broken code"
> >
> [snip]
> > Then other architectures can try to run their numbers, and only *if*
> > it then turns out that they have a reason to do something else should
> > we make this conditional and different on different architectures.
> >
> > Let's try to keep the code as common as possibly until we have hard
> > evidence for special cases, in other words.
> >
> 
> I did not want to make such a change without redoing the ThunderX2
> benchmark, or at least something else arm64-y. I may be able to bench it
> tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
> no arm64 people show up with their results.
> 
> Even then IMHO the safest route is to patch it out on x86-64 and give
> other people time to bench their archs as they get around to it, and
> ultimately whack the thing if it turns out nobody benefits from it.
> I would say beats backpedaling on the removal, but I'm not going to
> fight for it.
> 
> That said, does waiting for arm64 numbers and/or producing them for the
> removal commit message sound like a plan? If so, I'll post soon(tm).

Honestly, I wouldn't worry about us (arm64) here. I don't think any real
hardware implements the YIELD instruction (i.e. it behaves as a NOP in
practice). The only place I'm aware of where it _does_ something is in
QEMU, which was actually the motivation behind having it in cpu_relax() to
start with (see 1baa82f48030 ("arm64: Implement cpu_relax as yield")).

So, from the arm64 side of the fence, I'm perfectly happy just removing
the cpu_relax() calls from lockref.

Will

  parent reply	other threads:[~2023-01-13  9:47 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 23:36 lockref scalability on x86-64 vs cpu_relax Mateusz Guzik
2023-01-13  0:13 ` Linus Torvalds
2023-01-13  0:13   ` Linus Torvalds
2023-01-13  0:13   ` Linus Torvalds
2023-01-13  0:30   ` Luck, Tony
2023-01-13  0:30     ` Luck, Tony
2023-01-13  0:30     ` Luck, Tony
2023-01-13  0:45     ` Linus Torvalds
2023-01-13  0:45       ` Linus Torvalds
2023-01-13  0:45       ` Linus Torvalds
2023-01-13  7:55     ` ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax) Ard Biesheuvel
2023-01-13  7:55       ` Ard Biesheuvel
2023-01-13  7:55       ` Ard Biesheuvel
2023-01-13 16:17       ` Luck, Tony
2023-01-13 16:17         ` Luck, Tony
2023-01-13 16:17         ` Luck, Tony
2023-01-13 20:49       ` Jessica Clarke
2023-01-13 20:49         ` Jessica Clarke
2023-01-13 20:49         ` Jessica Clarke
2023-01-13 21:03         ` Luck, Tony
2023-01-13 21:03           ` Luck, Tony
2023-01-13 21:03           ` Luck, Tony
2023-01-13 21:04           ` Jessica Clarke
2023-01-13 21:04             ` Jessica Clarke
2023-01-13 21:04             ` Jessica Clarke
2023-01-13 21:05       ` John Paul Adrian Glaubitz
2023-01-13 21:05         ` John Paul Adrian Glaubitz
2023-01-13 21:05         ` John Paul Adrian Glaubitz
2023-01-13 23:25         ` Ard Biesheuvel
2023-01-13 23:25           ` Ard Biesheuvel
2023-01-13 23:25           ` Ard Biesheuvel
2023-01-14 11:24           ` Sedat Dilek
2023-01-14 11:24             ` Sedat Dilek
2023-01-14 11:24             ` Sedat Dilek
2023-01-14 11:28             ` Sedat Dilek
2023-01-14 11:28               ` Sedat Dilek
2023-01-14 11:28               ` Sedat Dilek
2023-01-15  0:27               ` Matthew Wilcox
2023-01-15  0:27                 ` Matthew Wilcox
2023-01-15  0:27                 ` Matthew Wilcox
2023-01-15 12:04                 ` Sedat Dilek
2023-01-15 12:04                   ` Sedat Dilek
2023-01-15 12:04                   ` Sedat Dilek
2023-01-16  9:42                   ` John Paul Adrian Glaubitz
2023-01-16  9:42                     ` John Paul Adrian Glaubitz
2023-01-16  9:42                     ` John Paul Adrian Glaubitz
2023-01-16  9:41                 ` John Paul Adrian Glaubitz
2023-01-16  9:41                   ` John Paul Adrian Glaubitz
2023-01-16  9:41                   ` John Paul Adrian Glaubitz
2023-01-16 13:28                   ` Matthew Wilcox
2023-01-16 13:28                     ` Matthew Wilcox
2023-01-16 13:28                     ` Matthew Wilcox
2023-01-16  9:40               ` John Paul Adrian Glaubitz
2023-01-16  9:40                 ` John Paul Adrian Glaubitz
2023-01-16  9:40                 ` John Paul Adrian Glaubitz
2023-01-16  9:37             ` John Paul Adrian Glaubitz
2023-01-16  9:37               ` John Paul Adrian Glaubitz
2023-01-16  9:37               ` John Paul Adrian Glaubitz
2023-01-16  9:32           ` John Paul Adrian Glaubitz
2023-01-16  9:32             ` John Paul Adrian Glaubitz
2023-01-16  9:32             ` John Paul Adrian Glaubitz
2023-01-16 10:09             ` Ard Biesheuvel
2023-01-16 10:09               ` Ard Biesheuvel
2023-01-16 10:09               ` Ard Biesheuvel
2023-01-13  1:12   ` lockref scalability on x86-64 vs cpu_relax Mateusz Guzik
2023-01-13  1:12     ` Mateusz Guzik
2023-01-13  1:12     ` Mateusz Guzik
2023-01-13  4:08     ` Linus Torvalds
2023-01-13  4:08       ` Linus Torvalds
2023-01-13  4:08       ` Linus Torvalds
2023-01-13  9:46     ` Will Deacon [this message]
2023-01-13  9:46       ` Will Deacon
2023-01-13  9:46       ` Will Deacon
2023-01-13  3:20   ` Nicholas Piggin
2023-01-13  3:20     ` Nicholas Piggin
2023-01-13  3:20     ` Nicholas Piggin
2023-01-13  4:15     ` Linus Torvalds
2023-01-13  4:15       ` Linus Torvalds
2023-01-13  4:15       ` Linus Torvalds
2023-01-13  5:36       ` Nicholas Piggin
2023-01-13  5:36         ` Nicholas Piggin
2023-01-13  5:36         ` Nicholas Piggin
2023-01-16 14:08     ` Memory transaction instructions David Howells
2023-01-16 14:08       ` David Howells
2023-01-16 15:09       ` Matthew Wilcox
2023-01-16 15:09         ` Matthew Wilcox
2023-01-16 15:09         ` Matthew Wilcox
2023-01-16 16:59       ` Linus Torvalds
2023-01-16 16:59         ` Linus Torvalds
2023-01-16 16:59         ` Linus Torvalds
2023-01-18  9:05       ` David Howells
2023-01-18  9:05         ` David Howells
2023-01-18  9:05         ` David Howells
2023-01-19  1:41         ` Nicholas Piggin
2023-01-19  1:41           ` Nicholas Piggin
2023-01-19  1:41           ` Nicholas Piggin
2023-01-13 10:23   ` lockref scalability on x86-64 vs cpu_relax Peter Zijlstra
2023-01-13 10:23     ` Peter Zijlstra
2023-01-13 10:23     ` Peter Zijlstra
2023-01-13 18:44   ` [PATCH] lockref: stop doing cpu_relax in the cmpxchg loop Mateusz Guzik
2023-01-13 18:44     ` Mateusz Guzik
2023-01-13 18:44     ` Mateusz Guzik
2023-01-13 21:47     ` Luck, Tony
2023-01-13 21:47       ` Luck, Tony
2023-01-13 21:47       ` Luck, Tony
2023-01-13 23:31       ` Linus Torvalds
2023-01-13 23:31         ` Linus Torvalds
2023-01-13 23:31         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230113094625.GA12235@willie-the-truck \
    --to=will@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=jan.glauber@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mjguzik@gmail.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.