From: Uros Bizjak <ubizjak@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
Nadav Amit <namit@vmware.com>, Andy Lutomirski <luto@kernel.org>,
Brian Gerst <brgerst@gmail.com>,
Denys Vlasenko <dvlasenk@redhat.com>,
"H . Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()
Date: Wed, 11 Oct 2023 22:00:42 +0200 [thread overview]
Message-ID: <CAFULd4ZSorEEkUZOobAyDzkyG+DujEoUOGiMPuiqd9V3C-a39w@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wiLyA0g3BvQ_nsF2PWi-FDtcNS5+4-ai1FX-xFzTBeTzg@mail.gmail.com>
On Wed, Oct 11, 2023 at 9:52 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, 11 Oct 2023 at 11:42, Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > The attached patch was tested on a target with fsgsbase CPUID and
> > without it. It works!
>
> .. I should clearly read all my emails before answering some of them.
>
> Yes, that patch looks good to me, and I'm happy to hear that you
> actually tested it unlike my "maybe something like this".
>
> > The patch improves amd_pmu_enable_virt() in the same way as reported
> > in the original patch submission and also reduces the number of percpu
> > offset reads (either from this_cpu_off or with rdgsbase) from 1663 to
> > 1571.
>
> Dio y ou have any actka performance numbers? The patch looks good to
> me, and I *think* rdgsbase ends up being faster in practice due to
> avoiding a memory access, but that's very much a gut feel.
Unfortunately, I don't have any perf numbers, only those from Agner's
instruction tables. The memory access performance has so many
parameters, that gut feeling is the only thing besides real
case-by-case measurements. The rule of thumb in the compiler world is
also that memory access should be avoided.
Uros.
>
> > The only drawback is a larger binary size:
> >
> > text data bss dec hex filename
> > 25546594 4387686 808452 30742732 1d518cc vmlinux-new.o
> > 25515256 4387814 808452 30711522 1d49ee2 vmlinux-old.o
> >
> > that increases by 31k (0.123%), probably due to 1578 rdgsbase alternatives.
>
> I'm actually surprised that it increases the text size. The 'rdgsbase'
> instruction should be smaller than a 'mov %gs', so I would have
> expected the *data* size to increase due to the alternatives tables,
> but not the text size.
>
> [ Looks around ]
>
> Oh. It's because we put the altinstructions into the text section.
> That's kind of silly, but whatever.
>
> So I think that increase in text-size is not "real" - yes, it
> increases our binary size because we obviously have two instructions,
> but the actual *executable* part likely stays the same, and it's just
> that we grow the altinstruction metadata.
>
> Linus
next prev parent reply other threads:[~2023-10-11 20:01 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-10 16:42 [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() Uros Bizjak
2023-10-10 17:32 ` Linus Torvalds
2023-10-10 18:22 ` Uros Bizjak
2023-10-10 18:25 ` Nadav Amit
2023-10-10 18:42 ` Linus Torvalds
2023-10-10 18:37 ` Linus Torvalds
2023-10-10 18:41 ` Uros Bizjak
2023-10-10 18:52 ` Linus Torvalds
2023-10-11 7:27 ` Uros Bizjak
2023-10-11 7:45 ` Uros Bizjak
2023-10-11 19:40 ` Linus Torvalds
2023-10-11 18:42 ` Uros Bizjak
2023-10-11 19:51 ` Linus Torvalds
2023-10-11 19:52 ` Linus Torvalds
2023-10-11 20:00 ` Uros Bizjak [this message]
2023-10-11 22:37 ` Ingo Molnar
2023-10-11 23:15 ` H. Peter Anvin
2023-10-12 1:35 ` Josh Poimboeuf
2023-10-12 6:19 ` Ingo Molnar
2023-10-12 16:08 ` Josh Poimboeuf
2023-10-12 17:59 ` Ingo Molnar
2023-10-12 21:30 ` Josh Poimboeuf
2023-10-13 10:52 ` Ingo Molnar
2023-10-11 7:41 ` Nadav Amit
2023-10-11 19:37 ` Linus Torvalds
2023-10-11 21:32 ` Uros Bizjak
2023-10-11 21:54 ` Linus Torvalds
2023-10-12 15:19 ` Nadav Amit
2023-10-12 16:33 ` Uros Bizjak
2023-10-12 16:55 ` Uros Bizjak
2023-10-12 17:10 ` Linus Torvalds
2023-10-12 17:47 ` Linus Torvalds
2023-10-12 18:01 ` Uros Bizjak
2023-10-13 9:38 ` Uros Bizjak
2023-10-13 11:53 ` Uros Bizjak
2023-10-13 16:38 ` Linus Torvalds
2023-10-12 17:52 ` Uros Bizjak
2023-11-20 9:39 ` Use %a asm operand modifier to obtain %rip-relative addressing Uros Bizjak
2023-10-12 16:56 ` [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr() Linus Torvalds
2023-10-12 17:16 ` Linus Torvalds
2023-10-12 19:32 ` Nadav Amit
2023-10-12 19:40 ` Linus Torvalds
2023-10-16 18:52 ` Uros Bizjak
2023-10-16 19:24 ` Linus Torvalds
2023-10-16 20:35 ` Nadav Amit
2023-10-16 20:59 ` Linus Torvalds
2023-10-16 23:02 ` Linus Torvalds
2023-10-16 23:14 ` Linus Torvalds
2023-10-17 7:23 ` Nadav Amit
2023-10-17 19:00 ` Linus Torvalds
2023-10-17 19:11 ` Uros Bizjak
2023-10-17 21:05 ` Uros Bizjak
2023-10-17 21:53 ` Linus Torvalds
2023-10-17 22:06 ` Nadav Amit
2023-10-17 22:29 ` Nadav Amit
2023-10-18 7:46 ` Uros Bizjak
2023-10-18 9:04 ` Uros Bizjak
2023-10-18 10:54 ` Nadav Amit
2023-10-18 12:14 ` Uros Bizjak
2023-10-18 13:15 ` Uros Bizjak
2023-10-18 14:46 ` Nadav Amit
2023-10-18 15:17 ` Uros Bizjak
2023-10-18 16:03 ` Nadav Amit
2023-10-18 16:26 ` Linus Torvalds
2023-10-18 17:23 ` Uros Bizjak
2023-10-18 18:11 ` Linus Torvalds
2023-10-18 18:08 ` Uros Bizjak
2023-10-18 18:15 ` Linus Torvalds
2023-10-18 18:26 ` Uros Bizjak
2023-10-18 19:33 ` Uros Bizjak
2023-10-18 20:17 ` Nadav Amit
2023-10-18 20:22 ` Linus Torvalds
2023-10-18 20:34 ` Linus Torvalds
2023-10-18 20:51 ` Uros Bizjak
2023-10-18 21:09 ` Uros Bizjak
2023-10-18 21:10 ` Linus Torvalds
2023-10-18 21:40 ` Uros Bizjak
2023-10-18 22:40 ` Linus Torvalds
2023-10-18 23:06 ` Linus Torvalds
2023-10-19 7:04 ` Uros Bizjak
2023-10-19 16:59 ` Linus Torvalds
2023-10-19 17:21 ` Uros Bizjak
2023-10-19 18:06 ` Linus Torvalds
2023-10-19 18:16 ` Uros Bizjak
2023-10-19 18:49 ` Linus Torvalds
2023-10-19 19:07 ` Linus Torvalds
2023-10-20 7:57 ` Uros Bizjak
2023-10-19 21:04 ` Linus Torvalds
2023-10-19 22:39 ` Linus Torvalds
2023-10-20 8:08 ` Uros Bizjak
2023-10-19 8:44 ` Peter Zijlstra
2023-10-19 8:54 ` Peter Zijlstra
2023-10-19 17:04 ` Linus Torvalds
2023-10-19 18:13 ` Peter Zijlstra
2023-10-19 18:22 ` Linus Torvalds
2023-10-19 18:37 ` Uros Bizjak
2023-10-19 9:07 ` Peter Zijlstra
2023-10-19 9:23 ` Uros Bizjak
2023-10-18 20:42 ` Uros Bizjak
2023-10-19 16:32 ` Uros Bizjak
2023-10-19 17:08 ` Linus Torvalds
2023-10-18 18:29 ` Nadav Amit
2023-10-18 16:12 ` Linus Torvalds
2023-10-18 17:07 ` Uros Bizjak
2023-10-18 18:01 ` Linus Torvalds
2023-10-16 21:09 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFULd4ZSorEEkUZOobAyDzkyG+DujEoUOGiMPuiqd9V3C-a39w@mail.gmail.com \
--to=ubizjak@gmail.com \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=hpa@zytor.com \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).