All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: linux-toolchains@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: jpoimboe@redhat.com, jbaron@akamai.com, rostedt@goodmis.org,
	ardb@kernel.org
Subject: static_branch/jump_label vs branch merging
Date: Thu, 8 Apr 2021 18:52:18 +0200	[thread overview]
Message-ID: <YG80wg/2iZjXfCDJ@hirez.programming.kicks-ass.net> (raw)

Hi!

Given code like:

DEFINE_STATIC_KEY_FALSE(sched_schedstats);

#define   schedstat_enabled()		static_branch_unlikely(&sched_schedstats)
#define   schedstat_set(var, val)	do { if (schedstat_enabled()) { var = (val); } } while (0)
#define __schedstat_set(var, val)	do { var = (val); } while (0)

void foo(void)
{
	struct task_struct *p = current;

	schedstat_set(p->se.statistics.wait_start,  0);
	schedstat_set(p->se.statistics.sleep_start, 0);
	schedstat_set(p->se.statistics.block_start, 0);
}

Where the static_branch_unlikely() ends up being:

static __always_inline bool arch_static_branch(struct static_key * const key, const bool branch)
{
	asm_volatile_goto("1:"
		".byte " __stringify(BYTES_NOP5) "\n\t"
		".pushsection __jump_table,  \"aw\" \n\t"
		_ASM_ALIGN "\n\t"
		".long 1b - ., %l[l_yes] - . \n\t"
		_ASM_PTR "%c0 + %c1 - .\n\t"
		".popsection \n\t"
		: :  "i" (key), "i" (branch) : : l_yes);

	return false;
l_yes:
	return true;
}

The compiler gives us code like:

000000000000a290 <foo>:
    a290:       65 48 8b 04 25 00 00 00 00      mov    %gs:0x0,%rax     a295: R_X86_64_32S      current_task
    a299:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    a29e:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    a2a3:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    a2a8:       c3                      retq
    a2a9:       48 c7 80 28 01 00 00 00 00 00 00        movq   $0x0,0x128(%rax)
    a2b4:       eb e8                   jmp    a29e <foo+0xe>
    a2b6:       48 c7 80 58 01 00 00 00 00 00 00        movq   $0x0,0x158(%rax)
    a2c1:       eb e0                   jmp    a2a3 <foo+0x13>
    a2c3:       48 c7 80 70 01 00 00 00 00 00 00        movq   $0x0,0x170(%rax)
    a2ce:       c3                      retq


Now, in this case I can easily rewrite foo like:

void foo2(void)
{
	struct task_struct *p = current;

	if (schedstat_enabled()) {
		__schedstat_set(p->se.statistics.wait_start,  0);
		__schedstat_set(p->se.statistics.sleep_start, 0);
		__schedstat_set(p->se.statistics.block_start, 0);
	}
}

Which gives the far more reasonable:

000000000000a2d0 <foo2>:
    a2d0:       65 48 8b 04 25 00 00 00 00      mov    %gs:0x0,%rax     a2d5: R_X86_64_32S      current_task
    a2d9:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    a2de:       c3                      retq
    a2df:       48 c7 80 28 01 00 00 00 00 00 00        movq   $0x0,0x128(%rax)
    a2ea:       48 c7 80 58 01 00 00 00 00 00 00        movq   $0x0,0x158(%rax)
    a2f5:       48 c7 80 70 01 00 00 00 00 00 00        movq   $0x0,0x170(%rax)
    a300:       c3                      retq

But I've found a few sites where this isn't so trivial.

Is there *any* way in which we can have the compiler recognise that the
asm_goto only depends on its arguments and have it merge the branches
itself?

I do realize that asm-goto being volatile this is a fairly huge ask, but
I figured I should at least raise the issue, if only to raise awareness.



             reply	other threads:[~2021-04-08 16:53 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 16:52 Peter Zijlstra [this message]
2021-04-09  9:57 ` static_branch/jump_label vs branch merging Ard Biesheuvel
2021-04-09 10:55   ` Florian Weimer
2021-04-09 11:16     ` Peter Zijlstra
2021-04-09 19:33       ` Nick Desaulniers
2021-04-09 20:11         ` Peter Zijlstra
2021-04-10 17:02         ` Segher Boessenkool
2021-04-09 11:12   ` Peter Zijlstra
2021-04-09 11:55     ` David Malcolm
2021-04-09 12:03       ` Peter Zijlstra
2021-04-09 13:01         ` Peter Zijlstra
2021-04-09 13:13           ` Peter Zijlstra
2021-04-09 13:48             ` David Malcolm
2021-04-09 18:40               ` Peter Zijlstra
2021-04-09 19:21                 ` David Malcolm
2021-04-09 20:09                   ` Peter Zijlstra
2021-04-09 21:07                     ` David Malcolm
2021-04-09 21:39                       ` Peter Zijlstra
2021-04-22 11:48                         ` Peter Zijlstra
2021-04-22 17:08                           ` Segher Boessenkool
2021-04-22 17:49                             ` Peter Zijlstra
2021-04-22 18:31                               ` Segher Boessenkool
2021-04-26 17:13                                 ` Peter Zijlstra
2021-04-10 12:44               ` David Laight
2021-04-09 13:03 ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YG80wg/2iZjXfCDJ@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ardb@kernel.org \
    --cc=jbaron@akamai.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.