linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	"David S. Miller" <davem@davemloft.net>,
	David Daney <david.daney@cavium.com>,
	Michael Ellerman <michael@ellerman.id.au>,
	Jan Glauber <jang@linux.vnet.ibm.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	Xen Devel <xen-devel@lists.xensource.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>,
	peterz@infradead.org, hpa@zytor.com
Subject: Re: [PATCH RFC V2 3/5] jump_label: if a key has already been initialized, don't nop it out
Date: Tue, 04 Oct 2011 08:18:40 -0700	[thread overview]
Message-ID: <4E8B23D0.6030503@goop.org> (raw)
In-Reply-To: <20111004141011.GA2520@redhat.com>

On 10/04/2011 07:10 AM, Jason Baron wrote:
> On Mon, Oct 03, 2011 at 09:27:56AM -0700, Jeremy Fitzhardinge wrote:
>> On 10/03/2011 08:02 AM, Jason Baron wrote:
>>> Hi,
>>>
>>> (Sorry for the late reply - I was away for a few days).
>>>
>>> The early enable is really nice - it means there are not restrictions on
>>> when jump_label_inc()/dec() can be called which is nice.
>>>
>>> comments below.
>>>
>>>
>>> On Sat, Oct 01, 2011 at 02:55:35PM -0700, Jeremy Fitzhardinge wrote:
>>>> From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
>>>>
>>>> If a key has been enabled before jump_label_init() is called, don't
>>>> nop it out.
>>>>
>>>> This removes arch_jump_label_text_poke_early() (which can only nop
>>>> out a site) and uses arch_jump_label_transform() instead.
>>>>
>>>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
>>>> ---
>>>>  include/linux/jump_label.h |    3 ++-
>>>>  kernel/jump_label.c        |   20 ++++++++------------
>>>>  2 files changed, 10 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
>>>> index 1213e9d..c8fb1b3 100644
>>>> --- a/include/linux/jump_label.h
>>>> +++ b/include/linux/jump_label.h
>>>> @@ -45,7 +45,8 @@ extern void jump_label_lock(void);
>>>>  extern void jump_label_unlock(void);
>>>>  extern void arch_jump_label_transform(struct jump_entry *entry,
>>>>  				 enum jump_label_type type);
>>>> -extern void arch_jump_label_text_poke_early(jump_label_t addr);
>>>> +extern void arch_jump_label_transform_early(struct jump_entry *entry,
>>>> +				 enum jump_label_type type);
>>>>  extern int jump_label_text_reserved(void *start, void *end);
>>>>  extern void jump_label_inc(struct jump_label_key *key);
>>>>  extern void jump_label_dec(struct jump_label_key *key);
>>>> diff --git a/kernel/jump_label.c b/kernel/jump_label.c
>>>> index a8ce450..059202d5 100644
>>>> --- a/kernel/jump_label.c
>>>> +++ b/kernel/jump_label.c
>>>> @@ -121,13 +121,6 @@ static void __jump_label_update(struct jump_label_key *key,
>>>>  	}
>>>>  }
>>>>  
>>>> -/*
>>>> - * Not all archs need this.
>>>> - */
>>>> -void __weak arch_jump_label_text_poke_early(jump_label_t addr)
>>>> -{
>>>> -}
>>>> -
>>>>  static __init int jump_label_init(void)
>>>>  {
>>>>  	struct jump_entry *iter_start = __start___jump_table;
>>>> @@ -139,12 +132,15 @@ static __init int jump_label_init(void)
>>>>  	jump_label_sort_entries(iter_start, iter_stop);
>>>>  
>>>>  	for (iter = iter_start; iter < iter_stop; iter++) {
>>>> -		arch_jump_label_text_poke_early(iter->code);
>>>> -		if (iter->key == (jump_label_t)(unsigned long)key)
>>>> +		struct jump_label_key *iterk;
>>>> +
>>>> +		iterk = (struct jump_label_key *)(unsigned long)iter->key;
>>>> +		arch_jump_label_transform(iter, jump_label_enabled(iterk) ?
>>>> +					  JUMP_LABEL_ENABLE : JUMP_LABEL_DISABLE);
>>> The only reason I called this at boot-time was that the 'ideal' x86
>>> no-op isn't known until boot time. Thus, in the enabled case we could
>>> skip the the arch_jump_label_transform() call. ie:
>>>
>>> if (!enabled)
>>> 	arch_jump_label_transform(iter, JUMP_LABEL_DISABLE);
>>
>> Yep, fair enough.
>>
>>>
>>>> +		if (iterk == key)
>>>>  			continue;
>>>>  
>>>> -		key = (struct jump_label_key *)(unsigned long)iter->key;
>>>> -		atomic_set(&key->enabled, 0);
>>>> +		key = iterk;
>>>>  		key->entries = iter;
>>>>  #ifdef CONFIG_MODULES
>>>>  		key->next = NULL;
>>>> @@ -212,7 +208,7 @@ void jump_label_apply_nops(struct module *mod)
>>>>  		return;
>>>>  
>>>>  	for (iter = iter_start; iter < iter_stop; iter++)
>>>> -		arch_jump_label_text_poke_early(iter->code);
>>>> +		arch_jump_label_transform(iter, JUMP_LABEL_DISABLE);
>>>>  }
>>>>  
>>>>  static int jump_label_add_module(struct module *mod)
>>>> -- 
>>>> 1.7.6.2
>>>>
>>> hmmm...this is used on module load in smp - so this would introduce a number of
>>> calls to stop_machine() where we didn't have them before. Yes, module
>>> load is a very slow path to begin with, but I think its at least worth
>>> pointing out...
>> Ah, that explains it - the module stuff certainly isn't "early" except -
>> I guess - in the module's lifetime.
>>
>> Well, I suppose I could introduce either second variant of the function,
>> or add a "live" flag (ie, may be updating code that a processor is
>> executing), which requires a stop_machine, or direct update if it doesn't.
>>
>> But is there any reason why we couldn't just generate a reasonably
>> efficient 5-byte atomic nop in the first place, and get rid of all that
>> fooling around?  It looks like x86 is the only arch where it makes any
>> difference at all, and how much difference does it really make?  Or is
>> there no one 5-byte atomic nop that works on all x86 variants, aside
>> from jmp +0?
>>
>>     J
> Yes, there are really two reasons for the initial no-op patching pass:
>
> 1) The jmp +0, is a 'safe' no-op that I know is going to initially
> boot for all x86. I'm not sure if there is a 5-byte nop that works on
> all x86 variants - but by using jmp +0, we make it much easier to debug
> cases where we may be using broken no-ops.
>
> 2) This optimization is about as close to a 0 cost off case as possible.
> I know there have been various no-op benchmarks posted on lkml in the
> past, so the choice of no-op does seem to make a difference. see:
> http://lkml.indiana.edu/hypermail/linux/kernel/0808.1/2416.html, for
> example. So at least to me, if we are not using the lowest cost no-op,
> we are at least in-part defeating the point of this optimization.
>
> I like the "live" flag suggestion mentioned above. Less functions is
> better, and non-x86 arches can simply ignore the flag.

I went the other way and added a second function,
arch_jump_label_transform_static(), which has a weak default
implementation which calls arch_jump_label_transform().  That way only
the architectures which really care about it need implement a second
variant. I did x86 and s390 by adapting the patches I had from the other
series; it didn't look like mips/sparc/power were very heavyweight at all.

    J

  reply	other threads:[~2011-10-04 15:18 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-01 21:55 [PATCH RFC V2 0/5] jump-label: allow early jump_label_enable() Jeremy Fitzhardinge
2011-10-01 21:55 ` [PATCH RFC V2 1/5] jump_label: use proper atomic_t initializer Jeremy Fitzhardinge
2011-10-01 21:55 ` [PATCH RFC V2 2/5] stop_machine: make stop_machine safe and efficient to call early Jeremy Fitzhardinge
2011-10-02  0:36   ` Tejun Heo
2011-10-03 19:24   ` [Xen-devel] " Konrad Rzeszutek Wilk
2011-10-01 21:55 ` [PATCH RFC V2 3/5] jump_label: if a key has already been initialized, don't nop it out Jeremy Fitzhardinge
2011-10-03 15:02   ` Jason Baron
2011-10-03 15:47     ` Steven Rostedt
2011-10-03 16:27     ` Jeremy Fitzhardinge
2011-10-04 14:10       ` Jason Baron
2011-10-04 15:18         ` Jeremy Fitzhardinge [this message]
2011-10-04 16:30         ` H. Peter Anvin
2011-10-04 17:53           ` Jason Baron
2011-10-04 18:05             ` Steven Rostedt
2011-10-06  0:16           ` Jeremy Fitzhardinge
2011-10-06  0:17             ` H. Peter Anvin
2011-10-06  0:47               ` Jeremy Fitzhardinge
2011-10-06 17:53               ` Jeremy Fitzhardinge
2011-10-06 18:10                 ` Jason Baron
2011-10-06 18:13                   ` H. Peter Anvin
2011-10-06 21:39                     ` Jeremy Fitzhardinge
2011-10-06 22:08                       ` Steven Rostedt
2011-10-06 18:15                   ` Jeremy Fitzhardinge
2011-10-06 18:33                     ` Jason Baron
2011-10-06 18:35                       ` H. Peter Anvin
2011-10-06 18:43                         ` Jason Baron
2011-10-06 18:26                   ` Steven Rostedt
2011-10-06 18:29                     ` H. Peter Anvin
2011-10-06 18:38                       ` Jason Baron
2011-10-06 19:34                         ` Steven Rostedt
2011-10-06 20:33                           ` Jason Baron
2011-10-06 20:45                             ` Steven Rostedt
2011-10-06 18:50                     ` Richard Henderson
2011-10-06 19:28                       ` Steven Rostedt
2011-10-06 21:42                         ` Jeremy Fitzhardinge
2011-10-06 22:06                           ` Steven Rostedt
2011-10-06 22:10                             ` Jeremy Fitzhardinge
2011-10-06 22:20                               ` Steven Rostedt
2011-10-07 17:09                               ` [PATCH][RFC] jump_labels/x86: Use either 5 byte or 2 byte jumps Steven Rostedt
2011-10-07 18:52                                 ` Jason Baron
2011-10-07 19:21                                   ` Steven Rostedt
2011-10-07 21:48                                     ` H. Peter Anvin
2011-10-07 22:00                                       ` Steven Rostedt
2011-10-07 22:03                                         ` H. Peter Anvin
2011-10-07 19:33                                   ` Steven Rostedt
2011-10-07 19:40                                 ` Jeremy Fitzhardinge
2011-10-07 19:58                                   ` Steven Rostedt
2011-10-07 20:04                                   ` Peter Zijlstra
2011-10-10 15:36   ` [PATCH RFC V2 3/5] jump_label: if a key has already been initialized, don't nop it out Jason Baron
2011-10-10 19:58     ` Jeremy Fitzhardinge
2011-10-10 20:10       ` Jason Baron
2011-10-01 21:55 ` [PATCH RFC V2 4/5] x86/jump_label: drop arch_jump_label_text_poke_early() Jeremy Fitzhardinge
2011-10-01 21:55 ` [PATCH RFC V2 5/5] sparc/jump_label: " Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E8B23D0.6030503@goop.org \
    --to=jeremy@goop.org \
    --cc=davem@davemloft.net \
    --cc=david.daney@cavium.com \
    --cc=hpa@zytor.com \
    --cc=jang@linux.vnet.ibm.com \
    --cc=jbaron@redhat.com \
    --cc=jeremy.fitzhardinge@citrix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).