All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] objtool: Avoid pointless modifications
@ 2021-10-07 21:22 Peter Zijlstra
  2021-10-07 21:22 ` [PATCH 1/2] objtool: Optimize re-writing jump_label Peter Zijlstra
  2021-10-07 21:22 ` [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2021-10-07 21:22 UTC (permalink / raw)
  To: jpoimboe; +Cc: linux-kernel, peterz, mbenes

While chasing that libelf warning, I started wondering why we're re-writing the
elf file to begin with, the second/noinstr pass shouldn't modify the image, so
we shouldn't end up in elf_write() in the first place.

These here two patches seem to cure this and should thereby also improve
performance somewhat.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] objtool: Optimize re-writing jump_label
  2021-10-07 21:22 [PATCH 0/2] objtool: Avoid pointless modifications Peter Zijlstra
@ 2021-10-07 21:22 ` Peter Zijlstra
  2021-10-08  6:55   ` Josh Poimboeuf
  2021-10-07 21:22 ` [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2021-10-07 21:22 UTC (permalink / raw)
  To: jpoimboe; +Cc: linux-kernel, peterz, mbenes

There's no point to re-write the jump_label NOP when it's already a NOP.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1397,7 +1397,7 @@ static int handle_jump_alt(struct objtoo
 		return -1;
 	}
 
-	if (special_alt->key_addend & 2) {
+	if ((special_alt->key_addend & 2) && orig_insn->type != INSN_NOP) {
 		struct reloc *reloc = insn_reloc(file, orig_insn);
 
 		if (reloc) {



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation
  2021-10-07 21:22 [PATCH 0/2] objtool: Avoid pointless modifications Peter Zijlstra
  2021-10-07 21:22 ` [PATCH 1/2] objtool: Optimize re-writing jump_label Peter Zijlstra
@ 2021-10-07 21:22 ` Peter Zijlstra
  2021-10-08  7:23   ` Josh Poimboeuf
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2021-10-07 21:22 UTC (permalink / raw)
  To: jpoimboe; +Cc: linux-kernel, peterz, mbenes

When re-running objtool it will generate alterantives for all
retpoline hunks, even if they are already present.

Discard the retpoline alternatives later so we can mark the
instructions as already having alternatives and subsequently skip
generating them. Use ->ignore_alts for this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c |    3 +++
 tools/objtool/check.c           |    8 ++++++++
 tools/objtool/special.c         |    8 --------
 3 files changed, 11 insertions(+), 8 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -806,6 +806,9 @@ int arch_rewrite_retpolines(struct objto
 		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
 			continue;
 
+		if (insn->ignore_alts)
+			continue;
+
 		reloc = insn->reloc;
 
 		sprintf(name, "__x86_indirect_alt_%s_%s",
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1468,6 +1468,14 @@ static int add_special_section_alts(stru
 				ret = -1;
 				goto out;
 			}
+			/*
+			 * Skip (but mark) the retpoline alternatives so that we
+			 * don't generate them again.
+			 */
+			if (new_insn->func && arch_is_retpoline(new_insn->func)) {
+				orig_insn->ignore_alts = true;
+				continue;
+			}
 		}
 
 		if (special_alt->group) {
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -109,14 +109,6 @@ static int get_alt_entry(struct elf *elf
 			return -1;
 		}
 
-		/*
-		 * Skip retpoline .altinstr_replacement... we already rewrite the
-		 * instructions for retpolines anyway, see arch_is_retpoline()
-		 * usage in add_{call,jump}_destinations().
-		 */
-		if (arch_is_retpoline(new_reloc->sym))
-			return 1;
-
 		reloc_to_sec_off(new_reloc, &alt->new_sec, &alt->new_off);
 
 		/* _ASM_EXTABLE_EX hack */



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] objtool: Optimize re-writing jump_label
  2021-10-07 21:22 ` [PATCH 1/2] objtool: Optimize re-writing jump_label Peter Zijlstra
@ 2021-10-08  6:55   ` Josh Poimboeuf
  2021-10-08 10:03     ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Josh Poimboeuf @ 2021-10-08  6:55 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mbenes

On Thu, Oct 07, 2021 at 11:22:12PM +0200, Peter Zijlstra wrote:
> There's no point to re-write the jump_label NOP when it's already a NOP.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  tools/objtool/check.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1397,7 +1397,7 @@ static int handle_jump_alt(struct objtoo
>  		return -1;
>  	}
>  
> -	if (special_alt->key_addend & 2) {
> +	if ((special_alt->key_addend & 2) && orig_insn->type != INSN_NOP) {
>  		struct reloc *reloc = insn_reloc(file, orig_insn);
>  
>  		if (reloc) {

While you're at it, a comment would be very helpful for that whole
clause.

-- 
Josh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation
  2021-10-07 21:22 ` [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation Peter Zijlstra
@ 2021-10-08  7:23   ` Josh Poimboeuf
  2021-10-08 10:35     ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Josh Poimboeuf @ 2021-10-08  7:23 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mbenes

On Thu, Oct 07, 2021 at 11:22:13PM +0200, Peter Zijlstra wrote:
> When re-running objtool it will generate alterantives for all

"alternatives"

> retpoline hunks, even if they are already present.
> 
> Discard the retpoline alternatives later so we can mark the

Discard? or mark as ignored?

> +++ b/tools/objtool/check.c
> @@ -1468,6 +1468,14 @@ static int add_special_section_alts(stru
>  				ret = -1;
>  				goto out;
>  			}
> +			/*
> +			 * Skip (but mark) the retpoline alternatives so that we
> +			 * don't generate them again.
> +			 */

I'm having a lot of trouble following this comment.  In my half-sleeping
state I'm theorizing this serves two purposes:

  1) skip validating the alt (because why?)

     and

  2) if re-running objtool on the object, don't generate a duplicate
     alternative?  or maybe it also avoids duplicates for retpoline
     alternatives which were created in asm code?

Not sure if I'm right but either way the comment needs more content.

Also not sure about $SUBJECT, maybe it can be more specific.

BTW, this "re-running objtool" thing is probably a bigger problem that
can be handled more broadly.  When writing an object, we could write a
dummy discard section ".discard.objtool_wuz_here" which tells it not to
touch it a second time as weird things could happen.

-- 
Josh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] objtool: Optimize re-writing jump_label
  2021-10-08  6:55   ` Josh Poimboeuf
@ 2021-10-08 10:03     ` Peter Zijlstra
  2021-10-08 16:28       ` Josh Poimboeuf
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2021-10-08 10:03 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: linux-kernel, mbenes

On Thu, Oct 07, 2021 at 11:55:50PM -0700, Josh Poimboeuf wrote:
> On Thu, Oct 07, 2021 at 11:22:12PM +0200, Peter Zijlstra wrote:
> > There's no point to re-write the jump_label NOP when it's already a NOP.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  tools/objtool/check.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > --- a/tools/objtool/check.c
> > +++ b/tools/objtool/check.c
> > @@ -1397,7 +1397,7 @@ static int handle_jump_alt(struct objtoo
> >  		return -1;
> >  	}
> >  
> > -	if (special_alt->key_addend & 2) {
> > +	if ((special_alt->key_addend & 2) && orig_insn->type != INSN_NOP) {
> >  		struct reloc *reloc = insn_reloc(file, orig_insn);
> >  
> >  		if (reloc) {
> 
> While you're at it, a comment would be very helpful for that whole
> clause.

Like so?

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1397,7 +1397,16 @@ static int handle_jump_alt(struct objtoo
 		return -1;
 	}
 
-	if (special_alt->key_addend & 2) {
+	/*
+	 * When, for whatever reason, the jump-label site cannot emit a right
+	 * sized NOP, then it can use Bit-1 of the struct static_key pointer to
+	 * indicate this instruction should be NOP'ed by objtool.
+	 *
+	 * Also see arch/x86/include/asm/jump_label.h:arch_static_branch(),
+	 * where we leave the assembler to pick between jmp.d8 and jmp.d32
+	 * based on destination offset.
+	 */
+	if ((special_alt->key_addend & 2) && orig_insn->type != INSN_NOP) {
 		struct reloc *reloc = insn_reloc(file, orig_insn);
 
 		if (reloc) {

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation
  2021-10-08  7:23   ` Josh Poimboeuf
@ 2021-10-08 10:35     ` Peter Zijlstra
  2021-10-08 16:39       ` Josh Poimboeuf
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2021-10-08 10:35 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: linux-kernel, mbenes

On Fri, Oct 08, 2021 at 12:23:25AM -0700, Josh Poimboeuf wrote:
> On Thu, Oct 07, 2021 at 11:22:13PM +0200, Peter Zijlstra wrote:
> > When re-running objtool it will generate alterantives for all
> 
> "alternatives"
> 
> > retpoline hunks, even if they are already present.
> > 
> > Discard the retpoline alternatives later so we can mark the
> 
> Discard? or mark as ignored?

I used 'discard' since we don't actually generate insn->alts entries.

> > +++ b/tools/objtool/check.c
> > @@ -1468,6 +1468,14 @@ static int add_special_section_alts(stru
> >  				ret = -1;
> >  				goto out;
> >  			}
> > +			/*
> > +			 * Skip (but mark) the retpoline alternatives so that we
> > +			 * don't generate them again.
> > +			 */
> 
> I'm having a lot of trouble following this comment.  In my half-sleeping
> state I'm theorizing this serves two purposes:
> 
>   1) skip validating the alt (because why?)
> 
>      and
> 
>   2) if re-running objtool on the object, don't generate a duplicate
>      alternative?  or maybe it also avoids duplicates for retpoline
>      alternatives which were created in asm code?
> 
> Not sure if I'm right but either way the comment needs more content.
> 
> Also not sure about $SUBJECT, maybe it can be more specific.

Below better?

> BTW, this "re-running objtool" thing is probably a bigger problem that
> can be handled more broadly.  When writing an object, we could write a
> dummy discard section ".discard.objtool_wuz_here" which tells it not to
> touch it a second time as weird things could happen.

Section can't work, since we run the first pass on individual
translations units, so if we get the wuz_here tag from one TU we can't
tell if we perhaps forgot to run on another.

Better detecting if there's actual work to do seems safer to me.

What I actually did yesterday was hack up --noinstr to WARN if there was
an elf modification done, I could turn that into a --ro flag or
something, which we can set on vmlinux if it's supposed to be a pure
validation pass.

---
Subject: objtool: Optimize retpoline alternative generation
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu Oct 7 23:15:34 CEST 2021

When re-running objtool it will generate alternatives for all
retpoline hunks, even if they are already present.

Instead of early discarding the retpoline alterantives, hang onto them
a little longer such that the instructions can be marked as already
having an alternative, which then in turn enables avoiding generating
another one.

Having multiple alternatives for a single site is harmless, provided
they're identical, however it does waste time and space.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c |    3 +++
 tools/objtool/check.c           |   11 +++++++++++
 tools/objtool/special.c         |    8 --------
 3 files changed, 14 insertions(+), 8 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -806,6 +806,9 @@ int arch_rewrite_retpolines(struct objto
 		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
 			continue;
 
+		if (insn->ignore_alts)
+			continue;
+
 		reloc = insn->reloc;
 
 		sprintf(name, "__x86_indirect_alt_%s_%s",
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1477,6 +1477,17 @@ static int add_special_section_alts(stru
 				ret = -1;
 				goto out;
 			}
+
+			/*
+			 * Don't generate alternative instruction streams
+			 * (insn->alts) but instead mark the retpoline call as
+			 * already having an alternative, so that we can avoid
+			 * generating another instance.
+			 */
+			if (new_insn->func && arch_is_retpoline(new_insn->func)) {
+				orig_insn->ignore_alts = true;
+				continue;
+			}
 		}
 
 		if (special_alt->group) {
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -109,14 +109,6 @@ static int get_alt_entry(struct elf *elf
 			return -1;
 		}
 
-		/*
-		 * Skip retpoline .altinstr_replacement... we already rewrite the
-		 * instructions for retpolines anyway, see arch_is_retpoline()
-		 * usage in add_{call,jump}_destinations().
-		 */
-		if (arch_is_retpoline(new_reloc->sym))
-			return 1;
-
 		reloc_to_sec_off(new_reloc, &alt->new_sec, &alt->new_off);
 
 		/* _ASM_EXTABLE_EX hack */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] objtool: Optimize re-writing jump_label
  2021-10-08 10:03     ` Peter Zijlstra
@ 2021-10-08 16:28       ` Josh Poimboeuf
  0 siblings, 0 replies; 10+ messages in thread
From: Josh Poimboeuf @ 2021-10-08 16:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mbenes

On Fri, Oct 08, 2021 at 12:03:31PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 07, 2021 at 11:55:50PM -0700, Josh Poimboeuf wrote:
> > On Thu, Oct 07, 2021 at 11:22:12PM +0200, Peter Zijlstra wrote:
> > > There's no point to re-write the jump_label NOP when it's already a NOP.
> > > 
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > ---
> > >  tools/objtool/check.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > --- a/tools/objtool/check.c
> > > +++ b/tools/objtool/check.c
> > > @@ -1397,7 +1397,7 @@ static int handle_jump_alt(struct objtoo
> > >  		return -1;
> > >  	}
> > >  
> > > -	if (special_alt->key_addend & 2) {
> > > +	if ((special_alt->key_addend & 2) && orig_insn->type != INSN_NOP) {
> > >  		struct reloc *reloc = insn_reloc(file, orig_insn);
> > >  
> > >  		if (reloc) {
> > 
> > While you're at it, a comment would be very helpful for that whole
> > clause.
> 
> Like so?
> 
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1397,7 +1397,16 @@ static int handle_jump_alt(struct objtoo
>  		return -1;
>  	}
>  
> -	if (special_alt->key_addend & 2) {
> +	/*
> +	 * When, for whatever reason, the jump-label site cannot emit a right
> +	 * sized NOP, then it can use Bit-1 of the struct static_key pointer to
> +	 * indicate this instruction should be NOP'ed by objtool.
> +	 *
> +	 * Also see arch/x86/include/asm/jump_label.h:arch_static_branch(),
> +	 * where we leave the assembler to pick between jmp.d8 and jmp.d32
> +	 * based on destination offset.
> +	 */
> +	if ((special_alt->key_addend & 2) && orig_insn->type != INSN_NOP) {
>  		struct reloc *reloc = insn_reloc(file, orig_insn);
>  
>  		if (reloc) {
> 

Much better!  Can you also mention why it's checking INSN_NOP?  The "run
objtool twice" case is far from obvious.

-- 
Josh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation
  2021-10-08 10:35     ` Peter Zijlstra
@ 2021-10-08 16:39       ` Josh Poimboeuf
  2021-10-09 10:42         ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Josh Poimboeuf @ 2021-10-08 16:39 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mbenes

On Fri, Oct 08, 2021 at 12:35:25PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 08, 2021 at 12:23:25AM -0700, Josh Poimboeuf wrote:
> > On Thu, Oct 07, 2021 at 11:22:13PM +0200, Peter Zijlstra wrote:
> > > When re-running objtool it will generate alterantives for all
> > 
> > "alternatives"
> > 
> > > retpoline hunks, even if they are already present.
> > > 
> > > Discard the retpoline alternatives later so we can mark the
> > 
> > Discard? or mark as ignored?
> 
> I used 'discard' since we don't actually generate insn->alts entries.

I still don't like 'discard', it sounds like you're removing the
existing ALTERNATIVE from the file.

> > BTW, this "re-running objtool" thing is probably a bigger problem that
> > can be handled more broadly.  When writing an object, we could write a
> > dummy discard section ".discard.objtool_wuz_here" which tells it not to
> > touch it a second time as weird things could happen.
> 
> Section can't work, since we run the first pass on individual
> translations units, so if we get the wuz_here tag from one TU we can't
> tell if we perhaps forgot to run on another.
> 
> Better detecting if there's actual work to do seems safer to me.

I *really* don't like writing an ITU and then later writing it again as
part of bigger a linked object.  It's just going to introduce a lot of
bugs and a lot of individual "did I do this yet?" checks that we'll
forget to do half the time.

If we "perhaps forgot to run on another", and if that's a problem, then
shouldn't it be a warning when we detect it?

What specific scenarios were you thinking about?

> What I actually did yesterday was hack up --noinstr to WARN if there was
> an elf modification done, I could turn that into a --ro flag or
> something, which we can set on vmlinux if it's supposed to be a pure
> validation pass.

That might be useful, --dry-run or so.  Also useful for re-running
objtool with --backtrace to get more details about a warning.

> Subject: objtool: Optimize retpoline alternative generation
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Thu Oct 7 23:15:34 CEST 2021
> 
> When re-running objtool it will generate alternatives for all
> retpoline hunks, even if they are already present.
> 
> Instead of early discarding the retpoline alterantives, hang onto them

s/alterantives/alternatives/

> @@ -1477,6 +1477,17 @@ static int add_special_section_alts(stru
>  				ret = -1;
>  				goto out;
>  			}
> +
> +			/*
> +			 * Don't generate alternative instruction streams
> +			 * (insn->alts) but instead mark the retpoline call as
> +			 * already having an alternative, so that we can avoid
> +			 * generating another instance.
> +			 */

But this also means that branch validation will get skipped on this alt,
right?  Can you mention that here, and why it's not a problem?

> +			if (new_insn->func && arch_is_retpoline(new_insn->func)) {
> +				orig_insn->ignore_alts = true;
> +				continue;
> +			}

-- 
Josh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation
  2021-10-08 16:39       ` Josh Poimboeuf
@ 2021-10-09 10:42         ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2021-10-09 10:42 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: linux-kernel, mbenes

On Fri, Oct 08, 2021 at 09:39:06AM -0700, Josh Poimboeuf wrote:
> On Fri, Oct 08, 2021 at 12:35:25PM +0200, Peter Zijlstra wrote:

> > I used 'discard' since we don't actually generate insn->alts entries.
> 
> I still don't like 'discard', it sounds like you're removing the
> existing ALTERNATIVE from the file.

OK, I'll go word-smith it more then :-)

> > > BTW, this "re-running objtool" thing is probably a bigger problem that
> > > can be handled more broadly.  When writing an object, we could write a
> > > dummy discard section ".discard.objtool_wuz_here" which tells it not to
> > > touch it a second time as weird things could happen.
> > 
> > Section can't work, since we run the first pass on individual
> > translations units, so if we get the wuz_here tag from one TU we can't
> > tell if we perhaps forgot to run on another.
> > 
> > Better detecting if there's actual work to do seems safer to me.
> 
> I *really* don't like writing an ITU and then later writing it again as
> part of bigger a linked object.  It's just going to introduce a lot of
> bugs and a lot of individual "did I do this yet?" checks that we'll
> forget to do half the time.
> 
> If we "perhaps forgot to run on another", and if that's a problem, then
> shouldn't it be a warning when we detect it?
> 
> What specific scenarios were you thinking about?

The obvious scenario (which I actually triggered when I did this
jump_label rewrite thingy) was that a file that uses jump_label wasn't
ran through objtool and stuff came unstuck.

So if we have this double-pass, then all modifications should be done at
the TU level, while the LD level pass should be validation only.

This R/O validation pass at LD level could detect the missed TU level
processing.

> > What I actually did yesterday was hack up --noinstr to WARN if there was
> > an elf modification done, I could turn that into a --ro flag or
> > something, which we can set on vmlinux if it's supposed to be a pure
> > validation pass.
> 
> That might be useful, --dry-run or so.  Also useful for re-running
> objtool with --backtrace to get more details about a warning.

What I had was a little more crude, it simply is a WARN for every elf.c
function that sets ->changed = true. It actually did do the change, but
it warned about it.

Also, --dry-run, to me, doesn't imply warning about the modifications,
simply not committing them, eg. it would skip elf_write(). (which should
still be easy enough to add).

> > Subject: objtool: Optimize retpoline alternative generation
> > From: Peter Zijlstra <peterz@infradead.org>
> > Date: Thu Oct 7 23:15:34 CEST 2021
> > 
> > When re-running objtool it will generate alternatives for all
> > retpoline hunks, even if they are already present.
> > 
> > Instead of early discarding the retpoline alterantives, hang onto them
> 
> s/alterantives/alternatives/
> 
> > @@ -1477,6 +1477,17 @@ static int add_special_section_alts(stru
> >  				ret = -1;
> >  				goto out;
> >  			}
> > +
> > +			/*
> > +			 * Don't generate alternative instruction streams
> > +			 * (insn->alts) but instead mark the retpoline call as
> > +			 * already having an alternative, so that we can avoid
> > +			 * generating another instance.
> > +			 */
> 
> But this also means that branch validation will get skipped on this alt,
> right?  Can you mention that here, and why it's not a problem?

I can. But I'm >.< close to getting objtool writing
.altinstr_replacement sections which would invalidate this patch.

The only teensy problem is that if objtool creates the
.altinstr_replacement section, it also needs to create a STT_SECTION
symbol for it, which I've also done... *except* STT_SECTION symbols need
to be STB_LOCAL.

And therin lies the rub, ELF wants all STB_LOCAL symbols grouped at the
start of the symbol table (and keeps the number of them in sh_info).
This means that objtool needs to go reshuffle the whole symbol table,
which means updating all references to the symbols :/

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-09 10:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-07 21:22 [PATCH 0/2] objtool: Avoid pointless modifications Peter Zijlstra
2021-10-07 21:22 ` [PATCH 1/2] objtool: Optimize re-writing jump_label Peter Zijlstra
2021-10-08  6:55   ` Josh Poimboeuf
2021-10-08 10:03     ` Peter Zijlstra
2021-10-08 16:28       ` Josh Poimboeuf
2021-10-07 21:22 ` [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation Peter Zijlstra
2021-10-08  7:23   ` Josh Poimboeuf
2021-10-08 10:35     ` Peter Zijlstra
2021-10-08 16:39       ` Josh Poimboeuf
2021-10-09 10:42         ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.