linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] x86/urgent for v5.11-rc7
@ 2021-02-07 10:40 Borislav Petkov
  2021-02-07 17:49 ` Linus Torvalds
  2021-02-07 18:29 ` pr-tracker-bot
  0 siblings, 2 replies; 26+ messages in thread
From: Borislav Petkov @ 2021-02-07 10:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: x86-ml, lkml

Hi Linus,

I hope this is the last batch of x86/urgent updates for this round.

Pls pull,
thx.

---

The following changes since commit 6ee1d745b7c9fd573fba142a2efdad76a9f1cb04:

  Linux 5.11-rc5 (2021-01-24 16:47:14 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/x86_urgent_for_v5.11_rc7

for you to fetch changes up to 816ef8d7a2c4182e19bc06ab65751cb9e3951e94:

  x86/efi: Remove EFI PGD build time checks (2021-02-06 13:54:14 +0100)

----------------------------------------------------------------
- Remove superfluous EFI PGD range checks which lead to those assertions failing
with certain kernel configs and LLVM.

- Disable setting breakpoints on facilities involved in #DB exception handling
to avoid infinite loops.

- Add extra serialization to non-serializing MSRs (IA32_TSC_DEADLINE and
x2 APIC MSRs) to adhere to SDM's recommendation and avoid any theoretical
issues.

- Re-add the EPB MSR reading on turbostat so that it works on older
kernels which don't have the corresponding EPB sysfs file.

- Add Alder Lake to the list of CPUs which support split lock.

- Fix %dr6 register handling in order to be able to set watchpoints with gdb
again.

- Disable CET instrumentation in the kernel so that gcc doesn't add
ENDBR64 to kernel code and thus confuse tracing.

----------------------------------------------------------------
Borislav Petkov (2):
      tools/power/turbostat: Fallback to an MSR read for EPB
      x86/efi: Remove EFI PGD build time checks

Dave Hansen (1):
      x86/apic: Add extra serialization for non-serializing MSRs

Fenghua Yu (1):
      x86/split_lock: Enable the split lock feature on another Alder Lake CPU

Josh Poimboeuf (1):
      x86/build: Disable CET instrumentation in the kernel

Lai Jiangshan (2):
      x86/debug: Prevent data breakpoints on __per_cpu_offset
      x86/debug: Prevent data breakpoints on cpu_dr7

Peter Zijlstra (1):
      x86/debug: Fix DR6 handling

 Makefile                              |  6 ----
 arch/x86/Makefile                     |  3 ++
 arch/x86/include/asm/apic.h           | 10 ------
 arch/x86/include/asm/barrier.h        | 18 +++++++++++
 arch/x86/kernel/apic/apic.c           |  4 +++
 arch/x86/kernel/apic/x2apic_cluster.c |  6 ++--
 arch/x86/kernel/apic/x2apic_phys.c    |  9 ++++--
 arch/x86/kernel/cpu/intel.c           |  1 +
 arch/x86/kernel/hw_breakpoint.c       | 61 +++++++++++++++++++++++------------
 arch/x86/platform/efi/efi_64.c        | 19 -----------
 tools/power/x86/turbostat/turbostat.c | 10 +++++-
 11 files changed, 85 insertions(+), 62 deletions(-)

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 10:40 [GIT PULL] x86/urgent for v5.11-rc7 Borislav Petkov
@ 2021-02-07 17:49 ` Linus Torvalds
  2021-02-07 17:58   ` Borislav Petkov
  2021-02-07 18:29 ` pr-tracker-bot
  1 sibling, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2021-02-07 17:49 UTC (permalink / raw)
  To: Borislav Petkov, Steven Rostedt; +Cc: x86-ml, lkml

On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <bp@suse.de> wrote:
>
> - Disable CET instrumentation in the kernel so that gcc doesn't add
> ENDBR64 to kernel code and thus confuse tracing.

So this is clearly the right thing to do for now, but I wonder if
people have a plan for actually enabling CET and endbr at cpl0 at some
point?

            Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 17:49 ` Linus Torvalds
@ 2021-02-07 17:58   ` Borislav Petkov
  2021-02-07 18:15     ` Linus Torvalds
  2021-02-07 18:19     ` Dave Hansen
  0 siblings, 2 replies; 26+ messages in thread
From: Borislav Petkov @ 2021-02-07 17:58 UTC (permalink / raw)
  To: Linus Torvalds, Dave Hansen; +Cc: Steven Rostedt, x86-ml, lkml

On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <bp@suse.de> wrote:
> >
> > - Disable CET instrumentation in the kernel so that gcc doesn't add
> > ENDBR64 to kernel code and thus confuse tracing.
> 
> So this is clearly the right thing to do for now, but I wonder if
> people have a plan for actually enabling CET and endbr at cpl0 at some
> point?

It probably is an item on some Intel manager's to-enable list. So far,
the CET enablement concentrates only on userspace but dhansen might know
more about future plans. CCed.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 17:58   ` Borislav Petkov
@ 2021-02-07 18:15     ` Linus Torvalds
  2021-02-07 18:32       ` Dave Hansen
  2021-02-07 22:45       ` Josh Poimboeuf
  2021-02-07 18:19     ` Dave Hansen
  1 sibling, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2021-02-07 18:15 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, Steven Rostedt, x86-ml, lkml, Josh Poimboeuf,
	Alexei Starovoitov

On Sun, Feb 7, 2021 at 9:58 AM Borislav Petkov <bp@suse.de> wrote:
>
> It probably is an item on some Intel manager's to-enable list. So far,
> the CET enablement concentrates only on userspace but dhansen might know
> more about future plans. CCed.

I think the new Ryzen 5000 series also supports CET, but I don't have
any machines to check.

Hopefully somebody ends up with hardware that supports it and a urge
to try to make it work in kernel land too.

I do suspect involved people should start thinking about how they want
to deal with functions starting with

        endbr64
        call __fentry__

instead of the call being at the very top of the function.

I _assume_ it's mostly tracing, bpf and objtool that are going to
notice, and it's going to be largely invisible to anybody else.

So hopefully the involved people can at least just try to see how
their code looks when they turn off retpoline and add

  -fcf-protection=full

to the compiler command line (assuming they have a gcc that can do
it), even if they can't actually test the end result on hardware.

            Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 17:58   ` Borislav Petkov
  2021-02-07 18:15     ` Linus Torvalds
@ 2021-02-07 18:19     ` Dave Hansen
  2021-02-07 18:31       ` Andy Lutomirski
  2021-02-07 20:44       ` Alexei Starovoitov
  1 sibling, 2 replies; 26+ messages in thread
From: Dave Hansen @ 2021-02-07 18:19 UTC (permalink / raw)
  To: Borislav Petkov, Linus Torvalds; +Cc: Steven Rostedt, x86-ml, lkml

On 2/7/21 9:58 AM, Borislav Petkov wrote:
> On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
>> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <bp@suse.de> wrote:
>>> - Disable CET instrumentation in the kernel so that gcc doesn't add
>>> ENDBR64 to kernel code and thus confuse tracing.
>> So this is clearly the right thing to do for now, but I wonder if
>> people have a plan for actually enabling CET and endbr at cpl0 at some
>> point?
> It probably is an item on some Intel manager's to-enable list. So far,
> the CET enablement concentrates only on userspace but dhansen might know
> more about future plans. CCed.

It's definitely on our radar to look at after CET userspace.

The only question for me is whether it will be worth doing with the
exiting kernel entry/exit architecture.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 10:40 [GIT PULL] x86/urgent for v5.11-rc7 Borislav Petkov
  2021-02-07 17:49 ` Linus Torvalds
@ 2021-02-07 18:29 ` pr-tracker-bot
  1 sibling, 0 replies; 26+ messages in thread
From: pr-tracker-bot @ 2021-02-07 18:29 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Linus Torvalds, x86-ml, lkml

The pull request you sent on Sun, 7 Feb 2021 11:40:22 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/x86_urgent_for_v5.11_rc7

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e24f9c5f6e3127a0679d5ba5575a181b80f219c9

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 18:19     ` Dave Hansen
@ 2021-02-07 18:31       ` Andy Lutomirski
  2021-02-08 10:33         ` Peter Zijlstra
  2021-02-07 20:44       ` Alexei Starovoitov
  1 sibling, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2021-02-07 18:31 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Borislav Petkov, Linus Torvalds, Steven Rostedt, x86-ml, lkml


> On Feb 7, 2021, at 10:19 AM, Dave Hansen <dave.hansen@intel.com> wrote:
> 
> On 2/7/21 9:58 AM, Borislav Petkov wrote:
>>> On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
>>> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <bp@suse.de> wrote:
>>>> - Disable CET instrumentation in the kernel so that gcc doesn't add
>>>> ENDBR64 to kernel code and thus confuse tracing.
>>> So this is clearly the right thing to do for now, but I wonder if
>>> people have a plan for actually enabling CET and endbr at cpl0 at some
>>> point?
>> It probably is an item on some Intel manager's to-enable list. So far,
>> the CET enablement concentrates only on userspace but dhansen might know
>> more about future plans. CCed.
> 
> It's definitely on our radar to look at after CET userspace.
> 
> The only question for me is whether it will be worth doing with the
> exiting kernel entry/exit architecture.

I assume you mean: is anyone sufficiently inspired to try to handle NMI correctly?  I have a whole pile of nacks saved up for incorrect implementations, although I will try to wrap them in polite explanations of precisely what is wrong :)

(I’ve contemplated doing this myself, and it doesn’t sound fun at all.)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 18:15     ` Linus Torvalds
@ 2021-02-07 18:32       ` Dave Hansen
  2021-02-07 18:40         ` Linus Torvalds
  2021-02-07 22:45       ` Josh Poimboeuf
  1 sibling, 1 reply; 26+ messages in thread
From: Dave Hansen @ 2021-02-07 18:32 UTC (permalink / raw)
  To: Linus Torvalds, Borislav Petkov
  Cc: Steven Rostedt, x86-ml, lkml, Josh Poimboeuf, Alexei Starovoitov,
	Andy Lutomirski

On 2/7/21 10:15 AM, Linus Torvalds wrote:
> On Sun, Feb 7, 2021 at 9:58 AM Borislav Petkov <bp@suse.de> wrote:
>> It probably is an item on some Intel manager's to-enable list. So far,
>> the CET enablement concentrates only on userspace but dhansen might know
>> more about future plans. CCed.
> I think the new Ryzen 5000 series also supports CET, but I don't have
> any machines to check.

Intel wraps up Shadow Stacks and Indirect Branch Tracking (IBT) under
the CET umbrella, although they can be implemented totally independently.

I actually forget about the IBT half most of the time because the kernel
code to implement userspace support is a much lighter lift than shadow
stacks.

My understanding is that AMD has documented support for Shadow Stacks:

	https://www.amd.com/system/files/TechDocs/24592.pdf

But has not yet released any documentation about IBT.  IBT seems to be
Intel-only, at least in the short term.  There may be more, but the
"Tiger Lake" CPUs are the only ones I know of off the top of my head
that are in the wild:

> https://ark.intel.com/content/www/us/en/ark/products/208661/intel-core-i7-1160g7-processor-12m-cache-up-to-4-40-ghz-with-ipu.html


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 18:32       ` Dave Hansen
@ 2021-02-07 18:40         ` Linus Torvalds
  0 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2021-02-07 18:40 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Borislav Petkov, Steven Rostedt, x86-ml, lkml, Josh Poimboeuf,
	Alexei Starovoitov, Andy Lutomirski

On Sun, Feb 7, 2021 at 10:32 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> My understanding is that AMD has documented support for Shadow Stacks:
>
>         https://www.amd.com/system/files/TechDocs/24592.pdf
>
> But has not yet released any documentation about IBT.  IBT seems to be
> Intel-only, at least in the short term.  There may be more, but the
> "Tiger Lake" CPUs are the only ones I know of off the top of my head
> that are in the wild:

Ahh, ok. I clearly didn't look at the details, just the overview of
"supports CET".

           Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 18:19     ` Dave Hansen
  2021-02-07 18:31       ` Andy Lutomirski
@ 2021-02-07 20:44       ` Alexei Starovoitov
  2021-02-07 22:35         ` Dave Hansen
  1 sibling, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2021-02-07 20:44 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Borislav Petkov, Linus Torvalds, Steven Rostedt, x86-ml, lkml

On Sun, Feb 7, 2021 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 2/7/21 9:58 AM, Borislav Petkov wrote:
> > On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
> >> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <bp@suse.de> wrote:
> >>> - Disable CET instrumentation in the kernel so that gcc doesn't add
> >>> ENDBR64 to kernel code and thus confuse tracing.
> >> So this is clearly the right thing to do for now, but I wonder if
> >> people have a plan for actually enabling CET and endbr at cpl0 at some
> >> point?
> > It probably is an item on some Intel manager's to-enable list. So far,
> > the CET enablement concentrates only on userspace but dhansen might know
> > more about future plans. CCed.
>
> It's definitely on our radar to look at after CET userspace.

What is the desired timeline to enable CET in the kernel ?
I think for bpf and tracing it will be mostly straightforward to deal
with extra endbr64 insn in front of the fentry nop.
Just trying to figure when this work needs to be done.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 20:44       ` Alexei Starovoitov
@ 2021-02-07 22:35         ` Dave Hansen
  2021-02-08 16:11           ` Yu, Yu-cheng
  0 siblings, 1 reply; 26+ messages in thread
From: Dave Hansen @ 2021-02-07 22:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Borislav Petkov, Linus Torvalds, Steven Rostedt, x86-ml, lkml,
	Yu, Yu-cheng

On 2/7/21 12:44 PM, Alexei Starovoitov wrote:
>>> It probably is an item on some Intel manager's to-enable list. So far,
>>> the CET enablement concentrates only on userspace but dhansen might know
>>> more about future plans. CCed.
>> It's definitely on our radar to look at after CET userspace.
> What is the desired timeline to enable CET in the kernel ?
> I think for bpf and tracing it will be mostly straightforward to deal
> with extra endbr64 insn in front of the fentry nop.
> Just trying to figure when this work needs to be done.

Yu-cheng?  Any idea when you're going to start hacking on the in-kernel
IBT bits?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 18:15     ` Linus Torvalds
  2021-02-07 18:32       ` Dave Hansen
@ 2021-02-07 22:45       ` Josh Poimboeuf
  2021-02-08 15:02         ` Steven Rostedt
  1 sibling, 1 reply; 26+ messages in thread
From: Josh Poimboeuf @ 2021-02-07 22:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Borislav Petkov, Dave Hansen, Steven Rostedt, x86-ml, lkml,
	Alexei Starovoitov

On Sun, Feb 07, 2021 at 10:15:49AM -0800, Linus Torvalds wrote:
> On Sun, Feb 7, 2021 at 9:58 AM Borislav Petkov <bp@suse.de> wrote:
> >
> > It probably is an item on some Intel manager's to-enable list. So far,
> > the CET enablement concentrates only on userspace but dhansen might know
> > more about future plans. CCed.
> 
> I think the new Ryzen 5000 series also supports CET, but I don't have
> any machines to check.
> 
> Hopefully somebody ends up with hardware that supports it and a urge
> to try to make it work in kernel land too.
> 
> I do suspect involved people should start thinking about how they want
> to deal with functions starting with
> 
>         endbr64
>         call __fentry__
> 
> instead of the call being at the very top of the function.

FWIW, objtool's already fine with it (otherwise we would have discovered
the need to disable fcf-protection much sooner).

-- 
Josh


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 18:31       ` Andy Lutomirski
@ 2021-02-08 10:33         ` Peter Zijlstra
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2021-02-08 10:33 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dave Hansen, Borislav Petkov, Linus Torvalds, Steven Rostedt,
	x86-ml, lkml

On Sun, Feb 07, 2021 at 10:31:32AM -0800, Andy Lutomirski wrote:
> 
> > On Feb 7, 2021, at 10:19 AM, Dave Hansen <dave.hansen@intel.com> wrote:
> > 
> > On 2/7/21 9:58 AM, Borislav Petkov wrote:
> >>> On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
> >>> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <bp@suse.de> wrote:
> >>>> - Disable CET instrumentation in the kernel so that gcc doesn't add
> >>>> ENDBR64 to kernel code and thus confuse tracing.
> >>> So this is clearly the right thing to do for now, but I wonder if
> >>> people have a plan for actually enabling CET and endbr at cpl0 at some
> >>> point?
> >> It probably is an item on some Intel manager's to-enable list. So far,
> >> the CET enablement concentrates only on userspace but dhansen might know
> >> more about future plans. CCed.
> > 
> > It's definitely on our radar to look at after CET userspace.
> > 
> > The only question for me is whether it will be worth doing with the
> > exiting kernel entry/exit architecture.
> 
> I assume you mean: is anyone sufficiently inspired to try to handle
> NMI correctly?  I have a whole pile of nacks saved up for incorrect
> implementations, although I will try to wrap them in polite
> explanations of precisely what is wrong :)

Yeah, the IST stack recursion possibilities are 'fun'. But IIRC CET-SS
has far more problems than just NMI. It will also run into all the ROP
tricks we pull for return tracing, CALL emulation and other lovely
things.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 22:45       ` Josh Poimboeuf
@ 2021-02-08 15:02         ` Steven Rostedt
  2021-02-08 15:33           ` Josh Poimboeuf
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2021-02-08 15:02 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Linus Torvalds, Borislav Petkov, Dave Hansen, x86-ml, lkml,
	Alexei Starovoitov

On Sun, 7 Feb 2021 16:45:40 -0600
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> > I do suspect involved people should start thinking about how they want
> > to deal with functions starting with
> > 
> >         endbr64
> >         call __fentry__
> > 
> > instead of the call being at the very top of the function.  
> 
> FWIW, objtool's already fine with it (otherwise we would have discovered
> the need to disable fcf-protection much sooner).

And this doesn't really affect tracing (note, another user that might be
affected is live kernel patching). The way this change was noticed, was
that there was a report of someone that was be able to connect a bpf
program to a function for one machine but not for another machine. The
other machine had this CET thingy.

The difference is, when you attach a probe to the start of a function,
kprobes will check if the probe address (start of function) is located at a
ftrace location (nop / __fentry__) and if it is, it would use the ftrace
infrastructure instead of attaching an int3 breakpoint. Because of the
enbr64 being at the start of the function, the check returned false (it was
not a ftrace location) and it attached an int3 breakpoint instead.

This uncovered another "bug". Peter Zijlstra made int3 handlers look like
NMIs (in_nmi() would return true in an int3 handler). The BPF programs would
not run in NMI context. But nobody noticed, because people usually attach
BPF programs to the start of a function using kprobes, and since kprobes
would use ftrace handlers (that don't set in_nmi() to true), everything
worked. But when the "endbr64" was added at the start of the program,
kprobes fell back to int3, and suddenly the BPF programs stopped working.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-08 15:02         ` Steven Rostedt
@ 2021-02-08 15:33           ` Josh Poimboeuf
  2021-02-08 15:47             ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Josh Poimboeuf @ 2021-02-08 15:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Borislav Petkov, Dave Hansen, x86-ml, lkml,
	Alexei Starovoitov, live-patching

On Mon, Feb 08, 2021 at 10:02:06AM -0500, Steven Rostedt wrote:
> On Sun, 7 Feb 2021 16:45:40 -0600
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > > I do suspect involved people should start thinking about how they want
> > > to deal with functions starting with
> > > 
> > >         endbr64
> > >         call __fentry__
> > > 
> > > instead of the call being at the very top of the function.  
> > 
> > FWIW, objtool's already fine with it (otherwise we would have discovered
> > the need to disable fcf-protection much sooner).
> 
> And this doesn't really affect tracing (note, another user that might be
> affected is live kernel patching).

Good point, livepatch is indeed affected.  Is there a better way to get
the "call __fentry__" address for a given function?


/*
 * Convert a function address into the appropriate ftrace location.
 *
 * Usually this is just the address of the function, but on some architectures
 * it's more complicated so allow them to provide a custom behaviour.
 */
#ifndef klp_get_ftrace_location
static unsigned long klp_get_ftrace_location(unsigned long faddr)
{
	return faddr;
}
#endif

-- 
Josh


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-08 15:33           ` Josh Poimboeuf
@ 2021-02-08 15:47             ` Peter Zijlstra
  2021-02-08 16:15               ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2021-02-08 15:47 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Steven Rostedt, Linus Torvalds, Borislav Petkov, Dave Hansen,
	x86-ml, lkml, Alexei Starovoitov, live-patching

On Mon, Feb 08, 2021 at 09:33:00AM -0600, Josh Poimboeuf wrote:
> On Mon, Feb 08, 2021 at 10:02:06AM -0500, Steven Rostedt wrote:
> > On Sun, 7 Feb 2021 16:45:40 -0600
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > 
> > > > I do suspect involved people should start thinking about how they want
> > > > to deal with functions starting with
> > > > 
> > > >         endbr64
> > > >         call __fentry__
> > > > 
> > > > instead of the call being at the very top of the function.  
> > > 
> > > FWIW, objtool's already fine with it (otherwise we would have discovered
> > > the need to disable fcf-protection much sooner).
> > 
> > And this doesn't really affect tracing (note, another user that might be
> > affected is live kernel patching).
> 
> Good point, livepatch is indeed affected.  Is there a better way to get
> the "call __fentry__" address for a given function?
> 
> 
> /*
>  * Convert a function address into the appropriate ftrace location.
>  *
>  * Usually this is just the address of the function, but on some architectures
>  * it's more complicated so allow them to provide a custom behaviour.
>  */
> #ifndef klp_get_ftrace_location
> static unsigned long klp_get_ftrace_location(unsigned long faddr)
> {
> 	return faddr;
> }
> #endif

I suppose the trivial fix is to see if it points to endbr64 and if so,
increment the addr by the length of that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-07 22:35         ` Dave Hansen
@ 2021-02-08 16:11           ` Yu, Yu-cheng
  0 siblings, 0 replies; 26+ messages in thread
From: Yu, Yu-cheng @ 2021-02-08 16:11 UTC (permalink / raw)
  To: Dave Hansen, Alexei Starovoitov
  Cc: Borislav Petkov, Linus Torvalds, Steven Rostedt, x86-ml, lkml

On 2/7/2021 2:35 PM, Dave Hansen wrote:
> On 2/7/21 12:44 PM, Alexei Starovoitov wrote:
>>>> It probably is an item on some Intel manager's to-enable list. So far,
>>>> the CET enablement concentrates only on userspace but dhansen might know
>>>> more about future plans. CCed.
>>> It's definitely on our radar to look at after CET userspace.
>> What is the desired timeline to enable CET in the kernel ?
>> I think for bpf and tracing it will be mostly straightforward to deal
>> with extra endbr64 insn in front of the fentry nop.
>> Just trying to figure when this work needs to be done.
> 
> Yu-cheng?  Any idea when you're going to start hacking on the in-kernel
> IBT bits?
> 

I have some kernel-mode enabling patches that I will soon send 
internally for comments.  My estimate is probably before the summer, I 
can send those to the mailing list.

--
Yu-cheng

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-08 15:47             ` Peter Zijlstra
@ 2021-02-08 16:15               ` Steven Rostedt
  2021-02-09  8:32                 ` Miroslav Benes
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2021-02-08 16:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Linus Torvalds, Borislav Petkov, Dave Hansen,
	x86-ml, lkml, Alexei Starovoitov, live-patching

On Mon, 8 Feb 2021 16:47:05 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> > /*
> >  * Convert a function address into the appropriate ftrace location.
> >  *
> >  * Usually this is just the address of the function, but on some architectures
> >  * it's more complicated so allow them to provide a custom behaviour.
> >  */
> > #ifndef klp_get_ftrace_location
> > static unsigned long klp_get_ftrace_location(unsigned long faddr)
> > {
> > 	return faddr;
> > }
> > #endif  
> 
> I suppose the trivial fix is to see if it points to endbr64 and if so,
> increment the addr by the length of that.

I thought of that too. But one thing that may be possible, is to use
kallsym. I believe you can get the range of a function (start and end of
the function) from kallsyms. Then ask ftrace for the addr in that range
(there should only be one).

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-08 16:15               ` Steven Rostedt
@ 2021-02-09  8:32                 ` Miroslav Benes
  2021-02-09 14:49                   ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Miroslav Benes @ 2021-02-09  8:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Josh Poimboeuf, Linus Torvalds, Borislav Petkov,
	Dave Hansen, x86-ml, lkml, Alexei Starovoitov, live-patching

On Mon, 8 Feb 2021, Steven Rostedt wrote:

> On Mon, 8 Feb 2021 16:47:05 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > /*
> > >  * Convert a function address into the appropriate ftrace location.
> > >  *
> > >  * Usually this is just the address of the function, but on some architectures
> > >  * it's more complicated so allow them to provide a custom behaviour.
> > >  */
> > > #ifndef klp_get_ftrace_location
> > > static unsigned long klp_get_ftrace_location(unsigned long faddr)
> > > {
> > > 	return faddr;
> > > }
> > > #endif  

powerpc has this

static inline unsigned long klp_get_ftrace_location(unsigned long faddr)                                               
{                                                                                                                      
        /*                                                                                                             
         * Live patch works only with -mprofile-kernel on PPC. In this case,                                           
         * the ftrace location is always within the first 16 bytes.                                                    
         */                                                                                                            
        return ftrace_location_range(faddr, faddr + 16);                                                               
}                                                                                                                      

> > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > increment the addr by the length of that.
> 
> I thought of that too. But one thing that may be possible, is to use
> kallsym. I believe you can get the range of a function (start and end of
> the function) from kallsyms. Then ask ftrace for the addr in that range
> (there should only be one).

And we can do this if a hard-coded value live above is not welcome. If I 
remember correctly, we used to have exactly this in the old versions of 
kGraft. We walked through all ftrace records, called 
kallsyms_lookup_size_offset() on every record's ip and if the offset+ip 
matched faddr (in this case), we returned the ip.

Miroslav

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09  8:32                 ` Miroslav Benes
@ 2021-02-09 14:49                   ` Steven Rostedt
  2021-02-09 15:16                     ` Miroslav Benes
  2021-02-09 16:45                     ` Alexei Starovoitov
  0 siblings, 2 replies; 26+ messages in thread
From: Steven Rostedt @ 2021-02-09 14:49 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Peter Zijlstra, Josh Poimboeuf, Linus Torvalds, Borislav Petkov,
	Dave Hansen, x86-ml, lkml, Alexei Starovoitov, live-patching

On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
Miroslav Benes <mbenes@suse.cz> wrote:

> powerpc has this
> 
> static inline unsigned long klp_get_ftrace_location(unsigned long faddr)                                               
> {                                                                                                                      
>         /*                                                                                                             
>          * Live patch works only with -mprofile-kernel on PPC. In this case,                                           
>          * the ftrace location is always within the first 16 bytes.                                                    
>          */                                                                                                            
>         return ftrace_location_range(faddr, faddr + 16);                                                               
> }                                                                                                                      
> 
> > > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > > increment the addr by the length of that.  
> > 
> > I thought of that too. But one thing that may be possible, is to use
> > kallsym. I believe you can get the range of a function (start and end of
> > the function) from kallsyms. Then ask ftrace for the addr in that range
> > (there should only be one).  
> 
> And we can do this if a hard-coded value live above is not welcome. If I 
> remember correctly, we used to have exactly this in the old versions of 
> kGraft. We walked through all ftrace records, called 
> kallsyms_lookup_size_offset() on every record's ip and if the offset+ip 
> matched faddr (in this case), we returned the ip.

Either way is fine. Question is, should we just wait till CET is
implemented for the kernel before making any of these changes? Just knowing
that we have a solution to handle it may be good enough for now.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09 14:49                   ` Steven Rostedt
@ 2021-02-09 15:16                     ` Miroslav Benes
  2021-02-09 16:45                     ` Alexei Starovoitov
  1 sibling, 0 replies; 26+ messages in thread
From: Miroslav Benes @ 2021-02-09 15:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Josh Poimboeuf, Linus Torvalds, Borislav Petkov,
	Dave Hansen, x86-ml, lkml, Alexei Starovoitov, live-patching

On Tue, 9 Feb 2021, Steven Rostedt wrote:

> On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
> Miroslav Benes <mbenes@suse.cz> wrote:
> 
> > powerpc has this
> > 
> > static inline unsigned long klp_get_ftrace_location(unsigned long faddr)                                               
> > {                                                                                                                      
> >         /*                                                                                                             
> >          * Live patch works only with -mprofile-kernel on PPC. In this case,                                           
> >          * the ftrace location is always within the first 16 bytes.                                                    
> >          */                                                                                                            
> >         return ftrace_location_range(faddr, faddr + 16);                                                               
> > }                                                                                                                      
> > 
> > > > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > > > increment the addr by the length of that.  
> > > 
> > > I thought of that too. But one thing that may be possible, is to use
> > > kallsym. I believe you can get the range of a function (start and end of
> > > the function) from kallsyms. Then ask ftrace for the addr in that range
> > > (there should only be one).  
> > 
> > And we can do this if a hard-coded value live above is not welcome. If I 
> > remember correctly, we used to have exactly this in the old versions of 
> > kGraft. We walked through all ftrace records, called 
> > kallsyms_lookup_size_offset() on every record's ip and if the offset+ip 
> > matched faddr (in this case), we returned the ip.
> 
> Either way is fine. Question is, should we just wait till CET is
> implemented for the kernel before making any of these changes? Just knowing
> that we have a solution to handle it may be good enough for now.

I'd prefer it to be a part of CET enablement patch set.

Miroslav

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09 14:49                   ` Steven Rostedt
  2021-02-09 15:16                     ` Miroslav Benes
@ 2021-02-09 16:45                     ` Alexei Starovoitov
  2021-02-09 16:55                       ` Andy Lutomirski
  1 sibling, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2021-02-09 16:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Miroslav Benes, Peter Zijlstra, Josh Poimboeuf, Linus Torvalds,
	Borislav Petkov, Dave Hansen, x86-ml, lkml, Alexei Starovoitov,
	live-patching

On Tue, Feb 9, 2021 at 6:49 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
> Miroslav Benes <mbenes@suse.cz> wrote:
>
> > powerpc has this
> >
> > static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
> > {
> >         /*
> >          * Live patch works only with -mprofile-kernel on PPC. In this case,
> >          * the ftrace location is always within the first 16 bytes.
> >          */
> >         return ftrace_location_range(faddr, faddr + 16);
> > }
> >
> > > > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > > > increment the addr by the length of that.
> > >
> > > I thought of that too. But one thing that may be possible, is to use
> > > kallsym. I believe you can get the range of a function (start and end of
> > > the function) from kallsyms. Then ask ftrace for the addr in that range
> > > (there should only be one).
> >
> > And we can do this if a hard-coded value live above is not welcome. If I
> > remember correctly, we used to have exactly this in the old versions of
> > kGraft. We walked through all ftrace records, called
> > kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
> > matched faddr (in this case), we returned the ip.
>
> Either way is fine. Question is, should we just wait till CET is
> implemented for the kernel before making any of these changes? Just knowing
> that we have a solution to handle it may be good enough for now.

I think the issue is more fundamental than what appears on the surface.
According to endbr64 documentation it's not just any instruction.
The cpu will wait for it and if it's replaced with int3 or not seen at
the branch target the cpu will throw an exception.
If I understood the doc correctly it means that endbr64 can never be
replaced with a breakpoint. If that's the case text_poke_bp and kprobe
need to do extra safety checks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09 16:45                     ` Alexei Starovoitov
@ 2021-02-09 16:55                       ` Andy Lutomirski
  2021-02-09 18:09                         ` Linus Torvalds
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2021-02-09 16:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Steven Rostedt, Miroslav Benes, Peter Zijlstra, Josh Poimboeuf,
	Linus Torvalds, Borislav Petkov, Dave Hansen, x86-ml, lkml,
	Alexei Starovoitov, live-patching



> On Feb 9, 2021, at 8:45 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Tue, Feb 9, 2021 at 6:49 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>> 
>>> On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
>>> Miroslav Benes <mbenes@suse.cz> wrote:
>>> 
>>> powerpc has this
>>> 
>>> static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
>>> {
>>>        /*
>>>         * Live patch works only with -mprofile-kernel on PPC. In this case,
>>>         * the ftrace location is always within the first 16 bytes.
>>>         */
>>>        return ftrace_location_range(faddr, faddr + 16);
>>> }
>>> 
>>>>> I suppose the trivial fix is to see if it points to endbr64 and if so,
>>>>> increment the addr by the length of that.
>>>> 
>>>> I thought of that too. But one thing that may be possible, is to use
>>>> kallsym. I believe you can get the range of a function (start and end of
>>>> the function) from kallsyms. Then ask ftrace for the addr in that range
>>>> (there should only be one).
>>> 
>>> And we can do this if a hard-coded value live above is not welcome. If I
>>> remember correctly, we used to have exactly this in the old versions of
>>> kGraft. We walked through all ftrace records, called
>>> kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
>>> matched faddr (in this case), we returned the ip.
>> 
>> Either way is fine. Question is, should we just wait till CET is
>> implemented for the kernel before making any of these changes? Just knowing
>> that we have a solution to handle it may be good enough for now.
> 
> I think the issue is more fundamental than what appears on the surface.
> According to endbr64 documentation it's not just any instruction.
> The cpu will wait for it and if it's replaced with int3 or not seen at
> the branch target the cpu will throw an exception.
> If I understood the doc correctly it means that endbr64 can never be
> replaced with a breakpoint. If that's the case text_poke_bp and kprobe
> need to do extra safety checks.

Ugh.

Or we hack up #CP to handle this case. I don’t quite know how I feel about this.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09 16:55                       ` Andy Lutomirski
@ 2021-02-09 18:09                         ` Linus Torvalds
  2021-02-09 18:26                           ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2021-02-09 18:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Alexei Starovoitov, Steven Rostedt, Miroslav Benes,
	Peter Zijlstra, Josh Poimboeuf, Borislav Petkov, Dave Hansen,
	x86-ml, lkml, Alexei Starovoitov, live-patching

On Tue, Feb 9, 2021 at 8:55 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
> Or we hack up #CP to handle this case. I don’t quite know how I feel about this.

I think that's the sane model - if we've replaced the instruction with
'int3', and we end up getting #CP due to that, just do the #BP
handling.

Anything else would just be insanely complicated, I feel.

             Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09 18:09                         ` Linus Torvalds
@ 2021-02-09 18:26                           ` Andy Lutomirski
  2021-02-09 18:39                             ` Linus Torvalds
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2021-02-09 18:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexei Starovoitov, Steven Rostedt, Miroslav Benes,
	Peter Zijlstra, Josh Poimboeuf, Borislav Petkov, Dave Hansen,
	x86-ml, lkml, Alexei Starovoitov, live-patching



> On Feb 9, 2021, at 10:09 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> On Tue, Feb 9, 2021 at 8:55 AM Andy Lutomirski <luto@amacapital.net> wrote:
>> 
>> Or we hack up #CP to handle this case. I don’t quite know how I feel about this.
> 
> I think that's the sane model - if we've replaced the instruction with
> 'int3', and we end up getting #CP due to that, just do the #BP
> handling.
> 
> Anything else would just be insanely complicated, I feel.

The other model is “don’t do that then.”

I suppose a nice property of patching ENDBR to INT3 is that, not only is it atomic, but ENDBR is sort of a NOP, so we don’t need to replace the ENDBR with anything.

> 
>             Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL] x86/urgent for v5.11-rc7
  2021-02-09 18:26                           ` Andy Lutomirski
@ 2021-02-09 18:39                             ` Linus Torvalds
  0 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2021-02-09 18:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Alexei Starovoitov, Steven Rostedt, Miroslav Benes,
	Peter Zijlstra, Josh Poimboeuf, Borislav Petkov, Dave Hansen,
	x86-ml, lkml, Alexei Starovoitov, live-patching

On Tue, Feb 9, 2021 at 10:26 AM Andy Lutomirski <luto@amacapital.net> wrote:
> >
> > Anything else would just be insanely complicated, I feel.
>
> The other model is “don’t do that then.”

Hmm. I guess all the code that does int3 patching could just be taught
to always go to the next instruction instead.

I don't think advancing the rewriting is an option for the asm
alternative() logic or the static call infrastructure, but those
should never be about endbr anyway, so presumably that's not an issue.

So if it ends up being _only_ about kprobes, then the "don't do that
then" might work fine.

                 Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-02-09 20:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-07 10:40 [GIT PULL] x86/urgent for v5.11-rc7 Borislav Petkov
2021-02-07 17:49 ` Linus Torvalds
2021-02-07 17:58   ` Borislav Petkov
2021-02-07 18:15     ` Linus Torvalds
2021-02-07 18:32       ` Dave Hansen
2021-02-07 18:40         ` Linus Torvalds
2021-02-07 22:45       ` Josh Poimboeuf
2021-02-08 15:02         ` Steven Rostedt
2021-02-08 15:33           ` Josh Poimboeuf
2021-02-08 15:47             ` Peter Zijlstra
2021-02-08 16:15               ` Steven Rostedt
2021-02-09  8:32                 ` Miroslav Benes
2021-02-09 14:49                   ` Steven Rostedt
2021-02-09 15:16                     ` Miroslav Benes
2021-02-09 16:45                     ` Alexei Starovoitov
2021-02-09 16:55                       ` Andy Lutomirski
2021-02-09 18:09                         ` Linus Torvalds
2021-02-09 18:26                           ` Andy Lutomirski
2021-02-09 18:39                             ` Linus Torvalds
2021-02-07 18:19     ` Dave Hansen
2021-02-07 18:31       ` Andy Lutomirski
2021-02-08 10:33         ` Peter Zijlstra
2021-02-07 20:44       ` Alexei Starovoitov
2021-02-07 22:35         ` Dave Hansen
2021-02-08 16:11           ` Yu, Yu-cheng
2021-02-07 18:29 ` pr-tracker-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).