linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xu Kuohai <xukuohai@huawei.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, <bpf@vger.kernel.org>,
	Florent Revest <revest@chromium.org>,
	Will Deacon <will@kernel.org>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Zi Shen Lim <zlim.lnx@gmail.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Ard Biesheuvel <ardb@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Guo Ren <guoren@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>
Subject: Re: [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64
Date: Thu, 6 Oct 2022 18:09:05 +0800	[thread overview]
Message-ID: <ee451276-1101-0b83-9840-9c506986f91d@huawei.com> (raw)
In-Reply-To: <YzR5WSLux4mmFIXg@FVFF77S0Q05N>

On 9/29/2022 12:42 AM, Mark Rutland wrote:
> On Tue, Sep 27, 2022 at 12:49:58PM +0800, Xu Kuohai wrote:
>> On 9/27/2022 1:43 AM, Mark Rutland wrote:
>>> On Mon, Sep 26, 2022 at 03:40:20PM +0100, Catalin Marinas wrote:
>>>> On Thu, Sep 22, 2022 at 08:01:16PM +0200, Daniel Borkmann wrote:
>>>>> On 9/13/22 6:27 PM, Xu Kuohai wrote:
>>>>>> This series adds ftrace direct call for arm64, which is required to attach
>>>>>> bpf trampoline to fentry.
>>>>>>
>>>>>> Although there is no agreement on how to support ftrace direct call on arm64,
>>>>>> no patch has been posted except the one I posted in [1], so this series
>>>>>> continues the work of [1] with the addition of long jump support. Now ftrace
>>>>>> direct call works regardless of the distance between the callsite and custom
>>>>>> trampoline.
>>>>>>
>>>>>> [1] https://lore.kernel.org/bpf/20220518131638.3401509-2-xukuohai@huawei.com/
>>>>>>
>>>>>> v2:
>>>>>> - Fix compile and runtime errors caused by ftrace_rec_arch_init
>>>>>>
>>>>>> v1: https://lore.kernel.org/bpf/20220913063146.74750-1-xukuohai@huaweicloud.com/
>>>>>>
>>>>>> Xu Kuohai (4):
>>>>>>      ftrace: Allow users to disable ftrace direct call
>>>>>>      arm64: ftrace: Support long jump for ftrace direct call
>>>>>>      arm64: ftrace: Add ftrace direct call support
>>>>>>      ftrace: Fix dead loop caused by direct call in ftrace selftest
>>>>>
>>>>> Given there's just a tiny fraction touching BPF JIT and most are around core arm64,
>>>>> it probably makes sense that this series goes via Catalin/Will through arm64 tree
>>>>> instead of bpf-next if it looks good to them. Catalin/Will, thoughts (Ack + bpf-next
>>>>> could work too, but I'd presume this just results in merge conflicts)?
>>>>
>>>> I think it makes sense for the series to go via the arm64 tree but I'd
>>>> like Mark to have a look at the ftrace changes first.
>>>
>>>>  From a quick scan, I still don't think this is quite right, and as it stands I
>>> believe this will break backtracing (as the instructions before the function
>>> entry point will not be symbolized correctly, getting in the way of
>>> RELIABLE_STACKTRACE). I think I was insufficiently clear with my earlier
>>> feedback there, as I have a mechanism in mind that wa a little simpler.
>>
>> Thanks for the review. I have some thoughts about reliable stacktrace.
>>
>> If PC is not in the range of literal_call, stacktrace works as before without
>> changes.
>>
>> If PC is in the range of literal_call, for example, interrupted by an
>> irq, I think there are 2 problems:
>>
>> 1. Caller LR is not pushed to the stack yet, so caller's address and name
>>     will be missing from the backtrace.
>>
>> 2. Since PC is not in func's address range, no symbol name will be found, so
>>     func name is also missing.
>>
>> Problem 1 is not introduced by this patchset, but the occurring probability
>> may be increased by this patchset. I think this problem should be addressed by
>> a reliable stacktrace scheme, such as ORC on x86.
> 
> I agree problem 1 is not introduced by this patch set; I have plans fo how to
> address that for reliable stacktrace based on identifying the ftrace
> trampoline. This is one of the reasons I do not want direct calls, as
> identifying all direct call trampolines is going to be very painful and slow,
> whereas identifying a statically allocated ftrace trampoline is far simpler.
> 
>> Problem 2 is indeed introduced by this patchset. I think there are at least 3
>> ways to deal with it:
> 
> What I would like to do here, as mentioned previously in other threads, is to
> avoid direct calls, and implement "FTRACE_WITH_OPS", where we can associate
> each patch-site with a specific set of ops, and invoke that directly from the
> regular ftrace trampoline.
> 
> With that, the patch site would look like:
> 
> 	pre_func_literal:
> 		NOP		// Patched to a pointer to
> 		NOP		// ftrace_ops
> 	func:
> 		< optional BTI here >
> 		NOP		// Patched to MOV X9, LR
> 		NOP		// Patched to a BL to the ftrace trampoline
> 
> ... then in the ftrace trampoline we can recover the ops pointer at a negative
> offset from the LR based on the LR, and invoke the ops from there (passing a
> struct ftrace_regs with the saved regs).
> 
> That way the patch-site is less significantly affected, and there's no impact
> to backtracing. That gets most of the benefit of the direct calls avoiding the
> ftrace ops list traversal, without having to do anything special at all. That
> should be much easier to maintain, too.
> 
> I started implementing that before LPC (and you can find some branches on my
> kernel.org repo), but I haven't yet had the time to rebase those and sort out
> the remaining issues:
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace/per-callsite-ops
> 
> Note that as a prerequisite for that I also want to reduce the set of registers
> we save/restore down to the set required by our calling convention, as the
> existing pt_regs is both large and generally unsound (since we can not and do
> not fill in many of the fields we only acquire at an exception boundary).
> That'll further reduce the ftrace overhead generally, and remove the needs for
> the two trampolines we currently have. I have a WIP at:
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace/minimal-regs
> 
> I intend to get back to both of those shortly (along with some related bits for
> kretprobes and stacktracing); I just haven't had much time recently due to
> other work and illness.
> 
>> 1. Add a symbol name for literal_call.
> 
> That'll require a number of invasive changes to make RELIABLE_STACKTRACE work,
> so I don't think we want to do that.
> 
>> 2. Hack the backtrace routine, if no symbol name found for a PC during backtrace,
>>     we can check if the PC is in literal_call, then adjust PC and try again.
> 
> The problem is that the existing symbolization code doesn't know the length of
> the prior symbol, so it will find *some* symbol associated with the previous
> function rather than finding no symbol.
> 
> To bodge around this we'dd need to special-case each patchable-function-entry
> site in symbolization, which is going to be painful and slow down unwinding
> unless we try to fix this up at boot-time or compile time.
> 
>> 3. Move literal_call to the func's address range, for example:
>>
>>          a. Compile with -fpatchable-function-entry=7
>>          func:
>>                  BTI C
>>                  NOP
>>                  NOP
>>                  NOP
>>                  NOP
>>                  NOP
>>                  NOP
>>                  NOP
> 
> This is a non-starter. We are not going to add 7 NOPs at the start of every
> function.
> 

Looks like we could just add 3 NOPs to function entry, like this:

1. At startup or when nothing attached, patch callsite to:

         literal:
                 .quad dummy_tramp
         func:
                 BTI C
                 MOV X9, LR
                 NOP
                 NOP
                 ...

2. When target is in range, patch callsite to

         literal:
                 .quad dummy_tramp
         func:
                 BTI C
                 MOV X9, LR
                 NOP
                 BL custom_trampoline
                 ...


3. Whe target is out of range, patch callsite to

         literal:
                 .quad custom_trampoline
         func:
                 BTI C
                 MOV X9, LR
                 LDR X16, literal
                 BLR X16
                 ...


> Thanks,
> Mark.
> 
> .


      parent reply	other threads:[~2022-10-06 10:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-13 16:27 [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64 Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 1/4] ftrace: Allow users to disable ftrace direct call Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 2/4] arm64: ftrace: Support long jump for " Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 3/4] arm64: ftrace: Add ftrace direct call support Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 4/4] ftrace: Fix dead loop caused by direct call in ftrace selftest Xu Kuohai
2022-09-22 18:01 ` [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64 Daniel Borkmann
2022-09-26 14:40   ` Catalin Marinas
2022-09-26 17:43     ` Mark Rutland
2022-09-27  4:49       ` Xu Kuohai
2022-09-28 16:42         ` Mark Rutland
2022-09-30  4:07           ` Xu Kuohai
2022-10-04 16:06             ` Florent Revest
2022-10-05 14:54               ` Xu Kuohai
2022-10-05 15:07                 ` Steven Rostedt
2022-10-05 15:10                   ` Florent Revest
2022-10-05 15:30                     ` Steven Rostedt
2022-10-05 22:12                       ` Jiri Olsa
2022-10-06 16:35                         ` Florent Revest
2022-10-06 10:09                       ` Xu Kuohai
2022-10-06 16:19                       ` Florent Revest
2022-10-06 16:29                         ` Steven Rostedt
2022-10-07 10:13                           ` Xu Kuohai
2022-10-17 17:55                           ` Florent Revest
2022-10-17 18:49                             ` Steven Rostedt
2022-10-17 19:10                               ` Florent Revest
2022-10-21 11:31                             ` Masami Hiramatsu
2022-10-21 16:49                               ` Florent Revest
2022-10-24 13:00                                 ` Masami Hiramatsu
2022-11-10  4:58                                 ` wuqiang
2022-10-06 10:09           ` Xu Kuohai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee451276-1101-0b83-9840-9c506986f91d@huawei.com \
    --to=xukuohai@huawei.com \
    --cc=andrii@kernel.org \
    --cc=ardb@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel@iogearbox.net \
    --cc=guoren@kernel.org \
    --cc=haoluo@google.com \
    --cc=jean-philippe@linaro.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=maz@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=revest@chromium.org \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=will@kernel.org \
    --cc=yhs@fb.com \
    --cc=zlim.lnx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).