All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Xu Kuohai <xukuohai@huawei.com>
Cc: bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Zi Shen Lim <zlim.lnx@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, hpa@zytor.com, Shuah Khan <shuah@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Daniel Kiss <daniel.kiss@arm.com>,
	Steven Price <steven.price@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Peter Collingbourne <pcc@google.com>,
	Mark Brown <broonie@kernel.org>, Delyan Kratunov <delyank@fb.com>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Wang ShaoBo <bobo.shaobowang@huawei.com>,
	cj.chengjian@huawei.com, huawei.libin@huawei.com,
	xiexiuqi@huawei.com, liwei391@huawei.com
Subject: Re: [PATCH bpf-next v5 1/6] arm64: ftrace: Add ftrace direct call support
Date: Thu, 26 May 2022 11:06:09 +0100	[thread overview]
Message-ID: <Yo9REdx3nsgbZunE@FVFF77S0Q05N> (raw)
In-Reply-To: <0f8fe661-c450-ccd8-761f-dbfff449c533@huawei.com>

On Thu, May 26, 2022 at 05:45:03PM +0800, Xu Kuohai wrote:
> On 5/25/2022 9:38 PM, Mark Rutland wrote:
> > On Wed, May 18, 2022 at 09:16:33AM -0400, Xu Kuohai wrote:
> >> Add ftrace direct support for arm64.
> >>
> >> 1. When there is custom trampoline only, replace the fentry nop to a
> >>    jump instruction that jumps directly to the custom trampoline.
> >>
> >> 2. When ftrace trampoline and custom trampoline coexist, jump from
> >>    fentry to ftrace trampoline first, then jump to custom trampoline
> >>    when ftrace trampoline exits. The current unused register
> >>    pt_regs->orig_x0 is used as an intermediary for jumping from ftrace
> >>    trampoline to custom trampoline.
> > 
> > For those of us not all that familiar with BPF, can you explain *why* you want
> > this? The above explains what the patch implements, but not why that's useful.
> > 
> > e.g. is this just to avoid the overhead of the ops list processing in the
> > regular ftrace code, or is the custom trampoline there to allow you to do
> > something special?
> 
> IIUC, ftrace direct call was designed to *remove* the unnecessary
> overhead of saving regs completely [1][2].

Ok. My plan is to get rid of most of the register saving generally, so I think
that aspect can be solved without direct calls.

> [1]
> https://lore.kernel.org/all/20191022175052.frjzlnjjfwwfov64@ast-mbp.dhcp.thefacebook.com/
> [2] https://lore.kernel.org/all/20191108212834.594904349@goodmis.org/
> 
> This patch itself is just a variant of [3].
> 
> [3] https://lore.kernel.org/all/20191108213450.891579507@goodmis.org/
> 
> > 
> > There is another patch series on the list from some of your colleagues which
> > uses dynamic trampolines to try to avoid that ops list overhead, and it's not
> > clear to me whether these are trying to solve the largely same problem or
> > something different. That other thread is at:
> > 
> >   https://lore.kernel.org/linux-arm-kernel/20220316100132.244849-1-bobo.shaobowang@huawei.com/
> > 
> > ... and I've added the relevant parties to CC here, since there doesn't seem to
> > be any overlap in the CC lists of the two threads.
> 
> We're not working to solve the same problem. The trampoline introduced
> in this series helps us to monitor kernel function or another bpf prog
> with bpf, and also helps us to use bpf prog like a normal kernel
> function pointer.

Ok, but why is it necessary to have a special trampoline?

Is that *just* to avoid overhead, or do you need to do something special that
the regular trampoline won't do?

> > 
> > In that other thread I've suggested a general approach we could follow at:
> >   
> >   https://lore.kernel.org/linux-arm-kernel/YmGF%2FOpIhAF8YeVq@lakrids/
> >
> 
> Is it possible for a kernel function to take a long jump to common
> trampoline when we get a huge kernel image?

It is possible, but only where the kernel Image itself is massive and the .text
section exceeeds 128MiB, at which point other things break anyway. Practically
speaking, this doesn't happen for production kernels, or reasonable test
kernels.

I've been meaning to add some logic to detect this at boot time and idsable
ftrace (or at build time), since live patching would also be broken in that
case.

> > As noted in that thread, I have a few concerns which equally apply here:
> > 
> > * Due to the limited range of BL instructions, it's not always possible to
> >   patch an ftrace call-site to branch to an arbitrary trampoline. The way this
> >   works for ftrace today relies upon knowingthe set of trampolines at
> >   compile-time, and allocating module PLTs for those, and that approach cannot
> >   work reliably for dynanically allocated trampolines.
> 
> Currently patch 5 returns -ENOTSUPP when long jump is detected, so no
> bpf trampoline is constructed for out of range patch-site:
> 
> if (is_long_jump(orig_call, image))
> 	return -ENOTSUPP;

Sure, my point is that in practice that means that (from the user's PoV) this
may randomly fail to work, and I'd like something that we can ensure works
consistently.

> >   I'd strongly prefer to avoid custom tramplines unless they're strictly
> >   necessary for functional reasons, so that we can have this work reliably and
> >   consistently.
> 
> bpf trampoline is needed by bpf itself, not to replace ftrace trampolines.

As above, can you please let me know *why* specifically it is needed? Why can't
we invoke the BPF code through the usual ops mechanism?

Is that to avoid overhead, or are there other functional reasons you need a
special trampoline?

> >> * If this is mostly about avoiding the ops list processing overhead, I
> beleive
> >   we can implement some custom ops support more generally in ftrace which would
> >   still use a common trampoline but could directly call into those custom ops.
> >   I would strongly prefer this over custom trampolines.
> > 
> > * I'm looking to minimize the set of regs ftrace saves, and never save a full
> >   pt_regs, since today we (incompletely) fill that with bogus values and cannot
> >   acquire some state reliably (e.g. PSTATE). I'd like to avoid usage of pt_regs
> >   unless necessary, and I don't want to add additional reliance upon that
> >   structure.
> 
> Even if such a common trampoline is used, bpf trampoline is still
> necessary since we need to construct custom instructions to implement
> bpf functions, for example, to implement kernel function pointer with a
> bpf prog.

Sorry, but I'm struggling to understand this. What specifically do you need to
do that means this can't use the same calling convention as the regular ops
function pointers?

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: Xu Kuohai <xukuohai@huawei.com>
Cc: bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Zi Shen Lim <zlim.lnx@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, hpa@zytor.com, Shuah Khan <shuah@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Daniel Kiss <daniel.kiss@arm.com>,
	Steven Price <steven.price@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Peter Collingbourne <pcc@google.com>,
	Mark Brown <broonie@kernel.org>, Delyan Kratunov <delyank@fb.com>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Wang ShaoBo <bobo.shaobowang@huawei.com>,
	cj.chengjian@huawei.com, huawei.libin@huawei.com,
	xiexiuqi@huawei.com, liwei391@huawei.com
Subject: Re: [PATCH bpf-next v5 1/6] arm64: ftrace: Add ftrace direct call support
Date: Thu, 26 May 2022 11:06:09 +0100	[thread overview]
Message-ID: <Yo9REdx3nsgbZunE@FVFF77S0Q05N> (raw)
In-Reply-To: <0f8fe661-c450-ccd8-761f-dbfff449c533@huawei.com>

On Thu, May 26, 2022 at 05:45:03PM +0800, Xu Kuohai wrote:
> On 5/25/2022 9:38 PM, Mark Rutland wrote:
> > On Wed, May 18, 2022 at 09:16:33AM -0400, Xu Kuohai wrote:
> >> Add ftrace direct support for arm64.
> >>
> >> 1. When there is custom trampoline only, replace the fentry nop to a
> >>    jump instruction that jumps directly to the custom trampoline.
> >>
> >> 2. When ftrace trampoline and custom trampoline coexist, jump from
> >>    fentry to ftrace trampoline first, then jump to custom trampoline
> >>    when ftrace trampoline exits. The current unused register
> >>    pt_regs->orig_x0 is used as an intermediary for jumping from ftrace
> >>    trampoline to custom trampoline.
> > 
> > For those of us not all that familiar with BPF, can you explain *why* you want
> > this? The above explains what the patch implements, but not why that's useful.
> > 
> > e.g. is this just to avoid the overhead of the ops list processing in the
> > regular ftrace code, or is the custom trampoline there to allow you to do
> > something special?
> 
> IIUC, ftrace direct call was designed to *remove* the unnecessary
> overhead of saving regs completely [1][2].

Ok. My plan is to get rid of most of the register saving generally, so I think
that aspect can be solved without direct calls.

> [1]
> https://lore.kernel.org/all/20191022175052.frjzlnjjfwwfov64@ast-mbp.dhcp.thefacebook.com/
> [2] https://lore.kernel.org/all/20191108212834.594904349@goodmis.org/
> 
> This patch itself is just a variant of [3].
> 
> [3] https://lore.kernel.org/all/20191108213450.891579507@goodmis.org/
> 
> > 
> > There is another patch series on the list from some of your colleagues which
> > uses dynamic trampolines to try to avoid that ops list overhead, and it's not
> > clear to me whether these are trying to solve the largely same problem or
> > something different. That other thread is at:
> > 
> >   https://lore.kernel.org/linux-arm-kernel/20220316100132.244849-1-bobo.shaobowang@huawei.com/
> > 
> > ... and I've added the relevant parties to CC here, since there doesn't seem to
> > be any overlap in the CC lists of the two threads.
> 
> We're not working to solve the same problem. The trampoline introduced
> in this series helps us to monitor kernel function or another bpf prog
> with bpf, and also helps us to use bpf prog like a normal kernel
> function pointer.

Ok, but why is it necessary to have a special trampoline?

Is that *just* to avoid overhead, or do you need to do something special that
the regular trampoline won't do?

> > 
> > In that other thread I've suggested a general approach we could follow at:
> >   
> >   https://lore.kernel.org/linux-arm-kernel/YmGF%2FOpIhAF8YeVq@lakrids/
> >
> 
> Is it possible for a kernel function to take a long jump to common
> trampoline when we get a huge kernel image?

It is possible, but only where the kernel Image itself is massive and the .text
section exceeeds 128MiB, at which point other things break anyway. Practically
speaking, this doesn't happen for production kernels, or reasonable test
kernels.

I've been meaning to add some logic to detect this at boot time and idsable
ftrace (or at build time), since live patching would also be broken in that
case.

> > As noted in that thread, I have a few concerns which equally apply here:
> > 
> > * Due to the limited range of BL instructions, it's not always possible to
> >   patch an ftrace call-site to branch to an arbitrary trampoline. The way this
> >   works for ftrace today relies upon knowingthe set of trampolines at
> >   compile-time, and allocating module PLTs for those, and that approach cannot
> >   work reliably for dynanically allocated trampolines.
> 
> Currently patch 5 returns -ENOTSUPP when long jump is detected, so no
> bpf trampoline is constructed for out of range patch-site:
> 
> if (is_long_jump(orig_call, image))
> 	return -ENOTSUPP;

Sure, my point is that in practice that means that (from the user's PoV) this
may randomly fail to work, and I'd like something that we can ensure works
consistently.

> >   I'd strongly prefer to avoid custom tramplines unless they're strictly
> >   necessary for functional reasons, so that we can have this work reliably and
> >   consistently.
> 
> bpf trampoline is needed by bpf itself, not to replace ftrace trampolines.

As above, can you please let me know *why* specifically it is needed? Why can't
we invoke the BPF code through the usual ops mechanism?

Is that to avoid overhead, or are there other functional reasons you need a
special trampoline?

> >> * If this is mostly about avoiding the ops list processing overhead, I
> beleive
> >   we can implement some custom ops support more generally in ftrace which would
> >   still use a common trampoline but could directly call into those custom ops.
> >   I would strongly prefer this over custom trampolines.
> > 
> > * I'm looking to minimize the set of regs ftrace saves, and never save a full
> >   pt_regs, since today we (incompletely) fill that with bogus values and cannot
> >   acquire some state reliably (e.g. PSTATE). I'd like to avoid usage of pt_regs
> >   unless necessary, and I don't want to add additional reliance upon that
> >   structure.
> 
> Even if such a common trampoline is used, bpf trampoline is still
> necessary since we need to construct custom instructions to implement
> bpf functions, for example, to implement kernel function pointer with a
> bpf prog.

Sorry, but I'm struggling to understand this. What specifically do you need to
do that means this can't use the same calling convention as the regular ops
function pointers?

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-05-26 10:06 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18 13:16 [PATCH bpf-next v5 0/6] bpf trampoline for arm64 Xu Kuohai
2022-05-18 13:16 ` Xu Kuohai
2022-05-18 13:16 ` [PATCH bpf-next v5 1/6] arm64: ftrace: Add ftrace direct call support Xu Kuohai
2022-05-18 13:16   ` Xu Kuohai
2022-05-23  1:39   ` KP Singh
2022-05-23  1:39     ` KP Singh
2022-05-25 13:38   ` Mark Rutland
2022-05-25 13:38     ` Mark Rutland
2022-05-26  9:45     ` Xu Kuohai
2022-05-26  9:45       ` Xu Kuohai
2022-05-26 10:06       ` Mark Rutland [this message]
2022-05-26 10:06         ` Mark Rutland
2022-05-26 14:48         ` Xu Kuohai
2022-05-26 14:48           ` Xu Kuohai
2022-06-06 16:35           ` Mark Rutland
2022-06-06 16:35             ` Mark Rutland
2022-06-09  4:27             ` Xu Kuohai
2022-06-09  4:27               ` Xu Kuohai
2022-08-09 17:03               ` Florent Revest
2022-08-09 17:03                 ` Florent Revest
2022-08-10  8:10                 ` Xu Kuohai
2022-08-10  8:10                   ` Xu Kuohai
2022-05-18 13:16 ` [PATCH bpf-next v5 2/6] ftrace: Fix deadloop caused by direct call in ftrace selftest Xu Kuohai
2022-05-18 13:16   ` Xu Kuohai
2022-05-25 13:43   ` Mark Rutland
2022-05-25 13:43     ` Mark Rutland
2022-05-26  9:45     ` Xu Kuohai
2022-05-26  9:45       ` Xu Kuohai
2022-05-18 13:16 ` [PATCH bpf-next v5 3/6] bpf: Remove is_valid_bpf_tramp_flags() Xu Kuohai
2022-05-18 13:16   ` Xu Kuohai
2022-05-25 13:45   ` Mark Rutland
2022-05-25 13:45     ` Mark Rutland
2022-05-26  9:45     ` Xu Kuohai
2022-05-26  9:45       ` Xu Kuohai
2022-05-26 10:12       ` Mark Rutland
2022-05-26 10:12         ` Mark Rutland
2022-05-26 14:46         ` Xu Kuohai
2022-05-26 14:46           ` Xu Kuohai
2022-05-18 13:16 ` [PATCH bpf-next v5 4/6] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Xu Kuohai
2022-05-18 13:16   ` Xu Kuohai
2022-05-23  1:41   ` KP Singh
2022-05-23  1:41     ` KP Singh
2022-05-25 14:10   ` Mark Rutland
2022-05-25 14:10     ` Mark Rutland
2022-05-26  9:45     ` Xu Kuohai
2022-05-26  9:45       ` Xu Kuohai
2022-05-26 10:34       ` Mark Rutland
2022-05-26 10:34         ` Mark Rutland
2022-05-26 14:47         ` Xu Kuohai
2022-05-26 14:47           ` Xu Kuohai
2022-05-18 13:16 ` [PATCH bpf-next v5 5/6] bpf, arm64: bpf trampoline " Xu Kuohai
2022-05-18 13:16   ` Xu Kuohai
2022-05-20 21:18   ` Alexei Starovoitov
2022-05-20 21:18     ` Alexei Starovoitov
2022-05-23 16:09     ` Mark Rutland
2022-05-23 16:09       ` Mark Rutland
2022-05-23  1:36   ` KP Singh
2022-05-23  1:36     ` KP Singh
2022-05-18 13:16 ` [PATCH bpf-next v5 6/6] selftests/bpf: Fix trivial typo in fentry_fexit.c Xu Kuohai
2022-05-18 13:16   ` Xu Kuohai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yo9REdx3nsgbZunE@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=andrii@kernel.org \
    --cc=ardb@kernel.org \
    --cc=ast@kernel.org \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cj.chengjian@huawei.com \
    --cc=daniel.kiss@arm.com \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=delyank@fb.com \
    --cc=dsahern@kernel.org \
    --cc=hawk@kernel.org \
    --cc=hpa@zytor.com \
    --cc=huawei.libin@huawei.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=liwei391@huawei.com \
    --cc=maz@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=pcc@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=steven.price@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xiexiuqi@huawei.com \
    --cc=xukuohai@huawei.com \
    --cc=yhs@fb.com \
    --cc=yoshfuji@linux-ipv6.org \
    --cc=zlim.lnx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.