All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"x86@kernel.org" <x86@kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 02/18] bpf: Add bpf_arch_text_poke() helper
Date: Fri, 8 Nov 2019 15:05:25 -0800	[thread overview]
Message-ID: <20191108230524.4j5jui2izyexxhkx@ast-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20191108213624.GM3079@worktop.programming.kicks-ass.net>

On Fri, Nov 08, 2019 at 10:36:24PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 08, 2019 at 11:32:41AM -0800, Alexei Starovoitov wrote:
> > On Fri, Nov 8, 2019 at 5:42 AM Alexei Starovoitov <ast@fb.com> wrote:
> > >
> > > On 11/8/19 1:36 AM, Peter Zijlstra wrote:
> > > > On Fri, Nov 08, 2019 at 10:11:56AM +0100, Peter Zijlstra wrote:
> > > >> On Thu, Nov 07, 2019 at 10:40:23PM -0800, Alexei Starovoitov wrote:
> > > >>> Add bpf_arch_text_poke() helper that is used by BPF trampoline logic to patch
> > > >>> nops/calls in kernel text into calls into BPF trampoline and to patch
> > > >>> calls/nops inside BPF programs too.
> > > >>
> > > >> This thing assumes the text is unused, right? That isn't spelled out
> > > >> anywhere. The implementation is very much unsafe vs concurrent execution
> > > >> of the text.
> > > >
> > > > Also, what NOP/CALL instructions will you be hijacking? If you're
> > > > planning on using the fentry nops, then what ensures this and ftrace
> > > > don't trample on one another? Similar for kprobes.
> > > >
> > > > In general, what ensures every instruction only has a single modifier?
> > >
> > > Looks like you didn't bother reading cover letter and missed a month
> 
> I did indeed not. A Changelog should be self sufficient and this one is
> sorely lacking. The cover leter is not preserved and should therefore
> not contain anything of value that is not also covered in the
> Changelogs.
> 
> > > of discussions between my and Steven regarding exactly this topic
> > > though you were directly cc-ed in all threads :(
> 
> I read some of it; it is a sad fact that I cannot read all email in my
> inbox, esp. not if, like in the last week or so, I'm busy hunting a
> regression.
> 
> And what I did remember of the emails I saw left me with the questions
> that were not answered by the changelog.
> 
> > > tldr for kernel fentry nops it will be converted to use
> > > register_ftrace_direct() whenever it's available.
> 
> So why the rush and not wait for that work to complete? It appears to me
> that without due coordination between bpf and ftrace badness could
> happen.
> 
> > > For all other nops, calls, jumps that are inside BPF programs BPF infra
> > > will continue modifying them through this helper.
> > > Daniel's upcoming bpf_tail_call() optimization will use text_poke as well.
> 
> This is probably off topic, but isn't tail-call optimization something
> done at JIT time and therefore not in need ot text_poke()?

Not quite. bpf_tail_call() are done via prog_array which is indirect jmp and
it suffers from retpoline. The verifier can see that in a lot of cases the
prog_array is used with constant index into array instead of a variable. In
such case indirect jmps can be optimized with direct jmps. That is
essentially what Daniel's patches are doing that are building on top of
bpf_arch_text_poke() and trampoline that I'm introducing in this set.

Another set is being prepared by Bjorn that also builds on top of
bpf_arch_text_poke() and trampoline. It's serving the purpose of getting rid of
indirect call when driver calls into BPF program for the first time. We've
looked at your static_call and concluded that it doesn't quite cut for this use
case.

The third framework is worked on by Martin. Who is using BPF trampoline for
BPF-based TCP extensions. This bit is not related to indirect call/jmp
optimization, but needs trampoline.

> > I was thinking more about this.
> > Peter,
> > do you mind we apply your first patch:
> > https://lore.kernel.org/lkml/20191007081944.88332264.2@infradead.org/
> > to both tip and bpf-next trees?
> 
> That would indeed be a much better solution. I'll repost much of that on
> Monday, and then we'll work on getting at the very least that one patch
> in a tip/branch we can share.

Awesome! I can certainly wait till next week. I just don't want to miss the
merge window for the work that it is ready. More below.

> > Then I can use text_poke_bp() as-is without any additional ugliness
> > on my side that would need to be removed in few weeks.
> 
> This I do _NOT_ understand. Why are you willing to merge a known broken
> patch? What is the rush, why can't you wait for all the prerequisites to
> land?

People have deadlines and here I'm not talking about fb deadlines. If it was
only up to me I could have waited until yours and Steven's patches land in
Linus's tree. Then Dave would pick them up after the merge window into net-next
and bpf things would be ready for the next release. Which is in 1.5 + 2 + 8
weeks (assuming 1.5 weeks until merge window, 2 weeks merge window, and 8
weeks next release cycle).
But most of bpf things are ready. I have one more follow up to do for another
feature. The first 4-5 patches of my set will enable Bjorn, Daniel, and
Martin's work. So I'm mainly looking for a way to converge three trees during
the merge window with no conflicts.

Just saw that Steven posted his set. That is great. If you land your first part
of text_poke_pb() next week into tip it will enable us to cherry-pick the first
few patches from tip and continue with bpf trampoline in net-next. Then during
the merge window tip, Steven's and net-next land into Linus's tree. Then I'll
send small follow up to switch to Steven's register_ftrace_direct() in places
that can use it and the other bits of bpf will keep using yours text_poke_bp()
because it's for the code inside generated bpf progs, various generated
trampolines and such. The conversion of some of bpf bits to
register_ftrace_direct() can be delayed by a release if really necessary. Since
text_poke_bp() approach will work fine, just not as nice if there is a full
integration via ftrace.
imo it's the best path for 3 trees to converge without delaying things for bpf
folks by a full release. At the end the deadlines are met and a bunch of people
are unblocked and happy. I hope that explains the rush.


  parent reply	other threads:[~2019-11-08 23:05 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08  6:40 [PATCH v3 bpf-next 00/18] Introduce BPF trampoline Alexei Starovoitov
2019-11-08  6:40 ` [PATCH v3 bpf-next 01/18] bpf: refactor x86 JIT into helpers Alexei Starovoitov
2019-11-08 19:27   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 02/18] bpf: Add bpf_arch_text_poke() helper Alexei Starovoitov
2019-11-08  6:56   ` Song Liu
2019-11-08  8:23   ` Björn Töpel
2019-11-08 14:09     ` Alexei Starovoitov
2019-11-08  9:11   ` Peter Zijlstra
2019-11-08  9:36     ` Peter Zijlstra
2019-11-08 13:41       ` Alexei Starovoitov
2019-11-08 19:32         ` Alexei Starovoitov
2019-11-08 21:36           ` Peter Zijlstra
2019-11-08 21:39             ` David Miller
2019-11-11  8:14               ` Peter Zijlstra
2019-11-11 10:21                 ` Daniel Borkmann
2019-11-11 16:10                 ` Jonathan Corbet
2019-11-08 23:05             ` Alexei Starovoitov [this message]
2019-11-10 10:54               ` Thomas Gleixner
2019-11-08  6:40 ` [PATCH v3 bpf-next 03/18] bpf: Introduce BPF trampoline Alexei Starovoitov
2019-11-08  7:04   ` Song Liu
2019-11-08  6:40 ` [PATCH v3 bpf-next 04/18] libbpf: Introduce btf__find_by_name_kind() Alexei Starovoitov
2019-11-08  7:05   ` Song Liu
2019-11-08 19:21   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 05/18] libbpf: Add support to attach to fentry/fexit tracing progs Alexei Starovoitov
2019-11-08  7:12   ` Song Liu
2019-11-08 19:44   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 06/18] selftest/bpf: Simple test for fentry/fexit Alexei Starovoitov
2019-11-08  6:40 ` [PATCH v3 bpf-next 07/18] bpf: Add kernel test functions for fentry testing Alexei Starovoitov
2019-11-08  6:40 ` [PATCH v3 bpf-next 08/18] selftests/bpf: Add test for BPF trampoline Alexei Starovoitov
2019-11-08  6:40 ` [PATCH v3 bpf-next 09/18] selftests/bpf: Add fexit tests " Alexei Starovoitov
2019-11-08  6:40 ` [PATCH v3 bpf-next 10/18] selftests/bpf: Add combined fentry/fexit test Alexei Starovoitov
2019-11-08  7:14   ` Song Liu
2019-11-08  6:40 ` [PATCH v3 bpf-next 11/18] selftests/bpf: Add stress test for maximum number of progs Alexei Starovoitov
2019-11-08  7:24   ` Song Liu
2019-11-08  6:40 ` [PATCH v3 bpf-next 12/18] bpf: Reserve space for BPF trampoline in BPF programs Alexei Starovoitov
2019-11-08  7:25   ` Song Liu
2019-11-08  6:40 ` [PATCH v3 bpf-next 13/18] bpf: Fix race in btf_resolve_helper_id() Alexei Starovoitov
2019-11-08  7:32   ` Song Liu
2019-11-08 19:58   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 14/18] bpf: Compare BTF types of functions arguments with actual types Alexei Starovoitov
2019-11-08 17:28   ` Song Liu
2019-11-08 17:32     ` Song Liu
2019-11-08 17:57       ` Alexei Starovoitov
2019-11-08 17:59         ` Song Liu
2019-11-08 23:46   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 15/18] bpf: Support attaching tracing BPF program to other BPF programs Alexei Starovoitov
2019-11-08 18:49   ` Song Liu
2019-11-08 18:59     ` Alexei Starovoitov
2019-11-08 20:17   ` Toke Høiland-Jørgensen
2019-11-08 21:14     ` Alexei Starovoitov
2019-11-08 21:32       ` Toke Høiland-Jørgensen
2019-11-10  7:17   ` Andrii Nakryiko
2019-11-11 23:04     ` Alexei Starovoitov
2019-11-12  4:38       ` Andrii Nakryiko
2019-11-12  4:47         ` Alexei Starovoitov
2019-11-08  6:40 ` [PATCH v3 bpf-next 16/18] libbpf: Add support for attaching BPF programs " Alexei Starovoitov
2019-11-08 18:57   ` Song Liu
2019-11-08 19:13     ` Alexei Starovoitov
2019-11-08 19:14       ` Song Liu
2019-11-10 16:56   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 17/18] selftests/bpf: Extend test_pkt_access test Alexei Starovoitov
2019-11-08 19:03   ` Song Liu
2019-11-10 16:58   ` Andrii Nakryiko
2019-11-08  6:40 ` [PATCH v3 bpf-next 18/18] selftests/bpf: Add a test for attaching BPF prog to another BPF prog and subprog Alexei Starovoitov
2019-11-08 19:13   ` Song Liu
2019-11-10 17:04   ` Andrii Nakryiko
2019-11-11 23:07     ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191108230524.4j5jui2izyexxhkx@ast-mbp.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=Kernel-team@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.