All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <song@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Song Liu <songliubraving@fb.com>,
	Ilya Leoshkevich <iii@linux.ibm.com>, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>
Subject: Re: [PATCH v6 bpf-next 6/7] bpf: introduce bpf_prog_pack allocator
Date: Tue, 25 Jan 2022 15:09:18 -0800	[thread overview]
Message-ID: <CAPhsuW7AzQL5y+4stw_MZCg2sR3e5qe1YS0L1evxhCvfTWF5+Q@mail.gmail.com> (raw)
In-Reply-To: <CAADnVQJ8-XVYb21bFRgsaoj7hzd89NSbSOBj0suwsYSL89pxsg@mail.gmail.com>

On Tue, Jan 25, 2022 at 2:48 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 2:25 PM Song Liu <song@kernel.org> wrote:
> >
> > On Tue, Jan 25, 2022 at 12:00 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Jan 24, 2022 at 11:21 PM Song Liu <song@kernel.org> wrote:
> > > >
> > > > On Mon, Jan 24, 2022 at 9:21 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Mon, Jan 24, 2022 at 10:27 AM Song Liu <songliubraving@fb.com> wrote:
> > > > > > >
> > > > > > > Are arches expected to allocate rw buffers in different ways? If not,
> > > > > > > I would consider putting this into the common code as well. Then
> > > > > > > arch-specific code would do something like
> > > > > > >
> > > > > > >  header = bpf_jit_binary_alloc_pack(size, &prg_buf, &prg_addr, ...);
> > > > > > >  ...
> > > > > > >  /*
> > > > > > >   * Generate code into prg_buf, the code should assume that its first
> > > > > > >   * byte is located at prg_addr.
> > > > > > >   */
> > > > > > >  ...
> > > > > > >  bpf_jit_binary_finalize_pack(header, prg_buf);
> > > > > > >
> > > > > > > where bpf_jit_binary_finalize_pack() would copy prg_buf to header and
> > > > > > > free it.
> > > > >
> > > > > It feels right, but bpf_jit_binary_finalize_pack() sounds 100% arch
> > > > > dependent. The only thing it will do is perform a copy via text_poke.
> > > > > What else?
> > > > >
> > > > > > I think this should work.
> > > > > >
> > > > > > We will need an API like: bpf_arch_text_copy, which uses text_poke_copy()
> > > > > > for x86_64 and s390_kernel_write() for x390. We will use bpf_arch_text_copy
> > > > > > to
> > > > > >   1) write header->size;
> > > > > >   2) do finally copy in bpf_jit_binary_finalize_pack().
> > > > >
> > > > > we can combine all text_poke operations into one.
> > > > >
> > > > > Can we add an 'image' pointer into struct bpf_binary_header ?
> > > >
> > > > There is a 4-byte hole in bpf_binary_header. How about we put
> > > > image_offset there? Actually we only need 2 bytes for offset.
> > > >
> > > > > Then do:
> > > > > int bpf_jit_binary_alloc_pack(size, &ro_hdr, &rw_hdr);
> > > > >
> > > > > ro_hdr->image would be the address used to compute offsets by JIT.
> > > >
> > > > If we only do one text_poke(), we cannot write ro_hdr->image yet. We
> > > > can use ro_hdr + rw_hdr->image_offset instead.
> > >
> > > Good points.
> > > Maybe let's go back to Ilya's suggestion and return 4 pointers
> > > from bpf_jit_binary_alloc_pack ?
> >
> > How about we use image_offset, like:
> >
> > struct bpf_binary_header {
> >         u32 size;
> >         u32 image_offset;
> >         u8 image[] __aligned(BPF_IMAGE_ALIGNMENT);
> > };
> >
> > Then we can use
> >
> > image = (void *)header + header->image_offset;
>
> I'm not excited about it, since it leaks header details into JITs.
> Looks like we don't need JIT to be aware of it.
> How about we do random() % roundup(sizeof(struct bpf_binary_header), 64)
> to pick the image start and populate
> image-sizeof(struct bpf_binary_header) range
> with 'int 3'.
> This way we can completely hide binary_header inside generic code.
> The bpf_jit_binary_alloc_pack() would return ro_image and rw_image only.
> And JIT would pass them back into bpf_jit_binary_finalize_pack().
> From the image pointer it would be trivial to get to binary_header with &63.
> The 128 byte offset that we use today was chosen arbitrarily.
> We were burning the whole page for a single program, so 128 bytes zone
> at the front was ok.
> Now we will be packing progs rounded up to 64 bytes, so it's better
> to avoid wasting those 128 bytes regardless.

In bpf_jit_binary_hdr(), we calculate header as image & PAGE_MASK.
If we want s/PAGE_MASK/63 for x86_64, we will have different versions
of bpf_jit_binary_hdr(). It is not on any hot path, so we can use __weak for
it. Other than this, I think the solution works fine.

Thanks,
Song

  reply	other threads:[~2022-01-25 23:09 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-21 19:49 [PATCH v6 bpf-next 0/7] bpf_prog_pack allocator Song Liu
2022-01-21 19:49 ` [PATCH v6 bpf-next 1/7] x86/Kconfig: select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP Song Liu
2022-01-21 19:49 ` [PATCH v6 bpf-next 2/7] bpf: use bytes instead of pages for bpf_jit_[charge|uncharge]_modmem Song Liu
2022-01-21 19:49 ` [PATCH v6 bpf-next 3/7] bpf: use size instead of pages in bpf_binary_header Song Liu
2022-01-21 19:49 ` [PATCH v6 bpf-next 4/7] bpf: add a pointer of bpf_binary_header to bpf_prog Song Liu
2022-01-21 19:49 ` [PATCH v6 bpf-next 5/7] x86/alternative: introduce text_poke_copy Song Liu
2022-01-21 19:49 ` [PATCH v6 bpf-next 6/7] bpf: introduce bpf_prog_pack allocator Song Liu
2022-01-21 23:55   ` Alexei Starovoitov
2022-01-22  0:23     ` Song Liu
2022-01-22  0:41       ` Alexei Starovoitov
2022-01-22  1:01         ` Song Liu
2022-01-22  1:12           ` Alexei Starovoitov
2022-01-22  1:30             ` Song Liu
2022-01-22  2:12               ` Alexei Starovoitov
2022-01-23  1:03                 ` Song Liu
2022-01-24 12:29                   ` Ilya Leoshkevich
2022-01-24 18:27                     ` Song Liu
2022-01-25  5:21                       ` Alexei Starovoitov
2022-01-25  7:21                         ` Song Liu
2022-01-25 19:59                           ` Alexei Starovoitov
2022-01-25 22:25                             ` Song Liu
2022-01-25 22:48                               ` Alexei Starovoitov
2022-01-25 23:09                                 ` Song Liu [this message]
2022-01-26  0:38                                   ` Alexei Starovoitov
2022-01-26  0:50                                     ` Song Liu
2022-01-26  1:20                                       ` Alexei Starovoitov
2022-01-26  1:28                                         ` Song Liu
2022-01-26  1:31                                           ` Song Liu
2022-01-26  1:34                                             ` Alexei Starovoitov
2022-01-24 12:45                   ` Peter Zijlstra
2022-01-21 19:49 ` [PATCH v6 bpf-next 7/7] bpf, x86_64: use " Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPhsuW7AzQL5y+4stw_MZCg2sR3e5qe1YS0L1evxhCvfTWF5+Q@mail.gmail.com \
    --to=song@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=iii@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=songliubraving@fb.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.