All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Hengqi Chen <hengqi.chen@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	 andrii@kernel.org, alan.maguire@oracle.com, olsajiri@gmail.com,
	 Jiri Olsa <jolsa@kernel.org>
Subject: Re: [PATCH bpf-next v3 2/3] libbpf: Support symbol versioning for uprobe
Date: Fri, 15 Sep 2023 13:20:03 -0700	[thread overview]
Message-ID: <CAEf4BzZ=RoEQkVy6b4uW4Bi8xnx6TimJ8=hYLJVVq8reLkus4w@mail.gmail.com> (raw)
In-Reply-To: <CAEyhmHQcDYvnUkE2EAbhenMYoXai7cC=q5xNgMC3X-+hfx6agw@mail.gmail.com>

On Fri, Sep 15, 2023 at 12:30 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> On Fri, Sep 15, 2023 at 1:12 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Sep 14, 2023 at 5:37 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> > >
> > > On Wed, Sep 13, 2023 at 7:14 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Sun, Sep 10, 2023 at 6:51 PM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> > > > >
> > > > > In current implementation, we assume that symbol found in .dynsym section
> > > > > would have a version suffix and use it to compare with symbol user supplied.
> > > > > According to the spec ([0]), this assumption is incorrect, the version info
> > > > > of dynamic symbols are stored in .gnu.version and .gnu.version_d sections
> > > > > of ELF objects. For example:
> > > > >
> > > > >     $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock
> > > > >     000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5
> > > > >     000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34
> > > > >     000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5
> > > > >
> > > > >     $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock
> > > > >       706: 000000000009b1a0   878 FUNC    GLOBAL DEFAULT   15 __pthread_rwlock_wrlock@GLIBC_2.2.5
> > > > >       2568: 000000000009b1a0   878 FUNC    GLOBAL DEFAULT   15 pthread_rwlock_wrlock@@GLIBC_2.34
> > > > >       2571: 000000000009b1a0   878 FUNC    GLOBAL DEFAULT   15 pthread_rwlock_wrlock@GLIBC_2.2.5
> > > > >
> > > > > In this case, specify pthread_rwlock_wrlock@@GLIBC_2.34 or
> > > > > pthread_rwlock_wrlock@GLIBC_2.2.5 in bpf_uprobe_opts::func_name won't work.
> > > > > Because the qualified name does NOT match `pthread_rwlock_wrlock` (without
> > > > > version suffix) in .dynsym sections.
> > > > >
> > > > > This commit implements the symbol versioning for dynsym and allows user to
> > > > > specify symbol in the following forms:
> > > > >   - func
> > > > >   - func@LIB_VERSION
> > > > >   - func@@LIB_VERSION
> > > > >
> > > > > In case of symbol conflicts, error out and users should resolve it by
> > > > > specifying a qualified name.
> > > > >
> > > > >   [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html
> > > > >
> > > > > Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
> > > > > Acked-by: Jiri Olsa <jolsa@kernel.org>
> > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
> > > > > ---
> > > > >  tools/lib/bpf/elf.c    | 146 +++++++++++++++++++++++++++++++++++++----
> > > > >  tools/lib/bpf/libbpf.c |   2 +-
> > > > >  2 files changed, 134 insertions(+), 14 deletions(-)
> > > > >
> > > > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c
> > > > > index 5c9e588b17da..825da903a34c 100644
> > > > > --- a/tools/lib/bpf/elf.c
> > > > > +++ b/tools/lib/bpf/elf.c
> > > > > @@ -1,5 +1,8 @@
> > > > >  // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > > > >
> > > > > +#ifndef _GNU_SOURCE
> > > > > +#define _GNU_SOURCE
> > > > > +#endif
> > > > >  #include <libelf.h>
> > > > >  #include <gelf.h>
> > > > >  #include <fcntl.h>
> > > > > @@ -10,6 +13,17 @@
> > > > >
> > > > >  #define STRERR_BUFSIZE  128
> > > > >
> > > > > +/* A SHT_GNU_versym section holds 16-bit words. This bit is set if
> > > > > + * the symbol is hidden and can only be seen when referenced using an
> > > > > + * explicit version number. This is a GNU extension.
> > > > > + */
> > > > > +#define VERSYM_HIDDEN  0x8000
> > > > > +
> > > > > +/* This is the mask for the rest of the data in a word read from a
> > > > > + * SHT_GNU_versym section.
> > > > > + */
> > > > > +#define VERSYM_VERSION 0x7fff
> > > > > +
> > > > >  int elf_open(const char *binary_path, struct elf_fd *elf_fd)
> > > > >  {
> > > > >         char errmsg[STRERR_BUFSIZE];
> > > > > @@ -64,11 +78,14 @@ struct elf_sym {
> > > > >         const char *name;
> > > > >         GElf_Sym sym;
> > > > >         GElf_Shdr sh;
> > > > > +       int ver;
> > > > > +       bool hidden;
> > > > >  };
> > > > >
> > > > >  struct elf_sym_iter {
> > > > >         Elf *elf;
> > > > >         Elf_Data *syms;
> > > > > +       Elf_Data *versyms;
> > > > >         size_t nr_syms;
> > > > >         size_t strtabidx;
> > > > >         size_t next_sym_idx;
> > > > > @@ -111,6 +128,18 @@ static int elf_sym_iter_new(struct elf_sym_iter *iter,
> > > > >         iter->nr_syms = iter->syms->d_size / sh.sh_entsize;
> > > > >         iter->elf = elf;
> > > > >         iter->st_type = st_type;
> > > > > +
> > > > > +       /* Version symbol table is meaningful to dynsym only */
> > > > > +       if (sh_type != SHT_DYNSYM)
> > > > > +               return 0;
> > > > > +
> > > > > +       scn = elf_find_next_scn_by_type(elf, SHT_GNU_versym, NULL);
> > > > > +       if (!scn)
> > > > > +               return 0;
> > > > > +       if (!gelf_getshdr(scn, &sh))
> > > > > +               return -EINVAL;
> > > > > +       iter->versyms = elf_getdata(scn, 0);
> > > > > +
> > > > >         return 0;
> > > > >  }
> > > > >
> > > > > @@ -119,6 +148,7 @@ static struct elf_sym *elf_sym_iter_next(struct elf_sym_iter *iter)
> > > > >         struct elf_sym *ret = &iter->sym;
> > > > >         GElf_Sym *sym = &ret->sym;
> > > > >         const char *name = NULL;
> > > > > +       GElf_Versym versym;
> > > > >         Elf_Scn *sym_scn;
> > > > >         size_t idx;
> > > > >
> > > > > @@ -138,12 +168,113 @@ static struct elf_sym *elf_sym_iter_next(struct elf_sym_iter *iter)
> > > > >
> > > > >                 iter->next_sym_idx = idx + 1;
> > > > >                 ret->name = name;
> > > > > +               ret->ver = 0;
> > > > > +               ret->hidden = false;
> > > > > +
> > > > > +               if (iter->versyms) {
> > > > > +                       if (!gelf_getversym(iter->versyms, idx, &versym))
> > > > > +                               continue;
> > > > > +                       ret->ver = versym & VERSYM_VERSION;
> > > > > +                       ret->hidden = versym & VERSYM_HIDDEN;
> > > > > +               }
> > > > >                 return ret;
> > > > >         }
> > > > >
> > > > >         return NULL;
> > > > >  }
> > > > >
> > > > > +static const char *elf_get_vername(Elf *elf, int ver)
> > > > > +{
> > > > > +       GElf_Verdaux verdaux;
> > > > > +       GElf_Verdef verdef;
> > > > > +       Elf_Data *verdefs;
> > > > > +       size_t strtabidx;
> > > > > +       GElf_Shdr sh;
> > > > > +       Elf_Scn *scn;
> > > > > +       int offset;
> > > > > +
> > > > > +       scn = elf_find_next_scn_by_type(elf, SHT_GNU_verdef, NULL);
> > > >
> > > > so this is a linear search, right? And we'll be doing it for every
> > > > .dynsym symbol. Let's do this once at the creation time and remember a
> > > > pointer inside struct Elf?
> > > >
> > >
> > > We reach here only when the symbol part match, and likely we get the
> > > desired one.
> >
> > sure, but you can have multiple versions, so multiple hits of this.
> > And the ELF itself could be pretty big with lots of sections. So I
> > think we should try to minimize number of linear searches over ELF
> > sections, if possible.
> >
> > > if we store the pointers in struct Elf, then we have to involve
> > > dynamic allocations.
> >
> > I'm not sure why dynamic allocations are needed?
> >
>
> Your last comment says remember those version name pointer in struct Elf,
> I thought this would be allocate an array and save those pointers
> (i.e. store a mapping from version index to version name)
>
> > But also I had struct elf_sym_iter in mind, where we already cache
> > Elf_Data *versyms, so why not do that with verdefs as well? I think
>
> verdef is not per symbol.

I meant to just not do elf_find_next_scn_by_type() search every time,
and just store Elf_Scn * pointer for SHT_GNU_verdef in iter state.


>
> > all these helpers (symbol_match and elf_get_vername) are called only
> > from elf_sym_iter stuff right now, right?
> >
> >
> > >
> > > > > +       if (!scn)
> > > > > +               return NULL;
> > > > > +       if (!gelf_getshdr(scn, &sh))
> > > > > +               return NULL;
> > > > > +       strtabidx = sh.sh_link;
> > > > > +       verdefs =  elf_getdata(scn, 0);
> > > > > +
> > > > > +       offset = 0;
> > > > > +       while (gelf_getverdef(verdefs, offset, &verdef)) {
> > > > > +               if (verdef.vd_ndx != ver) {
> > > > > +                       if (!verdef.vd_next)
> > > > > +                               break;
> > > > > +
> > > > > +                       offset += verdef.vd_next;
> > > > > +                       continue;
> > > > > +               }
> > > > > +
> > > > > +               if (!gelf_getverdaux(verdefs, offset + verdef.vd_aux, &verdaux))
> > > > > +                       break;
> > > > > +
> > > > > +               return elf_strptr(elf, strtabidx, verdaux.vda_name);
> > > > > +
> > > > > +       }
> > > > > +       return NULL;
> > > > > +}

[...]

  reply	other threads:[~2023-09-15 20:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11  1:50 [PATCH bpf-next v3 0/3] libbpf: Support symbol versioning for uprobe Hengqi Chen
2023-09-11  1:50 ` [PATCH bpf-next v3 1/3] libbpf: Resolve symbol conflicts at the same offset " Hengqi Chen
2023-09-11  1:50 ` [PATCH bpf-next v3 2/3] libbpf: Support symbol versioning " Hengqi Chen
2023-09-12 23:14   ` Andrii Nakryiko
2023-09-14 12:36     ` Hengqi Chen
2023-09-14 17:12       ` Andrii Nakryiko
2023-09-15  7:30         ` Hengqi Chen
2023-09-15 20:20           ` Andrii Nakryiko [this message]
2023-09-11  1:50 ` [PATCH bpf-next v3 3/3] selftests/bpf: Add tests for " Hengqi Chen
2023-09-12 23:19   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzZ=RoEQkVy6b4uW4Bi8xnx6TimJ8=hYLJVVq8reLkus4w@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hengqi.chen@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=olsajiri@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.