All of lore.kernel.org
 help / color / mirror / Atom feed
From: Espen Grindhaug <espen.grindhaug@gmail.com>
To: Yonghong Song <yhs@meta.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Mykola Lysenko <mykolal@fb.com>,
	Shuah Khan <shuah@kernel.org>,
	bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2] libbpf: Improve version handling when attaching uprobe
Date: Mon, 1 May 2023 18:30:24 +0200	[thread overview]
Message-ID: <ZE/pIM/z7x+35KQo@eg> (raw)
In-Reply-To: <533437a4-a76d-96e0-b04a-ab8eb7b5fb7f@meta.com>

On Mon, May 01, 2023 at 08:23:35AM -0700, Yonghong Song wrote:
>
>
> On 5/1/23 6:00 AM, Espen Grindhaug wrote:
> > On Thu, Apr 27, 2023 at 06:19:29PM -0700, Yonghong Song wrote:
> > >
> > >
> > > On 4/27/23 12:19 PM, Espen Grindhaug wrote:
> > > > On Wed, Apr 26, 2023 at 02:47:27PM -0700, Yonghong Song wrote:
> > > > >
> > > > >
> > > > > On 4/23/23 11:55 AM, Espen Grindhaug wrote:
> > > > > > This change fixes the handling of versions in elf_find_func_offset.
> > > > > > In the previous implementation, we incorrectly assumed that the
> > > > >
> > > > > Could you give more explanation/example in the commit message
> > > > > what does 'incorrectly' mean here? In which situations the
> > > > > current libbpf implementation will not be correct?
> > > > >
> > > >
> > > > How about something like this?
> > > >
> > > >
> > > > libbpf: Improve version handling when attaching uprobe
> > > >
> > > > This change fixes the handling of versions in elf_find_func_offset.
> > > >
> > > > For example, let's assume we are trying to attach an uprobe to pthread_create in
> > > > glibc. Prior to this commit, it would fail with an error message saying 'elf:
> > > > ambiguous match [...]', this is because there are two entries in the symbol
> > > > table with that name.
> > > >
> > > > $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
> > > > 0000000000094cc0 T pthread_create@GLIBC_2.2.5
> > > > 0000000000094cc0 T pthread_create@@GLIBC_2.34
> > > >
> > > > So we go ahead and modify our code to attach to 'pthread_create@@GLIBC_2.34',
> > > > and this also fails, but this time with the error 'elf: failed to find symbol
> > > > [...]'. This fails because we incorrectly assumed that the version information
> > > > would be present in the string found in the string table, but there is only the
> > > > string 'pthread_create'.
> > >
> > > I tried one example with my centos8 libpthread library.
> > >
> > > $ llvm-readelf -s /lib64/libc-2.28.so | grep pthread_cond_signal
> > >      39: 0000000000095f70    43 FUNC    GLOBAL DEFAULT    14
> > > pthread_cond_signal@@GLIBC_2.3.2
> > >      40: 0000000000096250    43 FUNC    GLOBAL DEFAULT    14
> > > pthread_cond_signal@GLIBC_2.2.5
> > >    3160: 0000000000096250    43 FUNC    LOCAL  DEFAULT    14
> > > __pthread_cond_signal_2_0
> > >    3589: 0000000000095f70    43 FUNC    LOCAL  DEFAULT    14
> > > __pthread_cond_signal
> > >    5522: 0000000000095f70    43 FUNC    GLOBAL DEFAULT    14
> > > pthread_cond_signal@@GLIBC_2.3.2
> > >    5545: 0000000000096250    43 FUNC    GLOBAL DEFAULT    14
> > > pthread_cond_signal@GLIBC_2.2.5
> > > $ nm -D /lib64/libc-2.28.so | grep pthread_cond_signal
> > > 0000000000095f70 T pthread_cond_signal@@GLIBC_2.3.2
> > > 0000000000096250 T pthread_cond_signal@GLIBC_2.2.5
> > > $
> > >
> > > Note that two pthread_cond_signal functions have different addresses,
> > > which is expected as they implemented for different versions.
> > >
> > > But in your case,
> > > > $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
> > > > 0000000000094cc0 T pthread_create@GLIBC_2.2.5
> > > > 0000000000094cc0 T pthread_create@@GLIBC_2.34
> > >
> > > Two functions have the same address which is very weird and I suspect
> > > some issues here at least needs some investigation.
> > >
> >
> > I am no expert on this, but as far as I can tell, this is normal,
> > although much more common on my Ubuntu machine than my Fedora machine.
> >
> > Script to find duplicates:
> >
> > nm -D /usr/lib64/libc-2.33.so | awk '
> > {
> >      addr = $1;
> >      symbol = $3;
> >      sub(/[@].*$/, "", symbol);
> >
> >      if (addr == prev_addr && symbol == prev_symbol) {
> >          if (prev_symbol_printed == 0) {
> >              print prev_line;
> >              prev_symbol_printed = 1;
> >          }
> >          print;
> >      } else {
> >          prev_symbol_printed = 0;
> >      }
> >      prev_addr = addr;
> >      prev_symbol = symbol;
> >      prev_line = $0;
> > }'
> >
> >
> > > Second, for the symbol table, the following is ELF encoding,
> > >
> > > typedef struct {
> > >          Elf64_Word      st_name;
> > >          unsigned char   st_info;
> > >          unsigned char   st_other;
> > >          Elf64_Half      st_shndx;
> > >          Elf64_Addr      st_value;
> > >          Elf64_Xword     st_size;
> > > } Elf64_Sym;
> > >
> > > where
> > > st_name
> > >
> > >      An index into the object file's symbol string table, which holds the
> > > character representations of the symbol names. If the value is nonzero, the
> > > value represents a string table index that gives the symbol name. Otherwise,
> > > the symbol table entry has no name.
> > >
> > > So, the function name (including @..., @@...) should be in string table
> > > which is the same for the above two pthread_cond_signal symbols.
> > >
> > > I think it is worthwhile to debug why in your situation
> > > pthread_create@GLIBC_2.2.5 and pthread_create@@GLIBC_2.34 do not
> > > have them in the string table.
> > >
> >
> > I think you are mistaken here; the strings in the strings table don't contain
> > the version. Take a look at this partial dump of the strings table.
> >
> > 	$ readelf -W -p .dynstr /usr/lib64/libc-2.33.so
> >
> > 	String dump of section '.dynstr':
> > 		[     1]  xdrmem_create
> > 		[     f]  __wctomb_chk
> > 		[    1c]  getmntent
> > 		[    26]  __freelocale
> > 		[    33]  __rawmemchr
> > 		[    3f]  _IO_vsprintf
> > 		[    4c]  getutent
> > 		[    55]  __file_change_detection_for_path
> > 	(...)
> > 		[  350e]  memrchr
> > 		[  3516]  pthread_cond_signal
> > 		[  352a]  __close
> > 	(...)
> > 		[  61b6]  GLIBC_2.2.5
> > 		[  61c2]  GLIBC_2.2.6
> > 		[  61ce]  GLIBC_2.3
> > 		[  61d8]  GLIBC_2.3.2
> > 		[  61e4]  GLIBC_2.3.3
> >
> > As you can see, the strings have no versions, and the version strings
> > themselves are also in this table as entries at the end of the table.
>
> I see you search .dynstr section. Do you think whether we should
> search .strtab instead since it contains versioned symbols?
>

I searched .dynstr since my libc files only have that section, but I do see
your point. If const char *binary_path points to an executable and not an
.so file, then we would find some versioned symbols in the .strtab section.
However, since libbpf supports using the .so as binary_path, would we not
need the functionality to build the complete name regardless?

Adding a check to not build the full name if it already contains an '@' is
probably a good idea, though.

> >
> > > >
> > > > This patch reworks how we compare the symbol name provided by the user if it is
> > > > qualified with a version (using @ or @@). We now look up the correct version
> > > > string in the version symbol table before constructing the full name, as also
> > > > done above by nm, before comparing.
> > > >
> > > > > > version information would be present in the string found in the
> > > > > > string table.
> > > > > >
> > > > > > We now look up the correct version string in the version symbol
> > > > > > table before constructing the full name and then comparing.
> > > > > >
> > > > > > This patch adds support for both name@version and name@@version to
> > > > > > match output of the various elf parsers.
> > > > > >
> > > > > > Signed-off-by: Espen Grindhaug <espen.grindhaug@gmail.com>
> > > > >
> > > > > [...]

  reply	other threads:[~2023-05-01 16:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-23 18:55 [PATCH v2] libbpf: Improve version handling when attaching uprobe Espen Grindhaug
2023-04-26 21:47 ` Yonghong Song
2023-04-27 19:19   ` Espen Grindhaug
2023-04-28  1:19     ` Yonghong Song
2023-05-01 13:00       ` Espen Grindhaug
2023-05-01 15:23         ` Yonghong Song
2023-05-01 16:30           ` Espen Grindhaug [this message]
2023-05-01 17:20             ` Yonghong Song
2023-05-02  4:02 ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZE/pIM/z7x+35KQo@eg \
    --to=espen.grindhaug@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mykolal@fb.com \
    --cc=sdf@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    --cc=yhs@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.