All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: siddhesh-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org,
	"libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org"
	<libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-man <linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Carlos O'Donell <carlos-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Rich Felker <dalias-/miJ2pyFWUyWIDz0JBNUog@public.gmane.org>,
	"H.J. Lu" <hjl.tools-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: Documenting the (dynamic) linking rules for symbol versioning
Date: Thu, 20 Apr 2017 14:40:16 +0200	[thread overview]
Message-ID: <c6736794-60bb-517e-0bcf-2e80331f0f72@gmail.com> (raw)
In-Reply-To: <22f26755-f7f0-898a-ac74-3f6df92a22d7-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 9880 bytes --]

Hello Siddhesh,

Thanks for your response!

On 04/20/2017 08:05 AM, Siddhesh Poyarekar wrote:
> On Wednesday 19 April 2017 08:37 PM, Michael Kerrisk (man-pages) wrote:
>> 1. If looking for a versioned symbol (NAME@VERSION), the DL will search
>>    starting from the start of the link map ("namespace") until it finds the
>>    first instance of either a matching unversioned NAME or an exact version
>>    match on NAME@VERSION. Preloading takes advantage of the former case to
>>    allow easy overriding of versioned symbols in a library that is loaded 
>>    later in the link map.
> 
> I believe it is the other way around, i.e. it looks for the versioned
> symbol first and if it is not found, link against the unversioned symbol
> provided that it is public.

I think that I have failed to provide enough detail for
you to understand what I meant. Consider the following:

1. We want to interpose some symbol in glibc (say, "malloc@GLIBC_2.0")
   with a symbol of our own (perhaps via a preloaded library).
2. In our preloaded shared library, the interposing "malloc"
   need not be a versioned symbol.

At least

>> 2. The version notation NAME@@VERSION denotes the default version
>>    for NAME. This default version is used in the following places:
>>
>>    a) At static link time, this is the version that the static
>>       linker will bind to when creating the relocation record
>>       that will be used by the DL.
>>    b) When doing a dlsym() look-up on the unversioned symbol NAME.
>>       (See check_match() in elf/dl-lookup.c)
>>
>>    Is the default version used in any other circumstance?
> 
> Only (a) afaict, where do you see (2) happening?  Unversioned symbol
> lookups seem to happen independent of the @@ version.

See the following (tarball of code attached):

$ cat sv_lib_v3.c
/*#* sv_lib_v3.c

   COPYRIGHT-NOTICE
*/

#include <stdio.h>

#ifndef DEF_XYZ_V2
__asm__(".symver xyz_newest,xyz@@VER_3");
__asm__(".symver xyz_new,xyz@VER_2");
#else
__asm__(".symver xyz_newest,xyz@VER_3");
__asm__(".symver xyz_new,xyz@@VER_2");
#endif
__asm__(".symver xyz_old,xyz@VER_1");

__asm__(".symver pqr_new,pqr@@VER_3");
__asm__(".symver pqr_old,pqr@VER_2");

__asm__(".symver tuv_newest,tuv@@VER_3");
__asm__(".symver tuv_new,tuv@VER_2");
__asm__(".symver tuv_old,tuv@VER_1");

void xyz_old(void) { printf("v1 xyz\n"); }

void xyz_new(void) { printf("v2 xyz\n"); }

void xyz_newest(void) { printf("v3 xyz\n"); }

void tuv_old(void) { printf("v1 tuv\n"); }

void tuv_new(void) { printf("v2 tuv\n"); }

void tuv_newest(void) { printf("v3 tuv\n"); }

void pqr_new(void) { printf("v3 pqr\n"); }

void pqr_old(void) { printf("v2 pqr\n"); }

void abc(void) { printf("v3 abc\n"); }
void v123(void) { printf("v3 v123\n"); }

$ cat sv_v3.map
VER_1 {
	global: xyz; tuv;
local: 	[a-uw-z]*; 	# Hide all other symbols
}; 

VER_2 { 
     	global: pqr;
} VER_1;

VER_3 {
    	global: abc;
} VER_2;

$ # Build version 3 of shared library, where the default (@@) version
$ # of xyz is VER_3
$ gcc -g -c -fPIC -Wall sv_lib_v3.c
$ gcc -g -shared -o libsv.so sv_lib_v3.o -Wl,--version-script,sv_v3.map

$ # Build version 3 of shared library, where the default (@@) version
$ # of xyz is VER_2
$ gcc -DDEF_XYZ_V2 -g -c -fPIC -Wall sv_lib_v3.c
$ gcc -g -shared -o libsv_def_xyz_v2.so sv_lib_v3.o -Wl,--version-script,sv_v3.map

$ # Verify symbol versions in the two DSOs:
$
$ readelf --dyn-syms libsv.so | grep xyz
    20: 0000000000000930    19 FUNC    GLOBAL DEFAULT   12 xyz@VER_1
    21: 0000000000000956    19 FUNC    GLOBAL DEFAULT   12 xyz@@VER_3
    22: 0000000000000943    19 FUNC    GLOBAL DEFAULT   12 xyz@VER_2
$ readelf --dyn-syms libsv_def_xyz_v2.so | grep xyz
    20: 0000000000000943    19 FUNC    GLOBAL DEFAULT   12 xyz@@VER_2
    21: 0000000000000956    19 FUNC    GLOBAL DEFAULT   12 xyz@VER_3
    22: 0000000000000930    19 FUNC    GLOBAL DEFAULT   12 xyz@VER_1

$ cat dynload.c 
#include <dlfcn.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>

int
main(int argc, char *argv[])
{
    void *libHandle;            /* Handle for shared library */
    void (*funcp)(void);        /* Pointer to function with no arguments */
    const char *err;

    if (argc != 3 || strcmp(argv[1], "--help") == 0) {
        fprintf(stderr, "Usage: %s lib-path func-name\n", argv[0]);
	exit(EXIT_FAILURE);
    }

    /* Load the shared library and get a handle for later use */

    libHandle = dlopen(argv[1], RTLD_LAZY);
    if (libHandle == NULL) {
        fprintf(stderr, "dlopen: %s\n", dlerror());
	exit(EXIT_FAILURE);
    }

    /* Search library for symbol named in argv[2] */

    (void) dlerror();                           /* Clear dlerror() */
    *(void **) (&funcp) = dlsym(libHandle, argv[2]);
    err = dlerror();
    if (err != NULL) {
        fprintf(stderr, "dlsym: %s\n", err);
	exit(EXIT_FAILURE);
    }

    /* Try calling the address returned by dlsym() as a function
       that takes no arguments */

    (*funcp)();

    dlclose(libHandle);                         /* Close the library */

    exit(EXIT_SUCCESS);
}

$ gcc -o dynload dynload.c -ldl
$ ./dynload ./libsv.so xyz
v3 xyz
$ ./dynload ./libsv_def_xyz_v2.so xyz
v2 xyz

Note the last line: dlsym() found xyz@@VER_2 (not xyz@VER_3).

>> 3. There can of course be only one NAME@@VERSION definition.
> 
> Right.
> 
>> 4. The version notation NAME@VERSION denotes a "hidden" version of the
>>    symbol. Such versions are not directly accessible, but can be
>>    accessed via asm(".symver") magic. There can be multiple "hidden"
>>    versions of a symbol.
> 
> It is hidden only to the static linker, i.e. it links against either
> unversioned or @@ versions of a symbol.

Yes.

>> 5. When resolving a reference to an unversioned symbol, NAME,
>>    in an executable that was linked against a nonsymbol-versioned
>>    library, the DL will, if it finds a symbol-versioned library
>>    in the link map use the earliest version of the symbol provided
>>    by that library.
>>
>>    I presume that this behavior exists to allow easy migration
>>    of a non-symbol-versioned application onto a system with
>>    a symbol-versioned versioned library that uses the same major
>>    version real name for the library as was formerly used by
>>    a non-symbol-versioned library. (My knowledge of this area
>>    was pretty much nonexistent at that time, but presumably this 
>>    is what was done in the transition from glibc 2.0 to glibc 2.1.)
>>    
>>    To clarify the scenario I am talking about:
>>
>>    a) We have prog.c which calls xyz() and is linked against a
>>       non-symbol-versioned libxyz.so.2.
>>
>>    b) Later, a symbol-versioned libxyz.so.2 is created that defines
>>       (for example):
>>           
>>           xyz@@VER_3
>>           xyz@VER_2
>>           xyz@VER_1
>>
>>       (Alternatively, we preload a shared library that defines
>>       these three versions of 'xyz'.)
>>
>>    c) If we run the ancient binary 'prog' which requests refers
>>       to an unversioned 'xyz', that will resolve to xyz@VER_1.
> 
> That seems correct.  The VERSYM section orders the versions by index
> (which seems to be based on ordering of the symbols in the version
> script) and the odest in that sequence seems to win for unversioned
> lookup.  For a dlsym(), the newest one wins.

Thanks for the confirmation.

>> 6. [An additional detail to 5, which surprised me at first, but
>>    I can sort of convince myself it makes sense...]
>>
>>    In the scenario described in point 5, an unversioned
>>    reference to NAME will be resolved to the earliest versioned
>>    symbol NAME inside a symbol-versioned library if there is
>>    is a version of NAME in the *lowest* version provided
>>    by the library. Otherwise, it will resolve to the *latest*
>>    version of NAME (and *not* to the default NAME@@VERSION
>>    version of the symbol).
>>
>>    To clarify with an example:
>>
>>    We have prog.c that calls abc() and xyz(), and is linked
>>    against a non-symbol-versioned library, lib_nonver.so,
>>    that provides definitions of abc() and xyz().
>>
>>    Then, we have a symbol-versioned library, lib_ver.so,
>>    that has three versions, VER_1, VER_2, and VER_3, and defines
>>    the following symbols:
>>
>>        xyz@@VER_3
>>        xyz@VER_2
>>        xyz@VER_1
>>
>>        abc@@VER_3
>>        abc@VER_2
>>
>>    Then we run 'prog' using:
>>
>>        LD_PRELOAD=./lib_ver.so ./prog
>>
>>    In this case, 'prog' will call xyz@VER_1 and abc@@VER_3
>>    (*not* abc@VER_2) from lib_ver.so.
>>
>>    I can convince myself (sort of) that this makes some sense by
>>    thinking about things from the perspective of the scenario of
>>    migrating from the non-symbol-versioned shared library to the
>>    symbol-versioned shared library: the old non-symbol-versioned library
>>    never provided a symbol 'abc()' so in this scenario, use the latest
>>    version of 'abc'. This applies even if the the latest version is not
>>    the 'default'.  In other words, even if the versions of 'abc'
>>    provided by lib_ver.so were the following, it would still be the
>>    VER_3 of abc() that is called:
>>
>>        abc@VER_3
>>        abc@@VER_2
>>
>>    Am I right about my rough guess for the rationale for point 6,
>>    or is there something else I should know/write about?
> 
> This seems odd, I hope someone here knows why this really is and
> (hopefully) point to resources. 

Florian commented on this point already. See his mail.

> Documentation about the dynamic linker
> are generally very hard to find, 

It sure is...

> so I'm glad you're doing this.

Let's see if I can make something useful...

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

[-- Attachment #2: symver_default.tar.gz --]
[-- Type: application/gzip, Size: 1069 bytes --]

  parent reply	other threads:[~2017-04-20 12:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-19 15:07 Documenting the (dynamic) linking rules for symbol versioning Michael Kerrisk (man-pages)
     [not found] ` <b3a962de-6703-d8b9-18f7-138185171475-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-04-19 15:48   ` Florian Weimer
     [not found]     ` <ee5e8057-7afa-c919-8ccb-9c8e6d0833c4-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-19 19:49       ` Michael Kerrisk (man-pages)
     [not found]         ` <517c3e75-93b5-0762-d6a4-7a17d196654e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-04-20  8:49           ` Florian Weimer
     [not found]             ` <3edb27c6-c9b6-df95-3810-a8b5abc740fb-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-20 11:45               ` Michael Kerrisk (man-pages)
2017-04-20 13:17                 ` Florian Weimer
2017-04-20 14:07                   ` Michael Kerrisk (man-pages)
     [not found]                     ` <0409f767-3ae3-48f0-4836-8694361c755c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-04-28 14:19                       ` Florian Weimer
2017-05-01 18:34                         ` Michael Kerrisk (man-pages)
2017-04-20  6:05   ` Siddhesh Poyarekar
     [not found]     ` <22f26755-f7f0-898a-ac74-3f6df92a22d7-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>
2017-04-20 12:40       ` Michael Kerrisk (man-pages) [this message]
2017-04-20 12:58         ` Siddhesh Poyarekar
     [not found]           ` <eb5bea0c-1f54-1b20-dc78-999160738ed3-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>
2017-04-20 13:01             ` Florian Weimer
2017-04-20 13:15               ` Siddhesh Poyarekar
     [not found]                 ` <c31e55fb-25af-bfe9-09db-83e622ec5e3f-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>
2017-04-20 13:45                   ` Florian Weimer
     [not found]                     ` <89907506-fddb-2429-7e18-b00b8a560070-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-20 14:09                       ` Michael Kerrisk (man-pages)
     [not found]                         ` <c1b5fd84-22ec-56de-b169-502d8072d188-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-04-20 14:35                           ` Michael Kerrisk (man-pages)
2017-05-05 14:10                           ` Florian Weimer
2017-04-26 19:57   ` Torvald Riegel
2017-05-05 19:51   ` Carlos O'Donell
     [not found]     ` <66c61101-f44f-2bbb-5ed2-b43c5d764e76-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-05-13 12:10       ` Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6736794-60bb-517e-0bcf-2e80331f0f72@gmail.com \
    --to=mtk.manpages-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=carlos-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dalias-/miJ2pyFWUyWIDz0JBNUog@public.gmane.org \
    --cc=hjl.tools-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org \
    --cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=siddhesh-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.