BPF Archive on lore.kernel.org
 help / color / Atom feed
* pahole and LTO
@ 2020-01-10 16:44 Stanislav Fomichev
  2020-01-10 16:47 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 4+ messages in thread
From: Stanislav Fomichev @ 2020-01-10 16:44 UTC (permalink / raw)
  To: acme, bpf; +Cc: netdev, daniel, ast, andriin, morbo

tl;dr - building the kernel with clang and lto breaks BTF generation because
pahole doesn't seem to understand cross-cu references.

Can be reproduced with the following:
    $ cat a.c
    struct s;

    void f1() {}

    __attribute__((always_inline)) void f2(struct s *p)
    {
            if (p)
                    f1();
    }
    $ cat b.c
    struct s {
            int x;
    };

    void f2(struct s *p);

    int main()
    {
            struct s s = { 10 };
            f2(&s);
    }
    $ clang -fuse-ld=lld -flto {a,b}.c -g

    $ pahole a.out
    tag__recode_dwarf_type: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
    lexblock__recode_dwarf_types: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
    struct s {
            int                        x;                    /*     0     4 */

            /* size: 4, cachelines: 1, members: 1 */
            /* last cacheline: 4 bytes */
    };

From what I can tell, pahole internally loops over each cu and resolves only
local references, while the dwarf spec (table 2.3) states the following
about 'reference':
"Refers to one of the debugging information entries that describe the program.
There are four types of reference. The first is an offset relative to the
beginning of the compilation unit in which the reference occurs and must
refer to an entry within that same compilation unit. The second type of
reference is the offset of a debugging information entry in any compilation
unit, including one different from the unit containing the reference. The
third type of reference is an indirect reference to a type definition using
an 8-byte signature for that type. The fourth type of reference is a reference
from within the .debug_info section of the executable or shared object file to
a debugging information entry in the .debug_info section of a supplementary
object file."

In particular: "The second type of reference is the offset of a debugging
information entry in any compilation unit, including one different from the
unit containing the reference."


So the question is: is it a (known) issue? Is it something that's ommitted
on purpose? Or it's not implemented because lto is not (yet) widely used?


Here is the dwarf:

$ readelf --debug-dump=info a.out
Contents of the .debug_info section:

  Compilation Unit @ offset 0x0:
   Length:        0x44 (32-bit)
   Version:       4
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <c>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
    <10>   DW_AT_language    : 12	(ANSI C99)
    <12>   DW_AT_name        : (indirect string, offset: 0x0): a.c
    <16>   DW_AT_stmt_list   : 0x0
    <1a>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
    <1e>   DW_AT_low_pc      : 0x201730
    <26>   DW_AT_high_pc     : 0x6
 <1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
    <2b>   DW_AT_low_pc      : 0x201730
    <33>   DW_AT_high_pc     : 0x6
    <37>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
    <39>   DW_AT_name        : (indirect string, offset: 0xa4): f1
    <3d>   DW_AT_decl_file   : 1
    <3e>   DW_AT_decl_line   : 3
    <3f>   DW_AT_external    : 1
 <1><3f>: Abbrev Number: 3 (DW_TAG_subprogram)
    <40>   DW_AT_name        : (indirect string, offset: 0x4): f2
    <44>   DW_AT_decl_file   : 1
    <45>   DW_AT_decl_line   : 5
    <46>   DW_AT_prototyped  : 1
    <46>   DW_AT_external    : 1
    <46>   DW_AT_inline      : 1	(inlined)
 <1><47>: Abbrev Number: 0
  Compilation Unit @ offset 0x48:
   Length:        0x7f (32-bit)
   Version:       4
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><53>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <54>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
    <58>   DW_AT_language    : 12	(ANSI C99)
    <5a>   DW_AT_name        : (indirect string, offset: 0x7): b.c
    <5e>   DW_AT_stmt_list   : 0x3a
    <62>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
    <66>   DW_AT_low_pc      : 0x201740
    <6e>   DW_AT_high_pc     : 0x1f
 <1><72>: Abbrev Number: 4 (DW_TAG_subprogram)
    <73>   DW_AT_low_pc      : 0x201740
    <7b>   DW_AT_high_pc     : 0x1f
    <7f>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
    <81>   DW_AT_name        : (indirect string, offset: 0x9d): main
    <85>   DW_AT_decl_file   : 1
    <86>   DW_AT_decl_line   : 7
    <87>   DW_AT_type        : <0xae>
    <8b>   DW_AT_external    : 1
 <2><8b>: Abbrev Number: 5 (DW_TAG_variable)
    <8c>   DW_AT_location    : 2 byte block: 91 78 	(DW_OP_fbreg: -8)
    <8f>   DW_AT_name        : (indirect string, offset: 0xb): s
    <93>   DW_AT_decl_file   : 1
    <94>   DW_AT_decl_line   : 9
    <95>   DW_AT_type        : <0xb5>
 <2><99>: Abbrev Number: 6 (DW_TAG_inlined_subroutine)
    <9a>   DW_AT_abstract_origin: <0x3f>
    <9e>   DW_AT_low_pc      : 0x201752
    <a6>   DW_AT_high_pc     : 0x5
    <aa>   DW_AT_call_file   : 1
    <ab>   DW_AT_call_line   : 10
    <ac>   DW_AT_call_column : 
 <2><ad>: Abbrev Number: 0
 <1><ae>: Abbrev Number: 7 (DW_TAG_base_type)
    <af>   DW_AT_name        : (indirect string, offset: 0xd): int
    <b3>   DW_AT_encoding    : 5	(signed)
    <b4>   DW_AT_byte_size   : 4
 <1><b5>: Abbrev Number: 8 (DW_TAG_structure_type)
    <b6>   DW_AT_name        : (indirect string, offset: 0xb): s
    <ba>   DW_AT_byte_size   : 4
    <bb>   DW_AT_decl_file   : 1
    <bc>   DW_AT_decl_line   : 1
 <2><bd>: Abbrev Number: 9 (DW_TAG_member)
    <be>   DW_AT_name        : (indirect string, offset: 0xa2): x
    <c2>   DW_AT_type        : <0xae>
    <c6>   DW_AT_decl_file   : 1
    <c7>   DW_AT_decl_line   : 2
    <c8>   DW_AT_data_member_location: 0
 <2><c9>: Abbrev Number: 0
 <1><ca>: Abbrev Number: 0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pahole and LTO
  2020-01-10 16:44 pahole and LTO Stanislav Fomichev
@ 2020-01-10 16:47 ` Arnaldo Carvalho de Melo
  2020-01-10 17:22   ` Stanislav Fomichev
  0 siblings, 1 reply; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-01-10 16:47 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: bpf, netdev, daniel, ast, andriin, morbo

Em Fri, Jan 10, 2020 at 08:44:10AM -0800, Stanislav Fomichev escreveu:
> tl;dr - building the kernel with clang and lto breaks BTF generation because
> pahole doesn't seem to understand cross-cu references.

tl;dr response:

Yeah, so it may be the time to fix that, elfutils has interfaces for it,
and the tools that come with it handle cross-cu references, so we need
to study that and make pahole understand it.

- Arnaldo
 
> Can be reproduced with the following:
>     $ cat a.c
>     struct s;
> 
>     void f1() {}
> 
>     __attribute__((always_inline)) void f2(struct s *p)
>     {
>             if (p)
>                     f1();
>     }
>     $ cat b.c
>     struct s {
>             int x;
>     };
> 
>     void f2(struct s *p);
> 
>     int main()
>     {
>             struct s s = { 10 };
>             f2(&s);
>     }
>     $ clang -fuse-ld=lld -flto {a,b}.c -g
> 
>     $ pahole a.out
>     tag__recode_dwarf_type: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
>     lexblock__recode_dwarf_types: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
>     struct s {
>             int                        x;                    /*     0     4 */
> 
>             /* size: 4, cachelines: 1, members: 1 */
>             /* last cacheline: 4 bytes */
>     };
> 
> From what I can tell, pahole internally loops over each cu and resolves only
> local references, while the dwarf spec (table 2.3) states the following
> about 'reference':
> "Refers to one of the debugging information entries that describe the program.
> There are four types of reference. The first is an offset relative to the
> beginning of the compilation unit in which the reference occurs and must
> refer to an entry within that same compilation unit. The second type of
> reference is the offset of a debugging information entry in any compilation
> unit, including one different from the unit containing the reference. The
> third type of reference is an indirect reference to a type definition using
> an 8-byte signature for that type. The fourth type of reference is a reference
> from within the .debug_info section of the executable or shared object file to
> a debugging information entry in the .debug_info section of a supplementary
> object file."
> 
> In particular: "The second type of reference is the offset of a debugging
> information entry in any compilation unit, including one different from the
> unit containing the reference."
> 
> 
> So the question is: is it a (known) issue? Is it something that's ommitted
> on purpose? Or it's not implemented because lto is not (yet) widely used?
> 
> 
> Here is the dwarf:
> 
> $ readelf --debug-dump=info a.out
> Contents of the .debug_info section:
> 
>   Compilation Unit @ offset 0x0:
>    Length:        0x44 (32-bit)
>    Version:       4
>    Abbrev Offset: 0x0
>    Pointer Size:  8
>  <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
>     <c>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
>     <10>   DW_AT_language    : 12	(ANSI C99)
>     <12>   DW_AT_name        : (indirect string, offset: 0x0): a.c
>     <16>   DW_AT_stmt_list   : 0x0
>     <1a>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
>     <1e>   DW_AT_low_pc      : 0x201730
>     <26>   DW_AT_high_pc     : 0x6
>  <1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
>     <2b>   DW_AT_low_pc      : 0x201730
>     <33>   DW_AT_high_pc     : 0x6
>     <37>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
>     <39>   DW_AT_name        : (indirect string, offset: 0xa4): f1
>     <3d>   DW_AT_decl_file   : 1
>     <3e>   DW_AT_decl_line   : 3
>     <3f>   DW_AT_external    : 1
>  <1><3f>: Abbrev Number: 3 (DW_TAG_subprogram)
>     <40>   DW_AT_name        : (indirect string, offset: 0x4): f2
>     <44>   DW_AT_decl_file   : 1
>     <45>   DW_AT_decl_line   : 5
>     <46>   DW_AT_prototyped  : 1
>     <46>   DW_AT_external    : 1
>     <46>   DW_AT_inline      : 1	(inlined)
>  <1><47>: Abbrev Number: 0
>   Compilation Unit @ offset 0x48:
>    Length:        0x7f (32-bit)
>    Version:       4
>    Abbrev Offset: 0x0
>    Pointer Size:  8
>  <0><53>: Abbrev Number: 1 (DW_TAG_compile_unit)
>     <54>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
>     <58>   DW_AT_language    : 12	(ANSI C99)
>     <5a>   DW_AT_name        : (indirect string, offset: 0x7): b.c
>     <5e>   DW_AT_stmt_list   : 0x3a
>     <62>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
>     <66>   DW_AT_low_pc      : 0x201740
>     <6e>   DW_AT_high_pc     : 0x1f
>  <1><72>: Abbrev Number: 4 (DW_TAG_subprogram)
>     <73>   DW_AT_low_pc      : 0x201740
>     <7b>   DW_AT_high_pc     : 0x1f
>     <7f>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
>     <81>   DW_AT_name        : (indirect string, offset: 0x9d): main
>     <85>   DW_AT_decl_file   : 1
>     <86>   DW_AT_decl_line   : 7
>     <87>   DW_AT_type        : <0xae>
>     <8b>   DW_AT_external    : 1
>  <2><8b>: Abbrev Number: 5 (DW_TAG_variable)
>     <8c>   DW_AT_location    : 2 byte block: 91 78 	(DW_OP_fbreg: -8)
>     <8f>   DW_AT_name        : (indirect string, offset: 0xb): s
>     <93>   DW_AT_decl_file   : 1
>     <94>   DW_AT_decl_line   : 9
>     <95>   DW_AT_type        : <0xb5>
>  <2><99>: Abbrev Number: 6 (DW_TAG_inlined_subroutine)
>     <9a>   DW_AT_abstract_origin: <0x3f>
>     <9e>   DW_AT_low_pc      : 0x201752
>     <a6>   DW_AT_high_pc     : 0x5
>     <aa>   DW_AT_call_file   : 1
>     <ab>   DW_AT_call_line   : 10
>     <ac>   DW_AT_call_column : 
>  <2><ad>: Abbrev Number: 0
>  <1><ae>: Abbrev Number: 7 (DW_TAG_base_type)
>     <af>   DW_AT_name        : (indirect string, offset: 0xd): int
>     <b3>   DW_AT_encoding    : 5	(signed)
>     <b4>   DW_AT_byte_size   : 4
>  <1><b5>: Abbrev Number: 8 (DW_TAG_structure_type)
>     <b6>   DW_AT_name        : (indirect string, offset: 0xb): s
>     <ba>   DW_AT_byte_size   : 4
>     <bb>   DW_AT_decl_file   : 1
>     <bc>   DW_AT_decl_line   : 1
>  <2><bd>: Abbrev Number: 9 (DW_TAG_member)
>     <be>   DW_AT_name        : (indirect string, offset: 0xa2): x
>     <c2>   DW_AT_type        : <0xae>
>     <c6>   DW_AT_decl_file   : 1
>     <c7>   DW_AT_decl_line   : 2
>     <c8>   DW_AT_data_member_location: 0
>  <2><c9>: Abbrev Number: 0
>  <1><ca>: Abbrev Number: 0

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pahole and LTO
  2020-01-10 16:47 ` Arnaldo Carvalho de Melo
@ 2020-01-10 17:22   ` Stanislav Fomichev
  2020-01-10 17:29     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 4+ messages in thread
From: Stanislav Fomichev @ 2020-01-10 17:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: bpf, netdev, daniel, ast, andriin, morbo

On 01/10, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jan 10, 2020 at 08:44:10AM -0800, Stanislav Fomichev escreveu:
> > tl;dr - building the kernel with clang and lto breaks BTF generation because
> > pahole doesn't seem to understand cross-cu references.
> 
> tl;dr response:
> 
> Yeah, so it may be the time to fix that, elfutils has interfaces for it,
> and the tools that come with it handle cross-cu references, so we need
> to study that and make pahole understand it.
Sure, we can definitely help with the implementation unless someone
is already actively working on it. Just wanted to make sure that's
a known problem.

From my (limited) looking at pahole sources, it seems that building
and index on the first pass and doing a second pass to resolve
cross-cu references is relatively easy to implement. Am I missing
anything? (not a dwarf expert in any sense).

And where do the patches for pahole go? I don't see any pahole patches
in bpf/netdev mailing lists.

> - Arnaldo
>  
> > Can be reproduced with the following:
> >     $ cat a.c
> >     struct s;
> > 
> >     void f1() {}
> > 
> >     __attribute__((always_inline)) void f2(struct s *p)
> >     {
> >             if (p)
> >                     f1();
> >     }
> >     $ cat b.c
> >     struct s {
> >             int x;
> >     };
> > 
> >     void f2(struct s *p);
> > 
> >     int main()
> >     {
> >             struct s s = { 10 };
> >             f2(&s);
> >     }
> >     $ clang -fuse-ld=lld -flto {a,b}.c -g
> > 
> >     $ pahole a.out
> >     tag__recode_dwarf_type: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
> >     lexblock__recode_dwarf_types: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
> >     struct s {
> >             int                        x;                    /*     0     4 */
> > 
> >             /* size: 4, cachelines: 1, members: 1 */
> >             /* last cacheline: 4 bytes */
> >     };
> > 
> > From what I can tell, pahole internally loops over each cu and resolves only
> > local references, while the dwarf spec (table 2.3) states the following
> > about 'reference':
> > "Refers to one of the debugging information entries that describe the program.
> > There are four types of reference. The first is an offset relative to the
> > beginning of the compilation unit in which the reference occurs and must
> > refer to an entry within that same compilation unit. The second type of
> > reference is the offset of a debugging information entry in any compilation
> > unit, including one different from the unit containing the reference. The
> > third type of reference is an indirect reference to a type definition using
> > an 8-byte signature for that type. The fourth type of reference is a reference
> > from within the .debug_info section of the executable or shared object file to
> > a debugging information entry in the .debug_info section of a supplementary
> > object file."
> > 
> > In particular: "The second type of reference is the offset of a debugging
> > information entry in any compilation unit, including one different from the
> > unit containing the reference."
> > 
> > 
> > So the question is: is it a (known) issue? Is it something that's ommitted
> > on purpose? Or it's not implemented because lto is not (yet) widely used?
> > 
> > 
> > Here is the dwarf:
> > 
> > $ readelf --debug-dump=info a.out
> > Contents of the .debug_info section:
> > 
> >   Compilation Unit @ offset 0x0:
> >    Length:        0x44 (32-bit)
> >    Version:       4
> >    Abbrev Offset: 0x0
> >    Pointer Size:  8
> >  <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
> >     <c>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
> >     <10>   DW_AT_language    : 12	(ANSI C99)
> >     <12>   DW_AT_name        : (indirect string, offset: 0x0): a.c
> >     <16>   DW_AT_stmt_list   : 0x0
> >     <1a>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
> >     <1e>   DW_AT_low_pc      : 0x201730
> >     <26>   DW_AT_high_pc     : 0x6
> >  <1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
> >     <2b>   DW_AT_low_pc      : 0x201730
> >     <33>   DW_AT_high_pc     : 0x6
> >     <37>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
> >     <39>   DW_AT_name        : (indirect string, offset: 0xa4): f1
> >     <3d>   DW_AT_decl_file   : 1
> >     <3e>   DW_AT_decl_line   : 3
> >     <3f>   DW_AT_external    : 1
> >  <1><3f>: Abbrev Number: 3 (DW_TAG_subprogram)
> >     <40>   DW_AT_name        : (indirect string, offset: 0x4): f2
> >     <44>   DW_AT_decl_file   : 1
> >     <45>   DW_AT_decl_line   : 5
> >     <46>   DW_AT_prototyped  : 1
> >     <46>   DW_AT_external    : 1
> >     <46>   DW_AT_inline      : 1	(inlined)
> >  <1><47>: Abbrev Number: 0
> >   Compilation Unit @ offset 0x48:
> >    Length:        0x7f (32-bit)
> >    Version:       4
> >    Abbrev Offset: 0x0
> >    Pointer Size:  8
> >  <0><53>: Abbrev Number: 1 (DW_TAG_compile_unit)
> >     <54>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
> >     <58>   DW_AT_language    : 12	(ANSI C99)
> >     <5a>   DW_AT_name        : (indirect string, offset: 0x7): b.c
> >     <5e>   DW_AT_stmt_list   : 0x3a
> >     <62>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
> >     <66>   DW_AT_low_pc      : 0x201740
> >     <6e>   DW_AT_high_pc     : 0x1f
> >  <1><72>: Abbrev Number: 4 (DW_TAG_subprogram)
> >     <73>   DW_AT_low_pc      : 0x201740
> >     <7b>   DW_AT_high_pc     : 0x1f
> >     <7f>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
> >     <81>   DW_AT_name        : (indirect string, offset: 0x9d): main
> >     <85>   DW_AT_decl_file   : 1
> >     <86>   DW_AT_decl_line   : 7
> >     <87>   DW_AT_type        : <0xae>
> >     <8b>   DW_AT_external    : 1
> >  <2><8b>: Abbrev Number: 5 (DW_TAG_variable)
> >     <8c>   DW_AT_location    : 2 byte block: 91 78 	(DW_OP_fbreg: -8)
> >     <8f>   DW_AT_name        : (indirect string, offset: 0xb): s
> >     <93>   DW_AT_decl_file   : 1
> >     <94>   DW_AT_decl_line   : 9
> >     <95>   DW_AT_type        : <0xb5>
> >  <2><99>: Abbrev Number: 6 (DW_TAG_inlined_subroutine)
> >     <9a>   DW_AT_abstract_origin: <0x3f>
> >     <9e>   DW_AT_low_pc      : 0x201752
> >     <a6>   DW_AT_high_pc     : 0x5
> >     <aa>   DW_AT_call_file   : 1
> >     <ab>   DW_AT_call_line   : 10
> >     <ac>   DW_AT_call_column : 
> >  <2><ad>: Abbrev Number: 0
> >  <1><ae>: Abbrev Number: 7 (DW_TAG_base_type)
> >     <af>   DW_AT_name        : (indirect string, offset: 0xd): int
> >     <b3>   DW_AT_encoding    : 5	(signed)
> >     <b4>   DW_AT_byte_size   : 4
> >  <1><b5>: Abbrev Number: 8 (DW_TAG_structure_type)
> >     <b6>   DW_AT_name        : (indirect string, offset: 0xb): s
> >     <ba>   DW_AT_byte_size   : 4
> >     <bb>   DW_AT_decl_file   : 1
> >     <bc>   DW_AT_decl_line   : 1
> >  <2><bd>: Abbrev Number: 9 (DW_TAG_member)
> >     <be>   DW_AT_name        : (indirect string, offset: 0xa2): x
> >     <c2>   DW_AT_type        : <0xae>
> >     <c6>   DW_AT_decl_file   : 1
> >     <c7>   DW_AT_decl_line   : 2
> >     <c8>   DW_AT_data_member_location: 0
> >  <2><c9>: Abbrev Number: 0
> >  <1><ca>: Abbrev Number: 0
> 
> -- 
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pahole and LTO
  2020-01-10 17:22   ` Stanislav Fomichev
@ 2020-01-10 17:29     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-01-10 17:29 UTC (permalink / raw)
  To: Stanislav Fomichev, Arnaldo Carvalho de Melo
  Cc: bpf, netdev, daniel, ast, andriin, morbo

On January 10, 2020 2:22:31 PM GMT-03:00, Stanislav Fomichev <sdf@fomichev.me> wrote:
>On 01/10, Arnaldo Carvalho de Melo wrote:
>> Em Fri, Jan 10, 2020 at 08:44:10AM -0800, Stanislav Fomichev
>escreveu:
>> > tl;dr - building the kernel with clang and lto breaks BTF
>generation because
>> > pahole doesn't seem to understand cross-cu references.
>> 
>> tl;dr response:
>> 
>> Yeah, so it may be the time to fix that, elfutils has interfaces for
>it,
>> and the tools that come with it handle cross-cu references, so we
>need
>> to study that and make pahole understand it.
>Sure, we can definitely help with the implementation unless someone
>is already actively working on it. Just wanted to make sure that's
>a known problem.
>
>From my (limited) looking at pahole sources, it seems that building
>and index on the first pass and doing a second pass to resolve
>cross-cu references is relatively easy to implement. Am I missing
>anything? (not a dwarf expert in any sense).

Give it a try, please 

>And where do the patches for pahole go? I don't see any pahole patches
>in bpf/netdev mailing lists.

Send it to me, cc
dwarves@vger.kernel.org

- Arnaldo
>
>> - Arnaldo
>>  
>> > Can be reproduced with the following:
>> >     $ cat a.c
>> >     struct s;
>> > 
>> >     void f1() {}
>> > 
>> >     __attribute__((always_inline)) void f2(struct s *p)
>> >     {
>> >             if (p)
>> >                     f1();
>> >     }
>> >     $ cat b.c
>> >     struct s {
>> >             int x;
>> >     };
>> > 
>> >     void f2(struct s *p);
>> > 
>> >     int main()
>> >     {
>> >             struct s s = { 10 };
>> >             f2(&s);
>> >     }
>> >     $ clang -fuse-ld=lld -flto {a,b}.c -g
>> > 
>> >     $ pahole a.out
>> >     tag__recode_dwarf_type: couldn't find 0x3f type for 0x99
>(inlined_subroutine)!
>> >     lexblock__recode_dwarf_types: couldn't find 0x3f type for 0x99
>(inlined_subroutine)!
>> >     struct s {
>> >             int                        x;                    /*    
>0     4 */
>> > 
>> >             /* size: 4, cachelines: 1, members: 1 */
>> >             /* last cacheline: 4 bytes */
>> >     };
>> > 
>> > From what I can tell, pahole internally loops over each cu and
>resolves only
>> > local references, while the dwarf spec (table 2.3) states the
>following
>> > about 'reference':
>> > "Refers to one of the debugging information entries that describe
>the program.
>> > There are four types of reference. The first is an offset relative
>to the
>> > beginning of the compilation unit in which the reference occurs and
>must
>> > refer to an entry within that same compilation unit. The second
>type of
>> > reference is the offset of a debugging information entry in any
>compilation
>> > unit, including one different from the unit containing the
>reference. The
>> > third type of reference is an indirect reference to a type
>definition using
>> > an 8-byte signature for that type. The fourth type of reference is
>a reference
>> > from within the .debug_info section of the executable or shared
>object file to
>> > a debugging information entry in the .debug_info section of a
>supplementary
>> > object file."
>> > 
>> > In particular: "The second type of reference is the offset of a
>debugging
>> > information entry in any compilation unit, including one different
>from the
>> > unit containing the reference."
>> > 
>> > 
>> > So the question is: is it a (known) issue? Is it something that's
>ommitted
>> > on purpose? Or it's not implemented because lto is not (yet) widely
>used?
>> > 
>> > 
>> > Here is the dwarf:
>> > 
>> > $ readelf --debug-dump=info a.out
>> > Contents of the .debug_info section:
>> > 
>> >   Compilation Unit @ offset 0x0:
>> >    Length:        0x44 (32-bit)
>> >    Version:       4
>> >    Abbrev Offset: 0x0
>> >    Pointer Size:  8
>> >  <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
>> >     <c>   DW_AT_producer    : (indirect string, offset: 0x11):
>clang version 10.0.0 (https://github.com/llvm/llvm-project.git
>5fe4679cc9cfb4941b766db07bf3cd928075d204)
>> >     <10>   DW_AT_language    : 12	(ANSI C99)
>> >     <12>   DW_AT_name        : (indirect string, offset: 0x0): a.c
>> >     <16>   DW_AT_stmt_list   : 0x0
>> >     <1a>   DW_AT_comp_dir    : (indirect string, offset: 0x7a):
>/usr/local/google/home/sdf/tmp/lto
>> >     <1e>   DW_AT_low_pc      : 0x201730
>> >     <26>   DW_AT_high_pc     : 0x6
>> >  <1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
>> >     <2b>   DW_AT_low_pc      : 0x201730
>> >     <33>   DW_AT_high_pc     : 0x6
>> >     <37>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
>> >     <39>   DW_AT_name        : (indirect string, offset: 0xa4): f1
>> >     <3d>   DW_AT_decl_file   : 1
>> >     <3e>   DW_AT_decl_line   : 3
>> >     <3f>   DW_AT_external    : 1
>> >  <1><3f>: Abbrev Number: 3 (DW_TAG_subprogram)
>> >     <40>   DW_AT_name        : (indirect string, offset: 0x4): f2
>> >     <44>   DW_AT_decl_file   : 1
>> >     <45>   DW_AT_decl_line   : 5
>> >     <46>   DW_AT_prototyped  : 1
>> >     <46>   DW_AT_external    : 1
>> >     <46>   DW_AT_inline      : 1	(inlined)
>> >  <1><47>: Abbrev Number: 0
>> >   Compilation Unit @ offset 0x48:
>> >    Length:        0x7f (32-bit)
>> >    Version:       4
>> >    Abbrev Offset: 0x0
>> >    Pointer Size:  8
>> >  <0><53>: Abbrev Number: 1 (DW_TAG_compile_unit)
>> >     <54>   DW_AT_producer    : (indirect string, offset: 0x11):
>clang version 10.0.0 (https://github.com/llvm/llvm-project.git
>5fe4679cc9cfb4941b766db07bf3cd928075d204)
>> >     <58>   DW_AT_language    : 12	(ANSI C99)
>> >     <5a>   DW_AT_name        : (indirect string, offset: 0x7): b.c
>> >     <5e>   DW_AT_stmt_list   : 0x3a
>> >     <62>   DW_AT_comp_dir    : (indirect string, offset: 0x7a):
>/usr/local/google/home/sdf/tmp/lto
>> >     <66>   DW_AT_low_pc      : 0x201740
>> >     <6e>   DW_AT_high_pc     : 0x1f
>> >  <1><72>: Abbrev Number: 4 (DW_TAG_subprogram)
>> >     <73>   DW_AT_low_pc      : 0x201740
>> >     <7b>   DW_AT_high_pc     : 0x1f
>> >     <7f>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
>> >     <81>   DW_AT_name        : (indirect string, offset: 0x9d):
>main
>> >     <85>   DW_AT_decl_file   : 1
>> >     <86>   DW_AT_decl_line   : 7
>> >     <87>   DW_AT_type        : <0xae>
>> >     <8b>   DW_AT_external    : 1
>> >  <2><8b>: Abbrev Number: 5 (DW_TAG_variable)
>> >     <8c>   DW_AT_location    : 2 byte block: 91 78 	(DW_OP_fbreg:
>-8)
>> >     <8f>   DW_AT_name        : (indirect string, offset: 0xb): s
>> >     <93>   DW_AT_decl_file   : 1
>> >     <94>   DW_AT_decl_line   : 9
>> >     <95>   DW_AT_type        : <0xb5>
>> >  <2><99>: Abbrev Number: 6 (DW_TAG_inlined_subroutine)
>> >     <9a>   DW_AT_abstract_origin: <0x3f>
>> >     <9e>   DW_AT_low_pc      : 0x201752
>> >     <a6>   DW_AT_high_pc     : 0x5
>> >     <aa>   DW_AT_call_file   : 1
>> >     <ab>   DW_AT_call_line   : 10
>> >     <ac>   DW_AT_call_column : 
>> >  <2><ad>: Abbrev Number: 0
>> >  <1><ae>: Abbrev Number: 7 (DW_TAG_base_type)
>> >     <af>   DW_AT_name        : (indirect string, offset: 0xd): int
>> >     <b3>   DW_AT_encoding    : 5	(signed)
>> >     <b4>   DW_AT_byte_size   : 4
>> >  <1><b5>: Abbrev Number: 8 (DW_TAG_structure_type)
>> >     <b6>   DW_AT_name        : (indirect string, offset: 0xb): s
>> >     <ba>   DW_AT_byte_size   : 4
>> >     <bb>   DW_AT_decl_file   : 1
>> >     <bc>   DW_AT_decl_line   : 1
>> >  <2><bd>: Abbrev Number: 9 (DW_TAG_member)
>> >     <be>   DW_AT_name        : (indirect string, offset: 0xa2): x
>> >     <c2>   DW_AT_type        : <0xae>
>> >     <c6>   DW_AT_decl_file   : 1
>> >     <c7>   DW_AT_decl_line   : 2
>> >     <c8>   DW_AT_data_member_location: 0
>> >  <2><c9>: Abbrev Number: 0
>> >  <1><ca>: Abbrev Number: 0
>> 
>> -- 
>> 
>> - Arnaldo


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-10 16:44 pahole and LTO Stanislav Fomichev
2020-01-10 16:47 ` Arnaldo Carvalho de Melo
2020-01-10 17:22   ` Stanislav Fomichev
2020-01-10 17:29     ` Arnaldo Carvalho de Melo

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git