bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: Jiri Olsa <olsajiri@gmail.com>
Cc: acme@kernel.org, ast@kernel.org, yhs@fb.com, andrii@kernel.org,
	daniel@iogearbox.net, laoar.shao@gmail.com, martin.lau@linux.dev,
	song@kernel.org, john.fastabend@gmail.com, kpsingh@kernel.org,
	sdf@google.com, haoluo@google.com, bpf@vger.kernel.org
Subject: Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
Date: Thu, 18 May 2023 14:23:34 +0100	[thread overview]
Message-ID: <35213852-1d29-e21f-e3f8-d3f164e97294@oracle.com> (raw)
In-Reply-To: <ZGXkN2TeEJZHMSG8@krava>

On 18/05/2023 09:39, Jiri Olsa wrote:
> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>> By making sorting function for our ELF function list match on
>> both name and function, we ensure that the set of ELF functions
>> includes multiple copies for functions which have multiple instances
>> across CUs.  For example, cpumask_weight has 22 instances in
>> System.map/kallsyms:
>>
>> ffffffff8103b530 t cpumask_weight
>> ffffffff8103e300 t cpumask_weight
>> ffffffff81040d30 t cpumask_weight
>> ffffffff8104fa00 t cpumask_weight
>> ffffffff81064300 t cpumask_weight
>> ffffffff81082ba0 t cpumask_weight
>> ffffffff81084f50 t cpumask_weight
>> ffffffff810a4ad0 t cpumask_weight
>> ffffffff810bb740 t cpumask_weight
>> ffffffff8110a6c0 t cpumask_weight
>> ffffffff81118ab0 t cpumask_weight
>> ffffffff81129b50 t cpumask_weight
>> ffffffff81137dc0 t cpumask_weight
>> ffffffff811aead0 t cpumask_weight
>> ffffffff811d6800 t cpumask_weight
>> ffffffff811e1370 t cpumask_weight
>> ffffffff812fae80 t cpumask_weight
>> ffffffff81375c50 t cpumask_weight
>> ffffffff81634b60 t cpumask_weight
>> ffffffff817ba540 t cpumask_weight
>> ffffffff819abf30 t cpumask_weight
>> ffffffff81a7cb60 t cpumask_weight
>>
>> With ELF representations for each address, and DWARF info about
>> addresses (low_pc) we can match DWARF with ELF accurately.
>> The result for the BTF representation is that we end up with
>> a single de-duped function:
>>
>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>
>> ...and 22 DECL_TAGs for each address that point at it:
>>
>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1
> 
> right, Yonghong pointed this out in:
>   https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
> 
> it's problem, because we pass btf id as attach id during bpf program load,
> and kernel does not have a way to figure out which address from the associated
> DECL_TAGs to use
> 
> if we could change dedup algo to take the function address into account and
> make it not de-duplicate equal functions with different addresses, then we
> could:
> 
>   - find btf id that properly and uniquely identifies the function we
>     want to trace
> 

So maybe a more natural approach would be to extend BTF_KIND_FUNC
(I think Alexei suggested something this earlier but I could be
misremembering) as follows:


2.2.12 BTF_KIND_FUNC
~~~~~~~~~~~~~~~~~~~~

``struct btf_type`` encoding requirement:
  * ``name_off``: offset to a valid C identifier
-  * ``info.kind_flag``: 0
+  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` follows
  * ``info.kind``: BTF_KIND_FUNC
  * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
                   or BTF_FUNC_EXTERN)
  * ``type``: a BTF_KIND_FUNC_PROTO type

- No additional type data follow ``btf_type``.
+ If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
+
+	struct btf_func {
+		__u64 addr;
+	};
+ Otherwise no additional type data follows ``btf_type``.


With the above, dedup could be made to fail when functions have non-
identical associated addresses. Judging by the number of DECL_TAGs in
the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
space for struct btf_funcs would require roughly 400k. We'd still get
dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB.



>   - store the vmlinux base address and treat stored function addresses as
>     offsets, so the verifier can get proper address even if the kernel
>     is relocated
>

yep; when we read kernel/module BTF in we could hopefully carry out
this recalculation and update the vmlinux/module BTF addresses
accordingly.

Thanks!

Alan

  reply	other threads:[~2023-05-18 13:24 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 1/6] btf_encoder: record function address and if it is local Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 2/6] dwarf_loader: store address in function low_pc if available Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 3/6] dwarf_loader: transfer low_pc info from subtroutine to its abstract origin Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 4/6] btf_encoder: add "addr=0x<addr>" function declaration tag if --btf_gen_func_addr specified Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address Alan Maguire
2023-05-18  8:39   ` Jiri Olsa
2023-05-18 13:23     ` Alan Maguire [this message]
2023-05-18 15:20       ` Yonghong Song
2023-05-18 16:22       ` Jiri Olsa
2023-05-18 18:25         ` Yonghong Song
2023-05-19  0:23           ` Yonghong Song
2023-05-19  0:26           ` Alexei Starovoitov
2023-05-22 21:31             ` Andrii Nakryiko
2023-05-25  8:52               ` Jiri Olsa
2023-05-25 10:14                 ` Alan Maguire
2023-05-25 17:29                   ` Andrii Nakryiko
2023-05-17 16:16 ` [RFC dwarves 6/6] pahole: document --btf_gen_func_addr Alan Maguire
2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
2023-05-19 14:46   ` Yonghong Song
2023-05-19 15:08   ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35213852-1d29-e21f-e3f8-d3f164e97294@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=acme@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=olsajiri@gmail.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).