bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: Jiri Olsa <olsajiri@gmail.com>, Alan Maguire <alan.maguire@oracle.com>
Cc: acme@kernel.org, ast@kernel.org, yhs@fb.com, andrii@kernel.org,
	daniel@iogearbox.net, laoar.shao@gmail.com, martin.lau@linux.dev,
	song@kernel.org, john.fastabend@gmail.com, kpsingh@kernel.org,
	sdf@google.com, haoluo@google.com, bpf@vger.kernel.org
Subject: Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
Date: Thu, 18 May 2023 17:23:58 -0700	[thread overview]
Message-ID: <32d01399-22a4-8ddc-379d-4ca2f41506f6@meta.com> (raw)
In-Reply-To: <ee0a24c9-1106-c847-2c91-0d828ec7fba3@meta.com>



On 5/18/23 11:25 AM, Yonghong Song wrote:
> 
> 
> On 5/18/23 9:22 AM, Jiri Olsa wrote:
>> On Thu, May 18, 2023 at 02:23:34PM +0100, Alan Maguire wrote:
>>> On 18/05/2023 09:39, Jiri Olsa wrote:
>>>> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>>>>> By making sorting function for our ELF function list match on
>>>>> both name and function, we ensure that the set of ELF functions
>>>>> includes multiple copies for functions which have multiple instances
>>>>> across CUs.  For example, cpumask_weight has 22 instances in
>>>>> System.map/kallsyms:
>>>>>
>>>>> ffffffff8103b530 t cpumask_weight
>>>>> ffffffff8103e300 t cpumask_weight
>>>>> ffffffff81040d30 t cpumask_weight
>>>>> ffffffff8104fa00 t cpumask_weight
>>>>> ffffffff81064300 t cpumask_weight
>>>>> ffffffff81082ba0 t cpumask_weight
>>>>> ffffffff81084f50 t cpumask_weight
>>>>> ffffffff810a4ad0 t cpumask_weight
>>>>> ffffffff810bb740 t cpumask_weight
>>>>> ffffffff8110a6c0 t cpumask_weight
>>>>> ffffffff81118ab0 t cpumask_weight
>>>>> ffffffff81129b50 t cpumask_weight
>>>>> ffffffff81137dc0 t cpumask_weight
>>>>> ffffffff811aead0 t cpumask_weight
>>>>> ffffffff811d6800 t cpumask_weight
>>>>> ffffffff811e1370 t cpumask_weight
>>>>> ffffffff812fae80 t cpumask_weight
>>>>> ffffffff81375c50 t cpumask_weight
>>>>> ffffffff81634b60 t cpumask_weight
>>>>> ffffffff817ba540 t cpumask_weight
>>>>> ffffffff819abf30 t cpumask_weight
>>>>> ffffffff81a7cb60 t cpumask_weight
>>>>>
>>>>> With ELF representations for each address, and DWARF info about
>>>>> addresses (low_pc) we can match DWARF with ELF accurately.
>>>>> The result for the BTF representation is that we end up with
>>>>> a single de-duped function:
>>>>>
>>>>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>>>>
>>>>> ...and 22 DECL_TAGs for each address that point at it:
>>>>>
>>>>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 
>>>>> component_idx=-1
>>>>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 
>>>>> component_idx=-1
>>>>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 
>>>>> component_idx=-1
>>>>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 
>>>>> component_idx=-1
>>>>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 
>>>>> component_idx=-1
>>>>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 
>>>>> component_idx=-1
>>>>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 
>>>>> component_idx=-1
>>>>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 
>>>>> component_idx=-1
>>>>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 
>>>>> component_idx=-1
>>>>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 
>>>>> component_idx=-1
>>>>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 
>>>>> component_idx=-1
>>>>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 
>>>>> component_idx=-1
>>>>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 
>>>>> component_idx=-1
>>>>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 
>>>>> component_idx=-1
>>>>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 
>>>>> component_idx=-1
>>>>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 
>>>>> component_idx=-1
>>>>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 
>>>>> component_idx=-1
>>>>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 
>>>>> component_idx=-1
>>>>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 
>>>>> component_idx=-1
>>>>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 
>>>>> component_idx=-1
>>>>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 
>>>>> component_idx=-1
>>>>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 
>>>>> component_idx=-1
>>>>
>>>> right, Yonghong pointed this out in:
>>>>    
>>>> https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
>>>>
>>>> it's problem, because we pass btf id as attach id during bpf program 
>>>> load,
>>>> and kernel does not have a way to figure out which address from the 
>>>> associated
>>>> DECL_TAGs to use
>>>>
>>>> if we could change dedup algo to take the function address into 
>>>> account and
>>>> make it not de-duplicate equal functions with different addresses, 
>>>> then we
>>>> could:
>>>>
>>>>    - find btf id that properly and uniquely identifies the function we
>>>>      want to trace
>>>>
>>>
>>> So maybe a more natural approach would be to extend BTF_KIND_FUNC
>>> (I think Alexei suggested something this earlier but I could be
>>> misremembering) as follows:
>>>
>>>
>>> 2.2.12 BTF_KIND_FUNC
>>> ~~~~~~~~~~~~~~~~~~~~
>>>
>>> ``struct btf_type`` encoding requirement:
>>>    * ``name_off``: offset to a valid C identifier
>>> -  * ``info.kind_flag``: 0
>>> +  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` 
>>> follows
>>>    * ``info.kind``: BTF_KIND_FUNC
>>>    * ``info.vlen``: linkage information (BTF_FUNC_STATIC, 
>>> BTF_FUNC_GLOBAL
>>>                     or BTF_FUNC_EXTERN)
>>>    * ``type``: a BTF_KIND_FUNC_PROTO type
>>>
>>> - No additional type data follow ``btf_type``.
>>> + If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
>>> +
>>> +    struct btf_func {
>>> +        __u64 addr;
>>> +    };
>>> + Otherwise no additional type data follows ``btf_type``.
>>>
>>>
>>> With the above, dedup could be made to fail when functions have non-
>>> identical associated addresses. Judging by the number of DECL_TAGs in
>>> the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
>>> space for struct btf_funcs would require roughly 400k. We'd still get
>>> dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB
>> nice, I think it's better solution
>>
>>>
>>>
>>>
>>>>    - store the vmlinux base address and treat stored function 
>>>> addresses as
>>>>      offsets, so the verifier can get proper address even if the kernel
>>>>      is relocated
>>>>
>>>
>>> yep; when we read kernel/module BTF in we could hopefully carry out
>>> this recalculation and update the vmlinux/module BTF addresses
>>> accordingly.
>>
>> I wonder now when the address will be stored as number (not string) we
>> could somehow generate relocation records and have the module loader
>> do the relocation automatically

This is an interesting idea if bpf subsystem does not want to do it...

>>
>> not sure how that works for vmlinux when it's loaded/relocated on boot

If no KASLR, address in the vmlinux dwarf should match the one in BTF
based on my experience. With KASLR, yes, relocation needs to be
done for those addresses in BTF as they won't match the actual func
address in the dwarf.

Do most distributions enable KASLR (CONFIG_RANDOMIZE_BASE)?

> 
> Right, actual module address will mostly not match the one in dwarf.
> Some during module btf load, we should modify btf address as well
> for later use? Yes, may need to reuse some routines used in initial
> module relocation.
> 
>>
>> jirka

  reply	other threads:[~2023-05-19  0:25 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 1/6] btf_encoder: record function address and if it is local Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 2/6] dwarf_loader: store address in function low_pc if available Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 3/6] dwarf_loader: transfer low_pc info from subtroutine to its abstract origin Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 4/6] btf_encoder: add "addr=0x<addr>" function declaration tag if --btf_gen_func_addr specified Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address Alan Maguire
2023-05-18  8:39   ` Jiri Olsa
2023-05-18 13:23     ` Alan Maguire
2023-05-18 15:20       ` Yonghong Song
2023-05-18 16:22       ` Jiri Olsa
2023-05-18 18:25         ` Yonghong Song
2023-05-19  0:23           ` Yonghong Song [this message]
2023-05-19  0:26           ` Alexei Starovoitov
2023-05-22 21:31             ` Andrii Nakryiko
2023-05-25  8:52               ` Jiri Olsa
2023-05-25 10:14                 ` Alan Maguire
2023-05-25 17:29                   ` Andrii Nakryiko
2023-05-17 16:16 ` [RFC dwarves 6/6] pahole: document --btf_gen_func_addr Alan Maguire
2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
2023-05-19 14:46   ` Yonghong Song
2023-05-19 15:08   ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32d01399-22a4-8ddc-379d-4ca2f41506f6@meta.com \
    --to=yhs@meta.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=olsajiri@gmail.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).