From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AA98C43460 for ; Fri, 2 Apr 2021 14:04:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4711D6112F for ; Fri, 2 Apr 2021 14:04:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235535AbhDBOEY (ORCPT ); Fri, 2 Apr 2021 10:04:24 -0400 Received: from mail.kernel.org ([198.145.29.99]:42264 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234161AbhDBOEW (ORCPT ); Fri, 2 Apr 2021 10:04:22 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id C891460C3D; Fri, 2 Apr 2021 14:04:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1617372261; bh=NW+Dz3Y9ztnjEMC1f9yZ5Jl6uR0eGqUzs2GjKs+wxyM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ubynTIaAi/uGVnX0n7TqiomAgE0SrtBjO46SUOMg20HFvkrrsUayth1lkWsJ+FNzU ZV64LJhwjEeVb71IXRr8jM33llOSo1zQo4DsbUX5BfQp4Tx33Rs6PC8gfu6Q1I1wNx jQwo46XLd1DIhTgsanbqJEDWL/gw75BC/2+7JcGdjIEy8FqFfe8nVsDKjFDMBKxzXz 1AH9urFKG6lnFdSQfRDtxL7oaxL+nc0KqDRA/X4zq1kAbY0IWu63O520+HOPI/G/2R I6OTsdAuN5RVk9+pP2CtI38U/xosou2e43AVgF/AJU1rNfqhRq5KaK9lBy9K7NkA+f YuVFad971Q5tQ== Received: by quaco.ghostprotocols.net (Postfix, from userid 1000) id 8824440647; Fri, 2 Apr 2021 11:04:18 -0300 (-03) Date: Fri, 2 Apr 2021 11:04:18 -0300 From: Arnaldo Carvalho de Melo To: David Blaikie Cc: Yonghong Song , Arnaldo Carvalho de Melo , dwarves@vger.kernel.org, Alexei Starovoitov , Bill Wendling , bpf , kernel-team@fb.com, Nick Desaulniers Subject: Re: [PATCH dwarves] dwarf_loader: handle subprogram ret type with abstract_origin properly Message-ID: References: <20210401213620.3056084-1-yhs@fb.com> <1ef31dd8-2385-1da1-2c95-54429c895d8a@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Url: http://acmel.wordpress.com Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Em Thu, Apr 01, 2021 at 05:00:46PM -0700, David Blaikie escreveu: > On Thu, Apr 1, 2021 at 4:41 PM Yonghong Song wrote: > > > > > > > > On 4/1/21 3:27 PM, David Blaikie wrote: > > > On Thu, Apr 1, 2021 at 2:41 PM Yonghong Song > > > wrote: > > > > > > > > > > > > > > > > On 4/1/21 2:36 PM, Yonghong Song wrote: > > > > > With latest bpf-next built with clang lto (thin or full), I hit one > > > test > > > > > failures: > > > > > $ ./test_progs -t tcp > > > > > ... > > > > > libbpf: extern (func ksym) 'tcp_slow_start': func_proto [23] > > > incompatible with kernel [115303] > > > > > libbpf: failed to load object 'bpf_cubic' > > > > > libbpf: failed to load BPF skeleton 'bpf_cubic': -22 > > > > > test_cubic:FAIL:bpf_cubic__open_and_load failed > > > > > #9/2 cubic:FAIL > > > > > ... > > > > > > > > > > The reason of the failure is due to bpf program 'tcp_slow_start' > > > > > func signature is different from vmlinux BTF. bpf program uses > > > > > the following signature: > > > > > extern __u32 tcp_slow_start(struct tcp_sock *tp, __u32 acked); > > > > > which is identical to the kernel definition in linux:include/net/tcp.h: > > > > > u32 tcp_slow_start(struct tcp_sock *tp, u32 acked); > > > > > While vmlinux BTF definition like: > > > > > [115303] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > > > > 'tp' type_id=39373 > > > > > 'acked' type_id=18 > > > > > [115304] FUNC 'tcp_slow_start' type_id=115303 linkage=static > > > > > The above is dumped with `bpftool btf dump file vmlinux`. > > > > > You can see the ret_type_id is 0 and this caused the problem. > > > > > > > > > > Looking at dwarf, we have: > > > > > > > > > > 0x11f2ec67: DW_TAG_subprogram > > > > > DW_AT_low_pc (0xffffffff81ed2330) > > > > > DW_AT_high_pc (0xffffffff81ed235c) > > > > > DW_AT_frame_base () > > > > > DW_AT_GNU_all_call_sites (true) > > > > > DW_AT_abstract_origin (0x11f2ed66 "tcp_slow_start") > > > > > ... > > > > > 0x11f2ed66: DW_TAG_subprogram > > > > > DW_AT_name ("tcp_slow_start") > > > > > DW_AT_decl_file > > > ("/home/yhs/work/bpf-next/net/ipv4/tcp_cong.c") > > > > > DW_AT_decl_line (392) > > > > > DW_AT_prototyped (true) > > > > > DW_AT_type (0x11f130c2 "u32") > > > > > DW_AT_external (true) > > > > > DW_AT_inline (DW_INL_inlined) > > > > > > > > David, > > > > > > > > Could you help confirm whether DW_AT_abstract_origin at a > > > > DW_TAG_subprogram always points to another DW_TAG_subprogram, > > > > or there are possible other cases? > > > > > > That's correct, so far as I understand the spec, specifically DWARFv5 > > > > > > 3.3.8.3 says: > > > > > > "The root entry for a concrete out-of-line instance of a given inlined > > > subroutine has the same tag as does its associated (abstract) inlined > > > subroutine entry (that is, tag DW_TAG_subprogram rather than > > > DW_TAG_inlined_subroutine)." > > > > Thanks. That means that some of my codes in the patch is > > dead code. > > > > > > > > Though people may come up with novel uses of DWARF features. What would > > > happen if this constraint were violated/what's your motivation for > > > asking (I don't quite understand the connection between test_progs > > > failure description, and this question) > > > > I have some codes to check the tag associated with abstract_origin > > for a subprogram must be a subprogram. Through experiment, I didn't > > see a violation, so I wonder that I can get confirmation from you > > and then I may delete that code. > > > > The test_progs failure exposed the bug, that is all. > > > > pahole cannot handle all weird usages of dwarf, so I think pahole > > is fine only to support well-formed dwarf. > > Sounds good. Thanks for the context! David, since you took the time to go thru the changes and to agree that Yonghong's fix is good, can I add a: Acked-by: David Blaikie to this patch? Maybe even a: Reviewed-by: David Blaikie Thanks, - Arnaldo > > > > > > > > - David > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > We have a subprogram which has an abstract_origin pointing to > > > > > the subprogram prototype with return type. Current one pass > > > > > recoding cannot easily resolve this easily since > > > > > at the time recoding for 0x11f2ec67, the return type in > > > > > 0x11f2ed66 has not been resolved. > > > > > > > > > > To simplify implementation, I just added another pass to > > > > > go through all functions after recoding pass. This should > > > > > resolve the above issue. > > > > > > > > > > With this patch, among total 250999 functions in vmlinux, > > > > > 4821 functions needs return type adjustment from type id 0 > > > > > to correct values. The above failed bpf selftest passed too. > > > > > > > > > > Signed-off-by: Yonghong Song > > > > > > --- > > > > > dwarf_loader.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > > 1 file changed, 46 insertions(+) > > > > > > > > > > Arnaldo, this is the last known pahole bug in my hand w.r.t. clang > > > > > LTO. With this, all self tests are passed except ones due > > > > > to global function inlining, static variable promotion etc, which > > > > > are not related to pahole. > > > > > > > > > > diff --git a/dwarf_loader.c b/dwarf_loader.c > > > > > index 026d137..367ac06 100644 > > > > > --- a/dwarf_loader.c > > > > > +++ b/dwarf_loader.c > > > > > @@ -2198,6 +2198,42 @@ out: > > > > > return 0; > > > > > } > > > > > > > > > > +static int cu__resolve_func_ret_types(struct cu *cu) > > > > > +{ > > > > > + struct ptr_table *pt = &cu->functions_table; > > > > > + uint32_t i; > > > > > + > > > > > + for (i = 0; i < pt->nr_entries; ++i) { > > > > > + struct tag *tag = pt->entries[i]; > > > > > + > > > > > + if (tag == NULL || tag->type != 0) > > > > > + continue; > > > > > + > > > > > + struct function *fn = tag__function(tag); > > > > > + if (!fn->abstract_origin) > > > > > + continue; > > > > > + > > > > > + struct dwarf_tag *dtag = tag->priv; > > > > > + struct dwarf_tag *dfunc; > > > > > + dfunc = dwarf_cu__find_tag_by_ref(cu->priv, > > > &dtag->abstract_origin); > > > > > + if (dfunc == NULL) { > > > > > + tag__print_abstract_origin_not_found(tag); > > > > > + return -1; > > > > > + } > > > > > + > > > > > + /* > > > > > + * Based on what I see it should be a subprogram, > > > > > + * but double check anyway to ensure I won't mess up > > > > > + * now and in the future. > > > > > + */ > > > > > + if (dfunc->tag->tag != DW_TAG_subprogram) > > > > > + continue; > > > > > + > > > > > + tag->type = dfunc->tag->type; > > > > > + } > > > > > + return 0; > > > > > +} > > > > > + > > > > > static int cu__recode_dwarf_types_table(struct cu *cu, > > > > > struct ptr_table *pt, > > > > > uint32_t i) > > > > > @@ -2637,6 +2673,16 @@ static int cus__merge_and_process_cu(struct > > > cus *cus, struct conf_load *conf, > > > > > /* process merged cu */ > > > > > if (cu__recode_dwarf_types(cu) != LSK__KEEPIT) > > > > > return DWARF_CB_ABORT; > > > > > + > > > > > + /* > > > > > + * for lto build, the function return type may not be > > > > > + * resolved due to the return type of a subprogram is > > > > > + * encoded in another subprogram through abstract_origin > > > > > + * tag. Let us visit all subprograms again to resolve this. > > > > > + */ > > > > > + if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT) > > > > > + return DWARF_CB_ABORT; > > > > > + > > > > > if (finalize_cu_immediately(cus, cu, dcu, conf) > > > > > == LSK__STOP_LOADING) > > > > > return DWARF_CB_ABORT; > > > > > -- - Arnaldo