From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BCCDC433F5 for ; Tue, 15 Mar 2022 17:59:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237505AbiCOSAQ (ORCPT ); Tue, 15 Mar 2022 14:00:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237133AbiCOSAQ (ORCPT ); Tue, 15 Mar 2022 14:00:16 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 748525675E for ; Tue, 15 Mar 2022 10:59:03 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E7D09615C1 for ; Tue, 15 Mar 2022 17:59:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB869C340EE; Tue, 15 Mar 2022 17:59:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1647367142; bh=hKYCVv8Nsy9TN/B6nIx9ASu/IZpd3N3FcHwSCqGDqX0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=b1aJNPeIQwpGz756DliDTB3BbFNMGC12tal+E6UQ0Wtkm5H2U4ejW2gyl30FpD3/l 92yWAng8u7XIOZ6S5eiKlwDhdTgAF8fhr+uRoHrka0pkfaPcvGW1Oj2tkZeQrWrnw9 Om/XyntIT/U/UTV3wnIzH3ucSDL1M6Ec0wu3bw2ffHSEIgQFCdimGWWbaN16sDKRXb 7nE89kxPCo22cLRtLoW12UZYEVCMH2k8iVQpuCAWA4NGG/MnAexrw2Dq4GRP8Xa7KU KH7Q0RVGQ4cRZNB5CjWAs3K9jcvQzfbYGtCZpfbpSYKZA59ACVCUEitrfd8e9MewY8 rj3V/iQsYqamg== Received: by quaco.ghostprotocols.net (Postfix, from userid 1000) id 88ADC40407; Tue, 15 Mar 2022 14:58:59 -0300 (-03) Date: Tue, 15 Mar 2022 14:58:59 -0300 From: Arnaldo Carvalho de Melo To: Stephen Brennan Cc: Yonghong Song , Shung-Hsi Yu , bpf@vger.kernel.org, Omar Sandoval , Arnaldo Carvalho de Melo , Stephen Brennan Subject: Re: Question: missing vmlinux BTF variable declarations Message-ID: References: <586a6288-704a-f7a7-b256-e18a675927df@oracle.com> <8735jjw4rp.fsf@brennan.io> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8735jjw4rp.fsf@brennan.io> X-Url: http://acmel.wordpress.com Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Em Tue, Mar 15, 2022 at 09:37:46AM -0700, Stephen Brennan escreveu: Yonghong Song writes: > > On 3/14/22 12:09 AM, Shung-Hsi Yu wrote: > >> On Wed, Mar 09, 2022 at 03:20:47PM -0800, Stephen Brennan wrote: > >>> I've been recently learning about BTF with a keen interest in using it > >>> as a fallback source of debug information. On the face of it, Linux > >>> kernels these days have a lot of introspection information. BTF provides > >>> information about types. kallsyms provides information about symbol > >>> locations. ORC allows us to reliably unwind stack traces. So together, > >>> these could enable a debugger (either postmortem, or live) to do a lot > >>> without needing to read the (very large) DWARF debuginfo files. For > >>> example, we could format backtraces with function names, we could > > For backtraces with function names, you probably still need ksyms since > > BTF won't encode address => symbol translation. > Yes, kallsyms is definitely required in this scheme. In practice, it > seems very common for distributions to be compiled not just with > CONFIG_KALLSYMS, but CONFIG_KALLSYMS_ALL. > Kallsyms is critical for mapping names to addresses (and vice versa). > >>> pretty-print global variables and data structures, etc. This is nice > > This indeed is a potential use case. > > We discussed this during adding per-cpu > > global variables. Ultimately we just added per-cpu global variables > > since we didn't have a use case or request for other global variables. > > But I still would like to know beyond this whether you have other needs > > which BPF may or may not help. It would be good to know since if > > ultimately you still need dwarf, then it might be undesirable to > > add general global variables to BTF. > I think that kallsyms, BTF, and ORC together will be enough to provide a > lite debugging experience. Some things will be missing: > - mapping backtrace addresses to source code lines So, BTF has provisions for that, and its present in the eBPF programs, perf annotate uses it, see tools/perf/util/annotate.c, symbol__disassemble_bpf(), it goes like: struct bpf_prog_linfo *prog_linfo = NULL; info_node = perf_env__find_bpf_prog_info(dso->bpf_prog.env, dso->bpf_prog.id); if (!info_node) { ret = SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF; goto out; } info_linear = info_node->info_linear; sub_id = dso->bpf_prog.sub_id; info.buffer = (void *)(uintptr_t)(info_linear->info.jited_prog_insns); info.buffer_length = info_linear->info.jited_prog_len; if (info_linear->info.nr_line_info) prog_linfo = bpf_prog_linfo__new(&info_linear->info); addr = pc + ((u64 *)(uintptr_t)(info_linear->info.jited_ksyms))[sub_id]; count = disassemble(pc, &info); if (prog_linfo) linfo = bpf_prog_linfo__lfind_addr_func(prog_linfo, addr, sub_id, nr_skip); if (linfo && btf) { srcline = btf__name_by_offset(btf, linfo->line_off); nr_skip++; } else srcline = NULL; etc. Having this for the kernel proper is thus doable, but then we go on making BTF info grow. Perhaps having this as optional, distros or appliances wanting to have a kernel with this extra info would add it and then tools would use it if available? > - intelligent stack frame information from DWARF CFI (e.g. > register/variable values) > - probably other things, I'm not a DWARF expert. > However, I do have two interesting branches of drgn which demonstrate > the utility of just BTF+kallsyms: > 1. https://github.com/osandov/drgn/pull/162 > 2. https://github.com/brenns10/drgn/tree/kallsyms_plus_btf > #1 adds preliminary BTF support, and #2 adds basic kallsyms support, > building on #1. Finally, I have some unpublished patches which add some > symbols into vmcoreinfo, which help us locate kallsyms info. From there, > drgn is able to take a core dump, and lookup symbols and get their > corresponding type info! > The only real blocker I see here is that the BTF data is mainly limited > to functions, so most of what you're doing is looking up function names > and viewing their signatures :) > >>> given that depending on your distro, it might be tough to get debuginfo, > >>> and it is quite large to download or install. > >>> > >>> As I've worked toward this goal, I discovered that while the > >>> BTF_KIND_VAR exists [1], the BTF included in the core kernel only has > >>> declarations for percpu variables. This makes BTF much less useful for > >>> this (admittedly odd) use case. Without a way to bind a name found in > >>> kallsyms to its type, we can't interpret global variables. It looks like > >>> the restriction for percpu-only variables is baked into the pahole BTF > >>> encoder [2]. > >>> [1]: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-var > >>> [2]: https://github.com/acmel/dwarves/blob/master/btf_encoder.c > >>> I wonder what the BPF / BTF community's thoughts are on including more > >>> of these global variable declarations? Perhaps behind a > >>> CONFIG_DEBUG_INFO_BTF_ALL, like how kallsyms does it? I'm aware that > > Currently on my local machine, the vmlinux BTF's size is 4.2MB and > > adding 1MB would be a big increase. CONFIG_DEBUG_INFO_BTF_ALL is a good > > idea. But we might be able to just add global variables without this > > new config if we have strong use case. > And unfortunately 1MiB is really just a shot in the dark, guessing > around 70k variables with no string data. Maybe we can have a separate BTF file with all this extra info that could be fetched from somewhere, keyed by build-id, like is now possible with debuginfod and DWARF? > I'd love to use kallsyms to avoid adding new strings into BTF. If the > "all variables BTF" config added a dependency on "CONFIG_KALLSYMS_ALL", > then we could use the BTF "kind_flag" to indicate that string values > should be looked up in the kallsyms table, not the BTF strings section. > This could even be used to reduce the string footprint for BTF > function names. > Of course it's a more complex change to dwarves :( > >>> each declaration costs at least 16 bytes of BTF records, plus the > >>> strings and any necessary type data. The string cost could be mitigated > >>> by allowing "name_off" to refer to the kallsyms offset for variable or > >>> function declaration. But the additional records could cost around 1MiB > >>> for common distribution configurations. > >>> > >>> I know this isn't the designed use case for BTF, but I think it's very > >>> exciting. > >> > >> I've been wondering about the same (possibility of using BTF for postmortem > >> debugging without debuginfo), though not to the extend that you've > >> researched. > >> > >> I find the idea exciting as well, and quite useful for distros where the > >> kernel package changes quite often that the debuginfo package may be long > >> gone by the time a crash dump for such kernel is captured. > > > > I would love to use BTF (including global variables in BTF) for crash > > dump. But I suspect we may still have some gaps. Maybe you can > > explore a little bit more on this? > > Hopefully my above explanation gives more context here. There is code > (not production-ready) which can make use of these features together. > The next step for me has been trying to get the dwarves/pahole BTF > encoder to output *all* functions but I've hit some issues with it. If I > can get that to work, then I can present a full demo of these pieces > working together and we can be confident that there are no gaps. > > Maybe this is a topic worth discussing at LSF/MM/BPF conference? Though > it's quite late for that... > > Thanks, > Stephen > > > > >> > >> Shung-Hsi > >> > >>> Thanks for your attention! > >>> Stephen > >> -- - Arnaldo