All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: acme@kernel.org
Cc: ast@kernel.org, andrii@kernel.org, daniel@iogearbox.net,
	eddyz87@gmail.com, haoluo@google.com, jolsa@kernel.org,
	john.fastabend@gmail.com, kpsingh@chromium.org,
	sinquersw@gmail.com, martin.lau@kernel.org,
	songliubraving@fb.com, sdf@google.com, timo@incline.eu,
	yhs@fb.com, bpf@vger.kernel.org,
	Alan Maguire <alan.maguire@oracle.com>
Subject: [PATCH v3 dwarves 1/8] dwarf_loader: Help spotting functions with optimized-out parameters
Date: Tue,  7 Feb 2023 17:14:55 +0000	[thread overview]
Message-ID: <1675790102-23037-2-git-send-email-alan.maguire@oracle.com> (raw)
In-Reply-To: <1675790102-23037-1-git-send-email-alan.maguire@oracle.com>

Compilation generates DWARF at several stages, and often the later DWARF
representations more accurately represent optimizations that have
occurred during compilation.

In particular, parameter representations can be spotted by their
abstract origin references to the original parameter, but they often
have more accurate location information.  In most cases, the parameter
locations will match calling conventions, and be registers for the first
6 parameters on x86_64, first 8 on ARM64 etc.  If the parameter is not a
register when it should be however, it is likely passed via the stack or
the compiler has used a constant representation instead.  The latter can
often be spotted by checking for a DW_AT_const_value attribute, as noted
by Eduard.

In addition, absence of a location tag (either across the abstract
origin reference and the original parameter, or in the standalone
parameter description) is evidence of an optimized-out parameter.
Presence of a location tag is stored in the parameter description and
shared between abstract tags and their original referents.

This change adds a field to parameters and their associated ftype to
note if a parameter has been optimized out.  Having this information
allows us to skip such functions, as their presence in CUs makes BTF
encoding impossible.

Committer notes:

Changed the NR_REGISTER_PARAMS definition from a if/elif/endif for the
native architecture into a function that uses the ELF header e_machine
to find the target architecture, to allow for cross builds.

Also avoided looking at location expression in parameter__new() when the
param_idx argument is -1, as is the case when creating 'struct
parameter' instances for DW_TAG_subroutine_type, since we don't have
such info, only for the 'struct parameter' instances created from
DW_TAG_subprogram.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/1675088985-20300-2-git-send-email-alan.maguire@oracle.com
Link: https://lore.kernel.org/r/9c330c78-e668-fa4c-e0ab-52aa445ccc00@oracle.com # DW_OP_reg0 is the first register on aarch64
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 dwarf_loader.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++---
 dwarves.h      |   6 ++-
 2 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 5a74035..7aaf1d4 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -52,6 +52,10 @@
 #define DW_OP_addrx 0xa1
 #endif
 
+#ifndef EM_RISCV
+#define EM_RISCV	243
+#endif
+
 static pthread_mutex_t libdw__lock = PTHREAD_MUTEX_INITIALIZER;
 
 static uint32_t hashtags__bits = 12;
@@ -992,13 +996,98 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
 	return member;
 }
 
-static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
+/* How many function parameters are passed via registers?  Used below in
+ * determining if an argument has been optimized out or if it is simply
+ * an argument > cu__nr_register_params().  Making cu__nr_register_params()
+ * return 0 allows unsupported architectures to skip tagging optimized-out
+ * values.
+ */
+static int arch__nr_register_params(const GElf_Ehdr *ehdr)
+{
+	switch (ehdr->e_machine) {
+	case EM_S390:	 return 5;
+	case EM_SPARC:
+	case EM_SPARCV9:
+	case EM_X86_64:	 return 6;
+	case EM_AARCH64:
+	case EM_ARC:
+	case EM_ARM:
+	case EM_MIPS:
+	case EM_PPC:
+	case EM_PPC64:
+	case EM_RISCV:	 return 8;
+	default:	 break;
+	}
+
+	return 0;
+}
+
+static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
+					struct conf_load *conf, int param_idx)
 {
 	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
 
 	if (parm != NULL) {
+		bool has_const_value;
+		Dwarf_Attribute attr;
+		struct location loc;
+
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
+
+		if (param_idx >= cu->nr_register_params || param_idx < 0)
+			return parm;
+		/* Parameters which use DW_AT_abstract_origin to point at
+		 * the original parameter definition (with no name in the DIE)
+		 * are the result of later DWARF generation during compilation
+		 * so often better take into account if arguments were
+		 * optimized out.
+		 *
+		 * By checking that locations for parameters that are expected
+		 * to be passed as registers are actually passed as registers,
+		 * we can spot optimized-out parameters.
+		 *
+		 * It can also be the case that a parameter DIE has
+		 * a constant value attribute reflecting optimization or
+		 * has no location attribute.
+		 *
+		 * From the DWARF spec:
+		 *
+		 * "4.1.10
+		 *
+		 * A DW_AT_const_value attribute for an entry describing a
+		 * variable or formal parameter whose value is constant and not
+		 * represented by an object in the address space of the program,
+		 * or an entry describing a named constant. (Note
+		 * that such an entry does not have a location attribute.)"
+		 *
+		 * So we can also use the absence of a location for a parameter
+		 * as evidence it has been optimized out.  This info will
+		 * need to be shared between a parameter and any abstract
+		 * origin references however, since gcc can have location
+		 * information in the parameter that refers back to the original
+		 * via abstract origin, so we need to share location presence
+		 * between these parameter representations.  See
+		 * ftype__recode_dwarf_types() below for how this is handled.
+		 */
+		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
+		has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
+		if (parm->has_loc &&
+		    attr_location(die, &loc.expr, &loc.exprlen) == 0 &&
+			loc.exprlen != 0) {
+			Dwarf_Op *expr = loc.expr;
+
+			switch (expr->atom) {
+			case DW_OP_reg0 ... DW_OP_reg31:
+			case DW_OP_breg0 ... DW_OP_breg31:
+				break;
+			default:
+				parm->optimized = 1;
+				break;
+			}
+		} else if (has_const_value) {
+			parm->optimized = 1;
+		}
 	}
 
 	return parm;
@@ -1450,7 +1539,7 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
 					     struct cu *cu, struct conf_load *conf,
 					     int param_idx)
 {
-	struct parameter *parm = parameter__new(die, cu, conf);
+	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
 
 	if (parm == NULL)
 		return NULL;
@@ -2194,6 +2283,7 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 
 	ftype__for_each_parameter(type, pos) {
 		struct dwarf_tag *dpos = pos->tag.priv;
+		struct parameter *opos;
 		struct dwarf_tag *dtype;
 
 		if (dpos->type.off == 0) {
@@ -2207,8 +2297,18 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 				tag__print_abstract_origin_not_found(&pos->tag);
 				continue;
 			}
-			pos->name = tag__parameter(dtype->tag)->name;
+			opos = tag__parameter(dtype->tag);
+			pos->name = opos->name;
 			pos->tag.type = dtype->tag->type;
+			/* share location information between parameter and
+			 * abstract origin; if neither have location, we will
+			 * mark the parameter as optimized out.
+			 */
+			if (pos->has_loc)
+				opos->has_loc = pos->has_loc;
+
+			if (pos->optimized)
+				opos->optimized = pos->optimized;
 			continue;
 		}
 
@@ -2478,18 +2578,33 @@ out:
 	return 0;
 }
 
-static int cu__resolve_func_ret_types(struct cu *cu)
+static int cu__resolve_func_ret_types_optimized(struct cu *cu)
 {
 	struct ptr_table *pt = &cu->functions_table;
 	uint32_t i;
 
 	for (i = 0; i < pt->nr_entries; ++i) {
 		struct tag *tag = pt->entries[i];
+		struct parameter *pos;
+		struct function *fn = tag__function(tag);
+
+		/* mark function as optimized if parameter is, or
+		 * if parameter does not have a location; at this
+		 * point location presence has been marked in
+		 * abstract origins for cases where a parameter
+		 * location is not stored in the original function
+		 * parameter tag.
+		 */
+		ftype__for_each_parameter(&fn->proto, pos) {
+			if (pos->optimized || !pos->has_loc) {
+				fn->proto.optimized_parms = 1;
+				break;
+			}
+		}
 
 		if (tag == NULL || tag->type != 0)
 			continue;
 
-		struct function *fn = tag__function(tag);
 		if (!fn->abstract_origin)
 			continue;
 
@@ -2612,7 +2727,7 @@ static int die__process_and_recode(Dwarf_Die *die, struct cu *cu, struct conf_lo
 	if (ret != 0)
 		return ret;
 
-	return cu__resolve_func_ret_types(cu);
+	return cu__resolve_func_ret_types_optimized(cu);
 }
 
 static int class_member__cache_byte_size(struct tag *tag, struct cu *cu,
@@ -2753,6 +2868,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
 		return DWARF_CB_ABORT;
 
 	cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
+	cu->nr_register_params = arch__nr_register_params(&ehdr);
 	return 0;
 }
 
@@ -3132,7 +3248,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
 	 * encoded in another subprogram through abstract_origin
 	 * tag. Let us visit all subprograms again to resolve this.
 	 */
-	if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT)
+	if (cu__resolve_func_ret_types_optimized(cu) != LSK__KEEPIT)
 		goto out_abort;
 
 	if (cus__finalize(cus, cu, conf, NULL) == LSK__STOP_LOADING)
diff --git a/dwarves.h b/dwarves.h
index 589588e..1cd95f7 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -262,6 +262,7 @@ struct cu {
 	uint8_t		 has_addr_info:1;
 	uint8_t		 uses_global_strings:1;
 	uint8_t		 little_endian:1;
+	uint8_t		 nr_register_params;
 	uint16_t	 language;
 	unsigned long	 nr_inline_expansions;
 	size_t		 size_inline_expansions;
@@ -808,6 +809,8 @@ size_t lexblock__fprintf(const struct lexblock *lexblock, const struct cu *cu,
 struct parameter {
 	struct tag tag;
 	const char *name;
+	uint8_t optimized:1;
+	uint8_t has_loc:1;
 };
 
 static inline struct parameter *tag__parameter(const struct tag *tag)
@@ -827,7 +830,8 @@ struct ftype {
 	struct tag	 tag;
 	struct list_head parms;
 	uint16_t	 nr_parms;
-	uint8_t		 unspec_parms; /* just one bit is needed */
+	uint8_t		 unspec_parms:1; /* just one bit is needed */
+	uint8_t		 optimized_parms:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
-- 
2.31.1


  reply	other threads:[~2023-02-07 17:16 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
2023-02-07 17:14 ` Alan Maguire [this message]
2023-02-07 17:14 ` [PATCH v3 dwarves 2/8] btf_encoder: store type_id_off, unspecified type in encoder Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 3/8] btf_encoder: Refactor function addition into dedicated btf_encoder__add_func Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 4/8] btf_encoder: Rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
2023-02-08 13:19   ` Jiri Olsa
2023-02-08 14:43     ` Arnaldo Carvalho de Melo
2023-02-08 20:51       ` Jiri Olsa
2023-02-08 22:57         ` Alan Maguire
2023-02-07 17:15 ` [PATCH v3 dwarves 6/8] btf_encoder: support delaying function addition to check for function prototype inconsistencies Alan Maguire
2023-02-07 17:15 ` [PATCH v3 dwarves 7/8] dwarves: document --btf_gen_optimized option Alan Maguire
2023-02-07 17:15 ` [PATCH v3 dwarves 8/8] dwarves: document --skip_encoding_btf_inconsistent_proto option Alan Maguire
2023-02-08 13:20 ` [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Jiri Olsa
2023-02-08 15:25   ` Alan Maguire
2023-02-08 16:20 ` Arnaldo Carvalho de Melo
2023-02-08 16:50   ` Arnaldo Carvalho de Melo
2023-02-09  9:36     ` Jiri Olsa
2023-02-09 12:22       ` Arnaldo Carvalho de Melo
     [not found]     ` <3c021d56-8818-2464-f7e0-889e769c0311@oracle.com>
2023-02-09 13:09       ` [PATCH bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25 Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1675790102-23037-2-git-send-email-alan.maguire@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=acme@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@chromium.org \
    --cc=martin.lau@kernel.org \
    --cc=sdf@google.com \
    --cc=sinquersw@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=timo@incline.eu \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.