All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
@ 2023-01-30 14:29 Alan Maguire
  2023-01-30 14:29 ` [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters Alan Maguire
                   ` (4 more replies)
  0 siblings, 5 replies; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 14:29 UTC (permalink / raw)
  To: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo
  Cc: daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf, Alan Maguire

At optimization level -O2 or higher in gcc, static functions may be
optimized such that they have suffixes like .isra.0, .constprop.0 etc.
These represent 
    
- constant propagation (.constprop.0);
- interprocedural scalar replacement of aggregates, removal of
  unused parameters and replacement of parameters passed by
  reference by parameters passed by value (.isra.0)
  
See [1] for details. 
    
Currently BTF encoding does not handle such optimized functions
that get renamed with a "." suffix such as ".isra.0", ".constprop.0".
This is safer because such suffixes can often indicate parameters have
been optimized out.  This series addresses this by matching a
function to a suffixed version ("foo" matching "foo.isra.0") while
ensuring that the function signature does not contain optimized-out
parameters.  Note that if the function is found ("foo") it will
be preferred, only falling back to "foo.isra.0" if lookup of the
function fails.  Addition to BTF is skipped if the function has
optimized-out parameters, since the expected function signature
will not match. BTF encoding does not include the "."-suffix to
be consistent with DWARF. In addition, the kernel currently does
not allow a "." suffix in a BTF function name.

A problem with this approach however is that BTF carries out the
encoding process in parallel across multiple CUs, and sometimes
a function has optimized-out parameters in one CU but not others;
we see this for NF_HOOK.constprop.0 for example.  So in order to
determine if the function has optimized-out parameters in any
CU, its addition is not carried out until we have processed all
CUs and are about to merge BTF.  At this point we know if any
such optimizations have occurred.  Patches 1-4 handle the
optimized-out parameter identification and matching "."-suffixed
functions with the original function to facilitate BTF
encoding.

Patch 5 addresses a related problem - it is entirely possible
for a static function of the same name to exist in different
CUs with different function signatures.  Because BTF does not
currently encode any information that would help disambiguate
which BTF function specification matches which static function
(in the case of multiple different function signatures), it is
best to eliminate such functions from BTF for now.  The same
mechanism that is used to compare static "."-suffixed functions
is re-used for the static function comparison.  A superficial
comparison of number of parameters/parameter names is done to
see if such representations are consistent, and if inconsistent
prototypes are observed, the function is flagged for exclusion
from BTF.

When these methods are combined - the additive encoding of
"."-suffixed functions and the subtractive elimination of
functions with inconsistent parameters - we see an overall
drop in the number of functions in vmlinux BTF, from
51150 to 49871.

Changes since v1 [2]

- Eduard noted that a DW_AT_const_value attribute can signal
  an optimized-out parameter, and that the lack of a location
  attribute signals optimization; ensure we handle those cases
  also (Eduard, patch 1).
- Jiri noted we can have inconsistencies between a static
  and non-static function; apply the comparison process to
  all functions (Jiri, patch 5)
- segmentation fault was observed when handling functions with
  > 10 parameters; needed parameter comparison loop to exit
  at BTF_ENCODER_MAX_PARAMETERS (patch 5)
- Kui-Feng Lee pointed out that having a global shared function
  tree would lead to a lot of contention; here a per-encoder 
  tree is used, and once the threads are collected the trees
  are merged. Performance numbers are provided in patch 5 
  (Kui-Feng Lee, patches 4/5)

Alan Maguire (5):
  dwarves: help dwarf loader spot functions with optimized-out
    parameters
  btf_encoder: refactor function addition into dedicated
    btf_encoder__add_func
  btf_encoder: rework btf_encoders__*() API to allow traversal of
    encoders
  btf_encoder: represent "."-suffixed functions (".isra.0") in BTF
  btf_encoder: delay function addition to check for function prototype
    inconsistencies

 btf_encoder.c  | 392 ++++++++++++++++++++++++++++++++++++++++++++++++---------
 btf_encoder.h  |   6 -
 dwarf_loader.c | 125 ++++++++++++++++--
 dwarves.h      |   8 +-
 pahole.c       |  14 +--
 5 files changed, 464 insertions(+), 81 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-30 14:29 [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
@ 2023-01-30 14:29 ` Alan Maguire
  2023-01-30 18:36   ` Arnaldo Carvalho de Melo
  2023-01-30 14:29 ` [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func Alan Maguire
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 14:29 UTC (permalink / raw)
  To: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo
  Cc: daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf, Alan Maguire

Compilation generates DWARF at several stages, and often the
later DWARF representations more accurately represent optimizations
that have occurred during compilation.

In particular, parameter representations can be spotted by their
abstract origin references to the original parameter, but they
often have more accurate location information.  In most cases,
the parameter locations will match calling conventions, and be
registers for the first 6 parameters on x86_64, first 8 on ARM64
etc.  If the parameter is not a register when it should be however,
it is likely passed via the stack or the compiler has used a
constant representation instead.  The latter can often be
spotted by checking for a DW_AT_const_value attribute,
as noted by Eduard.

In addition, absence of a location tag (either across
the abstract origin reference and the original parameter,
or in the standalone parameter description) is evidence of
an optimized-out parameter.  Presence of a location tag
is stored in the parameter description and shared between
abstract tags and their original referents.

This change adds a field to parameters and their associated
ftype to note if a parameter has been optimized out.  Having
this information allows us to skip such functions, as their
presence in CUs makes BTF encoding impossible.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 dwarf_loader.c | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
 dwarves.h      |   5 ++-
 2 files changed, 122 insertions(+), 8 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 5a74035..93c2307 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -992,13 +992,98 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
 	return member;
 }
 
-static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
+/* How many function parameters are passed via registers?  Used below in
+ * determining if an argument has been optimized out or if it is simply
+ * an argument > NR_REGISTER_PARAMS.  Setting NR_REGISTER_PARAMS to 0
+ * allows unsupported architectures to skip tagging optimized-out
+ * values.
+ */
+#if defined(__x86_64__)
+#define NR_REGISTER_PARAMS      6
+#elif defined(__s390__)
+#define NR_REGISTER_PARAMS	5
+#elif defined(__aarch64__)
+#define NR_REGISTER_PARAMS      8
+#elif defined(__mips__)
+#define NR_REGISTER_PARAMS	8
+#elif defined(__powerpc__)
+#define NR_REGISTER_PARAMS	8
+#elif defined(__sparc__)
+#define NR_REGISTER_PARAMS	6
+#elif defined(__riscv) && __riscv_xlen == 64
+#define NR_REGISTER_PARAMS	8
+#elif defined(__arc__)
+#define NR_REGISTER_PARAMS	8
+#else
+#define NR_REGISTER_PARAMS      0
+#endif
+
+static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
+					struct conf_load *conf, int param_idx)
 {
 	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
 
 	if (parm != NULL) {
+		bool has_const_value;
+		Dwarf_Attribute attr;
+		struct location loc;
+
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
+
+		if (param_idx >= NR_REGISTER_PARAMS)
+			return parm;
+		/* Parameters which use DW_AT_abstract_origin to point at
+		 * the original parameter definition (with no name in the DIE)
+		 * are the result of later DWARF generation during compilation
+		 * so often better take into account if arguments were
+		 * optimized out.
+		 *
+		 * By checking that locations for parameters that are expected
+		 * to be passed as registers are actually passed as registers,
+		 * we can spot optimized-out parameters.
+		 *
+		 * It can also be the case that a parameter DIE has
+		 * a constant value attribute reflecting optimization or
+		 * has no location attribute.
+		 *
+		 * From the DWARF spec:
+		 *
+		 * "4.1.10
+		 *
+		 * A DW_AT_const_value attribute for an entry describing a
+		 * variable or formal parameter whose value is constant and not
+		 * represented by an object in the address space of the program,
+		 * or an entry describing a named constant. (Note
+		 * that such an entry does not have a location attribute.)"
+		 *
+		 * So we can also use the absence of a location for a parameter
+		 * as evidence it has been optimized out.  This info will
+		 * need to be shared between a parameter and any abstract
+		 * origin references however, since gcc can have location
+		 * information in the parameter that refers back to the original
+		 * via abstract origin, so we need to share location presence
+		 * between these parameter representations.  See
+		 * ftype__recode_dwarf_types() below for how this is handled.
+		 */
+		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
+		has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
+		if (parm->has_loc &&
+		    attr_location(die, &loc.expr, &loc.exprlen) == 0 &&
+			loc.exprlen != 0) {
+			Dwarf_Op *expr = loc.expr;
+
+			switch (expr->atom) {
+			case DW_OP_reg1 ... DW_OP_reg31:
+			case DW_OP_breg0 ... DW_OP_breg31:
+				break;
+			default:
+				parm->optimized = 1;
+				break;
+			}
+		} else if (has_const_value) {
+			parm->optimized = 1;
+		}
 	}
 
 	return parm;
@@ -1450,7 +1535,7 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
 					     struct cu *cu, struct conf_load *conf,
 					     int param_idx)
 {
-	struct parameter *parm = parameter__new(die, cu, conf);
+	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
 
 	if (parm == NULL)
 		return NULL;
@@ -2194,6 +2279,7 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 
 	ftype__for_each_parameter(type, pos) {
 		struct dwarf_tag *dpos = pos->tag.priv;
+		struct parameter *opos;
 		struct dwarf_tag *dtype;
 
 		if (dpos->type.off == 0) {
@@ -2207,8 +2293,18 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 				tag__print_abstract_origin_not_found(&pos->tag);
 				continue;
 			}
-			pos->name = tag__parameter(dtype->tag)->name;
+			opos = tag__parameter(dtype->tag);
+			pos->name = opos->name;
 			pos->tag.type = dtype->tag->type;
+			/* share location information between parameter and
+			 * abstract origin; if neither have location, we will
+			 * mark the parameter as optimized out.
+			 */
+			if (pos->has_loc)
+				opos->has_loc = pos->has_loc;
+
+			if (pos->optimized)
+				opos->optimized = pos->optimized;
 			continue;
 		}
 
@@ -2478,18 +2574,33 @@ out:
 	return 0;
 }
 
-static int cu__resolve_func_ret_types(struct cu *cu)
+static int cu__resolve_func_ret_types_optimized(struct cu *cu)
 {
 	struct ptr_table *pt = &cu->functions_table;
 	uint32_t i;
 
 	for (i = 0; i < pt->nr_entries; ++i) {
 		struct tag *tag = pt->entries[i];
+		struct parameter *pos;
+		struct function *fn = tag__function(tag);
+
+		/* mark function as optimized if parameter is, or
+		 * if parameter does not have a location; at this
+		 * point location presence has been marked in
+		 * abstract origins for cases where a parameter
+		 * location is not stored in the original function
+		 * parameter tag.
+		 */
+		ftype__for_each_parameter(&fn->proto, pos) {
+			if (pos->optimized || !pos->has_loc) {
+				fn->proto.optimized_parms = 1;
+				break;
+			}
+		}
 
 		if (tag == NULL || tag->type != 0)
 			continue;
 
-		struct function *fn = tag__function(tag);
 		if (!fn->abstract_origin)
 			continue;
 
@@ -2612,7 +2723,7 @@ static int die__process_and_recode(Dwarf_Die *die, struct cu *cu, struct conf_lo
 	if (ret != 0)
 		return ret;
 
-	return cu__resolve_func_ret_types(cu);
+	return cu__resolve_func_ret_types_optimized(cu);
 }
 
 static int class_member__cache_byte_size(struct tag *tag, struct cu *cu,
@@ -3132,7 +3243,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
 	 * encoded in another subprogram through abstract_origin
 	 * tag. Let us visit all subprograms again to resolve this.
 	 */
-	if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT)
+	if (cu__resolve_func_ret_types_optimized(cu) != LSK__KEEPIT)
 		goto out_abort;
 
 	if (cus__finalize(cus, cu, conf, NULL) == LSK__STOP_LOADING)
diff --git a/dwarves.h b/dwarves.h
index 589588e..2723466 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -808,6 +808,8 @@ size_t lexblock__fprintf(const struct lexblock *lexblock, const struct cu *cu,
 struct parameter {
 	struct tag tag;
 	const char *name;
+	uint8_t optimized:1;
+	uint8_t has_loc:1;
 };
 
 static inline struct parameter *tag__parameter(const struct tag *tag)
@@ -827,7 +829,8 @@ struct ftype {
 	struct tag	 tag;
 	struct list_head parms;
 	uint16_t	 nr_parms;
-	uint8_t		 unspec_parms; /* just one bit is needed */
+	uint8_t		 unspec_parms:1; /* just one bit is needed */
+	uint8_t		 optimized_parms:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func
  2023-01-30 14:29 [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
  2023-01-30 14:29 ` [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters Alan Maguire
@ 2023-01-30 14:29 ` Alan Maguire
  2023-02-01 17:19   ` Arnaldo Carvalho de Melo
  2023-01-30 14:29 ` [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 14:29 UTC (permalink / raw)
  To: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo
  Cc: daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf, Alan Maguire

This will be useful for postponing local function addition later on.
As part of this, store the type id offset and unspecified type in
the encoder, as this will simplify late addition of local functions.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 101 +++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 57 insertions(+), 44 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index a5fa04a..44f1905 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -54,6 +54,8 @@ struct btf_encoder {
 	struct gobuffer   percpu_secinfo;
 	const char	  *filename;
 	struct elf_symtab *symtab;
+	uint32_t	  type_id_off;
+	uint32_t	  unspecified_type;
 	bool		  has_index_type,
 			  need_index_type,
 			  skip_encoding_vars,
@@ -593,20 +595,20 @@ static int32_t btf_encoder__add_func_param(struct btf_encoder *encoder, const ch
 	}
 }
 
-static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t type_id_off, uint32_t tag_type)
+static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_type)
 {
 	if (tag_type == 0)
 		return 0;
 
-	if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
+	if (tag_type == encoder->unspecified_type) {
 		// No provision for encoding this, turn it into void.
 		return 0;
 	}
 
-	return type_id_off + tag_type;
+	return encoder->type_id_off + tag_type;
 }
 
-static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype, uint32_t type_id_off)
+static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype)
 {
 	struct btf *btf = encoder->btf;
 	const struct btf_type *t;
@@ -616,7 +618,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
 
 	/* add btf_type for func_proto */
 	nr_params = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
-	type_id = btf_encoder__tag_type(encoder, type_id_off, ftype->tag.type);
+	type_id = btf_encoder__tag_type(encoder, ftype->tag.type);
 
 	id = btf__add_func_proto(btf, type_id);
 	if (id > 0) {
@@ -634,7 +636,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
 	ftype__for_each_parameter(ftype, param) {
 		const char *name = parameter__name(param);
 
-		type_id = param->tag.type == 0 ? 0 : type_id_off + param->tag.type;
+		type_id = param->tag.type == 0 ? 0 : encoder->type_id_off + param->tag.type;
 		++param_idx;
 		if (btf_encoder__add_func_param(encoder, name, type_id, param_idx == nr_params))
 			return -1;
@@ -762,6 +764,31 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
 	return id;
 }
 
+static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
+{
+	int btf_fnproto_id, btf_fn_id, tag_type_id;
+	struct llvm_annotation *annot;
+	const char *name;
+
+	btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
+	name = function__name(fn);
+	btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
+	if (btf_fnproto_id < 0 || btf_fn_id < 0) {
+		printf("error: failed to encode function '%s'\n", function__name(fn));
+		return -1;
+	}
+	list_for_each_entry(annot, &fn->annots, node) {
+		tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id,
+							annot->component_idx);
+		if (tag_type_id < 0) {
+			fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
+				annot->value, name, annot->component_idx);
+			return -1;
+		}
+	}
+	return 0;
+}
+
 /*
  * This corresponds to the same macro defined in
  * include/linux/kallsyms.h
@@ -859,22 +886,21 @@ static void dump_invalid_symbol(const char *msg, const char *sym,
 	fprintf(stderr, "PAHOLE: Error: Use '--btf_encode_force' to ignore such symbols and force emit the btf.\n");
 }
 
-static int tag__check_id_drift(const struct tag *tag,
-			       uint32_t core_id, uint32_t btf_type_id,
-			       uint32_t type_id_off)
+static int tag__check_id_drift(struct btf_encoder *encoder, const struct tag *tag,
+			       uint32_t core_id, uint32_t btf_type_id)
 {
-	if (btf_type_id != (core_id + type_id_off)) {
+	if (btf_type_id != (core_id + encoder->type_id_off)) {
 		fprintf(stderr,
 			"%s: %s id drift, core_id: %u, btf_type_id: %u, type_id_off: %u\n",
 			__func__, dwarf_tag_name(tag->tag),
-			core_id, btf_type_id, type_id_off);
+			core_id, btf_type_id, encoder->type_id_off);
 		return -1;
 	}
 
 	return 0;
 }
 
-static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off)
+static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag)
 {
 	struct type *type = tag__type(tag);
 	struct class_member *pos;
@@ -896,7 +922,8 @@ static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct
 		 * is required.
 		 */
 		name = class_member__name(pos);
-		if (btf_encoder__add_field(encoder, name, type_id_off + pos->tag.type, pos->bitfield_size, pos->bit_offset))
+		if (btf_encoder__add_field(encoder, name, encoder->type_id_off + pos->tag.type,
+					   pos->bitfield_size, pos->bit_offset))
 			return -1;
 	}
 
@@ -936,11 +963,11 @@ static int32_t btf_encoder__add_enum_type(struct btf_encoder *encoder, struct ta
 	return type_id;
 }
 
-static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off,
+static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
 				   struct conf_load *conf_load)
 {
 	/* single out type 0 as it represents special type "void" */
-	uint32_t ref_type_id = tag->type == 0 ? 0 : type_id_off + tag->type;
+	uint32_t ref_type_id = tag->type == 0 ? 0 : encoder->type_id_off + tag->type;
 	struct base_type *bt;
 	const char *name;
 
@@ -970,7 +997,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
 		if (tag__type(tag)->declaration)
 			return btf_encoder__add_ref_type(encoder, BTF_KIND_FWD, 0, name, tag->tag == DW_TAG_union_type);
 		else
-			return btf_encoder__add_struct_type(encoder, tag, type_id_off);
+			return btf_encoder__add_struct_type(encoder, tag);
 	case DW_TAG_array_type:
 		/* TODO: Encode one dimension at a time. */
 		encoder->need_index_type = true;
@@ -978,7 +1005,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
 	case DW_TAG_enumeration_type:
 		return btf_encoder__add_enum_type(encoder, tag, conf_load);
 	case DW_TAG_subroutine_type:
-		return btf_encoder__add_func_proto(encoder, tag__ftype(tag), type_id_off);
+		return btf_encoder__add_func_proto(encoder, tag__ftype(tag));
         case DW_TAG_unspecified_type:
 		/* Just don't encode this for now, converting anything with this type to void (0) instead.
 		 *
@@ -1281,7 +1308,7 @@ static bool ftype__has_arg_names(const struct ftype *ftype)
 	return true;
 }
 
-static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_t type_id_off)
+static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
 {
 	struct cu *cu = encoder->cu;
 	uint32_t core_id;
@@ -1366,7 +1393,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_
 			continue;
 		}
 
-		type = var->ip.tag.type + type_id_off;
+		type = var->ip.tag.type + encoder->type_id_off;
 		linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC;
 
 		if (encoder->verbose) {
@@ -1507,7 +1534,6 @@ void btf_encoder__delete(struct btf_encoder *encoder)
 
 int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load)
 {
-	uint32_t type_id_off = btf__type_cnt(encoder->btf) - 1;
 	struct llvm_annotation *annot;
 	int btf_type_id, tag_type_id, skipped_types = 0;
 	uint32_t core_id;
@@ -1516,21 +1542,24 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	int err = 0;
 
 	encoder->cu = cu;
+	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
+	if (encoder->cu->unspecified_type.tag)
+		encoder->unspecified_type = encoder->cu->unspecified_type.type;
 
 	if (!encoder->has_index_type) {
 		/* cu__find_base_type_by_name() takes "type_id_t *id" */
 		type_id_t id;
 		if (cu__find_base_type_by_name(cu, "int", &id)) {
 			encoder->has_index_type = true;
-			encoder->array_index_id = type_id_off + id;
+			encoder->array_index_id = encoder->type_id_off + id;
 		} else {
 			encoder->has_index_type = false;
-			encoder->array_index_id = type_id_off + cu->types_table.nr_entries;
+			encoder->array_index_id = encoder->type_id_off + cu->types_table.nr_entries;
 		}
 	}
 
 	cu__for_each_type(cu, core_id, pos) {
-		btf_type_id = btf_encoder__encode_tag(encoder, pos, type_id_off, conf_load);
+		btf_type_id = btf_encoder__encode_tag(encoder, pos, conf_load);
 
 		if (btf_type_id == 0) {
 			++skipped_types;
@@ -1538,7 +1567,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 		}
 
 		if (btf_type_id < 0 ||
-		    tag__check_id_drift(pos, core_id, btf_type_id + skipped_types, type_id_off)) {
+		    tag__check_id_drift(encoder, pos, core_id, btf_type_id + skipped_types)) {
 			err = -1;
 			goto out;
 		}
@@ -1572,7 +1601,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 			continue;
 		}
 
-		btf_type_id = type_id_off + core_id;
+		btf_type_id = encoder->type_id_off + core_id;
 		ns = tag__namespace(pos);
 		list_for_each_entry(annot, &ns->annots, node) {
 			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_type_id, annot->component_idx);
@@ -1585,8 +1614,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	}
 
 	cu__for_each_function(cu, core_id, fn) {
-		int btf_fnproto_id, btf_fn_id;
-		const char *name;
 
 		/*
 		 * Skip functions that:
@@ -1616,27 +1643,13 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				continue;
 		}
 
-		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto, type_id_off);
-		name = function__name(fn);
-		btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
-		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
-			err = -1;
-			printf("error: failed to encode function '%s'\n", function__name(fn));
+		err = btf_encoder__add_func(encoder, fn);
+		if (err)
 			goto out;
-		}
-
-		list_for_each_entry(annot, &fn->annots, node) {
-			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id, annot->component_idx);
-			if (tag_type_id < 0) {
-				fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
-					annot->value, name, annot->component_idx);
-				goto out;
-			}
-		}
 	}
 
 	if (!encoder->skip_encoding_vars)
-		err = btf_encoder__encode_cu_variables(encoder, type_id_off);
+		err = btf_encoder__encode_cu_variables(encoder);
 out:
 	encoder->cu = NULL;
 	return err;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders
  2023-01-30 14:29 [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
  2023-01-30 14:29 ` [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters Alan Maguire
  2023-01-30 14:29 ` [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func Alan Maguire
@ 2023-01-30 14:29 ` Alan Maguire
  2023-01-30 22:04   ` Jiri Olsa
  2023-01-30 14:29 ` [PATCH v2 dwarves 4/5] btf_encoder: represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
  2023-01-30 14:29 ` [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies Alan Maguire
  4 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 14:29 UTC (permalink / raw)
  To: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo
  Cc: daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf, Alan Maguire

To coordinate across multiple encoders at collection time, there
will be a need to access the set of encoders.  Rework the unused
btf_encoders__*() API to facilitate this.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 30 ++++++++++++++++++++++--------
 btf_encoder.h |  6 ------
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 44f1905..e20b628 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -30,6 +30,7 @@
 
 #include <errno.h>
 #include <stdint.h>
+#include <pthread.h>
 
 struct elf_function {
 	const char	*name;
@@ -79,21 +80,32 @@ struct btf_encoder {
 	} functions;
 };
 
-void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder)
-{
-	list_add_tail(&encoder->node, encoders);
-}
+static LIST_HEAD(encoders);
+static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
 
-struct btf_encoder *btf_encoders__first(struct list_head *encoders)
+/* mutex only needed for add/delete, as this can happen in multiple encoding
+ * threads.  Traversal of the list is currently confined to thread collection.
+ */
+static void btf_encoders__add(struct btf_encoder *encoder)
 {
-	return list_first_entry(encoders, struct btf_encoder, node);
+	pthread_mutex_lock(&encoders__lock);
+	list_add_tail(&encoder->node, &encoders);
+	pthread_mutex_unlock(&encoders__lock);
 }
 
-struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder)
+#define btf_encoders__for_each_encoder(encoder)		\
+	list_for_each_entry(encoder, &encoders, node)
+
+static void btf_encoders__delete(struct btf_encoder *encoder)
 {
-	return list_next_entry(encoder, node);
+	pthread_mutex_lock(&encoders__lock);
+	list_del(&encoder->node);
+	pthread_mutex_unlock(&encoders__lock);
 }
 
+#define btf_encoders__for_each_encoder(encoder)			\
+	list_for_each_entry(encoder, &encoders, node)
+
 #define PERCPU_SECTION ".data..percpu"
 
 /*
@@ -1505,6 +1517,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
 
 		if (encoder->verbose)
 			printf("File %s:\n", cu->filename);
+		btf_encoders__add(encoder);
 	}
 out:
 	return encoder;
@@ -1519,6 +1532,7 @@ void btf_encoder__delete(struct btf_encoder *encoder)
 	if (encoder == NULL)
 		return;
 
+	btf_encoders__delete(encoder);
 	__gobuffer__delete(&encoder->percpu_secinfo);
 	zfree(&encoder->filename);
 	btf__free(encoder->btf);
diff --git a/btf_encoder.h b/btf_encoder.h
index a65120c..34516bb 100644
--- a/btf_encoder.h
+++ b/btf_encoder.h
@@ -23,12 +23,6 @@ int btf_encoder__encode(struct btf_encoder *encoder);
 
 int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
 
-void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder);
-
-struct btf_encoder *btf_encoders__first(struct list_head *encoders);
-
-struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder);
-
 struct btf *btf_encoder__btf(struct btf_encoder *encoder);
 
 int btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 dwarves 4/5] btf_encoder: represent "."-suffixed functions (".isra.0") in BTF
  2023-01-30 14:29 [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (2 preceding siblings ...)
  2023-01-30 14:29 ` [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
@ 2023-01-30 14:29 ` Alan Maguire
  2023-01-30 14:29 ` [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies Alan Maguire
  4 siblings, 0 replies; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 14:29 UTC (permalink / raw)
  To: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo
  Cc: daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf, Alan Maguire

At gcc optimization level O2 or higher, many function optimizations
occur such as

- constant propagation (.constprop.0);
- interprocedural scalar replacement of aggregates, removal of
  unused parameters and replacement of parameters passed by
  reference by parameters passed by value (.isra.0)

See [1] for details.

Currently BTF encoding does not handle such optimized functions
that get renamed with a "." suffix such as ".isra.0", ".constprop.0".
This is safer because such suffixes can often indicate parameters have
been optimized out.  Since we can now spot this, support matching
to a "." suffix and represent the function in BTF if it does not
have optimized-out parameters.  First an attempt to match by
exact name is made; if that fails we fall back to checking
for a "."-suffixed name.  The BTF representation will use the
original function name "foo" not "foo.isra.0" for consistency
with DWARF representation.

There is a complication however, and this arises because we process
each CU separately and merge BTF when complete.  Different CUs
may optimize differently, so in one CU, a function may have
optimized-out parameters - and thus be ineligible for BTF -
while in another it does not have optimized-out parameters -
making it eligible for BTF.  The NF_HOOK function is an
example of this.

To avoid disrupting BTF generation parallelism, the approach
taken is to save such functions in a per-encoder binary tree
for later addition.  That way, at thread collection time,
observations about optimizations can be merged across
encoders and we know whether it is safe to add a "."-suffixed
function or not.

The result of this is we add 602 "."-suffixed functions to
the BTF representation.

However, note that the optimization checks are applied to
both "."-suffixed and normal functions.  They find 1428
of the latter with optimized-out parameters also, and these
are dropped from the BTF representation also.  For example,
bad_inode_permission() is skipped because no location
information is supplied for any of its parameters;
disassembling it we see why this might be:

(gdb) disassemble bad_inode_permission
Dump of assembler code for function bad_inode_permission:
   0xffffffff813ef180 <+0>:	callq  0xffffffff81088c70 <__fentry__>
   0xffffffff813ef185 <+5>:	push   %rbp
   0xffffffff813ef186 <+6>:	mov    $0xfffffffb,%eax
   0xffffffff813ef18b <+11>:	mov    %rsp,%rbp
   0xffffffff813ef18e <+14>:	pop    %rbp
   0xffffffff813ef18f <+15>:	jmpq   0xffffffff81c6e600 <__x86_return_thunk>
End of assembler dump.

...since the function is simply:

static int bad_inode_permission(struct user_namespace *mnt_userns,
				struct inode *inode, int mask)
{
	return -EIO;
}

So these changes lead to a net decrease of 826 functions in
vmlinux BTF.

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 197 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 dwarves.h     |   2 +
 pahole.c      |  14 ++---
 3 files changed, 200 insertions(+), 13 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index e20b628..f36150e 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -30,11 +30,13 @@
 
 #include <errno.h>
 #include <stdint.h>
+#include <search.h> /* for tsearch(), tfind() and tdestroy() */
 #include <pthread.h>
 
 struct elf_function {
 	const char	*name;
 	bool		 generated;
+	size_t		prefixlen;
 };
 
 #define MAX_PERCPU_VAR_CNT 4096
@@ -57,6 +59,8 @@ struct btf_encoder {
 	struct elf_symtab *symtab;
 	uint32_t	  type_id_off;
 	uint32_t	  unspecified_type;
+	void		  *saved_func_tree;
+	int		  saved_func_cnt;
 	bool		  has_index_type,
 			  need_index_type,
 			  skip_encoding_vars,
@@ -77,12 +81,15 @@ struct btf_encoder {
 		struct elf_function *entries;
 		int		    allocated;
 		int		    cnt;
+		int		    suffix_cnt; /* number of .isra, .part etc */
 	} functions;
 };
 
 static LIST_HEAD(encoders);
 static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
 
+static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder);
+
 /* mutex only needed for add/delete, as this can happen in multiple encoding
  * threads.  Traversal of the list is currently confined to thread collection.
  */
@@ -701,6 +708,10 @@ int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder
 	int32_t i, id;
 	struct btf_var_secinfo *vsi;
 
+	btf_encoder__add_saved_funcs(other);
+	if (encoder == other)
+		return 0;
+
 	for (i = 0; i < nr_var_secinfo; i++) {
 		vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i;
 		type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */
@@ -776,6 +787,70 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
 	return id;
 }
 
+
+static int function__compare(const void *a, const void *b)
+{
+	struct function *fa = (struct function *)a, *fb = (struct function *)b;
+
+	return strcmp(function__name(fa), function__name(fb));
+}
+
+struct btf_encoder_state {
+	struct btf_encoder *encoder;
+	uint32_t type_id_off;
+};
+
+static void btf_encoder__merge_func(struct btf_encoder *encoder, struct function *fn)
+{
+	struct function **nodep;
+
+	nodep = tfind(fn, &encoder->saved_func_tree, function__compare);
+	if (!nodep || !*nodep)
+		return;
+	/* merge characteristics across different encoder representations
+	 * of functions.
+	 */
+	fn->proto.optimized_parms |= (*nodep)->proto.optimized_parms;
+	(*nodep)->proto.optimized_parms |= fn->proto.optimized_parms;
+	(*nodep)->proto.processed = 1;
+}
+
+static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct function *fn)
+{
+	const char *name = function__name(fn);
+	struct function **nodep;
+
+	nodep = tsearch(fn, &encoder->saved_func_tree, function__compare);
+	if (nodep == NULL) {
+		fprintf(stderr, "error: out of memory adding static function '%s'\n",
+			name);
+		return -1;
+	}
+	/* If saving and we find an existing entry, we want to merge
+	 * observations across both functions, checking that the
+	 * "seen optimized parameters" status is reflected in our tree entry.
+	 * If the entry is new, record encoder state required
+	 * to add the local function later (encoder + type_id_off)
+	 * such that we can add the function later.
+	 */
+	if (*nodep != fn) {
+		(*nodep)->proto.optimized_parms |= fn->proto.optimized_parms;
+	} else {
+		struct btf_encoder_state *state = zalloc(sizeof(*state));
+
+		if (state == NULL) {
+			fprintf(stderr, "error: out of memory adding local function '%s'\n",
+				name);
+			return -1;
+		}
+		state->encoder = encoder;
+		state->type_id_off = encoder->type_id_off;
+		fn->priv = state;
+		encoder->saved_func_cnt++;
+	}
+	return 0;
+}
+
 static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
 {
 	int btf_fnproto_id, btf_fn_id, tag_type_id;
@@ -801,6 +876,67 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
 	return 0;
 }
 
+/* visit each node once, adding associated function. */
+static void btf_encoder__add_saved_func(const void *nodep, const VISIT which,
+					const int depth __maybe_unused)
+{
+	struct btf_encoder *encoder, *other_encoder;
+	struct btf_encoder_state *state;
+	struct function *fn = NULL;
+
+	switch (which) {
+	case preorder:
+	case endorder:
+		break;
+	case postorder:
+	case leaf:
+		fn = *((struct function **)nodep);
+		break;
+	}
+	if (!fn || !fn->priv || fn->proto.processed)
+		return;
+	state = (struct btf_encoder_state *)fn->priv;
+	encoder = state->encoder;
+	encoder->type_id_off = state->type_id_off;
+
+	/* merge optimized-out status across encoders */
+	btf_encoders__for_each_encoder(other_encoder) {
+		if (other_encoder != encoder)
+			btf_encoder__merge_func(other_encoder, fn);
+	}
+
+	if (fn->proto.optimized_parms) {
+		if (encoder->verbose) {
+			const char *name = function__name(fn);
+
+			printf("skipping addition of '%s'(%s) due to optimized-out parameters\n",
+			       name, fn->alias ?: name);
+		}
+	} else {
+		btf_encoder__add_func(encoder, fn);
+		fn->proto.processed = 1;
+	}
+}
+
+static void saved_func__free(void *node)
+{
+	struct function *fn = node;
+
+	if (fn->priv)
+		free(fn->priv);
+}
+
+void btf_encoder__add_saved_funcs(struct btf_encoder *encoder)
+{
+	if (!encoder->saved_func_tree)
+		return;
+
+	encoder->type_id_off = 0;
+	twalk(encoder->saved_func_tree, btf_encoder__add_saved_func);
+	tdestroy(encoder->saved_func_tree, saved_func__free);
+	encoder->saved_func_tree = NULL;
+}
+
 /*
  * This corresponds to the same macro defined in
  * include/linux/kallsyms.h
@@ -812,6 +948,11 @@ static int functions_cmp(const void *_a, const void *_b)
 	const struct elf_function *a = _a;
 	const struct elf_function *b = _b;
 
+	/* if search key allows prefix match, verify target has matching
+	 * prefix len and prefix matches.
+	 */
+	if (a->prefixlen && a->prefixlen == b->prefixlen)
+		return strncmp(a->name, b->name, b->prefixlen);
 	return strcmp(a->name, b->name);
 }
 
@@ -844,14 +985,21 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
 	}
 
 	encoder->functions.entries[encoder->functions.cnt].name = name;
+	if (strchr(name, '.')) {
+		const char *suffix = strchr(name, '.');
+
+		encoder->functions.suffix_cnt++;
+		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
+	}
 	encoder->functions.entries[encoder->functions.cnt].generated = false;
 	encoder->functions.cnt++;
 	return 0;
 }
 
-static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name)
+static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
+						       const char *name, size_t prefixlen)
 {
-	struct elf_function key = { .name = name };
+	struct elf_function key = { .name = name, .prefixlen = prefixlen };
 
 	return bsearch(&key, encoder->functions.entries, encoder->functions.cnt, sizeof(key), functions_cmp);
 }
@@ -1181,6 +1329,9 @@ int btf_encoder__encode(struct btf_encoder *encoder)
 {
 	int err;
 
+	/* for single-threaded case, saved funcs are added here */
+	btf_encoder__add_saved_funcs(encoder);
+
 	if (gobuffer__size(&encoder->percpu_secinfo) != 0)
 		btf_encoder__add_datasec(encoder, PERCPU_SECTION);
 
@@ -1628,6 +1779,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	}
 
 	cu__for_each_function(cu, core_id, fn) {
+		bool save = false;
 
 		/*
 		 * Skip functions that:
@@ -1648,22 +1800,55 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 			if (!name)
 				continue;
 
-			func = btf_encoder__find_function(encoder, name);
-			if (!func || func->generated)
+			/* prefer exact function name match... */
+			func = btf_encoder__find_function(encoder, name, 0);
+			if (func) {
+				if (func->generated)
+					continue;
+				func->generated = true;
+			} else if (encoder->functions.suffix_cnt) {
+				/* falling back to name.isra.0 match if no exact
+				 * match is found; only bother if we found any
+				 * .suffix function names.  The function
+				 * will be saved and added once we ensure
+				 * it does not have optimized-out parameters
+				 * in any cu.
+				 */
+				func = btf_encoder__find_function(encoder, name,
+								  strlen(name));
+				if (func) {
+					save = true;
+					if (encoder->verbose)
+						printf("matched function '%s' with '%s'%s\n",
+						       name, func->name,
+						       fn->proto.optimized_parms ?
+						       ", has optimized-out parameters" : "");
+				}
+			}
+			if (!func)
 				continue;
-			func->generated = true;
+			fn->alias = func->name;
 		} else {
 			if (!fn->external)
 				continue;
 		}
 
-		err = btf_encoder__add_func(encoder, fn);
+		if (save)
+			err = btf_encoder__save_func(encoder, fn);
+		else
+			err = btf_encoder__add_func(encoder, fn);
 		if (err)
 			goto out;
 	}
 
 	if (!encoder->skip_encoding_vars)
 		err = btf_encoder__encode_cu_variables(encoder);
+
+	/* It is only safe to delete this CU if we have not stashed any static
+	 * functions for later addition.
+	 */
+	if (!err)
+		err = encoder->saved_func_cnt > 0 ? LSK__KEEPIT : LSK__DELETE;
 out:
 	encoder->cu = NULL;
 	return err;
diff --git a/dwarves.h b/dwarves.h
index 2723466..64c7c56 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -831,6 +831,7 @@ struct ftype {
 	uint16_t	 nr_parms;
 	uint8_t		 unspec_parms:1; /* just one bit is needed */
 	uint8_t		 optimized_parms:1;
+	uint8_t		 processed:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
@@ -883,6 +884,7 @@ struct function {
 	struct rb_node	 rb_node;
 	const char	 *name;
 	const char	 *linkage_name;
+	const char	 *alias;	/* name.isra.0 */
 	uint32_t	 cu_total_size_inline_expansions;
 	uint16_t	 cu_total_nr_inline_expansions;
 	uint8_t		 inlined:2;
diff --git a/pahole.c b/pahole.c
index 6f4f87c..bc120cb 100644
--- a/pahole.c
+++ b/pahole.c
@@ -2980,20 +2980,20 @@ static int pahole_threads_collect(struct conf_load *conf, int nr_threads, void *
 		 * Merge content of the btf instances of worker threads to the btf
 		 * instance of the primary btf_encoder.
                 */
-		if (!threads[i]->btf || threads[i]->encoder == btf_encoder)
-			continue; /* The primary btf_encoder */
+		if (!threads[i]->btf)
+			continue;
 		err = btf_encoder__add_encoder(btf_encoder, threads[i]->encoder);
 		if (err < 0)
 			goto out;
-		btf_encoder__delete(threads[i]->encoder);
-		threads[i]->encoder = NULL;
 	}
 	err = 0;
 
 out:
 	for (i = 0; i < nr_threads; i++) {
-		if (threads[i]->encoder && threads[i]->encoder != btf_encoder)
+		if (threads[i]->encoder && threads[i]->encoder != btf_encoder) {
 			btf_encoder__delete(threads[i]->encoder);
+			threads[i]->encoder = NULL;
+		}
 	}
 	free(threads[0]);
 
@@ -3077,11 +3077,11 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
 			encoder = btf_encoder;
 		}
 
-		if (btf_encoder__encode_cu(encoder, cu, conf_load)) {
+		ret = btf_encoder__encode_cu(encoder, cu, conf_load);
+		if (ret < 0) {
 			fprintf(stderr, "Encountered error while encoding BTF.\n");
 			exit(1);
 		}
-		ret = LSK__DELETE;
 out_btf:
 		return ret;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies
  2023-01-30 14:29 [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (3 preceding siblings ...)
  2023-01-30 14:29 ` [PATCH v2 dwarves 4/5] btf_encoder: represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
@ 2023-01-30 14:29 ` Alan Maguire
  2023-01-30 17:20   ` Alexei Starovoitov
  4 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 14:29 UTC (permalink / raw)
  To: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo
  Cc: daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf, Alan Maguire

There are multiple sources of inconsistency that can result in
functions of the same name having multiple prototypes:

- multiple static functions in different CUs share the same name
- static and external functions share the same name

Here we attempt to catch such cases by finding inconsistencies
across CUs using the save/compare/merge mechanisms that were
previously introduced to handle optimized-out parameters,
using it for all functions.

For two instances of a function to be considered consistent:

- number of parameters must match
- parameter names must match

The latter is a less strong method than a full type
comparison but suffices to match functions.

With these changes, we see 278 functions removed due to
protoype inconsistency.  For example, wakeup_show()
has two distinct prototypes:

static ssize_t wakeup_show(struct kobject *kobj,
                           struct kobj_attribute *attr, char *buf)
(from kernel/irq/irqdesc.c)

static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
                           char *buf)
(from drivers/base/power/sysfs.c)

In some other cases, the parameter comparisons weed out additional
inconsistencies in "."-suffixed functions across CUs.

We also see a large number of functions eliminated due to
optimized-out parameters; 2542 functions are eliminated for this
reason, both "."-suffixed (1007) and otherwise (1535).

Because the save/compare/merge process occurs for all functions
it is important to assess performance effects.  In addition,
prior to these changes the number of functions ultimately
represented in BTF was non-deterministic when pahole was
run with multiple threads.  This was due to the fact that
functions were marked as generated on a per-encoder basis
when first added, and as such the same function could
be added multiple times for different encoders, and if they
encountered inconsistent function prototypes, deduplication
could leave multiple entries in place for the same name.
When run in a single thread, the "generated" state associated
with the name would prevent this.

Here we assess both BTF encoding performance and determinism
of the function representation in baseline compared to with
these changes.  Determinism is assessed by counting the
number of functions in BTF.  Comparisons are done for 1,
4 and 8 threads.

Baseline

$ time LLVM_OBJCOPY=objcopy pahole -J vmlinux

real	0m18.160s
user	0m17.179s
sys	0m0.757s

$ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
51150
$ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
51150

$ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux

real	0m8.078s
user	0m17.978s
sys	0m0.732s

$ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
51592
$ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
51150

$ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux

real	0m7.075s
user	0m19.010s
sys	0m0.587s

$ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
51683
$ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
51150

Test:

$ time LLVM_OBJCOPY=objcopy pahole -J  vmlinux

real	0m19.039s
user	0m17.617s
sys	0m1.419s
$ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
49871
$ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
49871

$ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux

real	0m8.482s
user	0m18.233s
sys	0m2.412s
$ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
49871
$ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
49871

$ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux

real	0m7.614s
user	0m19.384s
sys	0m3.739s
$ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
49871
$ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l

So there is a small cost in performance, but we improve determinism
and the consistency of representation.

Future work could support maintaining multiple inconsistent
prototypes; having a way to associate the function BTF representation
with the function site would be needed however, and the BPF
infrastructure would need to ensure that an fentry program was
attached to the right site with the right prototype for example.
BTF declaration tags with specifying the function address(es)
a prototype referred to could help here, but edge cases like
KASLR (where addresses change dynamically at boot-time) would
have to be considered to make this work well.

Similarly, future work could potentially accommodate function
prototypes with optimized-out parameters, similarly using
tagging to identify them.  Again the kernel would have to
be aware of such tagging and handle it.

For now it is better to have an incomplete representation
that more accurately reflects the actual function parameters
used, removing inconsistencies that could otherwise do harm.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++------------
 dwarves.h     |   1 +
 2 files changed, 81 insertions(+), 20 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index f36150e..44739a9 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -35,7 +35,6 @@
 
 struct elf_function {
 	const char	*name;
-	bool		 generated;
 	size_t		prefixlen;
 };
 
@@ -708,6 +707,9 @@ int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder
 	int32_t i, id;
 	struct btf_var_secinfo *vsi;
 
+	/* saved functions are added to each encoder's BTF prior to it
+	 * being merged with the parent encoder.
+	 */
 	btf_encoder__add_saved_funcs(other);
 	if (encoder == other)
 		return 0;
@@ -795,11 +797,72 @@ static int function__compare(const void *a, const void *b)
 	return strcmp(function__name(fa), function__name(fb));
 }
 
+#define BTF_ENCODER_MAX_PARAMETERS	12
+
 struct btf_encoder_state {
 	struct btf_encoder *encoder;
 	uint32_t type_id_off;
+	bool got_parameter_names;
+	const char *parameter_names[BTF_ENCODER_MAX_PARAMETERS];
 };
 
+static void parameter_names__get(struct ftype *ftype, size_t nr_parameters,
+		     const char **parameter_names)
+{
+	struct parameter *parameter;
+	int i = 0;
+
+	ftype__for_each_parameter(ftype, parameter) {
+		if (i >= nr_parameters)
+			break;
+		parameter_names[i++] = parameter__name(parameter);
+	}
+}
+
+static bool funcs__match(struct function *f1, struct function *f2)
+{
+
+	const char *parameter_names[BTF_ENCODER_MAX_PARAMETERS];
+	struct btf_encoder_state *state = f1->priv;
+	const char *name = function__name(f1);
+	int i;
+
+	if (!state)
+		return false;
+
+	if (f1->proto.nr_parms != f2->proto.nr_parms) {
+		if (state->encoder->verbose)
+			printf("function mismatch for '%s'(%s): %d params != %d params\n",
+			       name, f1->alias ?: name,
+			       f1->proto.nr_parms, f2->proto.nr_parms);
+		return false;
+	}
+
+	if (!state->got_parameter_names) {
+		parameter_names__get(&f1->proto, BTF_ENCODER_MAX_PARAMETERS,
+				     state->parameter_names);
+		state->got_parameter_names = true;
+	}
+	parameter_names__get(&f2->proto, BTF_ENCODER_MAX_PARAMETERS, parameter_names);
+	for (i = 0; i < f1->proto.nr_parms && i < BTF_ENCODER_MAX_PARAMETERS; i++) {
+		if (!state->parameter_names[i]) {
+			if (!parameter_names[i])
+				continue;
+		} else if (parameter_names[i]) {
+			if (strcmp(state->parameter_names[i], parameter_names[i]) == 0)
+				continue;
+		}
+		if (state->encoder->verbose)
+			printf("function mismatch for '%s'(%s): parameter #%d '%s' != '%s'\n",
+			       name, f1->alias ?: name, i,
+			       state->parameter_names[i] ?: "<null>",
+			       parameter_names[i] ?: "<null>");
+
+		return false;
+	}
+	return true;
+}
+
 static void btf_encoder__merge_func(struct btf_encoder *encoder, struct function *fn)
 {
 	struct function **nodep;
@@ -812,6 +875,9 @@ static void btf_encoder__merge_func(struct btf_encoder *encoder, struct function
 	 */
 	fn->proto.optimized_parms |= (*nodep)->proto.optimized_parms;
 	(*nodep)->proto.optimized_parms |= fn->proto.optimized_parms;
+	if ((fn->proto.inconsistent_proto || (*nodep)->proto.inconsistent_proto) ||
+	    !funcs__match(fn, *nodep))
+		(*nodep)->proto.inconsistent_proto = fn->proto.inconsistent_proto = 1;
 	(*nodep)->proto.processed = 1;
 }
 
@@ -822,19 +888,22 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 
 	nodep = tsearch(fn, &encoder->saved_func_tree, function__compare);
 	if (nodep == NULL) {
-		fprintf(stderr, "error: out of memory adding static function '%s'\n",
+		fprintf(stderr, "error: out of memory adding function '%s'\n",
 			name);
 		return -1;
 	}
 	/* If saving and we find an existing entry, we want to merge
 	 * observations across both functions, checking that the
-	 * "seen optimized parameters" status is reflected in our tree entry.
+	 * "seen optimized parameters" and inconsistent prototype
+	 * status is reflected in our tree entry.
 	 * If the entry is new, record encoder state required
 	 * to add the local function later (encoder + type_id_off)
 	 * such that we can add the function later.
 	 */
 	if (*nodep != fn) {
 		(*nodep)->proto.optimized_parms |= fn->proto.optimized_parms;
+		if (!funcs__match(*nodep, fn))
+			(*nodep)->proto.inconsistent_proto = fn->proto.inconsistent_proto = 1;
 	} else {
 		struct btf_encoder_state *state = zalloc(sizeof(*state));
 
@@ -905,12 +974,14 @@ static void btf_encoder__add_saved_func(const void *nodep, const VISIT which,
 			btf_encoder__merge_func(other_encoder, fn);
 	}
 
-	if (fn->proto.optimized_parms) {
+	if (fn->proto.optimized_parms || fn->proto.inconsistent_proto) {
 		if (encoder->verbose) {
 			const char *name = function__name(fn);
 
-			printf("skipping addition of '%s'(%s) due to optimized-out parameters\n",
-			       name, fn->alias ?: name);
+			printf("skipping addition of '%s'(%s) due to %s\n",
+			       name, fn->alias ?: name,
+			       fn->proto.optimized_parms ? "optimized-out parameters" :
+							   "multiple inconsistent function prototypes");
 		}
 	} else {
 		btf_encoder__add_func(encoder, fn);
@@ -991,7 +1062,6 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
 		encoder->functions.suffix_cnt++;
 		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
 	}
-	encoder->functions.entries[encoder->functions.cnt].generated = false;
 	encoder->functions.cnt++;
 	return 0;
 }
@@ -1779,8 +1849,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	}
 
 	cu__for_each_function(cu, core_id, fn) {
-		bool save = false;
-
 		/*
 		 * Skip functions that:
 		 *   - are marked as declarations
@@ -1802,11 +1870,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 
 			/* prefer exact function name match... */
 			func = btf_encoder__find_function(encoder, name, 0);
-			if (func) {
-				if (func->generated)
-					continue;
-				func->generated = true;
-			} else if (encoder->functions.suffix_cnt) {
+			if (!func && encoder->functions.suffix_cnt) {
 				/* falling back to name.isra.0 match if no exact
 				 * match is found; only bother if we found any
 				 * .suffix function names.  The function
@@ -1817,7 +1881,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				func = btf_encoder__find_function(encoder, name,
 								  strlen(name));
 				if (func) {
-					save = true;
+					fn->alias = func->name;
 					if (encoder->verbose)
 						printf("matched function '%s' with '%s'%s\n",
 						       name, func->name,
@@ -1827,16 +1891,12 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 			}
 			if (!func)
 				continue;
-			fn->alias = func->name;
 		} else {
 			if (!fn->external)
 				continue;
 		}
 
-		if (save)
-			err = btf_encoder__save_func(encoder, fn);
-		else
-			err = btf_encoder__add_func(encoder, fn);
+		err = btf_encoder__save_func(encoder, fn);
 		if (err)
 			goto out;
 	}
diff --git a/dwarves.h b/dwarves.h
index 64c7c56..ba94573 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -832,6 +832,7 @@ struct ftype {
 	uint8_t		 unspec_parms:1; /* just one bit is needed */
 	uint8_t		 optimized_parms:1;
 	uint8_t		 processed:1;
+	uint8_t		 inconsistent_proto:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies
  2023-01-30 14:29 ` [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies Alan Maguire
@ 2023-01-30 17:20   ` Alexei Starovoitov
  2023-01-30 18:08     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2023-01-30 17:20 UTC (permalink / raw)
  To: Alan Maguire
  Cc: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel,
	andrii, songliubraving, john.fastabend, kpsingh, sdf, haoluo,
	martin.lau, bpf

On Mon, Jan 30, 2023 at 02:29:45PM +0000, Alan Maguire wrote:
> There are multiple sources of inconsistency that can result in
> functions of the same name having multiple prototypes:
> 
> - multiple static functions in different CUs share the same name
> - static and external functions share the same name
> 
> Here we attempt to catch such cases by finding inconsistencies
> across CUs using the save/compare/merge mechanisms that were
> previously introduced to handle optimized-out parameters,
> using it for all functions.
> 
> For two instances of a function to be considered consistent:
> 
> - number of parameters must match
> - parameter names must match
> 
> The latter is a less strong method than a full type
> comparison but suffices to match functions.
> 
> With these changes, we see 278 functions removed due to
> protoype inconsistency.  For example, wakeup_show()
> has two distinct prototypes:
> 
> static ssize_t wakeup_show(struct kobject *kobj,
>                            struct kobj_attribute *attr, char *buf)
> (from kernel/irq/irqdesc.c)
> 
> static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>                            char *buf)
> (from drivers/base/power/sysfs.c)
> 
> In some other cases, the parameter comparisons weed out additional
> inconsistencies in "."-suffixed functions across CUs.
> 
> We also see a large number of functions eliminated due to
> optimized-out parameters; 2542 functions are eliminated for this
> reason, both "."-suffixed (1007) and otherwise (1535).

imo it's a good thing.

> Because the save/compare/merge process occurs for all functions
> it is important to assess performance effects.  In addition,
> prior to these changes the number of functions ultimately
> represented in BTF was non-deterministic when pahole was
> run with multiple threads.  This was due to the fact that
> functions were marked as generated on a per-encoder basis
> when first added, and as such the same function could
> be added multiple times for different encoders, and if they
> encountered inconsistent function prototypes, deduplication
> could leave multiple entries in place for the same name.
> When run in a single thread, the "generated" state associated
> with the name would prevent this.
> 
> Here we assess both BTF encoding performance and determinism
> of the function representation in baseline compared to with
> these changes.  Determinism is assessed by counting the
> number of functions in BTF.  Comparisons are done for 1,
> 4 and 8 threads.
> 
> Baseline
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J vmlinux
> 
> real	0m18.160s
> user	0m17.179s
> sys	0m0.757s
> 
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> 51150
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> 51150
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux
> 
> real	0m8.078s
> user	0m17.978s
> sys	0m0.732s
> 
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> 51592
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> 51150
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux
> 
> real	0m7.075s
> user	0m19.010s
> sys	0m0.587s
> 
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> 51683
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> 51150

Ouch. I didn't realize it is so random currently.

> Test:
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J  vmlinux
> 
> real	0m19.039s
> user	0m17.617s
> sys	0m1.419s
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> 49871
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> 49871
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux
> 
> real	0m8.482s
> user	0m18.233s
> sys	0m2.412s
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> 49871
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> 49871
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux
> 
> real	0m7.614s
> user	0m19.384s
> sys	0m3.739s
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> 49871
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> 
> So there is a small cost in performance, but we improve determinism
> and the consistency of representation.

This is a great fix.

I'm not an expert in this code base, but patches look good to me.
Thank you for fixing it.

> For now it is better to have an incomplete representation
> that more accurately reflects the actual function parameters
> used, removing inconsistencies that could otherwise do harm.

+1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies
  2023-01-30 17:20   ` Alexei Starovoitov
@ 2023-01-30 18:08     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-30 18:08 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alan Maguire, yhs, ast, olsajiri, eddyz87, sinquersw, timo,
	daniel, andrii, songliubraving, john.fastabend, kpsingh, sdf,
	haoluo, martin.lau, bpf

Em Mon, Jan 30, 2023 at 09:20:37AM -0800, Alexei Starovoitov escreveu:
> On Mon, Jan 30, 2023 at 02:29:45PM +0000, Alan Maguire wrote:
> > There are multiple sources of inconsistency that can result in
> > functions of the same name having multiple prototypes:
> > 
> > - multiple static functions in different CUs share the same name
> > - static and external functions share the same name
> > 
> > Here we attempt to catch such cases by finding inconsistencies
> > across CUs using the save/compare/merge mechanisms that were
> > previously introduced to handle optimized-out parameters,
> > using it for all functions.
> > 
> > For two instances of a function to be considered consistent:
> > 
> > - number of parameters must match
> > - parameter names must match
> > 
> > The latter is a less strong method than a full type
> > comparison but suffices to match functions.
> > 
> > With these changes, we see 278 functions removed due to
> > protoype inconsistency.  For example, wakeup_show()
> > has two distinct prototypes:
> > 
> > static ssize_t wakeup_show(struct kobject *kobj,
> >                            struct kobj_attribute *attr, char *buf)
> > (from kernel/irq/irqdesc.c)
> > 
> > static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
> >                            char *buf)
> > (from drivers/base/power/sysfs.c)
> > 
> > In some other cases, the parameter comparisons weed out additional
> > inconsistencies in "."-suffixed functions across CUs.
> > 
> > We also see a large number of functions eliminated due to
> > optimized-out parameters; 2542 functions are eliminated for this
> > reason, both "."-suffixed (1007) and otherwise (1535).
> 
> imo it's a good thing.
> 
> > Because the save/compare/merge process occurs for all functions
> > it is important to assess performance effects.  In addition,
> > prior to these changes the number of functions ultimately
> > represented in BTF was non-deterministic when pahole was
> > run with multiple threads.  This was due to the fact that
> > functions were marked as generated on a per-encoder basis
> > when first added, and as such the same function could
> > be added multiple times for different encoders, and if they
> > encountered inconsistent function prototypes, deduplication
> > could leave multiple entries in place for the same name.
> > When run in a single thread, the "generated" state associated
> > with the name would prevent this.
> > 
> > Here we assess both BTF encoding performance and determinism
> > of the function representation in baseline compared to with
> > these changes.  Determinism is assessed by counting the
> > number of functions in BTF.  Comparisons are done for 1,
> > 4 and 8 threads.
> > 
> > Baseline
> > 
> > $ time LLVM_OBJCOPY=objcopy pahole -J vmlinux
> > 
> > real	0m18.160s
> > user	0m17.179s
> > sys	0m0.757s
> > 
> > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> > 51150
> > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> > 51150
> > 
> > $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux
> > 
> > real	0m8.078s
> > user	0m17.978s
> > sys	0m0.732s
> > 
> > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> > 51592
> > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> > 51150
> > 
> > $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux
> > 
> > real	0m7.075s
> > user	0m19.010s
> > sys	0m0.587s
> > 
> > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> > 51683
> > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> > 51150
> 
> Ouch. I didn't realize it is so random currently.
> 
> > Test:
> > 
> > $ time LLVM_OBJCOPY=objcopy pahole -J  vmlinux
> > 
> > real	0m19.039s
> > user	0m17.617s
> > sys	0m1.419s
> > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> > 49871
> > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> > 49871
> > 
> > $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux
> > 
> > real	0m8.482s
> > user	0m18.233s
> > sys	0m2.412s
> > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> > 49871
> > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> > 49871
> > 
> > $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux
> > 
> > real	0m7.614s
> > user	0m19.384s
> > sys	0m3.739s
> > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> > 49871
> > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> > 
> > So there is a small cost in performance, but we improve determinism
> > and the consistency of representation.
> 
> This is a great fix.
> 
> I'm not an expert in this code base, but patches look good to me.
> Thank you for fixing it.

And all the description of the problem and of the solution, limitations,
together with a summary of the review comments, its a pleasure to
process a patch series like this one :-)

Doing that now and performing the usual tests,

Thanks,

- Arnaldo
 
> > For now it is better to have an incomplete representation
> > that more accurately reflects the actual function parameters
> > used, removing inconsistencies that could otherwise do harm.
> 
> +1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-30 14:29 ` [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters Alan Maguire
@ 2023-01-30 18:36   ` Arnaldo Carvalho de Melo
  2023-01-30 20:10     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-30 18:36 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 02:29:41PM +0000, Alan Maguire escreveu:
> Compilation generates DWARF at several stages, and often the
> later DWARF representations more accurately represent optimizations
> that have occurred during compilation.
> 
> In particular, parameter representations can be spotted by their
> abstract origin references to the original parameter, but they
> often have more accurate location information.  In most cases,
> the parameter locations will match calling conventions, and be
> registers for the first 6 parameters on x86_64, first 8 on ARM64
> etc.  If the parameter is not a register when it should be however,
> it is likely passed via the stack or the compiler has used a
> constant representation instead.  The latter can often be
> spotted by checking for a DW_AT_const_value attribute,
> as noted by Eduard.
> 
> In addition, absence of a location tag (either across
> the abstract origin reference and the original parameter,
> or in the standalone parameter description) is evidence of
> an optimized-out parameter.  Presence of a location tag
> is stored in the parameter description and shared between
> abstract tags and their original referents.
> 
> This change adds a field to parameters and their associated
> ftype to note if a parameter has been optimized out.  Having
> this information allows us to skip such functions, as their
> presence in CUs makes BTF encoding impossible.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  dwarf_loader.c | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
>  dwarves.h      |   5 ++-
>  2 files changed, 122 insertions(+), 8 deletions(-)
> 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 5a74035..93c2307 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -992,13 +992,98 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
>  	return member;
>  }
>  
> -static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
> +/* How many function parameters are passed via registers?  Used below in
> + * determining if an argument has been optimized out or if it is simply
> + * an argument > NR_REGISTER_PARAMS.  Setting NR_REGISTER_PARAMS to 0
> + * allows unsupported architectures to skip tagging optimized-out
> + * values.
> + */
> +#if defined(__x86_64__)
> +#define NR_REGISTER_PARAMS      6
> +#elif defined(__s390__)
> +#define NR_REGISTER_PARAMS	5
> +#elif defined(__aarch64__)
> +#define NR_REGISTER_PARAMS      8
> +#elif defined(__mips__)
> +#define NR_REGISTER_PARAMS	8
> +#elif defined(__powerpc__)
> +#define NR_REGISTER_PARAMS	8
> +#elif defined(__sparc__)
> +#define NR_REGISTER_PARAMS	6
> +#elif defined(__riscv) && __riscv_xlen == 64
> +#define NR_REGISTER_PARAMS	8
> +#elif defined(__arc__)
> +#define NR_REGISTER_PARAMS	8
> +#else
> +#define NR_REGISTER_PARAMS      0
> +#endif

This should be done as a function, something like:

int cu__nr_register_params(struct cu *cu)
{
	GElf_Ehdr ehdr;

	gelf_getehdr(cu->elf, &ehdr);

	switch (ehdr.machine) {
	...

}

I'm coding that now, will send the diff shortly.

This is to support cross-builds.

- Arnaldo

> +
> +static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
> +					struct conf_load *conf, int param_idx)
>  {
>  	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
>  
>  	if (parm != NULL) {
> +		bool has_const_value;
> +		Dwarf_Attribute attr;
> +		struct location loc;
> +
>  		tag__init(&parm->tag, cu, die);
>  		parm->name = attr_string(die, DW_AT_name, conf);
> +
> +		if (param_idx >= NR_REGISTER_PARAMS)
> +			return parm;
> +		/* Parameters which use DW_AT_abstract_origin to point at
> +		 * the original parameter definition (with no name in the DIE)
> +		 * are the result of later DWARF generation during compilation
> +		 * so often better take into account if arguments were
> +		 * optimized out.
> +		 *
> +		 * By checking that locations for parameters that are expected
> +		 * to be passed as registers are actually passed as registers,
> +		 * we can spot optimized-out parameters.
> +		 *
> +		 * It can also be the case that a parameter DIE has
> +		 * a constant value attribute reflecting optimization or
> +		 * has no location attribute.
> +		 *
> +		 * From the DWARF spec:
> +		 *
> +		 * "4.1.10
> +		 *
> +		 * A DW_AT_const_value attribute for an entry describing a
> +		 * variable or formal parameter whose value is constant and not
> +		 * represented by an object in the address space of the program,
> +		 * or an entry describing a named constant. (Note
> +		 * that such an entry does not have a location attribute.)"
> +		 *
> +		 * So we can also use the absence of a location for a parameter
> +		 * as evidence it has been optimized out.  This info will
> +		 * need to be shared between a parameter and any abstract
> +		 * origin references however, since gcc can have location
> +		 * information in the parameter that refers back to the original
> +		 * via abstract origin, so we need to share location presence
> +		 * between these parameter representations.  See
> +		 * ftype__recode_dwarf_types() below for how this is handled.
> +		 */
> +		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
> +		has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
> +		if (parm->has_loc &&
> +		    attr_location(die, &loc.expr, &loc.exprlen) == 0 &&
> +			loc.exprlen != 0) {
> +			Dwarf_Op *expr = loc.expr;
> +
> +			switch (expr->atom) {
> +			case DW_OP_reg1 ... DW_OP_reg31:
> +			case DW_OP_breg0 ... DW_OP_breg31:
> +				break;
> +			default:
> +				parm->optimized = 1;
> +				break;
> +			}
> +		} else if (has_const_value) {
> +			parm->optimized = 1;
> +		}
>  	}
>  
>  	return parm;
> @@ -1450,7 +1535,7 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
>  					     struct cu *cu, struct conf_load *conf,
>  					     int param_idx)
>  {
> -	struct parameter *parm = parameter__new(die, cu, conf);
> +	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
>  
>  	if (parm == NULL)
>  		return NULL;
> @@ -2194,6 +2279,7 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
>  
>  	ftype__for_each_parameter(type, pos) {
>  		struct dwarf_tag *dpos = pos->tag.priv;
> +		struct parameter *opos;
>  		struct dwarf_tag *dtype;
>  
>  		if (dpos->type.off == 0) {
> @@ -2207,8 +2293,18 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
>  				tag__print_abstract_origin_not_found(&pos->tag);
>  				continue;
>  			}
> -			pos->name = tag__parameter(dtype->tag)->name;
> +			opos = tag__parameter(dtype->tag);
> +			pos->name = opos->name;
>  			pos->tag.type = dtype->tag->type;
> +			/* share location information between parameter and
> +			 * abstract origin; if neither have location, we will
> +			 * mark the parameter as optimized out.
> +			 */
> +			if (pos->has_loc)
> +				opos->has_loc = pos->has_loc;
> +
> +			if (pos->optimized)
> +				opos->optimized = pos->optimized;
>  			continue;
>  		}
>  
> @@ -2478,18 +2574,33 @@ out:
>  	return 0;
>  }
>  
> -static int cu__resolve_func_ret_types(struct cu *cu)
> +static int cu__resolve_func_ret_types_optimized(struct cu *cu)
>  {
>  	struct ptr_table *pt = &cu->functions_table;
>  	uint32_t i;
>  
>  	for (i = 0; i < pt->nr_entries; ++i) {
>  		struct tag *tag = pt->entries[i];
> +		struct parameter *pos;
> +		struct function *fn = tag__function(tag);
> +
> +		/* mark function as optimized if parameter is, or
> +		 * if parameter does not have a location; at this
> +		 * point location presence has been marked in
> +		 * abstract origins for cases where a parameter
> +		 * location is not stored in the original function
> +		 * parameter tag.
> +		 */
> +		ftype__for_each_parameter(&fn->proto, pos) {
> +			if (pos->optimized || !pos->has_loc) {
> +				fn->proto.optimized_parms = 1;
> +				break;
> +			}
> +		}
>  
>  		if (tag == NULL || tag->type != 0)
>  			continue;
>  
> -		struct function *fn = tag__function(tag);
>  		if (!fn->abstract_origin)
>  			continue;
>  
> @@ -2612,7 +2723,7 @@ static int die__process_and_recode(Dwarf_Die *die, struct cu *cu, struct conf_lo
>  	if (ret != 0)
>  		return ret;
>  
> -	return cu__resolve_func_ret_types(cu);
> +	return cu__resolve_func_ret_types_optimized(cu);
>  }
>  
>  static int class_member__cache_byte_size(struct tag *tag, struct cu *cu,
> @@ -3132,7 +3243,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
>  	 * encoded in another subprogram through abstract_origin
>  	 * tag. Let us visit all subprograms again to resolve this.
>  	 */
> -	if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT)
> +	if (cu__resolve_func_ret_types_optimized(cu) != LSK__KEEPIT)
>  		goto out_abort;
>  
>  	if (cus__finalize(cus, cu, conf, NULL) == LSK__STOP_LOADING)
> diff --git a/dwarves.h b/dwarves.h
> index 589588e..2723466 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -808,6 +808,8 @@ size_t lexblock__fprintf(const struct lexblock *lexblock, const struct cu *cu,
>  struct parameter {
>  	struct tag tag;
>  	const char *name;
> +	uint8_t optimized:1;
> +	uint8_t has_loc:1;
>  };
>  
>  static inline struct parameter *tag__parameter(const struct tag *tag)
> @@ -827,7 +829,8 @@ struct ftype {
>  	struct tag	 tag;
>  	struct list_head parms;
>  	uint16_t	 nr_parms;
> -	uint8_t		 unspec_parms; /* just one bit is needed */
> +	uint8_t		 unspec_parms:1; /* just one bit is needed */
> +	uint8_t		 optimized_parms:1;
>  };
>  
>  static inline struct ftype *tag__ftype(const struct tag *tag)
> -- 
> 1.8.3.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-30 18:36   ` Arnaldo Carvalho de Melo
@ 2023-01-30 20:10     ` Arnaldo Carvalho de Melo
  2023-01-30 20:23       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-30 20:10 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 03:36:09PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 30, 2023 at 02:29:41PM +0000, Alan Maguire escreveu:
> > Compilation generates DWARF at several stages, and often the
> > later DWARF representations more accurately represent optimizations
> > that have occurred during compilation.
> > 
> > In particular, parameter representations can be spotted by their
> > abstract origin references to the original parameter, but they
> > often have more accurate location information.  In most cases,
> > the parameter locations will match calling conventions, and be
> > registers for the first 6 parameters on x86_64, first 8 on ARM64
> > etc.  If the parameter is not a register when it should be however,
> > it is likely passed via the stack or the compiler has used a
> > constant representation instead.  The latter can often be
> > spotted by checking for a DW_AT_const_value attribute,
> > as noted by Eduard.
> > 
> > In addition, absence of a location tag (either across
> > the abstract origin reference and the original parameter,
> > or in the standalone parameter description) is evidence of
> > an optimized-out parameter.  Presence of a location tag
> > is stored in the parameter description and shared between
> > abstract tags and their original referents.
> > 
> > This change adds a field to parameters and their associated
> > ftype to note if a parameter has been optimized out.  Having
> > this information allows us to skip such functions, as their
> > presence in CUs makes BTF encoding impossible.
> > 
> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > ---
> >  dwarf_loader.c | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
> >  dwarves.h      |   5 ++-
> >  2 files changed, 122 insertions(+), 8 deletions(-)
> > 
> > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > index 5a74035..93c2307 100644
> > --- a/dwarf_loader.c
> > +++ b/dwarf_loader.c
> > @@ -992,13 +992,98 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
> >  	return member;
> >  }
> >  
> > -static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
> > +/* How many function parameters are passed via registers?  Used below in
> > + * determining if an argument has been optimized out or if it is simply
> > + * an argument > NR_REGISTER_PARAMS.  Setting NR_REGISTER_PARAMS to 0
> > + * allows unsupported architectures to skip tagging optimized-out
> > + * values.
> > + */
> > +#if defined(__x86_64__)
> > +#define NR_REGISTER_PARAMS      6
> > +#elif defined(__s390__)
> > +#define NR_REGISTER_PARAMS	5
> > +#elif defined(__aarch64__)
> > +#define NR_REGISTER_PARAMS      8
> > +#elif defined(__mips__)
> > +#define NR_REGISTER_PARAMS	8
> > +#elif defined(__powerpc__)
> > +#define NR_REGISTER_PARAMS	8
> > +#elif defined(__sparc__)
> > +#define NR_REGISTER_PARAMS	6
> > +#elif defined(__riscv) && __riscv_xlen == 64
> > +#define NR_REGISTER_PARAMS	8
> > +#elif defined(__arc__)
> > +#define NR_REGISTER_PARAMS	8
> > +#else
> > +#define NR_REGISTER_PARAMS      0
> > +#endif
> 
> This should be done as a function, something like:
> 
> int cu__nr_register_params(struct cu *cu)
> {
> 	GElf_Ehdr ehdr;
> 
> 	gelf_getehdr(cu->elf, &ehdr);
> 
> 	switch (ehdr.machine) {
> 	...
> 
> }
> 
> I'm coding that now, will send the diff shortly.
> 
> This is to support cross-builds.

I made this change to this patch, please check.

Thanks,

- Arnaldo

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 752a3c1afc4494f2..81963e71715c8435 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -994,29 +994,29 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
 
 /* How many function parameters are passed via registers?  Used below in
  * determining if an argument has been optimized out or if it is simply
- * an argument > NR_REGISTER_PARAMS.  Setting NR_REGISTER_PARAMS to 0
- * allows unsupported architectures to skip tagging optimized-out
+ * an argument > cu__nr_register_params().  Making cu__nr_register_params()
+ * return 0 allows unsupported architectures to skip tagging optimized-out
  * values.
  */
-#if defined(__x86_64__)
-#define NR_REGISTER_PARAMS      6
-#elif defined(__s390__)
-#define NR_REGISTER_PARAMS	5
-#elif defined(__aarch64__)
-#define NR_REGISTER_PARAMS      8
-#elif defined(__mips__)
-#define NR_REGISTER_PARAMS	8
-#elif defined(__powerpc__)
-#define NR_REGISTER_PARAMS	8
-#elif defined(__sparc__)
-#define NR_REGISTER_PARAMS	6
-#elif defined(__riscv) && __riscv_xlen == 64
-#define NR_REGISTER_PARAMS	8
-#elif defined(__arc__)
-#define NR_REGISTER_PARAMS	8
-#else
-#define NR_REGISTER_PARAMS      0
-#endif
+static int arch__nr_register_params(const GElf_Ehdr *ehdr)
+{
+	switch (ehdr->e_machine) {
+	case EM_S390:	 return 5;
+	case EM_SPARC:
+	case EM_SPARCV9:
+	case EM_X86_64:	 return 6;
+	case EM_AARCH64:
+	case EM_ARC:
+	case EM_ARM:
+	case EM_MIPS:
+	case EM_PPC:
+	case EM_PPC64:
+	case EM_RISCV:	 return 8;
+	default:	 break;
+	}
+
+	return 0;
+}
 
 static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 					struct conf_load *conf, int param_idx)
@@ -1031,7 +1031,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
 
-		if (param_idx >= NR_REGISTER_PARAMS)
+		if (param_idx >= cu->nr_register_params)
 			return parm;
 		/* Parameters which use DW_AT_abstract_origin to point at
 		 * the original parameter definition (with no name in the DIE)
@@ -2870,6 +2870,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
 		return DWARF_CB_ABORT;
 
 	cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
+	cu->nr_register_params = arch__nr_register_params(&ehdr);
 	return 0;
 }
 
diff --git a/dwarves.h b/dwarves.h
index fd1ca3ae9f4ab531..ddf56f0124e0ec03 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -262,6 +262,7 @@ struct cu {
 	uint8_t		 has_addr_info:1;
 	uint8_t		 uses_global_strings:1;
 	uint8_t		 little_endian:1;
+	uint8_t		 nr_register_params;
 	uint16_t	 language;
 	unsigned long	 nr_inline_expansions;
 	size_t		 size_inline_expansions;

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-30 20:10     ` Arnaldo Carvalho de Melo
@ 2023-01-30 20:23       ` Arnaldo Carvalho de Melo
  2023-01-30 22:37         ` Alan Maguire
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-30 20:23 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 30, 2023 at 03:36:09PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > +#define NR_REGISTER_PARAMS	8
> > > +#elif defined(__arc__)
> > > +#define NR_REGISTER_PARAMS	8
> > > +#else
> > > +#define NR_REGISTER_PARAMS      0
> > > +#endif
> > 
> > This should be done as a function, something like:
> > 
> > int cu__nr_register_params(struct cu *cu)
> > {
> > 	GElf_Ehdr ehdr;
> > 
> > 	gelf_getehdr(cu->elf, &ehdr);
> > 
> > 	switch (ehdr.machine) {
> > 	...
> > 
> > }
> > 
> > I'm coding that now, will send the diff shortly.
> > 
> > This is to support cross-builds.
> 
> I made this change to this patch, please check.

And added this to that cset:

Committer notes:

Changed the NR_REGISTER_PARAMS definition from a if/elif/endif for the
native architecture into a function that uses the ELF header e_machine
to find the target architecture, to allow for cross builds. 

---

- Arnaldo

> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 752a3c1afc4494f2..81963e71715c8435 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -994,29 +994,29 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
>  
>  /* How many function parameters are passed via registers?  Used below in
>   * determining if an argument has been optimized out or if it is simply
> - * an argument > NR_REGISTER_PARAMS.  Setting NR_REGISTER_PARAMS to 0
> - * allows unsupported architectures to skip tagging optimized-out
> + * an argument > cu__nr_register_params().  Making cu__nr_register_params()
> + * return 0 allows unsupported architectures to skip tagging optimized-out
>   * values.
>   */
> -#if defined(__x86_64__)
> -#define NR_REGISTER_PARAMS      6
> -#elif defined(__s390__)
> -#define NR_REGISTER_PARAMS	5
> -#elif defined(__aarch64__)
> -#define NR_REGISTER_PARAMS      8
> -#elif defined(__mips__)
> -#define NR_REGISTER_PARAMS	8
> -#elif defined(__powerpc__)
> -#define NR_REGISTER_PARAMS	8
> -#elif defined(__sparc__)
> -#define NR_REGISTER_PARAMS	6
> -#elif defined(__riscv) && __riscv_xlen == 64
> -#define NR_REGISTER_PARAMS	8
> -#elif defined(__arc__)
> -#define NR_REGISTER_PARAMS	8
> -#else
> -#define NR_REGISTER_PARAMS      0
> -#endif
> +static int arch__nr_register_params(const GElf_Ehdr *ehdr)
> +{
> +	switch (ehdr->e_machine) {
> +	case EM_S390:	 return 5;
> +	case EM_SPARC:
> +	case EM_SPARCV9:
> +	case EM_X86_64:	 return 6;
> +	case EM_AARCH64:
> +	case EM_ARC:
> +	case EM_ARM:
> +	case EM_MIPS:
> +	case EM_PPC:
> +	case EM_PPC64:
> +	case EM_RISCV:	 return 8;
> +	default:	 break;
> +	}
> +
> +	return 0;
> +}
>  
>  static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>  					struct conf_load *conf, int param_idx)
> @@ -1031,7 +1031,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>  		tag__init(&parm->tag, cu, die);
>  		parm->name = attr_string(die, DW_AT_name, conf);
>  
> -		if (param_idx >= NR_REGISTER_PARAMS)
> +		if (param_idx >= cu->nr_register_params)
>  			return parm;
>  		/* Parameters which use DW_AT_abstract_origin to point at
>  		 * the original parameter definition (with no name in the DIE)
> @@ -2870,6 +2870,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
>  		return DWARF_CB_ABORT;
>  
>  	cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
> +	cu->nr_register_params = arch__nr_register_params(&ehdr);
>  	return 0;
>  }
>  
> diff --git a/dwarves.h b/dwarves.h
> index fd1ca3ae9f4ab531..ddf56f0124e0ec03 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -262,6 +262,7 @@ struct cu {
>  	uint8_t		 has_addr_info:1;
>  	uint8_t		 uses_global_strings:1;
>  	uint8_t		 little_endian:1;
> +	uint8_t		 nr_register_params;
>  	uint16_t	 language;
>  	unsigned long	 nr_inline_expansions;
>  	size_t		 size_inline_expansions;

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders
  2023-01-30 14:29 ` [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
@ 2023-01-30 22:04   ` Jiri Olsa
  2023-01-31  0:24     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Jiri Olsa @ 2023-01-30 22:04 UTC (permalink / raw)
  To: Alan Maguire
  Cc: acme, yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel,
	andrii, songliubraving, john.fastabend, kpsingh, sdf, haoluo,
	martin.lau, bpf

On Mon, Jan 30, 2023 at 02:29:43PM +0000, Alan Maguire wrote:
> To coordinate across multiple encoders at collection time, there
> will be a need to access the set of encoders.  Rework the unused
> btf_encoders__*() API to facilitate this.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  btf_encoder.c | 30 ++++++++++++++++++++++--------
>  btf_encoder.h |  6 ------
>  2 files changed, 22 insertions(+), 14 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 44f1905..e20b628 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -30,6 +30,7 @@
>  
>  #include <errno.h>
>  #include <stdint.h>
> +#include <pthread.h>
>  
>  struct elf_function {
>  	const char	*name;
> @@ -79,21 +80,32 @@ struct btf_encoder {
>  	} functions;
>  };
>  
> -void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder)
> -{
> -	list_add_tail(&encoder->node, encoders);
> -}
> +static LIST_HEAD(encoders);
> +static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
>  
> -struct btf_encoder *btf_encoders__first(struct list_head *encoders)
> +/* mutex only needed for add/delete, as this can happen in multiple encoding
> + * threads.  Traversal of the list is currently confined to thread collection.
> + */
> +static void btf_encoders__add(struct btf_encoder *encoder)
>  {
> -	return list_first_entry(encoders, struct btf_encoder, node);
> +	pthread_mutex_lock(&encoders__lock);
> +	list_add_tail(&encoder->node, &encoders);
> +	pthread_mutex_unlock(&encoders__lock);
>  }
>  
> -struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder)
> +#define btf_encoders__for_each_encoder(encoder)		\
> +	list_for_each_entry(encoder, &encoders, node)
> +
> +static void btf_encoders__delete(struct btf_encoder *encoder)
>  {
> -	return list_next_entry(encoder, node);
> +	pthread_mutex_lock(&encoders__lock);
> +	list_del(&encoder->node);
> +	pthread_mutex_unlock(&encoders__lock);
>  }
>  
> +#define btf_encoders__for_each_encoder(encoder)			\
> +	list_for_each_entry(encoder, &encoders, node)
> +

there's extra btf_encoders__for_each_encoder define

hum I'm scratching my head how this compile, probably because it's identical

jirka


>  #define PERCPU_SECTION ".data..percpu"
>  
>  /*
> @@ -1505,6 +1517,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
>  
>  		if (encoder->verbose)
>  			printf("File %s:\n", cu->filename);
> +		btf_encoders__add(encoder);
>  	}
>  out:
>  	return encoder;
> @@ -1519,6 +1532,7 @@ void btf_encoder__delete(struct btf_encoder *encoder)
>  	if (encoder == NULL)
>  		return;
>  
> +	btf_encoders__delete(encoder);
>  	__gobuffer__delete(&encoder->percpu_secinfo);
>  	zfree(&encoder->filename);
>  	btf__free(encoder->btf);
> diff --git a/btf_encoder.h b/btf_encoder.h
> index a65120c..34516bb 100644
> --- a/btf_encoder.h
> +++ b/btf_encoder.h
> @@ -23,12 +23,6 @@ int btf_encoder__encode(struct btf_encoder *encoder);
>  
>  int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
>  
> -void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder);
> -
> -struct btf_encoder *btf_encoders__first(struct list_head *encoders);
> -
> -struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder);
> -
>  struct btf *btf_encoder__btf(struct btf_encoder *encoder);
>  
>  int btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other);
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-30 20:23       ` Arnaldo Carvalho de Melo
@ 2023-01-30 22:37         ` Alan Maguire
  2023-01-31  0:25           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-01-30 22:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Jan 30, 2023 at 03:36:09PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>> +#define NR_REGISTER_PARAMS	8
>>>> +#elif defined(__arc__)
>>>> +#define NR_REGISTER_PARAMS	8
>>>> +#else
>>>> +#define NR_REGISTER_PARAMS      0
>>>> +#endif
>>>
>>> This should be done as a function, something like:
>>>
>>> int cu__nr_register_params(struct cu *cu)
>>> {
>>> 	GElf_Ehdr ehdr;
>>>
>>> 	gelf_getehdr(cu->elf, &ehdr);
>>>
>>> 	switch (ehdr.machine) {
>>> 	...
>>>
>>> }
>>>
>>> I'm coding that now, will send the diff shortly.
>>>
>>> This is to support cross-builds.
>>
>> I made this change to this patch, please check.
> 
> And added this to that cset:
> 
> Committer notes:
> 
> Changed the NR_REGISTER_PARAMS definition from a if/elif/endif for the
> native architecture into a function that uses the ELF header e_machine
> to find the target architecture, to allow for cross builds. 
> 
> ---
> 
> - Arnaldo
> 
>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>> index 752a3c1afc4494f2..81963e71715c8435 100644
>> --- a/dwarf_loader.c
>> +++ b/dwarf_loader.c
>> @@ -994,29 +994,29 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
>>  
>>  /* How many function parameters are passed via registers?  Used below in
>>   * determining if an argument has been optimized out or if it is simply
>> - * an argument > NR_REGISTER_PARAMS.  Setting NR_REGISTER_PARAMS to 0
>> - * allows unsupported architectures to skip tagging optimized-out
>> + * an argument > cu__nr_register_params().  Making cu__nr_register_params()
>> + * return 0 allows unsupported architectures to skip tagging optimized-out
>>   * values.
>>   */
>> -#if defined(__x86_64__)
>> -#define NR_REGISTER_PARAMS      6
>> -#elif defined(__s390__)
>> -#define NR_REGISTER_PARAMS	5
>> -#elif defined(__aarch64__)
>> -#define NR_REGISTER_PARAMS      8
>> -#elif defined(__mips__)
>> -#define NR_REGISTER_PARAMS	8
>> -#elif defined(__powerpc__)
>> -#define NR_REGISTER_PARAMS	8
>> -#elif defined(__sparc__)
>> -#define NR_REGISTER_PARAMS	6
>> -#elif defined(__riscv) && __riscv_xlen == 64
>> -#define NR_REGISTER_PARAMS	8
>> -#elif defined(__arc__)
>> -#define NR_REGISTER_PARAMS	8
>> -#else
>> -#define NR_REGISTER_PARAMS      0
>> -#endif
>> +static int arch__nr_register_params(const GElf_Ehdr *ehdr)
>> +{
>> +	switch (ehdr->e_machine) {
>> +	case EM_S390:	 return 5;
>> +	case EM_SPARC:
>> +	case EM_SPARCV9:
>> +	case EM_X86_64:	 return 6;
>> +	case EM_AARCH64:
>> +	case EM_ARC:
>> +	case EM_ARM:
>> +	case EM_MIPS:
>> +	case EM_PPC:
>> +	case EM_PPC64:
>> +	case EM_RISCV:	 return 8;
>> +	default:	 break;
>> +	}
>> +
>> +	return 0;
>> +}
>>  
>>  static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>  					struct conf_load *conf, int param_idx)
>> @@ -1031,7 +1031,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>  		tag__init(&parm->tag, cu, die);
>>  		parm->name = attr_string(die, DW_AT_name, conf);
>>  
>> -		if (param_idx >= NR_REGISTER_PARAMS)
>> +		if (param_idx >= cu->nr_register_params)
>>  			return parm;
>>  		/* Parameters which use DW_AT_abstract_origin to point at
>>  		 * the original parameter definition (with no name in the DIE)
>> @@ -2870,6 +2870,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
>>  		return DWARF_CB_ABORT;
>>  
>>  	cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
>> +	cu->nr_register_params = arch__nr_register_params(&ehdr);
>>  	return 0;
>>  }
>>  
>> diff --git a/dwarves.h b/dwarves.h
>> index fd1ca3ae9f4ab531..ddf56f0124e0ec03 100644
>> --- a/dwarves.h
>> +++ b/dwarves.h
>> @@ -262,6 +262,7 @@ struct cu {
>>  	uint8_t		 has_addr_info:1;
>>  	uint8_t		 uses_global_strings:1;
>>  	uint8_t		 little_endian:1;
>> +	uint8_t		 nr_register_params;
>>  	uint16_t	 language;
>>  	unsigned long	 nr_inline_expansions;
>>  	size_t		 size_inline_expansions;
> 

Thanks for this, never thought of cross-builds to be honest!
Tested just now on x86_64 and aarch64 at my end, just ran
into one small thing on one system; turns out EM_RISCV isn't
defined if using a very old elf.h; below works around this
(dwarves otherwise builds fine on this system).

diff --git a/dwarf_loader.c b/dwarf_loader.c
index dba2d37..47a3bc2 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -992,6 +992,11 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *c
        return member;
 }
 
+/* for older elf.h */
+#ifndef EM_RISCV
+#define EM_RISCV       243
+#endif
+
 /* How many function parameters are passed via registers?  Used below in
  * determining if an argument has been optimized out or if it is simply
  * an argument > cu__nr_register_params().  Making cu__nr_register_params()

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders
  2023-01-30 22:04   ` Jiri Olsa
@ 2023-01-31  0:24     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-31  0:24 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alan Maguire, yhs, ast, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 11:04:20PM +0100, Jiri Olsa escreveu:
> On Mon, Jan 30, 2023 at 02:29:43PM +0000, Alan Maguire wrote:
> > To coordinate across multiple encoders at collection time, there
> > will be a need to access the set of encoders.  Rework the unused
> > btf_encoders__*() API to facilitate this.
> > 
> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > ---
> >  btf_encoder.c | 30 ++++++++++++++++++++++--------
> >  btf_encoder.h |  6 ------
> >  2 files changed, 22 insertions(+), 14 deletions(-)
> > 
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 44f1905..e20b628 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -30,6 +30,7 @@
> >  
> >  #include <errno.h>
> >  #include <stdint.h>
> > +#include <pthread.h>
> >  
> >  struct elf_function {
> >  	const char	*name;
> > @@ -79,21 +80,32 @@ struct btf_encoder {
> >  	} functions;
> >  };
> >  
> > -void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder)
> > -{
> > -	list_add_tail(&encoder->node, encoders);
> > -}
> > +static LIST_HEAD(encoders);
> > +static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
> >  
> > -struct btf_encoder *btf_encoders__first(struct list_head *encoders)
> > +/* mutex only needed for add/delete, as this can happen in multiple encoding
> > + * threads.  Traversal of the list is currently confined to thread collection.
> > + */
> > +static void btf_encoders__add(struct btf_encoder *encoder)
> >  {
> > -	return list_first_entry(encoders, struct btf_encoder, node);
> > +	pthread_mutex_lock(&encoders__lock);
> > +	list_add_tail(&encoder->node, &encoders);
> > +	pthread_mutex_unlock(&encoders__lock);
> >  }
> >  
> > -struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder)
> > +#define btf_encoders__for_each_encoder(encoder)		\
> > +	list_for_each_entry(encoder, &encoders, node)
> > +
> > +static void btf_encoders__delete(struct btf_encoder *encoder)
> >  {
> > -	return list_next_entry(encoder, node);
> > +	pthread_mutex_lock(&encoders__lock);
> > +	list_del(&encoder->node);
> > +	pthread_mutex_unlock(&encoders__lock);
> >  }
> >  
> > +#define btf_encoders__for_each_encoder(encoder)			\
> > +	list_for_each_entry(encoder, &encoders, node)
> > +
> 
> there's extra btf_encoders__for_each_encoder define
> 
> hum I'm scratching my head how this compile, probably because it's identical

I removed it, thanks!

- Arnaldo
 
> jirka
> 
> 
> >  #define PERCPU_SECTION ".data..percpu"
> >  
> >  /*
> > @@ -1505,6 +1517,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
> >  
> >  		if (encoder->verbose)
> >  			printf("File %s:\n", cu->filename);
> > +		btf_encoders__add(encoder);
> >  	}
> >  out:
> >  	return encoder;
> > @@ -1519,6 +1532,7 @@ void btf_encoder__delete(struct btf_encoder *encoder)
> >  	if (encoder == NULL)
> >  		return;
> >  
> > +	btf_encoders__delete(encoder);
> >  	__gobuffer__delete(&encoder->percpu_secinfo);
> >  	zfree(&encoder->filename);
> >  	btf__free(encoder->btf);
> > diff --git a/btf_encoder.h b/btf_encoder.h
> > index a65120c..34516bb 100644
> > --- a/btf_encoder.h
> > +++ b/btf_encoder.h
> > @@ -23,12 +23,6 @@ int btf_encoder__encode(struct btf_encoder *encoder);
> >  
> >  int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
> >  
> > -void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder);
> > -
> > -struct btf_encoder *btf_encoders__first(struct list_head *encoders);
> > -
> > -struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder);
> > -
> >  struct btf *btf_encoder__btf(struct btf_encoder *encoder);
> >  
> >  int btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other);
> > -- 
> > 1.8.3.1
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-30 22:37         ` Alan Maguire
@ 2023-01-31  0:25           ` Arnaldo Carvalho de Melo
  2023-01-31  1:04             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-31  0:25 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> +++ b/dwarves.h
> >> @@ -262,6 +262,7 @@ struct cu {
> >>  	uint8_t		 has_addr_info:1;
> >>  	uint8_t		 uses_global_strings:1;
> >>  	uint8_t		 little_endian:1;
> >> +	uint8_t		 nr_register_params;
> >>  	uint16_t	 language;
> >>  	unsigned long	 nr_inline_expansions;
> >>  	size_t		 size_inline_expansions;
> > 
 
> Thanks for this, never thought of cross-builds to be honest!

> Tested just now on x86_64 and aarch64 at my end, just ran
> into one small thing on one system; turns out EM_RISCV isn't
> defined if using a very old elf.h; below works around this
> (dwarves otherwise builds fine on this system).

Ok, will add it and will test with containers for older distros too.

- Arnaldo
 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index dba2d37..47a3bc2 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -992,6 +992,11 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *c
>         return member;
>  }
>  
> +/* for older elf.h */
> +#ifndef EM_RISCV
> +#define EM_RISCV       243
> +#endif
> +
>  /* How many function parameters are passed via registers?  Used below in
>   * determining if an argument has been optimized out or if it is simply
>   * an argument > cu__nr_register_params().  Making cu__nr_register_params()

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31  0:25           ` Arnaldo Carvalho de Melo
@ 2023-01-31  1:04             ` Arnaldo Carvalho de Melo
  2023-01-31 12:14               ` Alan Maguire
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-31  1:04 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >> +++ b/dwarves.h
> > >> @@ -262,6 +262,7 @@ struct cu {
> > >>  	uint8_t		 has_addr_info:1;
> > >>  	uint8_t		 uses_global_strings:1;
> > >>  	uint8_t		 little_endian:1;
> > >> +	uint8_t		 nr_register_params;
> > >>  	uint16_t	 language;
> > >>  	unsigned long	 nr_inline_expansions;
> > >>  	size_t		 size_inline_expansions;
> > > 
>  
> > Thanks for this, never thought of cross-builds to be honest!
> 
> > Tested just now on x86_64 and aarch64 at my end, just ran
> > into one small thing on one system; turns out EM_RISCV isn't
> > defined if using a very old elf.h; below works around this
> > (dwarves otherwise builds fine on this system).
> 
> Ok, will add it and will test with containers for older distros too.

Its on the 'next' branch, so that it gets tested in the libbpf github
repo at:

https://github.com/libbpf/libbpf/actions/workflows/pahole.yml

It failed yesterday and today due to problems with the installation of
llvm, probably tomorrow it'll be back working as I saw some
notifications floating by.

I added the conditional EM_RISCV definition as well as removed the dup
iterator that Jiri noticed.

Thanks,

- Arnaldo
 
> > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > index dba2d37..47a3bc2 100644
> > --- a/dwarf_loader.c
> > +++ b/dwarf_loader.c
> > @@ -992,6 +992,11 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *c
> >         return member;
> >  }
> >  
> > +/* for older elf.h */
> > +#ifndef EM_RISCV
> > +#define EM_RISCV       243
> > +#endif
> > +
> >  /* How many function parameters are passed via registers?  Used below in
> >   * determining if an argument has been optimized out or if it is simply
> >   * an argument > cu__nr_register_params().  Making cu__nr_register_params()
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31  1:04             ` Arnaldo Carvalho de Melo
@ 2023-01-31 12:14               ` Alan Maguire
  2023-01-31 12:33                 ` Arnaldo Carvalho de Melo
  2023-01-31 17:43                 ` Alexei Starovoitov
  0 siblings, 2 replies; 40+ messages in thread
From: Alan Maguire @ 2023-01-31 12:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>> +++ b/dwarves.h
>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>  	uint8_t		 has_addr_info:1;
>>>>>  	uint8_t		 uses_global_strings:1;
>>>>>  	uint8_t		 little_endian:1;
>>>>> +	uint8_t		 nr_register_params;
>>>>>  	uint16_t	 language;
>>>>>  	unsigned long	 nr_inline_expansions;
>>>>>  	size_t		 size_inline_expansions;
>>>>
>>  
>>> Thanks for this, never thought of cross-builds to be honest!
>>
>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>> into one small thing on one system; turns out EM_RISCV isn't
>>> defined if using a very old elf.h; below works around this
>>> (dwarves otherwise builds fine on this system).
>>
>> Ok, will add it and will test with containers for older distros too.
> 
> Its on the 'next' branch, so that it gets tested in the libbpf github
> repo at:
> 
> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> 
> It failed yesterday and today due to problems with the installation of
> llvm, probably tomorrow it'll be back working as I saw some
> notifications floating by.
> 
> I added the conditional EM_RISCV definition as well as removed the dup
> iterator that Jiri noticed.
>

Thanks again Arnaldo! I've hit an issue with this series in
BTF encoding of kfuncs; specifically we see some kfuncs missing
from the BTF representation, and as a result:

WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
WARN: resolve_btfids: unresolved symbol bpf_ct_change_status

Not sure why I didn't notice this previously.

The problem is the DWARF - and therefore BTF - generated for a function like

int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
{
        return -EOPNOTSUPP;
}

looks like this:

   <8af83a2>   DW_AT_external    : 1
    <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
    <8af83a6>   DW_AT_decl_file   : 5
    <8af83a7>   DW_AT_decl_line   : 737
    <8af83a9>   DW_AT_decl_column : 5
    <8af83aa>   DW_AT_prototyped  : 1
    <8af83aa>   DW_AT_type        : <0x8ad8547>
    <8af83ae>   DW_AT_sibling     : <0x8af83cd>
 <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
    <8af83b3>   DW_AT_name        : ctx
    <8af83b7>   DW_AT_decl_file   : 5
    <8af83b8>   DW_AT_decl_line   : 737
    <8af83ba>   DW_AT_decl_column : 51
    <8af83bb>   DW_AT_type        : <0x8af421d>
 <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
    <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
    <8af83c4>   DW_AT_decl_file   : 5
    <8af83c5>   DW_AT_decl_line   : 737
    <8af83c7>   DW_AT_decl_column : 61
    <8af83c8>   DW_AT_type        : <0x8adc424>

...and because there are no further abstract origin references
with location information either, we classify it as lacking 
locations for (some of) the parameters, and as a result
we skip BTF encoding. We can work around that by doing this:

__attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
{
	return -EOPNOTSUPP;
}

Should we #define some kind of "kfunc" prefix equivalent to the
above to handle these cases in include/linux/bpf.h perhaps?
If that makes sense, I'll send bpf-next patches to cover the
set of kfuncs.

The other thing we might want to do is bump the libbpf version
for dwarves 1.25, what do you think? I've tested with libbpf 1.1
and aside from the above issue all looks good (there's a few dedup
improvements that this version will give us). I can send a patch for
the libbpf update if that makes sense.

Thanks!

Alan
 
> Thanks,
> 
> - Arnaldo
>  
>>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>>> index dba2d37..47a3bc2 100644
>>> --- a/dwarf_loader.c
>>> +++ b/dwarf_loader.c
>>> @@ -992,6 +992,11 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *c
>>>         return member;
>>>  }
>>>  
>>> +/* for older elf.h */
>>> +#ifndef EM_RISCV
>>> +#define EM_RISCV       243
>>> +#endif
>>> +
>>>  /* How many function parameters are passed via registers?  Used below in
>>>   * determining if an argument has been optimized out or if it is simply
>>>   * an argument > cu__nr_register_params().  Making cu__nr_register_params()
>>
>> -- 
>>
>> - Arnaldo
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 12:14               ` Alan Maguire
@ 2023-01-31 12:33                 ` Arnaldo Carvalho de Melo
  2023-01-31 13:35                   ` Jiri Olsa
  2023-01-31 17:43                 ` Alexei Starovoitov
  1 sibling, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-01-31 12:33 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf



On January 31, 2023 9:14:05 AM GMT-03:00, Alan Maguire <alan.maguire@oracle.com> wrote:
>On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>> +++ b/dwarves.h
>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>  	uint8_t		 has_addr_info:1;
>>>>>>  	uint8_t		 uses_global_strings:1;
>>>>>>  	uint8_t		 little_endian:1;
>>>>>> +	uint8_t		 nr_register_params;
>>>>>>  	uint16_t	 language;
>>>>>>  	unsigned long	 nr_inline_expansions;
>>>>>>  	size_t		 size_inline_expansions;
>>>>>
>>>  
>>>> Thanks for this, never thought of cross-builds to be honest!
>>>
>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>> defined if using a very old elf.h; below works around this
>>>> (dwarves otherwise builds fine on this system).
>>>
>>> Ok, will add it and will test with containers for older distros too.
>> 
>> Its on the 'next' branch, so that it gets tested in the libbpf github
>> repo at:
>> 
>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>> 
>> It failed yesterday and today due to problems with the installation of
>> llvm, probably tomorrow it'll be back working as I saw some
>> notifications floating by.
>> 
>> I added the conditional EM_RISCV definition as well as removed the dup
>> iterator that Jiri noticed.
>>
>
>Thanks again Arnaldo! I've hit an issue with this series in
>BTF encoding of kfuncs; specifically we see some kfuncs missing
>from the BTF representation, and as a result:
>
>WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>
>Not sure why I didn't notice this previously.
>
>The problem is the DWARF - and therefore BTF - generated for a function like
>
>int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>{
>        return -EOPNOTSUPP;
>}
>
>looks like this:
>
>   <8af83a2>   DW_AT_external    : 1
>    <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>    <8af83a6>   DW_AT_decl_file   : 5
>    <8af83a7>   DW_AT_decl_line   : 737
>    <8af83a9>   DW_AT_decl_column : 5
>    <8af83aa>   DW_AT_prototyped  : 1
>    <8af83aa>   DW_AT_type        : <0x8ad8547>
>    <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>    <8af83b3>   DW_AT_name        : ctx
>    <8af83b7>   DW_AT_decl_file   : 5
>    <8af83b8>   DW_AT_decl_line   : 737
>    <8af83ba>   DW_AT_decl_column : 51
>    <8af83bb>   DW_AT_type        : <0x8af421d>
> <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>    <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>    <8af83c4>   DW_AT_decl_file   : 5
>    <8af83c5>   DW_AT_decl_line   : 737
>    <8af83c7>   DW_AT_decl_column : 61
>    <8af83c8>   DW_AT_type        : <0x8adc424>
>
>...and because there are no further abstract origin references
>with location information either, we classify it as lacking 
>locations for (some of) the parameters, and as a result
>we skip BTF encoding. We can work around that by doing this:
>
>__attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>{
>	return -EOPNOTSUPP;
>}
>
>Should we #define some kind of "kfunc" prefix equivalent to the
>above to handle these cases in include/linux/bpf.h perhaps?
>If that makes sense, I'll send bpf-next patches to cover the
>set of kfuncs.

Jiri?

>The other thing we might want to do is bump the libbpf version
>for dwarves 1.25, what do you think? I've tested with libbpf 1.1
>and aside from the above issue all looks good (there's a few dedup
>improvements that this version will give us). I can send a patch for
>the libbpf update if that makes sense.


Please send it, then we give it some more days of wider testing,

Yonghong, Andrii, comments on updating libbpf in the pahole submodule?

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 12:33                 ` Arnaldo Carvalho de Melo
@ 2023-01-31 13:35                   ` Jiri Olsa
  0 siblings, 0 replies; 40+ messages in thread
From: Jiri Olsa @ 2023-01-31 13:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alan Maguire, Arnaldo Carvalho de Melo, yhs, ast, olsajiri,
	eddyz87, sinquersw, timo, daniel, andrii, songliubraving,
	john.fastabend, kpsingh, sdf, haoluo, martin.lau, bpf

On Tue, Jan 31, 2023 at 09:33:49AM -0300, Arnaldo Carvalho de Melo wrote:
> 
> 
> On January 31, 2023 9:14:05 AM GMT-03:00, Alan Maguire <alan.maguire@oracle.com> wrote:
> >On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> >> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> >>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> >>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>>> +++ b/dwarves.h
> >>>>>> @@ -262,6 +262,7 @@ struct cu {
> >>>>>>  	uint8_t		 has_addr_info:1;
> >>>>>>  	uint8_t		 uses_global_strings:1;
> >>>>>>  	uint8_t		 little_endian:1;
> >>>>>> +	uint8_t		 nr_register_params;
> >>>>>>  	uint16_t	 language;
> >>>>>>  	unsigned long	 nr_inline_expansions;
> >>>>>>  	size_t		 size_inline_expansions;
> >>>>>
> >>>  
> >>>> Thanks for this, never thought of cross-builds to be honest!
> >>>
> >>>> Tested just now on x86_64 and aarch64 at my end, just ran
> >>>> into one small thing on one system; turns out EM_RISCV isn't
> >>>> defined if using a very old elf.h; below works around this
> >>>> (dwarves otherwise builds fine on this system).
> >>>
> >>> Ok, will add it and will test with containers for older distros too.
> >> 
> >> Its on the 'next' branch, so that it gets tested in the libbpf github
> >> repo at:
> >> 
> >> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> >> 
> >> It failed yesterday and today due to problems with the installation of
> >> llvm, probably tomorrow it'll be back working as I saw some
> >> notifications floating by.
> >> 
> >> I added the conditional EM_RISCV definition as well as removed the dup
> >> iterator that Jiri noticed.
> >>
> >
> >Thanks again Arnaldo! I've hit an issue with this series in
> >BTF encoding of kfuncs; specifically we see some kfuncs missing
> >from the BTF representation, and as a result:
> >
> >WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> >WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> >WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> >
> >Not sure why I didn't notice this previously.
> >
> >The problem is the DWARF - and therefore BTF - generated for a function like
> >
> >int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> >{
> >        return -EOPNOTSUPP;
> >}
> >
> >looks like this:
> >
> >   <8af83a2>   DW_AT_external    : 1
> >    <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> >    <8af83a6>   DW_AT_decl_file   : 5
> >    <8af83a7>   DW_AT_decl_line   : 737
> >    <8af83a9>   DW_AT_decl_column : 5
> >    <8af83aa>   DW_AT_prototyped  : 1
> >    <8af83aa>   DW_AT_type        : <0x8ad8547>
> >    <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> >    <8af83b3>   DW_AT_name        : ctx
> >    <8af83b7>   DW_AT_decl_file   : 5
> >    <8af83b8>   DW_AT_decl_line   : 737
> >    <8af83ba>   DW_AT_decl_column : 51
> >    <8af83bb>   DW_AT_type        : <0x8af421d>
> > <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> >    <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> >    <8af83c4>   DW_AT_decl_file   : 5
> >    <8af83c5>   DW_AT_decl_line   : 737
> >    <8af83c7>   DW_AT_decl_column : 61
> >    <8af83c8>   DW_AT_type        : <0x8adc424>
> >
> >...and because there are no further abstract origin references
> >with location information either, we classify it as lacking 
> >locations for (some of) the parameters, and as a result
> >we skip BTF encoding. We can work around that by doing this:
> >
> >__attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> >{
> >	return -EOPNOTSUPP;
> >}
> >
> >Should we #define some kind of "kfunc" prefix equivalent to the
> >above to handle these cases in include/linux/bpf.h perhaps?
> >If that makes sense, I'll send bpf-next patches to cover the
> >set of kfuncs.
> 
> Jiri?

hum I wonder what's the point of the kfunc if it returns -EOPNOTSUPP,
at least I can't see any other version of it.. maybe some temporary
stuff like for bpf_task_kptr_get

but I think it's good idea to make sure it does not get optimized out,
so some kfunc macro seems like good idea.. or maybe we could use
also declaration tag for kfuncs

David already send some patchset for BPF_KFUNC macro, could be part of
that

https://lore.kernel.org/bpf/20230123171506.71995-1-void@manifault.com/

jirka

> 
> >The other thing we might want to do is bump the libbpf version
> >for dwarves 1.25, what do you think? I've tested with libbpf 1.1
> >and aside from the above issue all looks good (there's a few dedup
> >improvements that this version will give us). I can send a patch for
> >the libbpf update if that makes sense.
> 
> 
> Please send it, then we give it some more days of wider testing,
> 
> Yonghong, Andrii, comments on updating libbpf in the pahole submodule?
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 12:14               ` Alan Maguire
  2023-01-31 12:33                 ` Arnaldo Carvalho de Melo
@ 2023-01-31 17:43                 ` Alexei Starovoitov
  2023-01-31 18:16                   ` Alexei Starovoitov
  1 sibling, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2023-01-31 17:43 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Arnaldo Carvalho de Melo, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> >>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> >>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>> +++ b/dwarves.h
> >>>>> @@ -262,6 +262,7 @@ struct cu {
> >>>>>   uint8_t          has_addr_info:1;
> >>>>>   uint8_t          uses_global_strings:1;
> >>>>>   uint8_t          little_endian:1;
> >>>>> + uint8_t          nr_register_params;
> >>>>>   uint16_t         language;
> >>>>>   unsigned long    nr_inline_expansions;
> >>>>>   size_t           size_inline_expansions;
> >>>>
> >>
> >>> Thanks for this, never thought of cross-builds to be honest!
> >>
> >>> Tested just now on x86_64 and aarch64 at my end, just ran
> >>> into one small thing on one system; turns out EM_RISCV isn't
> >>> defined if using a very old elf.h; below works around this
> >>> (dwarves otherwise builds fine on this system).
> >>
> >> Ok, will add it and will test with containers for older distros too.
> >
> > Its on the 'next' branch, so that it gets tested in the libbpf github
> > repo at:
> >
> > https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> >
> > It failed yesterday and today due to problems with the installation of
> > llvm, probably tomorrow it'll be back working as I saw some
> > notifications floating by.
> >
> > I added the conditional EM_RISCV definition as well as removed the dup
> > iterator that Jiri noticed.
> >
>
> Thanks again Arnaldo! I've hit an issue with this series in
> BTF encoding of kfuncs; specifically we see some kfuncs missing
> from the BTF representation, and as a result:
>
> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>
> Not sure why I didn't notice this previously.
>
> The problem is the DWARF - and therefore BTF - generated for a function like
>
> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> {
>         return -EOPNOTSUPP;
> }
>
> looks like this:
>
>    <8af83a2>   DW_AT_external    : 1
>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>     <8af83a6>   DW_AT_decl_file   : 5
>     <8af83a7>   DW_AT_decl_line   : 737
>     <8af83a9>   DW_AT_decl_column : 5
>     <8af83aa>   DW_AT_prototyped  : 1
>     <8af83aa>   DW_AT_type        : <0x8ad8547>
>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>     <8af83b3>   DW_AT_name        : ctx
>     <8af83b7>   DW_AT_decl_file   : 5
>     <8af83b8>   DW_AT_decl_line   : 737
>     <8af83ba>   DW_AT_decl_column : 51
>     <8af83bb>   DW_AT_type        : <0x8af421d>
>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>     <8af83c4>   DW_AT_decl_file   : 5
>     <8af83c5>   DW_AT_decl_line   : 737
>     <8af83c7>   DW_AT_decl_column : 61
>     <8af83c8>   DW_AT_type        : <0x8adc424>
>
> ...and because there are no further abstract origin references
> with location information either, we classify it as lacking
> locations for (some of) the parameters, and as a result
> we skip BTF encoding. We can work around that by doing this:
>
> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)

replied in the other thread. This attr is broken and discouraged by gcc.

For kfuncs where aregs are unused, please try __used and __may_unused
applied to arguments.
If that won't work, please add barrier_var(arg) to the body of kfunc
the way we do in selftests.

> {
>         return -EOPNOTSUPP;
> }
>
> Should we #define some kind of "kfunc" prefix equivalent to the
> above to handle these cases in include/linux/bpf.h perhaps?
> If that makes sense, I'll send bpf-next patches to cover the
> set of kfuncs.
>
> The other thing we might want to do is bump the libbpf version
> for dwarves 1.25, what do you think? I've tested with libbpf 1.1
> and aside from the above issue all looks good (there's a few dedup
> improvements that this version will give us). I can send a patch for
> the libbpf update if that makes sense.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 17:43                 ` Alexei Starovoitov
@ 2023-01-31 18:16                   ` Alexei Starovoitov
  2023-01-31 23:45                     ` Alan Maguire
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2023-01-31 18:16 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Arnaldo Carvalho de Melo, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > >>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > >>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >>>>> +++ b/dwarves.h
> > >>>>> @@ -262,6 +262,7 @@ struct cu {
> > >>>>>   uint8_t          has_addr_info:1;
> > >>>>>   uint8_t          uses_global_strings:1;
> > >>>>>   uint8_t          little_endian:1;
> > >>>>> + uint8_t          nr_register_params;
> > >>>>>   uint16_t         language;
> > >>>>>   unsigned long    nr_inline_expansions;
> > >>>>>   size_t           size_inline_expansions;
> > >>>>
> > >>
> > >>> Thanks for this, never thought of cross-builds to be honest!
> > >>
> > >>> Tested just now on x86_64 and aarch64 at my end, just ran
> > >>> into one small thing on one system; turns out EM_RISCV isn't
> > >>> defined if using a very old elf.h; below works around this
> > >>> (dwarves otherwise builds fine on this system).
> > >>
> > >> Ok, will add it and will test with containers for older distros too.
> > >
> > > Its on the 'next' branch, so that it gets tested in the libbpf github
> > > repo at:
> > >
> > > https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > >
> > > It failed yesterday and today due to problems with the installation of
> > > llvm, probably tomorrow it'll be back working as I saw some
> > > notifications floating by.
> > >
> > > I added the conditional EM_RISCV definition as well as removed the dup
> > > iterator that Jiri noticed.
> > >
> >
> > Thanks again Arnaldo! I've hit an issue with this series in
> > BTF encoding of kfuncs; specifically we see some kfuncs missing
> > from the BTF representation, and as a result:
> >
> > WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> >
> > Not sure why I didn't notice this previously.
> >
> > The problem is the DWARF - and therefore BTF - generated for a function like
> >
> > int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > {
> >         return -EOPNOTSUPP;
> > }
> >
> > looks like this:
> >
> >    <8af83a2>   DW_AT_external    : 1
> >     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> >     <8af83a6>   DW_AT_decl_file   : 5
> >     <8af83a7>   DW_AT_decl_line   : 737
> >     <8af83a9>   DW_AT_decl_column : 5
> >     <8af83aa>   DW_AT_prototyped  : 1
> >     <8af83aa>   DW_AT_type        : <0x8ad8547>
> >     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> >  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> >     <8af83b3>   DW_AT_name        : ctx
> >     <8af83b7>   DW_AT_decl_file   : 5
> >     <8af83b8>   DW_AT_decl_line   : 737
> >     <8af83ba>   DW_AT_decl_column : 51
> >     <8af83bb>   DW_AT_type        : <0x8af421d>
> >  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> >     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> >     <8af83c4>   DW_AT_decl_file   : 5
> >     <8af83c5>   DW_AT_decl_line   : 737
> >     <8af83c7>   DW_AT_decl_column : 61
> >     <8af83c8>   DW_AT_type        : <0x8adc424>
> >
> > ...and because there are no further abstract origin references
> > with location information either, we classify it as lacking
> > locations for (some of) the parameters, and as a result
> > we skip BTF encoding. We can work around that by doing this:
> >
> > __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>
> replied in the other thread. This attr is broken and discouraged by gcc.
>
> For kfuncs where aregs are unused, please try __used and __may_unused
> applied to arguments.
> If that won't work, please add barrier_var(arg) to the body of kfunc
> the way we do in selftests.

There is also
# define __visible __attribute__((__externally_visible__))
that probably fits the best here.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 18:16                   ` Alexei Starovoitov
@ 2023-01-31 23:45                     ` Alan Maguire
  2023-01-31 23:58                       ` David Vernet
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-01-31 23:45 UTC (permalink / raw)
  To: Alexei Starovoitov, David Vernet
  Cc: Arnaldo Carvalho de Melo, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On 31/01/2023 18:16, Alexei Starovoitov wrote:
> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>
>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>> +++ b/dwarves.h
>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>>>   uint8_t          has_addr_info:1;
>>>>>>>>   uint8_t          uses_global_strings:1;
>>>>>>>>   uint8_t          little_endian:1;
>>>>>>>> + uint8_t          nr_register_params;
>>>>>>>>   uint16_t         language;
>>>>>>>>   unsigned long    nr_inline_expansions;
>>>>>>>>   size_t           size_inline_expansions;
>>>>>>>
>>>>>
>>>>>> Thanks for this, never thought of cross-builds to be honest!
>>>>>
>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>>>> defined if using a very old elf.h; below works around this
>>>>>> (dwarves otherwise builds fine on this system).
>>>>>
>>>>> Ok, will add it and will test with containers for older distros too.
>>>>
>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
>>>> repo at:
>>>>
>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>>>>
>>>> It failed yesterday and today due to problems with the installation of
>>>> llvm, probably tomorrow it'll be back working as I saw some
>>>> notifications floating by.
>>>>
>>>> I added the conditional EM_RISCV definition as well as removed the dup
>>>> iterator that Jiri noticed.
>>>>
>>>
>>> Thanks again Arnaldo! I've hit an issue with this series in
>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
>>> from the BTF representation, and as a result:
>>>
>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>>>
>>> Not sure why I didn't notice this previously.
>>>
>>> The problem is the DWARF - and therefore BTF - generated for a function like
>>>
>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>> {
>>>         return -EOPNOTSUPP;
>>> }
>>>
>>> looks like this:
>>>
>>>    <8af83a2>   DW_AT_external    : 1
>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>>>     <8af83a6>   DW_AT_decl_file   : 5
>>>     <8af83a7>   DW_AT_decl_line   : 737
>>>     <8af83a9>   DW_AT_decl_column : 5
>>>     <8af83aa>   DW_AT_prototyped  : 1
>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>>>     <8af83b3>   DW_AT_name        : ctx
>>>     <8af83b7>   DW_AT_decl_file   : 5
>>>     <8af83b8>   DW_AT_decl_line   : 737
>>>     <8af83ba>   DW_AT_decl_column : 51
>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>>>     <8af83c4>   DW_AT_decl_file   : 5
>>>     <8af83c5>   DW_AT_decl_line   : 737
>>>     <8af83c7>   DW_AT_decl_column : 61
>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
>>>
>>> ...and because there are no further abstract origin references
>>> with location information either, we classify it as lacking
>>> locations for (some of) the parameters, and as a result
>>> we skip BTF encoding. We can work around that by doing this:
>>>
>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>
>> replied in the other thread. This attr is broken and discouraged by gcc.
>>
>> For kfuncs where aregs are unused, please try __used and __may_unused
>> applied to arguments.
>> If that won't work, please add barrier_var(arg) to the body of kfunc
>> the way we do in selftests.
> 
> There is also
> # define __visible __attribute__((__externally_visible__))
> that probably fits the best here.
> 

testing thus for seems to show that for x86_64, David's series
(using __used noinline in the BPF_KFUNC() wrapper and extended
to cover recently-arrived kfuncs like cpumask) is sufficient
to avoid resolve_btfids warnings.

We need to update the LSM_HOOK() definition for BPF LSM too,
otherwise they will cause problems with missing btfids also.

With all that done, I'm not seeing resolve_btfids complaints
for x86_64 (tested gcc9,11). I also tried using __visible, but
using that in the kfunc wrapper causes problems for the static tcp 
congestion control functions. We see warnings like these if __visible
is used in BPF_KFUNC():

net/ipv4/tcp_dctcp.c:79:1: warning: ‘externally_visible’ attribute have effect only on public objects [-Wattributes]
   79 | {

However, for aarch64 with the same changes we see a bunch of complaints
from resolve_btfids for BPF_KFUNC()-wrapped kfuncs and LSM hooks:

  BTFIDS  vmlinux
WARN: resolve_btfids: unresolved symbol tcp_cong_avoid_ai
WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
WARN: resolve_btfids: unresolved symbol bpf_lsm_xfrm_state_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_xfrm_policy_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_tun_dev_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_task_to_inode
WARN: resolve_btfids: unresolved symbol bpf_lsm_task_getsecid_obj
WARN: resolve_btfids: unresolved symbol bpf_lsm_task_free
WARN: resolve_btfids: unresolved symbol bpf_lsm_sock_graft
WARN: resolve_btfids: unresolved symbol bpf_lsm_sk_getsecid
WARN: resolve_btfids: unresolved symbol bpf_lsm_sk_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_sk_clone_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_shm_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_sem_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_sctp_sk_clone
WARN: resolve_btfids: unresolved symbol bpf_lsm_sb_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_sb_free_mnt_opts
WARN: resolve_btfids: unresolved symbol bpf_lsm_sb_delete
WARN: resolve_btfids: unresolved symbol bpf_lsm_req_classify_flow
WARN: resolve_btfids: unresolved symbol bpf_lsm_release_secctx
WARN: resolve_btfids: unresolved symbol bpf_lsm_perf_event_free
WARN: resolve_btfids: unresolved symbol bpf_lsm_msg_queue_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_msg_msg_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_key_free
WARN: resolve_btfids: unresolved symbol bpf_lsm_ipc_getsecid
WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_post_setxattr
WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_invalidate_secctx
WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_getsecid
WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_inet_csk_clone
WARN: resolve_btfids: unresolved symbol bpf_lsm_inet_conn_established
WARN: resolve_btfids: unresolved symbol bpf_lsm_ib_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_file_set_fowner
WARN: resolve_btfids: unresolved symbol bpf_lsm_file_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_d_instantiate
WARN: resolve_btfids: unresolved symbol bpf_lsm_current_getsecid_subj
WARN: resolve_btfids: unresolved symbol bpf_lsm_cred_transfer
WARN: resolve_btfids: unresolved symbol bpf_lsm_cred_getsecid
WARN: resolve_btfids: unresolved symbol bpf_lsm_cred_free
WARN: resolve_btfids: unresolved symbol bpf_lsm_bprm_committing_creds
WARN: resolve_btfids: unresolved symbol bpf_lsm_bprm_committed_creds
WARN: resolve_btfids: unresolved symbol bpf_lsm_bpf_prog_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_bpf_map_free_security
WARN: resolve_btfids: unresolved symbol bpf_lsm_audit_rule_free
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass_ctx
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_pass1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail3
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test3
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb_release
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb1_release
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_int_mem_release
WARN: resolve_btfids: unresolved symbol bpf_cpumask_any
WARN: resolve_btfids: unresolved symbol bpf_cgroup_acquire
WARN: resolve_btfids: unresolved symbol bpf_cast_to_kern_ctx
  NM      System.map

Thanks!

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 23:45                     ` Alan Maguire
@ 2023-01-31 23:58                       ` David Vernet
  2023-02-01  0:14                         ` Alexei Starovoitov
  0 siblings, 1 reply; 40+ messages in thread
From: David Vernet @ 2023-01-31 23:58 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >>
> >> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>>
> >>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> >>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> >>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> >>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>>>>> +++ b/dwarves.h
> >>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> >>>>>>>>   uint8_t          has_addr_info:1;
> >>>>>>>>   uint8_t          uses_global_strings:1;
> >>>>>>>>   uint8_t          little_endian:1;
> >>>>>>>> + uint8_t          nr_register_params;
> >>>>>>>>   uint16_t         language;
> >>>>>>>>   unsigned long    nr_inline_expansions;
> >>>>>>>>   size_t           size_inline_expansions;
> >>>>>>>
> >>>>>
> >>>>>> Thanks for this, never thought of cross-builds to be honest!
> >>>>>
> >>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> >>>>>> into one small thing on one system; turns out EM_RISCV isn't
> >>>>>> defined if using a very old elf.h; below works around this
> >>>>>> (dwarves otherwise builds fine on this system).
> >>>>>
> >>>>> Ok, will add it and will test with containers for older distros too.
> >>>>
> >>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> >>>> repo at:
> >>>>
> >>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> >>>>
> >>>> It failed yesterday and today due to problems with the installation of
> >>>> llvm, probably tomorrow it'll be back working as I saw some
> >>>> notifications floating by.
> >>>>
> >>>> I added the conditional EM_RISCV definition as well as removed the dup
> >>>> iterator that Jiri noticed.
> >>>>
> >>>
> >>> Thanks again Arnaldo! I've hit an issue with this series in
> >>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> >>> from the BTF representation, and as a result:
> >>>
> >>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> >>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> >>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> >>>
> >>> Not sure why I didn't notice this previously.
> >>>
> >>> The problem is the DWARF - and therefore BTF - generated for a function like
> >>>
> >>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> >>> {
> >>>         return -EOPNOTSUPP;
> >>> }
> >>>
> >>> looks like this:
> >>>
> >>>    <8af83a2>   DW_AT_external    : 1
> >>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> >>>     <8af83a6>   DW_AT_decl_file   : 5
> >>>     <8af83a7>   DW_AT_decl_line   : 737
> >>>     <8af83a9>   DW_AT_decl_column : 5
> >>>     <8af83aa>   DW_AT_prototyped  : 1
> >>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> >>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> >>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> >>>     <8af83b3>   DW_AT_name        : ctx
> >>>     <8af83b7>   DW_AT_decl_file   : 5
> >>>     <8af83b8>   DW_AT_decl_line   : 737
> >>>     <8af83ba>   DW_AT_decl_column : 51
> >>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> >>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> >>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> >>>     <8af83c4>   DW_AT_decl_file   : 5
> >>>     <8af83c5>   DW_AT_decl_line   : 737
> >>>     <8af83c7>   DW_AT_decl_column : 61
> >>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> >>>
> >>> ...and because there are no further abstract origin references
> >>> with location information either, we classify it as lacking
> >>> locations for (some of) the parameters, and as a result
> >>> we skip BTF encoding. We can work around that by doing this:
> >>>
> >>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> >>
> >> replied in the other thread. This attr is broken and discouraged by gcc.
> >>
> >> For kfuncs where aregs are unused, please try __used and __may_unused
> >> applied to arguments.
> >> If that won't work, please add barrier_var(arg) to the body of kfunc
> >> the way we do in selftests.
> > 
> > There is also
> > # define __visible __attribute__((__externally_visible__))
> > that probably fits the best here.
> > 
> 
> testing thus for seems to show that for x86_64, David's series
> (using __used noinline in the BPF_KFUNC() wrapper and extended
> to cover recently-arrived kfuncs like cpumask) is sufficient
> to avoid resolve_btfids warnings.

Nice. Alexei -- lmk how you want to proceed. I think using the
__bpf_kfunc macro in the short term (with __used and noinline) is
probably the least controversial way to unblock this, but am open to
other suggestions.

> 
> We need to update the LSM_HOOK() definition for BPF LSM too,
> otherwise they will cause problems with missing btfids also.
> 
> With all that done, I'm not seeing resolve_btfids complaints
> for x86_64 (tested gcc9,11). I also tried using __visible, but
> using that in the kfunc wrapper causes problems for the static tcp 
> congestion control functions. We see warnings like these if __visible
> is used in BPF_KFUNC():
> 
> net/ipv4/tcp_dctcp.c:79:1: warning: ‘externally_visible’ attribute have effect only on public objects [-Wattributes]
>    79 | {

Yeah, I tend to think we should try to avoid using hidden / visible
attributes given that (to my knowledge) they're really more meant for
controlling whether a symbol is exported from a shared object rather
than controlling what the compiler is doing when it creates the
compilation unit. One could imagine that in an LTO build, the compiler
would still optimize the function regardless of its visibility for that
reason, though it's possible I don't have the full picture.

> 
> However, for aarch64 with the same changes we see a bunch of complaints
> from resolve_btfids for BPF_KFUNC()-wrapped kfuncs and LSM hooks:
> 
>   BTFIDS  vmlinux
> WARN: resolve_btfids: unresolved symbol tcp_cong_avoid_ai
> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
> WARN: resolve_btfids: unresolved symbol bpf_lsm_xfrm_state_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_xfrm_policy_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_tun_dev_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_task_to_inode
> WARN: resolve_btfids: unresolved symbol bpf_lsm_task_getsecid_obj
> WARN: resolve_btfids: unresolved symbol bpf_lsm_task_free
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sock_graft
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sk_getsecid
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sk_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sk_clone_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_shm_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sem_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sctp_sk_clone
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sb_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sb_free_mnt_opts
> WARN: resolve_btfids: unresolved symbol bpf_lsm_sb_delete
> WARN: resolve_btfids: unresolved symbol bpf_lsm_req_classify_flow
> WARN: resolve_btfids: unresolved symbol bpf_lsm_release_secctx
> WARN: resolve_btfids: unresolved symbol bpf_lsm_perf_event_free
> WARN: resolve_btfids: unresolved symbol bpf_lsm_msg_queue_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_msg_msg_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_key_free
> WARN: resolve_btfids: unresolved symbol bpf_lsm_ipc_getsecid
> WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_post_setxattr
> WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_invalidate_secctx
> WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_getsecid
> WARN: resolve_btfids: unresolved symbol bpf_lsm_inode_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_inet_csk_clone
> WARN: resolve_btfids: unresolved symbol bpf_lsm_inet_conn_established
> WARN: resolve_btfids: unresolved symbol bpf_lsm_ib_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_file_set_fowner
> WARN: resolve_btfids: unresolved symbol bpf_lsm_file_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_d_instantiate
> WARN: resolve_btfids: unresolved symbol bpf_lsm_current_getsecid_subj
> WARN: resolve_btfids: unresolved symbol bpf_lsm_cred_transfer
> WARN: resolve_btfids: unresolved symbol bpf_lsm_cred_getsecid
> WARN: resolve_btfids: unresolved symbol bpf_lsm_cred_free
> WARN: resolve_btfids: unresolved symbol bpf_lsm_bprm_committing_creds
> WARN: resolve_btfids: unresolved symbol bpf_lsm_bprm_committed_creds
> WARN: resolve_btfids: unresolved symbol bpf_lsm_bpf_prog_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_bpf_map_free_security
> WARN: resolve_btfids: unresolved symbol bpf_lsm_audit_rule_free
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass_ctx
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_pass1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail3
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test3
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb_release
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb1_release
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_int_mem_release
> WARN: resolve_btfids: unresolved symbol bpf_cpumask_any
> WARN: resolve_btfids: unresolved symbol bpf_cgroup_acquire
> WARN: resolve_btfids: unresolved symbol bpf_cast_to_kern_ctx
>   NM      System.map

Is that all of them? That's surprising that we'd only fail to resolve a
random subset of the kfuncs, e.g. bpf_cpumask_any(). There's nothing
whatsoever special about it.

> 
> Thanks!
> 
> Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-01-31 23:58                       ` David Vernet
@ 2023-02-01  0:14                         ` Alexei Starovoitov
  2023-02-01  3:02                           ` David Vernet
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2023-02-01  0:14 UTC (permalink / raw)
  To: David Vernet
  Cc: Alan Maguire, Arnaldo Carvalho de Melo, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
>
> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> > On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > > On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > >>
> > >> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >>>
> > >>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > >>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > >>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > >>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >>>>>>>> +++ b/dwarves.h
> > >>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> > >>>>>>>>   uint8_t          has_addr_info:1;
> > >>>>>>>>   uint8_t          uses_global_strings:1;
> > >>>>>>>>   uint8_t          little_endian:1;
> > >>>>>>>> + uint8_t          nr_register_params;
> > >>>>>>>>   uint16_t         language;
> > >>>>>>>>   unsigned long    nr_inline_expansions;
> > >>>>>>>>   size_t           size_inline_expansions;
> > >>>>>>>
> > >>>>>
> > >>>>>> Thanks for this, never thought of cross-builds to be honest!
> > >>>>>
> > >>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> > >>>>>> into one small thing on one system; turns out EM_RISCV isn't
> > >>>>>> defined if using a very old elf.h; below works around this
> > >>>>>> (dwarves otherwise builds fine on this system).
> > >>>>>
> > >>>>> Ok, will add it and will test with containers for older distros too.
> > >>>>
> > >>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> > >>>> repo at:
> > >>>>
> > >>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > >>>>
> > >>>> It failed yesterday and today due to problems with the installation of
> > >>>> llvm, probably tomorrow it'll be back working as I saw some
> > >>>> notifications floating by.
> > >>>>
> > >>>> I added the conditional EM_RISCV definition as well as removed the dup
> > >>>> iterator that Jiri noticed.
> > >>>>
> > >>>
> > >>> Thanks again Arnaldo! I've hit an issue with this series in
> > >>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> > >>> from the BTF representation, and as a result:
> > >>>
> > >>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > >>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > >>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> > >>>
> > >>> Not sure why I didn't notice this previously.
> > >>>
> > >>> The problem is the DWARF - and therefore BTF - generated for a function like
> > >>>
> > >>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > >>> {
> > >>>         return -EOPNOTSUPP;
> > >>> }
> > >>>
> > >>> looks like this:
> > >>>
> > >>>    <8af83a2>   DW_AT_external    : 1
> > >>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> > >>>     <8af83a6>   DW_AT_decl_file   : 5
> > >>>     <8af83a7>   DW_AT_decl_line   : 737
> > >>>     <8af83a9>   DW_AT_decl_column : 5
> > >>>     <8af83aa>   DW_AT_prototyped  : 1
> > >>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> > >>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > >>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> > >>>     <8af83b3>   DW_AT_name        : ctx
> > >>>     <8af83b7>   DW_AT_decl_file   : 5
> > >>>     <8af83b8>   DW_AT_decl_line   : 737
> > >>>     <8af83ba>   DW_AT_decl_column : 51
> > >>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> > >>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> > >>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> > >>>     <8af83c4>   DW_AT_decl_file   : 5
> > >>>     <8af83c5>   DW_AT_decl_line   : 737
> > >>>     <8af83c7>   DW_AT_decl_column : 61
> > >>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> > >>>
> > >>> ...and because there are no further abstract origin references
> > >>> with location information either, we classify it as lacking
> > >>> locations for (some of) the parameters, and as a result
> > >>> we skip BTF encoding. We can work around that by doing this:
> > >>>
> > >>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > >>
> > >> replied in the other thread. This attr is broken and discouraged by gcc.
> > >>
> > >> For kfuncs where aregs are unused, please try __used and __may_unused
> > >> applied to arguments.
> > >> If that won't work, please add barrier_var(arg) to the body of kfunc
> > >> the way we do in selftests.
> > >
> > > There is also
> > > # define __visible __attribute__((__externally_visible__))
> > > that probably fits the best here.
> > >
> >
> > testing thus for seems to show that for x86_64, David's series
> > (using __used noinline in the BPF_KFUNC() wrapper and extended
> > to cover recently-arrived kfuncs like cpumask) is sufficient
> > to avoid resolve_btfids warnings.
>
> Nice. Alexei -- lmk how you want to proceed. I think using the
> __bpf_kfunc macro in the short term (with __used and noinline) is
> probably the least controversial way to unblock this, but am open to
> other suggestions.

Sounds good to me, but sounds like __used and noinline are not
enough to address the issues on aarch64?

> Yeah, I tend to think we should try to avoid using hidden / visible
> attributes given that (to my knowledge) they're really more meant for
> controlling whether a symbol is exported from a shared object rather
> than controlling what the compiler is doing when it creates the
> compilation unit. One could imagine that in an LTO build, the compiler
> would still optimize the function regardless of its visibility for that
> reason, though it's possible I don't have the full picture.

__visible is specifically done to prevent optimization of
functions that are externally visible. That should address LTO concerns.
We haven't seen LTO messing up anything. Just something to keep in mind.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01  0:14                         ` Alexei Starovoitov
@ 2023-02-01  3:02                           ` David Vernet
  2023-02-01 13:59                             ` Alan Maguire
  2023-02-03  1:09                             ` Yonghong Song
  0 siblings, 2 replies; 40+ messages in thread
From: David Vernet @ 2023-02-01  3:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alan Maguire, Arnaldo Carvalho de Melo, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
> >
> > On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> > > On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > > > On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > >>
> > > >> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > >>>
> > > >>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > > >>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > >>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > > >>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > > >>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > >>>>>>>> +++ b/dwarves.h
> > > >>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> > > >>>>>>>>   uint8_t          has_addr_info:1;
> > > >>>>>>>>   uint8_t          uses_global_strings:1;
> > > >>>>>>>>   uint8_t          little_endian:1;
> > > >>>>>>>> + uint8_t          nr_register_params;
> > > >>>>>>>>   uint16_t         language;
> > > >>>>>>>>   unsigned long    nr_inline_expansions;
> > > >>>>>>>>   size_t           size_inline_expansions;
> > > >>>>>>>
> > > >>>>>
> > > >>>>>> Thanks for this, never thought of cross-builds to be honest!
> > > >>>>>
> > > >>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> > > >>>>>> into one small thing on one system; turns out EM_RISCV isn't
> > > >>>>>> defined if using a very old elf.h; below works around this
> > > >>>>>> (dwarves otherwise builds fine on this system).
> > > >>>>>
> > > >>>>> Ok, will add it and will test with containers for older distros too.
> > > >>>>
> > > >>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> > > >>>> repo at:
> > > >>>>
> > > >>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > > >>>>
> > > >>>> It failed yesterday and today due to problems with the installation of
> > > >>>> llvm, probably tomorrow it'll be back working as I saw some
> > > >>>> notifications floating by.
> > > >>>>
> > > >>>> I added the conditional EM_RISCV definition as well as removed the dup
> > > >>>> iterator that Jiri noticed.
> > > >>>>
> > > >>>
> > > >>> Thanks again Arnaldo! I've hit an issue with this series in
> > > >>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> > > >>> from the BTF representation, and as a result:
> > > >>>
> > > >>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > > >>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > > >>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> > > >>>
> > > >>> Not sure why I didn't notice this previously.
> > > >>>
> > > >>> The problem is the DWARF - and therefore BTF - generated for a function like
> > > >>>
> > > >>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > >>> {
> > > >>>         return -EOPNOTSUPP;
> > > >>> }
> > > >>>
> > > >>> looks like this:
> > > >>>
> > > >>>    <8af83a2>   DW_AT_external    : 1
> > > >>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> > > >>>     <8af83a6>   DW_AT_decl_file   : 5
> > > >>>     <8af83a7>   DW_AT_decl_line   : 737
> > > >>>     <8af83a9>   DW_AT_decl_column : 5
> > > >>>     <8af83aa>   DW_AT_prototyped  : 1
> > > >>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> > > >>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > > >>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> > > >>>     <8af83b3>   DW_AT_name        : ctx
> > > >>>     <8af83b7>   DW_AT_decl_file   : 5
> > > >>>     <8af83b8>   DW_AT_decl_line   : 737
> > > >>>     <8af83ba>   DW_AT_decl_column : 51
> > > >>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> > > >>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> > > >>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> > > >>>     <8af83c4>   DW_AT_decl_file   : 5
> > > >>>     <8af83c5>   DW_AT_decl_line   : 737
> > > >>>     <8af83c7>   DW_AT_decl_column : 61
> > > >>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> > > >>>
> > > >>> ...and because there are no further abstract origin references
> > > >>> with location information either, we classify it as lacking
> > > >>> locations for (some of) the parameters, and as a result
> > > >>> we skip BTF encoding. We can work around that by doing this:
> > > >>>
> > > >>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > >>
> > > >> replied in the other thread. This attr is broken and discouraged by gcc.
> > > >>
> > > >> For kfuncs where aregs are unused, please try __used and __may_unused
> > > >> applied to arguments.
> > > >> If that won't work, please add barrier_var(arg) to the body of kfunc
> > > >> the way we do in selftests.
> > > >
> > > > There is also
> > > > # define __visible __attribute__((__externally_visible__))
> > > > that probably fits the best here.
> > > >
> > >
> > > testing thus for seems to show that for x86_64, David's series
> > > (using __used noinline in the BPF_KFUNC() wrapper and extended
> > > to cover recently-arrived kfuncs like cpumask) is sufficient
> > > to avoid resolve_btfids warnings.
> >
> > Nice. Alexei -- lmk how you want to proceed. I think using the
> > __bpf_kfunc macro in the short term (with __used and noinline) is
> > probably the least controversial way to unblock this, but am open to
> > other suggestions.
> 
> Sounds good to me, but sounds like __used and noinline are not
> enough to address the issues on aarch64?

Indeed, we'll have to make sure that's also addressed. Alan -- did you
try Alexei's suggestion to use __weak? Does that fix the issue for
aarch64? I'm still confused as to why it's only complaining for a small
subset of kfuncs, which include those that have external linkage.

> 
> > Yeah, I tend to think we should try to avoid using hidden / visible
> > attributes given that (to my knowledge) they're really more meant for
> > controlling whether a symbol is exported from a shared object rather
> > than controlling what the compiler is doing when it creates the
> > compilation unit. One could imagine that in an LTO build, the compiler
> > would still optimize the function regardless of its visibility for that
> > reason, though it's possible I don't have the full picture.
> 
> __visible is specifically done to prevent optimization of
> functions that are externally visible. That should address LTO concerns.
> We haven't seen LTO messing up anything. Just something to keep in mind.

Ah, fair enough. I was conflating that with the visibility("...")
attribute. As you pointed out, __visible is something else entirely, and
is meant to avoid possible issues with LTO.

One other option we could consider is enforcing that kfuncs must have
global linkage and can't be static. If we did that, it seems like
__visible would be a viable option. Though we'd have to verify that it
addresses the issue w/ aarch64.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01  3:02                           ` David Vernet
@ 2023-02-01 13:59                             ` Alan Maguire
  2023-02-01 15:02                               ` Arnaldo Carvalho de Melo
  2023-02-03  1:09                             ` Yonghong Song
  1 sibling, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-02-01 13:59 UTC (permalink / raw)
  To: David Vernet, Alexei Starovoitov
  Cc: Arnaldo Carvalho de Melo, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On 01/02/2023 03:02, David Vernet wrote:
> On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
>> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
>>>
>>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
>>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
>>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>
>>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>>>
>>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>> +++ b/dwarves.h
>>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>>>>>>>   uint8_t          has_addr_info:1;
>>>>>>>>>>>>   uint8_t          uses_global_strings:1;
>>>>>>>>>>>>   uint8_t          little_endian:1;
>>>>>>>>>>>> + uint8_t          nr_register_params;
>>>>>>>>>>>>   uint16_t         language;
>>>>>>>>>>>>   unsigned long    nr_inline_expansions;
>>>>>>>>>>>>   size_t           size_inline_expansions;
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
>>>>>>>>>
>>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>>>>>>>> defined if using a very old elf.h; below works around this
>>>>>>>>>> (dwarves otherwise builds fine on this system).
>>>>>>>>>
>>>>>>>>> Ok, will add it and will test with containers for older distros too.
>>>>>>>>
>>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
>>>>>>>> repo at:
>>>>>>>>
>>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>>>>>>>>
>>>>>>>> It failed yesterday and today due to problems with the installation of
>>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
>>>>>>>> notifications floating by.
>>>>>>>>
>>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
>>>>>>>> iterator that Jiri noticed.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
>>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
>>>>>>> from the BTF representation, and as a result:
>>>>>>>
>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>>>>>>>
>>>>>>> Not sure why I didn't notice this previously.
>>>>>>>
>>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
>>>>>>>
>>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>> {
>>>>>>>         return -EOPNOTSUPP;
>>>>>>> }
>>>>>>>
>>>>>>> looks like this:
>>>>>>>
>>>>>>>    <8af83a2>   DW_AT_external    : 1
>>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
>>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
>>>>>>>     <8af83a9>   DW_AT_decl_column : 5
>>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
>>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
>>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>>>>>>>     <8af83b3>   DW_AT_name        : ctx
>>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
>>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
>>>>>>>     <8af83ba>   DW_AT_decl_column : 51
>>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
>>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
>>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
>>>>>>>     <8af83c7>   DW_AT_decl_column : 61
>>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
>>>>>>>
>>>>>>> ...and because there are no further abstract origin references
>>>>>>> with location information either, we classify it as lacking
>>>>>>> locations for (some of) the parameters, and as a result
>>>>>>> we skip BTF encoding. We can work around that by doing this:
>>>>>>>
>>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>
>>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
>>>>>>
>>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
>>>>>> applied to arguments.
>>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
>>>>>> the way we do in selftests.
>>>>>
>>>>> There is also
>>>>> # define __visible __attribute__((__externally_visible__))
>>>>> that probably fits the best here.
>>>>>
>>>>
>>>> testing thus for seems to show that for x86_64, David's series
>>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
>>>> to cover recently-arrived kfuncs like cpumask) is sufficient
>>>> to avoid resolve_btfids warnings.
>>>
>>> Nice. Alexei -- lmk how you want to proceed. I think using the
>>> __bpf_kfunc macro in the short term (with __used and noinline) is
>>> probably the least controversial way to unblock this, but am open to
>>> other suggestions.
>>
>> Sounds good to me, but sounds like __used and noinline are not
>> enough to address the issues on aarch64?
> 
> Indeed, we'll have to make sure that's also addressed. Alan -- did you
> try Alexei's suggestion to use __weak? Does that fix the issue for
> aarch64? I'm still confused as to why it's only complaining for a small
> subset of kfuncs, which include those that have external linkage.
> 

I finally got to the bottom of the aarch64 issues; there was a 1-line bug
in the changes I made to the DWARF handling code which leads to BTF generation;
it was excluding a bunch of functions incorrectly, marking them as optimized out.
The fix is:

diff --git a/dwarf_loader.c b/dwarf_loader.c
index dba2d37..8364e17 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
                        Dwarf_Op *expr = loc.expr;
 
                        switch (expr->atom) {
-                       case DW_OP_reg1 ... DW_OP_reg31:
+                       case DW_OP_reg0 ... DW_OP_reg31:
                        case DW_OP_breg0 ... DW_OP_breg31:
                                break;
                        default:

..and because reg0 is the first parameter for aarch64, we were
incorrectly landing in the "default:" of the switch statement
and marking a bunch of functions as optimized out
because we thought the first argument was. Sorry about this,
and thanks for all the suggestions!

Arnaldo, will I send a v3 series incorporating the above fix
to patch 1?

With this fix in place, prefixing the kfunc functions with

__used noinline

...did the trick to ensure kfuncs were not excluded on x86_64
and aarch64.

>>
>>> Yeah, I tend to think we should try to avoid using hidden / visible
>>> attributes given that (to my knowledge) they're really more meant for
>>> controlling whether a symbol is exported from a shared object rather
>>> than controlling what the compiler is doing when it creates the
>>> compilation unit. One could imagine that in an LTO build, the compiler
>>> would still optimize the function regardless of its visibility for that
>>> reason, though it's possible I don't have the full picture.
>>
>> __visible is specifically done to prevent optimization of
>> functions that are externally visible. That should address LTO concerns.
>> We haven't seen LTO messing up anything. Just something to keep in mind.
> 
> Ah, fair enough. I was conflating that with the visibility("...")
> attribute. As you pointed out, __visible is something else entirely, and
> is meant to avoid possible issues with LTO.
> 
> One other option we could consider is enforcing that kfuncs must have
> global linkage and can't be static. If we did that, it seems like
> __visible would be a viable option. Though we'd have to verify that it
> addresses the issue w/ aarch64.
> 

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 13:59                             ` Alan Maguire
@ 2023-02-01 15:02                               ` Arnaldo Carvalho de Melo
  2023-02-01 15:13                                 ` Alan Maguire
  2023-02-01 15:19                                 ` David Vernet
  0 siblings, 2 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-01 15:02 UTC (permalink / raw)
  To: Alan Maguire
  Cc: David Vernet, Alexei Starovoitov, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
> On 01/02/2023 03:02, David Vernet wrote:
> > On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
> >> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
> >>>
> >>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> >>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
> >>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> >>>>> <alexei.starovoitov@gmail.com> wrote:
> >>>>>>
> >>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>>>>>>
> >>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> >>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> >>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> >>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>>>>>>>>> +++ b/dwarves.h
> >>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> >>>>>>>>>>>>   uint8_t          has_addr_info:1;
> >>>>>>>>>>>>   uint8_t          uses_global_strings:1;
> >>>>>>>>>>>>   uint8_t          little_endian:1;
> >>>>>>>>>>>> + uint8_t          nr_register_params;
> >>>>>>>>>>>>   uint16_t         language;
> >>>>>>>>>>>>   unsigned long    nr_inline_expansions;
> >>>>>>>>>>>>   size_t           size_inline_expansions;
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
> >>>>>>>>>
> >>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> >>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
> >>>>>>>>>> defined if using a very old elf.h; below works around this
> >>>>>>>>>> (dwarves otherwise builds fine on this system).
> >>>>>>>>>
> >>>>>>>>> Ok, will add it and will test with containers for older distros too.
> >>>>>>>>
> >>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> >>>>>>>> repo at:
> >>>>>>>>
> >>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> >>>>>>>>
> >>>>>>>> It failed yesterday and today due to problems with the installation of
> >>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
> >>>>>>>> notifications floating by.
> >>>>>>>>
> >>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
> >>>>>>>> iterator that Jiri noticed.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
> >>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> >>>>>>> from the BTF representation, and as a result:
> >>>>>>>
> >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> >>>>>>>
> >>>>>>> Not sure why I didn't notice this previously.
> >>>>>>>
> >>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
> >>>>>>>
> >>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> >>>>>>> {
> >>>>>>>         return -EOPNOTSUPP;
> >>>>>>> }
> >>>>>>>
> >>>>>>> looks like this:
> >>>>>>>
> >>>>>>>    <8af83a2>   DW_AT_external    : 1
> >>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> >>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
> >>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
> >>>>>>>     <8af83a9>   DW_AT_decl_column : 5
> >>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
> >>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> >>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> >>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> >>>>>>>     <8af83b3>   DW_AT_name        : ctx
> >>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
> >>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
> >>>>>>>     <8af83ba>   DW_AT_decl_column : 51
> >>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> >>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> >>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> >>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
> >>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
> >>>>>>>     <8af83c7>   DW_AT_decl_column : 61
> >>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> >>>>>>>
> >>>>>>> ...and because there are no further abstract origin references
> >>>>>>> with location information either, we classify it as lacking
> >>>>>>> locations for (some of) the parameters, and as a result
> >>>>>>> we skip BTF encoding. We can work around that by doing this:
> >>>>>>>
> >>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> >>>>>>
> >>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
> >>>>>>
> >>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
> >>>>>> applied to arguments.
> >>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
> >>>>>> the way we do in selftests.
> >>>>>
> >>>>> There is also
> >>>>> # define __visible __attribute__((__externally_visible__))
> >>>>> that probably fits the best here.
> >>>>>
> >>>>
> >>>> testing thus for seems to show that for x86_64, David's series
> >>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
> >>>> to cover recently-arrived kfuncs like cpumask) is sufficient
> >>>> to avoid resolve_btfids warnings.
> >>>
> >>> Nice. Alexei -- lmk how you want to proceed. I think using the
> >>> __bpf_kfunc macro in the short term (with __used and noinline) is
> >>> probably the least controversial way to unblock this, but am open to
> >>> other suggestions.
> >>
> >> Sounds good to me, but sounds like __used and noinline are not
> >> enough to address the issues on aarch64?
> > 
> > Indeed, we'll have to make sure that's also addressed. Alan -- did you
> > try Alexei's suggestion to use __weak? Does that fix the issue for
> > aarch64? I'm still confused as to why it's only complaining for a small
> > subset of kfuncs, which include those that have external linkage.
> > 
> 
> I finally got to the bottom of the aarch64 issues; there was a 1-line bug
> in the changes I made to the DWARF handling code which leads to BTF generation;
> it was excluding a bunch of functions incorrectly, marking them as optimized out.
> The fix is:
> 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index dba2d37..8364e17 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>                         Dwarf_Op *expr = loc.expr;
>  
>                         switch (expr->atom) {
> -                       case DW_OP_reg1 ... DW_OP_reg31:
> +                       case DW_OP_reg0 ... DW_OP_reg31:
>                         case DW_OP_breg0 ... DW_OP_breg31:
>                                 break;
>                         default:
> 
> ..and because reg0 is the first parameter for aarch64, we were
> incorrectly landing in the "default:" of the switch statement
> and marking a bunch of functions as optimized out
> because we thought the first argument was. Sorry about this,
> and thanks for all the suggestions!
> 
> Arnaldo, will I send a v3 series incorporating the above fix
> to patch 1?

I can fix it here. Done, I;ll force push it to the 'next' branch.

Also I noted the index_idx usage in parameter__new(), it can be -1 when
processing:

 <1><2eb2>: Abbrev Number: 18 (DW_TAG_subroutine_type)
    <2eb3>   DW_AT_prototyped  : 1
    <2eb3>   DW_AT_sibling     : <0x2ec2>
 <2><2eb7>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <2eb8>   DW_AT_type        : <0x414>
 <2><2ebc>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <2ebd>   DW_AT_type        : <0x69>
 <2><2ec1>: Abbrev Number: 0

 And in that case we don't have the location expression:

  <1><af36>: Abbrev Number: 77 (DW_TAG_subprogram)
    <af37>   DW_AT_external    : 1
    <af37>   DW_AT_name        : (indirect string, offset: 0x4ff7): startup_64_setup_env
    <af3b>   DW_AT_decl_file   : 1
    <af3b>   DW_AT_decl_line   : 592
    <af3d>   DW_AT_decl_column : 13
    <af3e>   DW_AT_prototyped  : 1
    <af3e>   DW_AT_low_pc      : 0xffffffff81000570
    <af46>   DW_AT_high_pc     : 0x6d
    <af4e>   DW_AT_frame_base  : 1 byte block: 9c       (DW_OP_call_frame_cfa)
    <af50>   DW_AT_call_all_calls: 1
    <af50>   DW_AT_sibling     : <0xb11f>
 <2><af54>: Abbrev Number: 67 (DW_TAG_formal_parameter)
    <af55>   DW_AT_name        : (indirect string, offset: 0x2a50d): physbase
    <af59>   DW_AT_decl_file   : 1
    <af59>   DW_AT_decl_line   : 592
    <af5b>   DW_AT_decl_column : 48
    <af5c>   DW_AT_type        : <0x4c>
    <af60>   DW_AT_location    : 0x10 (location list)
    <af64>   DW_AT_GNU_locviews: 0xc

I.e. its just a function _type_, not an actual function, so I'm applying
this on top of that first patch, ok?

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 7e05fde8a5c3ac26..253c5efaf3b55a93 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1035,7 +1035,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
 
-		if (param_idx >= cu->nr_register_params)
+		if (param_idx >= cu->nr_register_params || param_idx < 0)
 			return parm;
 		/* Parameters which use DW_AT_abstract_origin to point at
 		 * the original parameter definition (with no name in the DIE)


- Arnaldo
 
> With this fix in place, prefixing the kfunc functions with
> 
> __used noinline
> 
> ...did the trick to ensure kfuncs were not excluded on x86_64
> and aarch64.
> 
> >>
> >>> Yeah, I tend to think we should try to avoid using hidden / visible
> >>> attributes given that (to my knowledge) they're really more meant for
> >>> controlling whether a symbol is exported from a shared object rather
> >>> than controlling what the compiler is doing when it creates the
> >>> compilation unit. One could imagine that in an LTO build, the compiler
> >>> would still optimize the function regardless of its visibility for that
> >>> reason, though it's possible I don't have the full picture.
> >>
> >> __visible is specifically done to prevent optimization of
> >> functions that are externally visible. That should address LTO concerns.
> >> We haven't seen LTO messing up anything. Just something to keep in mind.
> > 
> > Ah, fair enough. I was conflating that with the visibility("...")
> > attribute. As you pointed out, __visible is something else entirely, and
> > is meant to avoid possible issues with LTO.
> > 
> > One other option we could consider is enforcing that kfuncs must have
> > global linkage and can't be static. If we did that, it seems like
> > __visible would be a viable option. Though we'd have to verify that it
> > addresses the issue w/ aarch64.
> > 

-- 

- Arnaldo

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 15:02                               ` Arnaldo Carvalho de Melo
@ 2023-02-01 15:13                                 ` Alan Maguire
  2023-02-01 15:19                                 ` David Vernet
  1 sibling, 0 replies; 40+ messages in thread
From: Alan Maguire @ 2023-02-01 15:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: David Vernet, Alexei Starovoitov, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On 01/02/2023 15:02, Arnaldo Carvalho de Melo wrote:
> Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
>> On 01/02/2023 03:02, David Vernet wrote:
>>> On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
>>>> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
>>>>>
>>>>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
>>>>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
>>>>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
>>>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>>>>>
>>>>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>>>> +++ b/dwarves.h
>>>>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>>>>>>>>>   uint8_t          has_addr_info:1;
>>>>>>>>>>>>>>   uint8_t          uses_global_strings:1;
>>>>>>>>>>>>>>   uint8_t          little_endian:1;
>>>>>>>>>>>>>> + uint8_t          nr_register_params;
>>>>>>>>>>>>>>   uint16_t         language;
>>>>>>>>>>>>>>   unsigned long    nr_inline_expansions;
>>>>>>>>>>>>>>   size_t           size_inline_expansions;
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
>>>>>>>>>>>
>>>>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>>>>>>>>>> defined if using a very old elf.h; below works around this
>>>>>>>>>>>> (dwarves otherwise builds fine on this system).
>>>>>>>>>>>
>>>>>>>>>>> Ok, will add it and will test with containers for older distros too.
>>>>>>>>>>
>>>>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
>>>>>>>>>> repo at:
>>>>>>>>>>
>>>>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>>>>>>>>>>
>>>>>>>>>> It failed yesterday and today due to problems with the installation of
>>>>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
>>>>>>>>>> notifications floating by.
>>>>>>>>>>
>>>>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
>>>>>>>>>> iterator that Jiri noticed.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
>>>>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
>>>>>>>>> from the BTF representation, and as a result:
>>>>>>>>>
>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>>>>>>>>>
>>>>>>>>> Not sure why I didn't notice this previously.
>>>>>>>>>
>>>>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
>>>>>>>>>
>>>>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>>>> {
>>>>>>>>>         return -EOPNOTSUPP;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> looks like this:
>>>>>>>>>
>>>>>>>>>    <8af83a2>   DW_AT_external    : 1
>>>>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>>>>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
>>>>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
>>>>>>>>>     <8af83a9>   DW_AT_decl_column : 5
>>>>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
>>>>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
>>>>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>>>>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>>>>>>>>>     <8af83b3>   DW_AT_name        : ctx
>>>>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
>>>>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
>>>>>>>>>     <8af83ba>   DW_AT_decl_column : 51
>>>>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
>>>>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>>>>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>>>>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
>>>>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
>>>>>>>>>     <8af83c7>   DW_AT_decl_column : 61
>>>>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
>>>>>>>>>
>>>>>>>>> ...and because there are no further abstract origin references
>>>>>>>>> with location information either, we classify it as lacking
>>>>>>>>> locations for (some of) the parameters, and as a result
>>>>>>>>> we skip BTF encoding. We can work around that by doing this:
>>>>>>>>>
>>>>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>>>
>>>>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
>>>>>>>>
>>>>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
>>>>>>>> applied to arguments.
>>>>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
>>>>>>>> the way we do in selftests.
>>>>>>>
>>>>>>> There is also
>>>>>>> # define __visible __attribute__((__externally_visible__))
>>>>>>> that probably fits the best here.
>>>>>>>
>>>>>>
>>>>>> testing thus for seems to show that for x86_64, David's series
>>>>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
>>>>>> to cover recently-arrived kfuncs like cpumask) is sufficient
>>>>>> to avoid resolve_btfids warnings.
>>>>>
>>>>> Nice. Alexei -- lmk how you want to proceed. I think using the
>>>>> __bpf_kfunc macro in the short term (with __used and noinline) is
>>>>> probably the least controversial way to unblock this, but am open to
>>>>> other suggestions.
>>>>
>>>> Sounds good to me, but sounds like __used and noinline are not
>>>> enough to address the issues on aarch64?
>>>
>>> Indeed, we'll have to make sure that's also addressed. Alan -- did you
>>> try Alexei's suggestion to use __weak? Does that fix the issue for
>>> aarch64? I'm still confused as to why it's only complaining for a small
>>> subset of kfuncs, which include those that have external linkage.
>>>
>>
>> I finally got to the bottom of the aarch64 issues; there was a 1-line bug
>> in the changes I made to the DWARF handling code which leads to BTF generation;
>> it was excluding a bunch of functions incorrectly, marking them as optimized out.
>> The fix is:
>>
>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>> index dba2d37..8364e17 100644
>> --- a/dwarf_loader.c
>> +++ b/dwarf_loader.c
>> @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>                         Dwarf_Op *expr = loc.expr;
>>  
>>                         switch (expr->atom) {
>> -                       case DW_OP_reg1 ... DW_OP_reg31:
>> +                       case DW_OP_reg0 ... DW_OP_reg31:
>>                         case DW_OP_breg0 ... DW_OP_breg31:
>>                                 break;
>>                         default:
>>
>> ..and because reg0 is the first parameter for aarch64, we were
>> incorrectly landing in the "default:" of the switch statement
>> and marking a bunch of functions as optimized out
>> because we thought the first argument was. Sorry about this,
>> and thanks for all the suggestions!
>>
>> Arnaldo, will I send a v3 series incorporating the above fix
>> to patch 1?
> 
> I can fix it here. Done, I;ll force push it to the 'next' branch.
> 
> Also I noted the index_idx usage in parameter__new(), it can be -1 when
> processing:
> 
>  <1><2eb2>: Abbrev Number: 18 (DW_TAG_subroutine_type)
>     <2eb3>   DW_AT_prototyped  : 1
>     <2eb3>   DW_AT_sibling     : <0x2ec2>
>  <2><2eb7>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <2eb8>   DW_AT_type        : <0x414>
>  <2><2ebc>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <2ebd>   DW_AT_type        : <0x69>
>  <2><2ec1>: Abbrev Number: 0
> 
>  And in that case we don't have the location expression:
> 
>   <1><af36>: Abbrev Number: 77 (DW_TAG_subprogram)
>     <af37>   DW_AT_external    : 1
>     <af37>   DW_AT_name        : (indirect string, offset: 0x4ff7): startup_64_setup_env
>     <af3b>   DW_AT_decl_file   : 1
>     <af3b>   DW_AT_decl_line   : 592
>     <af3d>   DW_AT_decl_column : 13
>     <af3e>   DW_AT_prototyped  : 1
>     <af3e>   DW_AT_low_pc      : 0xffffffff81000570
>     <af46>   DW_AT_high_pc     : 0x6d
>     <af4e>   DW_AT_frame_base  : 1 byte block: 9c       (DW_OP_call_frame_cfa)
>     <af50>   DW_AT_call_all_calls: 1
>     <af50>   DW_AT_sibling     : <0xb11f>
>  <2><af54>: Abbrev Number: 67 (DW_TAG_formal_parameter)
>     <af55>   DW_AT_name        : (indirect string, offset: 0x2a50d): physbase
>     <af59>   DW_AT_decl_file   : 1
>     <af59>   DW_AT_decl_line   : 592
>     <af5b>   DW_AT_decl_column : 48
>     <af5c>   DW_AT_type        : <0x4c>
>     <af60>   DW_AT_location    : 0x10 (location list)
>     <af64>   DW_AT_GNU_locviews: 0xc
> 
> I.e. its just a function _type_, not an actual function, so I'm applying
> this on top of that first patch, ok?
> 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 7e05fde8a5c3ac26..253c5efaf3b55a93 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -1035,7 +1035,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>  		tag__init(&parm->tag, cu, die);
>  		parm->name = attr_string(die, DW_AT_name, conf);
>  
> -		if (param_idx >= cu->nr_register_params)
> +		if (param_idx >= cu->nr_register_params || param_idx < 0)
>  			return parm;
>  		/* Parameters which use DW_AT_abstract_origin to point at
>  		 * the original parameter definition (with no name in the DIE)
> 
>

ah, great catch. thanks again!

Alan
 
> - Arnaldo
>  
>> With this fix in place, prefixing the kfunc functions with
>>
>> __used noinline
>>
>> ...did the trick to ensure kfuncs were not excluded on x86_64
>> and aarch64.
>>
>>>>
>>>>> Yeah, I tend to think we should try to avoid using hidden / visible
>>>>> attributes given that (to my knowledge) they're really more meant for
>>>>> controlling whether a symbol is exported from a shared object rather
>>>>> than controlling what the compiler is doing when it creates the
>>>>> compilation unit. One could imagine that in an LTO build, the compiler
>>>>> would still optimize the function regardless of its visibility for that
>>>>> reason, though it's possible I don't have the full picture.
>>>>
>>>> __visible is specifically done to prevent optimization of
>>>> functions that are externally visible. That should address LTO concerns.
>>>> We haven't seen LTO messing up anything. Just something to keep in mind.
>>>
>>> Ah, fair enough. I was conflating that with the visibility("...")
>>> attribute. As you pointed out, __visible is something else entirely, and
>>> is meant to avoid possible issues with LTO.
>>>
>>> One other option we could consider is enforcing that kfuncs must have
>>> global linkage and can't be static. If we did that, it seems like
>>> __visible would be a viable option. Though we'd have to verify that it
>>> addresses the issue w/ aarch64.
>>>
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 15:02                               ` Arnaldo Carvalho de Melo
  2023-02-01 15:13                                 ` Alan Maguire
@ 2023-02-01 15:19                                 ` David Vernet
  2023-02-01 16:49                                   ` Alexei Starovoitov
  1 sibling, 1 reply; 40+ messages in thread
From: David Vernet @ 2023-02-01 15:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alan Maguire, Alexei Starovoitov, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Wed, Feb 01, 2023 at 12:02:07PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
> > On 01/02/2023 03:02, David Vernet wrote:
> > > On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
> > >> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
> > >>>
> > >>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> > >>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > >>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > >>>>> <alexei.starovoitov@gmail.com> wrote:
> > >>>>>>
> > >>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >>>>>>>
> > >>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > >>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > >>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > >>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > >>>>>>>>>>>> +++ b/dwarves.h
> > >>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> > >>>>>>>>>>>>   uint8_t          has_addr_info:1;
> > >>>>>>>>>>>>   uint8_t          uses_global_strings:1;
> > >>>>>>>>>>>>   uint8_t          little_endian:1;
> > >>>>>>>>>>>> + uint8_t          nr_register_params;
> > >>>>>>>>>>>>   uint16_t         language;
> > >>>>>>>>>>>>   unsigned long    nr_inline_expansions;
> > >>>>>>>>>>>>   size_t           size_inline_expansions;
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
> > >>>>>>>>>
> > >>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> > >>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
> > >>>>>>>>>> defined if using a very old elf.h; below works around this
> > >>>>>>>>>> (dwarves otherwise builds fine on this system).
> > >>>>>>>>>
> > >>>>>>>>> Ok, will add it and will test with containers for older distros too.
> > >>>>>>>>
> > >>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> > >>>>>>>> repo at:
> > >>>>>>>>
> > >>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > >>>>>>>>
> > >>>>>>>> It failed yesterday and today due to problems with the installation of
> > >>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
> > >>>>>>>> notifications floating by.
> > >>>>>>>>
> > >>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
> > >>>>>>>> iterator that Jiri noticed.
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
> > >>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> > >>>>>>> from the BTF representation, and as a result:
> > >>>>>>>
> > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> > >>>>>>>
> > >>>>>>> Not sure why I didn't notice this previously.
> > >>>>>>>
> > >>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
> > >>>>>>>
> > >>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > >>>>>>> {
> > >>>>>>>         return -EOPNOTSUPP;
> > >>>>>>> }
> > >>>>>>>
> > >>>>>>> looks like this:
> > >>>>>>>
> > >>>>>>>    <8af83a2>   DW_AT_external    : 1
> > >>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> > >>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
> > >>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
> > >>>>>>>     <8af83a9>   DW_AT_decl_column : 5
> > >>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
> > >>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> > >>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > >>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> > >>>>>>>     <8af83b3>   DW_AT_name        : ctx
> > >>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
> > >>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
> > >>>>>>>     <8af83ba>   DW_AT_decl_column : 51
> > >>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> > >>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> > >>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> > >>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
> > >>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
> > >>>>>>>     <8af83c7>   DW_AT_decl_column : 61
> > >>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> > >>>>>>>
> > >>>>>>> ...and because there are no further abstract origin references
> > >>>>>>> with location information either, we classify it as lacking
> > >>>>>>> locations for (some of) the parameters, and as a result
> > >>>>>>> we skip BTF encoding. We can work around that by doing this:
> > >>>>>>>
> > >>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > >>>>>>
> > >>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
> > >>>>>>
> > >>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
> > >>>>>> applied to arguments.
> > >>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
> > >>>>>> the way we do in selftests.
> > >>>>>
> > >>>>> There is also
> > >>>>> # define __visible __attribute__((__externally_visible__))
> > >>>>> that probably fits the best here.
> > >>>>>
> > >>>>
> > >>>> testing thus for seems to show that for x86_64, David's series
> > >>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
> > >>>> to cover recently-arrived kfuncs like cpumask) is sufficient
> > >>>> to avoid resolve_btfids warnings.
> > >>>
> > >>> Nice. Alexei -- lmk how you want to proceed. I think using the
> > >>> __bpf_kfunc macro in the short term (with __used and noinline) is
> > >>> probably the least controversial way to unblock this, but am open to
> > >>> other suggestions.
> > >>
> > >> Sounds good to me, but sounds like __used and noinline are not
> > >> enough to address the issues on aarch64?
> > > 
> > > Indeed, we'll have to make sure that's also addressed. Alan -- did you
> > > try Alexei's suggestion to use __weak? Does that fix the issue for
> > > aarch64? I'm still confused as to why it's only complaining for a small
> > > subset of kfuncs, which include those that have external linkage.
> > > 
> > 
> > I finally got to the bottom of the aarch64 issues; there was a 1-line bug
> > in the changes I made to the DWARF handling code which leads to BTF generation;
> > it was excluding a bunch of functions incorrectly, marking them as optimized out.
> > The fix is:
> > 
> > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > index dba2d37..8364e17 100644
> > --- a/dwarf_loader.c
> > +++ b/dwarf_loader.c
> > @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
> >                         Dwarf_Op *expr = loc.expr;
> >  
> >                         switch (expr->atom) {
> > -                       case DW_OP_reg1 ... DW_OP_reg31:
> > +                       case DW_OP_reg0 ... DW_OP_reg31:
> >                         case DW_OP_breg0 ... DW_OP_breg31:
> >                                 break;
> >                         default:
> > 
> > ..and because reg0 is the first parameter for aarch64, we were
> > incorrectly landing in the "default:" of the switch statement
> > and marking a bunch of functions as optimized out
> > because we thought the first argument was. Sorry about this,
> > and thanks for all the suggestions!

Great, so inline and __used with __bpf_kfunc sounds like the way forward
in the short term. Arnaldo / Alexei -- how do you want to resolve the
dependency here? Going through bpf-next is probably a good idea so that
we get proper CI coverage, and any kfuncs added to bpf-next after this
can use the macro. Does that work for you?

> > 
> > Arnaldo, will I send a v3 series incorporating the above fix
> > to patch 1?
> 
> I can fix it here. Done, I;ll force push it to the 'next' branch.
> 
> Also I noted the index_idx usage in parameter__new(), it can be -1 when
> processing:
> 
>  <1><2eb2>: Abbrev Number: 18 (DW_TAG_subroutine_type)
>     <2eb3>   DW_AT_prototyped  : 1
>     <2eb3>   DW_AT_sibling     : <0x2ec2>
>  <2><2eb7>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <2eb8>   DW_AT_type        : <0x414>
>  <2><2ebc>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <2ebd>   DW_AT_type        : <0x69>
>  <2><2ec1>: Abbrev Number: 0
> 
>  And in that case we don't have the location expression:
> 
>   <1><af36>: Abbrev Number: 77 (DW_TAG_subprogram)
>     <af37>   DW_AT_external    : 1
>     <af37>   DW_AT_name        : (indirect string, offset: 0x4ff7): startup_64_setup_env
>     <af3b>   DW_AT_decl_file   : 1
>     <af3b>   DW_AT_decl_line   : 592
>     <af3d>   DW_AT_decl_column : 13
>     <af3e>   DW_AT_prototyped  : 1
>     <af3e>   DW_AT_low_pc      : 0xffffffff81000570
>     <af46>   DW_AT_high_pc     : 0x6d
>     <af4e>   DW_AT_frame_base  : 1 byte block: 9c       (DW_OP_call_frame_cfa)
>     <af50>   DW_AT_call_all_calls: 1
>     <af50>   DW_AT_sibling     : <0xb11f>
>  <2><af54>: Abbrev Number: 67 (DW_TAG_formal_parameter)
>     <af55>   DW_AT_name        : (indirect string, offset: 0x2a50d): physbase
>     <af59>   DW_AT_decl_file   : 1
>     <af59>   DW_AT_decl_line   : 592
>     <af5b>   DW_AT_decl_column : 48
>     <af5c>   DW_AT_type        : <0x4c>
>     <af60>   DW_AT_location    : 0x10 (location list)
>     <af64>   DW_AT_GNU_locviews: 0xc
> 
> I.e. its just a function _type_, not an actual function, so I'm applying
> this on top of that first patch, ok?
> 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 7e05fde8a5c3ac26..253c5efaf3b55a93 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -1035,7 +1035,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>  		tag__init(&parm->tag, cu, die);
>  		parm->name = attr_string(die, DW_AT_name, conf);
>  
> -		if (param_idx >= cu->nr_register_params)
> +		if (param_idx >= cu->nr_register_params || param_idx < 0)
>  			return parm;
>  		/* Parameters which use DW_AT_abstract_origin to point at
>  		 * the original parameter definition (with no name in the DIE)
> 
> 
> - Arnaldo
>  
> > With this fix in place, prefixing the kfunc functions with
> > 
> > __used noinline
> > 
> > ...did the trick to ensure kfuncs were not excluded on x86_64
> > and aarch64.

[...]

Thanks,
David

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 15:19                                 ` David Vernet
@ 2023-02-01 16:49                                   ` Alexei Starovoitov
  2023-02-01 17:01                                     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Alexei Starovoitov @ 2023-02-01 16:49 UTC (permalink / raw)
  To: David Vernet
  Cc: Arnaldo Carvalho de Melo, Alan Maguire, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On Wed, Feb 1, 2023 at 7:19 AM David Vernet <void@manifault.com> wrote:
>
> On Wed, Feb 01, 2023 at 12:02:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
> > > On 01/02/2023 03:02, David Vernet wrote:
> > > > On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
> > > >> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
> > > >>>
> > > >>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> > > >>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > > >>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > > >>>>> <alexei.starovoitov@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > >>>>>>>
> > > >>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > > >>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > >>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > > >>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > > >>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > >>>>>>>>>>>> +++ b/dwarves.h
> > > >>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> > > >>>>>>>>>>>>   uint8_t          has_addr_info:1;
> > > >>>>>>>>>>>>   uint8_t          uses_global_strings:1;
> > > >>>>>>>>>>>>   uint8_t          little_endian:1;
> > > >>>>>>>>>>>> + uint8_t          nr_register_params;
> > > >>>>>>>>>>>>   uint16_t         language;
> > > >>>>>>>>>>>>   unsigned long    nr_inline_expansions;
> > > >>>>>>>>>>>>   size_t           size_inline_expansions;
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
> > > >>>>>>>>>
> > > >>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> > > >>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
> > > >>>>>>>>>> defined if using a very old elf.h; below works around this
> > > >>>>>>>>>> (dwarves otherwise builds fine on this system).
> > > >>>>>>>>>
> > > >>>>>>>>> Ok, will add it and will test with containers for older distros too.
> > > >>>>>>>>
> > > >>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> > > >>>>>>>> repo at:
> > > >>>>>>>>
> > > >>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > > >>>>>>>>
> > > >>>>>>>> It failed yesterday and today due to problems with the installation of
> > > >>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
> > > >>>>>>>> notifications floating by.
> > > >>>>>>>>
> > > >>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
> > > >>>>>>>> iterator that Jiri noticed.
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
> > > >>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> > > >>>>>>> from the BTF representation, and as a result:
> > > >>>>>>>
> > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> > > >>>>>>>
> > > >>>>>>> Not sure why I didn't notice this previously.
> > > >>>>>>>
> > > >>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
> > > >>>>>>>
> > > >>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > >>>>>>> {
> > > >>>>>>>         return -EOPNOTSUPP;
> > > >>>>>>> }
> > > >>>>>>>
> > > >>>>>>> looks like this:
> > > >>>>>>>
> > > >>>>>>>    <8af83a2>   DW_AT_external    : 1
> > > >>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> > > >>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
> > > >>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
> > > >>>>>>>     <8af83a9>   DW_AT_decl_column : 5
> > > >>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
> > > >>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> > > >>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > > >>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> > > >>>>>>>     <8af83b3>   DW_AT_name        : ctx
> > > >>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
> > > >>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
> > > >>>>>>>     <8af83ba>   DW_AT_decl_column : 51
> > > >>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> > > >>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> > > >>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> > > >>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
> > > >>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
> > > >>>>>>>     <8af83c7>   DW_AT_decl_column : 61
> > > >>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> > > >>>>>>>
> > > >>>>>>> ...and because there are no further abstract origin references
> > > >>>>>>> with location information either, we classify it as lacking
> > > >>>>>>> locations for (some of) the parameters, and as a result
> > > >>>>>>> we skip BTF encoding. We can work around that by doing this:
> > > >>>>>>>
> > > >>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > >>>>>>
> > > >>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
> > > >>>>>>
> > > >>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
> > > >>>>>> applied to arguments.
> > > >>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
> > > >>>>>> the way we do in selftests.
> > > >>>>>
> > > >>>>> There is also
> > > >>>>> # define __visible __attribute__((__externally_visible__))
> > > >>>>> that probably fits the best here.
> > > >>>>>
> > > >>>>
> > > >>>> testing thus for seems to show that for x86_64, David's series
> > > >>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
> > > >>>> to cover recently-arrived kfuncs like cpumask) is sufficient
> > > >>>> to avoid resolve_btfids warnings.
> > > >>>
> > > >>> Nice. Alexei -- lmk how you want to proceed. I think using the
> > > >>> __bpf_kfunc macro in the short term (with __used and noinline) is
> > > >>> probably the least controversial way to unblock this, but am open to
> > > >>> other suggestions.
> > > >>
> > > >> Sounds good to me, but sounds like __used and noinline are not
> > > >> enough to address the issues on aarch64?
> > > >
> > > > Indeed, we'll have to make sure that's also addressed. Alan -- did you
> > > > try Alexei's suggestion to use __weak? Does that fix the issue for
> > > > aarch64? I'm still confused as to why it's only complaining for a small
> > > > subset of kfuncs, which include those that have external linkage.
> > > >
> > >
> > > I finally got to the bottom of the aarch64 issues; there was a 1-line bug
> > > in the changes I made to the DWARF handling code which leads to BTF generation;
> > > it was excluding a bunch of functions incorrectly, marking them as optimized out.
> > > The fix is:
> > >
> > > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > > index dba2d37..8364e17 100644
> > > --- a/dwarf_loader.c
> > > +++ b/dwarf_loader.c
> > > @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
> > >                         Dwarf_Op *expr = loc.expr;
> > >
> > >                         switch (expr->atom) {
> > > -                       case DW_OP_reg1 ... DW_OP_reg31:
> > > +                       case DW_OP_reg0 ... DW_OP_reg31:
> > >                         case DW_OP_breg0 ... DW_OP_breg31:
> > >                                 break;
> > >                         default:
> > >
> > > ..and because reg0 is the first parameter for aarch64, we were
> > > incorrectly landing in the "default:" of the switch statement
> > > and marking a bunch of functions as optimized out
> > > because we thought the first argument was. Sorry about this,
> > > and thanks for all the suggestions!
>
> Great, so inline and __used with __bpf_kfunc sounds like the way forward
> in the short term. Arnaldo / Alexei -- how do you want to resolve the
> dependency here? Going through bpf-next is probably a good idea so that
> we get proper CI coverage, and any kfuncs added to bpf-next after this
> can use the macro. Does that work for you?

It feels fixed pahole should be done under some flag
otherwise when people update the pahole the existing and older
kernels might stop building with warns:
WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
...

Arnaldo, could you check what warns do you see with this fixed pahole
in bpf tree ?
If there are only few warns then we can manually add __used noinline
to these places, push to bpf tree and push to stable.

Then in bpf-next we can clean up everything with __bpf_kfunc.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 16:49                                   ` Alexei Starovoitov
@ 2023-02-01 17:01                                     ` Arnaldo Carvalho de Melo
  2023-02-01 17:18                                       ` Alan Maguire
  2023-02-01 22:32                                       ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-01 17:01 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Alan Maguire, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

Em Wed, Feb 01, 2023 at 08:49:07AM -0800, Alexei Starovoitov escreveu:
> On Wed, Feb 1, 2023 at 7:19 AM David Vernet <void@manifault.com> wrote:
> >
> > On Wed, Feb 01, 2023 at 12:02:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
> > > > On 01/02/2023 03:02, David Vernet wrote:
> > > > > On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
> > > > >> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
> > > > >>>
> > > > >>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> > > > >>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > > > >>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > > > >>>>> <alexei.starovoitov@gmail.com> wrote:
> > > > >>>>>>
> > > > >>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > > >>>>>>>
> > > > >>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > > > >>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > >>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > > > >>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > > > >>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > >>>>>>>>>>>> +++ b/dwarves.h
> > > > >>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> > > > >>>>>>>>>>>>   uint8_t          has_addr_info:1;
> > > > >>>>>>>>>>>>   uint8_t          uses_global_strings:1;
> > > > >>>>>>>>>>>>   uint8_t          little_endian:1;
> > > > >>>>>>>>>>>> + uint8_t          nr_register_params;
> > > > >>>>>>>>>>>>   uint16_t         language;
> > > > >>>>>>>>>>>>   unsigned long    nr_inline_expansions;
> > > > >>>>>>>>>>>>   size_t           size_inline_expansions;
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> > > > >>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
> > > > >>>>>>>>>> defined if using a very old elf.h; below works around this
> > > > >>>>>>>>>> (dwarves otherwise builds fine on this system).
> > > > >>>>>>>>>
> > > > >>>>>>>>> Ok, will add it and will test with containers for older distros too.
> > > > >>>>>>>>
> > > > >>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> > > > >>>>>>>> repo at:
> > > > >>>>>>>>
> > > > >>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > > > >>>>>>>>
> > > > >>>>>>>> It failed yesterday and today due to problems with the installation of
> > > > >>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
> > > > >>>>>>>> notifications floating by.
> > > > >>>>>>>>
> > > > >>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
> > > > >>>>>>>> iterator that Jiri noticed.
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
> > > > >>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> > > > >>>>>>> from the BTF representation, and as a result:
> > > > >>>>>>>
> > > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> > > > >>>>>>>
> > > > >>>>>>> Not sure why I didn't notice this previously.
> > > > >>>>>>>
> > > > >>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
> > > > >>>>>>>
> > > > >>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > > >>>>>>> {
> > > > >>>>>>>         return -EOPNOTSUPP;
> > > > >>>>>>> }
> > > > >>>>>>>
> > > > >>>>>>> looks like this:
> > > > >>>>>>>
> > > > >>>>>>>    <8af83a2>   DW_AT_external    : 1
> > > > >>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> > > > >>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
> > > > >>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
> > > > >>>>>>>     <8af83a9>   DW_AT_decl_column : 5
> > > > >>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
> > > > >>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> > > > >>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > > > >>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> > > > >>>>>>>     <8af83b3>   DW_AT_name        : ctx
> > > > >>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
> > > > >>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
> > > > >>>>>>>     <8af83ba>   DW_AT_decl_column : 51
> > > > >>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> > > > >>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> > > > >>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> > > > >>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
> > > > >>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
> > > > >>>>>>>     <8af83c7>   DW_AT_decl_column : 61
> > > > >>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> > > > >>>>>>>
> > > > >>>>>>> ...and because there are no further abstract origin references
> > > > >>>>>>> with location information either, we classify it as lacking
> > > > >>>>>>> locations for (some of) the parameters, and as a result
> > > > >>>>>>> we skip BTF encoding. We can work around that by doing this:
> > > > >>>>>>>
> > > > >>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > > >>>>>>
> > > > >>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
> > > > >>>>>>
> > > > >>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
> > > > >>>>>> applied to arguments.
> > > > >>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
> > > > >>>>>> the way we do in selftests.
> > > > >>>>>
> > > > >>>>> There is also
> > > > >>>>> # define __visible __attribute__((__externally_visible__))
> > > > >>>>> that probably fits the best here.
> > > > >>>>>
> > > > >>>>
> > > > >>>> testing thus for seems to show that for x86_64, David's series
> > > > >>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
> > > > >>>> to cover recently-arrived kfuncs like cpumask) is sufficient
> > > > >>>> to avoid resolve_btfids warnings.
> > > > >>>
> > > > >>> Nice. Alexei -- lmk how you want to proceed. I think using the
> > > > >>> __bpf_kfunc macro in the short term (with __used and noinline) is
> > > > >>> probably the least controversial way to unblock this, but am open to
> > > > >>> other suggestions.
> > > > >>
> > > > >> Sounds good to me, but sounds like __used and noinline are not
> > > > >> enough to address the issues on aarch64?
> > > > >
> > > > > Indeed, we'll have to make sure that's also addressed. Alan -- did you
> > > > > try Alexei's suggestion to use __weak? Does that fix the issue for
> > > > > aarch64? I'm still confused as to why it's only complaining for a small
> > > > > subset of kfuncs, which include those that have external linkage.
> > > > >
> > > >
> > > > I finally got to the bottom of the aarch64 issues; there was a 1-line bug
> > > > in the changes I made to the DWARF handling code which leads to BTF generation;
> > > > it was excluding a bunch of functions incorrectly, marking them as optimized out.
> > > > The fix is:
> > > >
> > > > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > > > index dba2d37..8364e17 100644
> > > > --- a/dwarf_loader.c
> > > > +++ b/dwarf_loader.c
> > > > @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
> > > >                         Dwarf_Op *expr = loc.expr;
> > > >
> > > >                         switch (expr->atom) {
> > > > -                       case DW_OP_reg1 ... DW_OP_reg31:
> > > > +                       case DW_OP_reg0 ... DW_OP_reg31:
> > > >                         case DW_OP_breg0 ... DW_OP_breg31:
> > > >                                 break;
> > > >                         default:
> > > >
> > > > ..and because reg0 is the first parameter for aarch64, we were
> > > > incorrectly landing in the "default:" of the switch statement
> > > > and marking a bunch of functions as optimized out
> > > > because we thought the first argument was. Sorry about this,
> > > > and thanks for all the suggestions!
> >
> > Great, so inline and __used with __bpf_kfunc sounds like the way forward
> > in the short term. Arnaldo / Alexei -- how do you want to resolve the
> > dependency here? Going through bpf-next is probably a good idea so that
> > we get proper CI coverage, and any kfuncs added to bpf-next after this
> > can use the macro. Does that work for you?
> 
> It feels fixed pahole should be done under some flag
> otherwise when people update the pahole the existing and older
> kernels might stop building with warns:
> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> ...
> 
> Arnaldo, could you check what warns do you see with this fixed pahole
> in bpf tree ?

Sure.

> If there are only few warns then we can manually add __used noinline
> to these places, push to bpf tree and push to stable.
> 
> Then in bpf-next we can clean up everything with __bpf_kfunc.

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 17:01                                     ` Arnaldo Carvalho de Melo
@ 2023-02-01 17:18                                       ` Alan Maguire
  2023-02-01 18:54                                         ` Arnaldo Carvalho de Melo
  2023-02-01 22:33                                         ` Alan Maguire
  2023-02-01 22:32                                       ` Arnaldo Carvalho de Melo
  1 sibling, 2 replies; 40+ messages in thread
From: Alan Maguire @ 2023-02-01 17:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Alexei Starovoitov
  Cc: David Vernet, Yonghong Song, Alexei Starovoitov, Jiri Olsa,
	Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On 01/02/2023 17:01, Arnaldo Carvalho de Melo wrote:
> Em Wed, Feb 01, 2023 at 08:49:07AM -0800, Alexei Starovoitov escreveu:
>> On Wed, Feb 1, 2023 at 7:19 AM David Vernet <void@manifault.com> wrote:
>>>
>>> On Wed, Feb 01, 2023 at 12:02:07PM -0300, Arnaldo Carvalho de Melo wrote:
>>>> Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
>>>>> On 01/02/2023 03:02, David Vernet wrote:
>>>>>> On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
>>>>>>> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
>>>>>>>>
>>>>>>>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
>>>>>>>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
>>>>>>>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
>>>>>>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>>>>>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>>>>>>> +++ b/dwarves.h
>>>>>>>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>>>>>>>>>>>>   uint8_t          has_addr_info:1;
>>>>>>>>>>>>>>>>>   uint8_t          uses_global_strings:1;
>>>>>>>>>>>>>>>>>   uint8_t          little_endian:1;
>>>>>>>>>>>>>>>>> + uint8_t          nr_register_params;
>>>>>>>>>>>>>>>>>   uint16_t         language;
>>>>>>>>>>>>>>>>>   unsigned long    nr_inline_expansions;
>>>>>>>>>>>>>>>>>   size_t           size_inline_expansions;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>>>>>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>>>>>>>>>>>>> defined if using a very old elf.h; below works around this
>>>>>>>>>>>>>>> (dwarves otherwise builds fine on this system).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ok, will add it and will test with containers for older distros too.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
>>>>>>>>>>>>> repo at:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>>>>>>>>>>>>>
>>>>>>>>>>>>> It failed yesterday and today due to problems with the installation of
>>>>>>>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
>>>>>>>>>>>>> notifications floating by.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
>>>>>>>>>>>>> iterator that Jiri noticed.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
>>>>>>>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
>>>>>>>>>>>> from the BTF representation, and as a result:
>>>>>>>>>>>>
>>>>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>>>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>>>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>>>>>>>>>>>>
>>>>>>>>>>>> Not sure why I didn't notice this previously.
>>>>>>>>>>>>
>>>>>>>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
>>>>>>>>>>>>
>>>>>>>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>>>>>>> {
>>>>>>>>>>>>         return -EOPNOTSUPP;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> looks like this:
>>>>>>>>>>>>
>>>>>>>>>>>>    <8af83a2>   DW_AT_external    : 1
>>>>>>>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>>>>>>>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
>>>>>>>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
>>>>>>>>>>>>     <8af83a9>   DW_AT_decl_column : 5
>>>>>>>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
>>>>>>>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
>>>>>>>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>>>>>>>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>>>>>>>>>>>>     <8af83b3>   DW_AT_name        : ctx
>>>>>>>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
>>>>>>>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
>>>>>>>>>>>>     <8af83ba>   DW_AT_decl_column : 51
>>>>>>>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
>>>>>>>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>>>>>>>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>>>>>>>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
>>>>>>>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
>>>>>>>>>>>>     <8af83c7>   DW_AT_decl_column : 61
>>>>>>>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
>>>>>>>>>>>>
>>>>>>>>>>>> ...and because there are no further abstract origin references
>>>>>>>>>>>> with location information either, we classify it as lacking
>>>>>>>>>>>> locations for (some of) the parameters, and as a result
>>>>>>>>>>>> we skip BTF encoding. We can work around that by doing this:
>>>>>>>>>>>>
>>>>>>>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>>>>>>
>>>>>>>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
>>>>>>>>>>>
>>>>>>>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
>>>>>>>>>>> applied to arguments.
>>>>>>>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
>>>>>>>>>>> the way we do in selftests.
>>>>>>>>>>
>>>>>>>>>> There is also
>>>>>>>>>> # define __visible __attribute__((__externally_visible__))
>>>>>>>>>> that probably fits the best here.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> testing thus for seems to show that for x86_64, David's series
>>>>>>>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
>>>>>>>>> to cover recently-arrived kfuncs like cpumask) is sufficient
>>>>>>>>> to avoid resolve_btfids warnings.
>>>>>>>>
>>>>>>>> Nice. Alexei -- lmk how you want to proceed. I think using the
>>>>>>>> __bpf_kfunc macro in the short term (with __used and noinline) is
>>>>>>>> probably the least controversial way to unblock this, but am open to
>>>>>>>> other suggestions.
>>>>>>>
>>>>>>> Sounds good to me, but sounds like __used and noinline are not
>>>>>>> enough to address the issues on aarch64?
>>>>>>
>>>>>> Indeed, we'll have to make sure that's also addressed. Alan -- did you
>>>>>> try Alexei's suggestion to use __weak? Does that fix the issue for
>>>>>> aarch64? I'm still confused as to why it's only complaining for a small
>>>>>> subset of kfuncs, which include those that have external linkage.
>>>>>>
>>>>>
>>>>> I finally got to the bottom of the aarch64 issues; there was a 1-line bug
>>>>> in the changes I made to the DWARF handling code which leads to BTF generation;
>>>>> it was excluding a bunch of functions incorrectly, marking them as optimized out.
>>>>> The fix is:
>>>>>
>>>>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>>>>> index dba2d37..8364e17 100644
>>>>> --- a/dwarf_loader.c
>>>>> +++ b/dwarf_loader.c
>>>>> @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>>>>                         Dwarf_Op *expr = loc.expr;
>>>>>
>>>>>                         switch (expr->atom) {
>>>>> -                       case DW_OP_reg1 ... DW_OP_reg31:
>>>>> +                       case DW_OP_reg0 ... DW_OP_reg31:
>>>>>                         case DW_OP_breg0 ... DW_OP_breg31:
>>>>>                                 break;
>>>>>                         default:
>>>>>
>>>>> ..and because reg0 is the first parameter for aarch64, we were
>>>>> incorrectly landing in the "default:" of the switch statement
>>>>> and marking a bunch of functions as optimized out
>>>>> because we thought the first argument was. Sorry about this,
>>>>> and thanks for all the suggestions!
>>>
>>> Great, so inline and __used with __bpf_kfunc sounds like the way forward
>>> in the short term. Arnaldo / Alexei -- how do you want to resolve the
>>> dependency here? Going through bpf-next is probably a good idea so that
>>> we get proper CI coverage, and any kfuncs added to bpf-next after this
>>> can use the macro. Does that work for you?
>>
>> It feels fixed pahole should be done under some flag
>> otherwise when people update the pahole the existing and older
>> kernels might stop building with warns:
>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>> ...
>>

Good point, something like

--skip_inconsistent_proto	Skip functions that have multiple inconsistent
				function prototypes sharing the same name, or
				have optimized-out parameters.

? Implementation needs a bit of thought though because we're
not really doing the same thing that we were before. Previously we
were adding the first instance of a function in the CU we came across.
Probably safest to resurrect that behaviour for the legacy
non-skip-inconsistent-proto case I think. The final patch handling
inconsistent function prototypes will need to be reworked a bit to 
support this, since we tossed this approach and used saving/merging 
multiple instances in the tree instead.  Once I've built bpf trees I'll
have a go at getting this working.

>> Arnaldo, could you check what warns do you see with this fixed pahole
>> in bpf tree ?
> 
> Sure.
> 

I can collect this for x86_64/aarch64 too; might take a few hours
before I have the results.

>> If there are only few warns then we can manually add __used noinline
>> to these places, push to bpf tree and push to stable.
>>
>> Then in bpf-next we can clean up everything with __bpf_kfunc.
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func
  2023-01-30 14:29 ` [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func Alan Maguire
@ 2023-02-01 17:19   ` Arnaldo Carvalho de Melo
  2023-02-01 17:50     ` Alan Maguire
  0 siblings, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-01 17:19 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Mon, Jan 30, 2023 at 02:29:42PM +0000, Alan Maguire escreveu:
> This will be useful for postponing local function addition later on.
> As part of this, store the type id offset and unspecified type in
> the encoder, as this will simplify late addition of local functions.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  btf_encoder.c | 101 +++++++++++++++++++++++++++++++++-------------------------
>  1 file changed, 57 insertions(+), 44 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index a5fa04a..44f1905 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -54,6 +54,8 @@ struct btf_encoder {
>  	struct gobuffer   percpu_secinfo;
>  	const char	  *filename;
>  	struct elf_symtab *symtab;
> +	uint32_t	  type_id_off;
> +	uint32_t	  unspecified_type;
>  	bool		  has_index_type,
>  			  need_index_type,
>  			  skip_encoding_vars,
> @@ -593,20 +595,20 @@ static int32_t btf_encoder__add_func_param(struct btf_encoder *encoder, const ch
>  	}
>  }
>  
> -static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t type_id_off, uint32_t tag_type)
> +static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_type)
>  {
>  	if (tag_type == 0)
>  		return 0;
>  
> -	if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
> +	if (tag_type == encoder->unspecified_type) {
>  		// No provision for encoding this, turn it into void.
>  		return 0;
>  	}

Humm, are those two lines (above) really equivalent? IIRC I read that as
encoder->cu->unspecified_type.tag being zero means we still didn't set
it, not that it is void (zero), right?

So if we're passing a tag_type zero, void, we'll return 0, i.e. turn
into a void, so seems equivalent, try not to combine patches like this
in the future, i.e. I would expect, from a quick glance, to have:

-     if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
+     if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {

I.e. just the removal of the indirection thru encoder->cu. Or am I
missing something here?

- Arnaldo

>  
> -	return type_id_off + tag_type;
> +	return encoder->type_id_off + tag_type;
>  }
>  
> -static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype, uint32_t type_id_off)
> +static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype)
>  {
>  	struct btf *btf = encoder->btf;
>  	const struct btf_type *t;
> @@ -616,7 +618,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
>  
>  	/* add btf_type for func_proto */
>  	nr_params = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
> -	type_id = btf_encoder__tag_type(encoder, type_id_off, ftype->tag.type);
> +	type_id = btf_encoder__tag_type(encoder, ftype->tag.type);
>  
>  	id = btf__add_func_proto(btf, type_id);
>  	if (id > 0) {
> @@ -634,7 +636,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
>  	ftype__for_each_parameter(ftype, param) {
>  		const char *name = parameter__name(param);
>  
> -		type_id = param->tag.type == 0 ? 0 : type_id_off + param->tag.type;
> +		type_id = param->tag.type == 0 ? 0 : encoder->type_id_off + param->tag.type;
>  		++param_idx;
>  		if (btf_encoder__add_func_param(encoder, name, type_id, param_idx == nr_params))
>  			return -1;
> @@ -762,6 +764,31 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
>  	return id;
>  }
>  
> +static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
> +{
> +	int btf_fnproto_id, btf_fn_id, tag_type_id;
> +	struct llvm_annotation *annot;
> +	const char *name;
> +
> +	btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
> +	name = function__name(fn);
> +	btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> +	if (btf_fnproto_id < 0 || btf_fn_id < 0) {
> +		printf("error: failed to encode function '%s'\n", function__name(fn));
> +		return -1;
> +	}
> +	list_for_each_entry(annot, &fn->annots, node) {
> +		tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id,
> +							annot->component_idx);
> +		if (tag_type_id < 0) {
> +			fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
> +				annot->value, name, annot->component_idx);
> +			return -1;
> +		}
> +	}
> +	return 0;
> +}
> +
>  /*
>   * This corresponds to the same macro defined in
>   * include/linux/kallsyms.h
> @@ -859,22 +886,21 @@ static void dump_invalid_symbol(const char *msg, const char *sym,
>  	fprintf(stderr, "PAHOLE: Error: Use '--btf_encode_force' to ignore such symbols and force emit the btf.\n");
>  }
>  
> -static int tag__check_id_drift(const struct tag *tag,
> -			       uint32_t core_id, uint32_t btf_type_id,
> -			       uint32_t type_id_off)
> +static int tag__check_id_drift(struct btf_encoder *encoder, const struct tag *tag,
> +			       uint32_t core_id, uint32_t btf_type_id)
>  {
> -	if (btf_type_id != (core_id + type_id_off)) {
> +	if (btf_type_id != (core_id + encoder->type_id_off)) {
>  		fprintf(stderr,
>  			"%s: %s id drift, core_id: %u, btf_type_id: %u, type_id_off: %u\n",
>  			__func__, dwarf_tag_name(tag->tag),
> -			core_id, btf_type_id, type_id_off);
> +			core_id, btf_type_id, encoder->type_id_off);
>  		return -1;
>  	}
>  
>  	return 0;
>  }
>  
> -static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off)
> +static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag)
>  {
>  	struct type *type = tag__type(tag);
>  	struct class_member *pos;
> @@ -896,7 +922,8 @@ static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct
>  		 * is required.
>  		 */
>  		name = class_member__name(pos);
> -		if (btf_encoder__add_field(encoder, name, type_id_off + pos->tag.type, pos->bitfield_size, pos->bit_offset))
> +		if (btf_encoder__add_field(encoder, name, encoder->type_id_off + pos->tag.type,
> +					   pos->bitfield_size, pos->bit_offset))
>  			return -1;
>  	}
>  
> @@ -936,11 +963,11 @@ static int32_t btf_encoder__add_enum_type(struct btf_encoder *encoder, struct ta
>  	return type_id;
>  }
>  
> -static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off,
> +static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
>  				   struct conf_load *conf_load)
>  {
>  	/* single out type 0 as it represents special type "void" */
> -	uint32_t ref_type_id = tag->type == 0 ? 0 : type_id_off + tag->type;
> +	uint32_t ref_type_id = tag->type == 0 ? 0 : encoder->type_id_off + tag->type;
>  	struct base_type *bt;
>  	const char *name;
>  
> @@ -970,7 +997,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
>  		if (tag__type(tag)->declaration)
>  			return btf_encoder__add_ref_type(encoder, BTF_KIND_FWD, 0, name, tag->tag == DW_TAG_union_type);
>  		else
> -			return btf_encoder__add_struct_type(encoder, tag, type_id_off);
> +			return btf_encoder__add_struct_type(encoder, tag);
>  	case DW_TAG_array_type:
>  		/* TODO: Encode one dimension at a time. */
>  		encoder->need_index_type = true;
> @@ -978,7 +1005,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
>  	case DW_TAG_enumeration_type:
>  		return btf_encoder__add_enum_type(encoder, tag, conf_load);
>  	case DW_TAG_subroutine_type:
> -		return btf_encoder__add_func_proto(encoder, tag__ftype(tag), type_id_off);
> +		return btf_encoder__add_func_proto(encoder, tag__ftype(tag));
>          case DW_TAG_unspecified_type:
>  		/* Just don't encode this for now, converting anything with this type to void (0) instead.
>  		 *
> @@ -1281,7 +1308,7 @@ static bool ftype__has_arg_names(const struct ftype *ftype)
>  	return true;
>  }
>  
> -static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_t type_id_off)
> +static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
>  {
>  	struct cu *cu = encoder->cu;
>  	uint32_t core_id;
> @@ -1366,7 +1393,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_
>  			continue;
>  		}
>  
> -		type = var->ip.tag.type + type_id_off;
> +		type = var->ip.tag.type + encoder->type_id_off;
>  		linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC;
>  
>  		if (encoder->verbose) {
> @@ -1507,7 +1534,6 @@ void btf_encoder__delete(struct btf_encoder *encoder)
>  
>  int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load)
>  {
> -	uint32_t type_id_off = btf__type_cnt(encoder->btf) - 1;
>  	struct llvm_annotation *annot;
>  	int btf_type_id, tag_type_id, skipped_types = 0;
>  	uint32_t core_id;
> @@ -1516,21 +1542,24 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  	int err = 0;
>  
>  	encoder->cu = cu;
> +	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
> +	if (encoder->cu->unspecified_type.tag)
> +		encoder->unspecified_type = encoder->cu->unspecified_type.type;
>  
>  	if (!encoder->has_index_type) {
>  		/* cu__find_base_type_by_name() takes "type_id_t *id" */
>  		type_id_t id;
>  		if (cu__find_base_type_by_name(cu, "int", &id)) {
>  			encoder->has_index_type = true;
> -			encoder->array_index_id = type_id_off + id;
> +			encoder->array_index_id = encoder->type_id_off + id;
>  		} else {
>  			encoder->has_index_type = false;
> -			encoder->array_index_id = type_id_off + cu->types_table.nr_entries;
> +			encoder->array_index_id = encoder->type_id_off + cu->types_table.nr_entries;
>  		}
>  	}
>  
>  	cu__for_each_type(cu, core_id, pos) {
> -		btf_type_id = btf_encoder__encode_tag(encoder, pos, type_id_off, conf_load);
> +		btf_type_id = btf_encoder__encode_tag(encoder, pos, conf_load);
>  
>  		if (btf_type_id == 0) {
>  			++skipped_types;
> @@ -1538,7 +1567,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  		}
>  
>  		if (btf_type_id < 0 ||
> -		    tag__check_id_drift(pos, core_id, btf_type_id + skipped_types, type_id_off)) {
> +		    tag__check_id_drift(encoder, pos, core_id, btf_type_id + skipped_types)) {
>  			err = -1;
>  			goto out;
>  		}
> @@ -1572,7 +1601,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  			continue;
>  		}
>  
> -		btf_type_id = type_id_off + core_id;
> +		btf_type_id = encoder->type_id_off + core_id;
>  		ns = tag__namespace(pos);
>  		list_for_each_entry(annot, &ns->annots, node) {
>  			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_type_id, annot->component_idx);
> @@ -1585,8 +1614,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  	}
>  
>  	cu__for_each_function(cu, core_id, fn) {
> -		int btf_fnproto_id, btf_fn_id;
> -		const char *name;
>  
>  		/*
>  		 * Skip functions that:
> @@ -1616,27 +1643,13 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  				continue;
>  		}
>  
> -		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto, type_id_off);
> -		name = function__name(fn);
> -		btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> -		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
> -			err = -1;
> -			printf("error: failed to encode function '%s'\n", function__name(fn));
> +		err = btf_encoder__add_func(encoder, fn);
> +		if (err)
>  			goto out;
> -		}
> -
> -		list_for_each_entry(annot, &fn->annots, node) {
> -			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id, annot->component_idx);
> -			if (tag_type_id < 0) {
> -				fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
> -					annot->value, name, annot->component_idx);
> -				goto out;
> -			}
> -		}
>  	}
>  
>  	if (!encoder->skip_encoding_vars)
> -		err = btf_encoder__encode_cu_variables(encoder, type_id_off);
> +		err = btf_encoder__encode_cu_variables(encoder);
>  out:
>  	encoder->cu = NULL;
>  	return err;
> -- 
> 1.8.3.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func
  2023-02-01 17:19   ` Arnaldo Carvalho de Melo
@ 2023-02-01 17:50     ` Alan Maguire
  2023-02-01 18:59       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Maguire @ 2023-02-01 17:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

On 01/02/2023 17:19, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 30, 2023 at 02:29:42PM +0000, Alan Maguire escreveu:
>> This will be useful for postponing local function addition later on.
>> As part of this, store the type id offset and unspecified type in
>> the encoder, as this will simplify late addition of local functions.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>  btf_encoder.c | 101 +++++++++++++++++++++++++++++++++-------------------------
>>  1 file changed, 57 insertions(+), 44 deletions(-)
>>
>> diff --git a/btf_encoder.c b/btf_encoder.c
>> index a5fa04a..44f1905 100644
>> --- a/btf_encoder.c
>> +++ b/btf_encoder.c
>> @@ -54,6 +54,8 @@ struct btf_encoder {
>>  	struct gobuffer   percpu_secinfo;
>>  	const char	  *filename;
>>  	struct elf_symtab *symtab;
>> +	uint32_t	  type_id_off;
>> +	uint32_t	  unspecified_type;
>>  	bool		  has_index_type,
>>  			  need_index_type,
>>  			  skip_encoding_vars,
>> @@ -593,20 +595,20 @@ static int32_t btf_encoder__add_func_param(struct btf_encoder *encoder, const ch
>>  	}
>>  }
>>  
>> -static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t type_id_off, uint32_t tag_type)
>> +static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_type)
>>  {
>>  	if (tag_type == 0)
>>  		return 0;
>>  
>> -	if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
>> +	if (tag_type == encoder->unspecified_type) {
>>  		// No provision for encoding this, turn it into void.
>>  		return 0;
>>  	}
> 
> Humm, are those two lines (above) really equivalent? IIRC I read that as
> encoder->cu->unspecified_type.tag being zero means we still didn't set
> it, not that it is void (zero), right?
> 
> So if we're passing a tag_type zero, void, we'll return 0, i.e. turn
> into a void, so seems equivalent, try not to combine patches like this
> in the future, i.e. I would expect, from a quick glance, to have:
> 
> -     if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
> +     if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
> 
> I.e. just the removal of the indirection thru encoder->cu. Or am I
> missing something here?
>

No, I don't think you're missing anything. I should have separated
out the changes that record encoder info such that we don't need to
rely on the current CU; we need those because now we interact with
functions potentially much later on, and the current CU can be
different. Ideally that would have come before this patch
refactoring function addition.

I can rework the series to do that if you like? Patch 5 will
need a bit of work too so that we can continue to support the
legacy behaviour, and we'll need an additional patch to support
switching the inconsistent prototype handling on also.
 
> - Arnaldo
> 
>>  
>> -	return type_id_off + tag_type;
>> +	return encoder->type_id_off + tag_type;
>>  }
>>  
>> -static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype, uint32_t type_id_off)
>> +static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype)
>>  {
>>  	struct btf *btf = encoder->btf;
>>  	const struct btf_type *t;
>> @@ -616,7 +618,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
>>  
>>  	/* add btf_type for func_proto */
>>  	nr_params = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
>> -	type_id = btf_encoder__tag_type(encoder, type_id_off, ftype->tag.type);
>> +	type_id = btf_encoder__tag_type(encoder, ftype->tag.type);
>>  
>>  	id = btf__add_func_proto(btf, type_id);
>>  	if (id > 0) {
>> @@ -634,7 +636,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
>>  	ftype__for_each_parameter(ftype, param) {
>>  		const char *name = parameter__name(param);
>>  
>> -		type_id = param->tag.type == 0 ? 0 : type_id_off + param->tag.type;
>> +		type_id = param->tag.type == 0 ? 0 : encoder->type_id_off + param->tag.type;
>>  		++param_idx;
>>  		if (btf_encoder__add_func_param(encoder, name, type_id, param_idx == nr_params))
>>  			return -1;
>> @@ -762,6 +764,31 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
>>  	return id;
>>  }
>>  
>> +static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
>> +{
>> +	int btf_fnproto_id, btf_fn_id, tag_type_id;
>> +	struct llvm_annotation *annot;
>> +	const char *name;
>> +
>> +	btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
>> +	name = function__name(fn);
>> +	btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
>> +	if (btf_fnproto_id < 0 || btf_fn_id < 0) {
>> +		printf("error: failed to encode function '%s'\n", function__name(fn));
>> +		return -1;
>> +	}
>> +	list_for_each_entry(annot, &fn->annots, node) {
>> +		tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id,
>> +							annot->component_idx);
>> +		if (tag_type_id < 0) {
>> +			fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
>> +				annot->value, name, annot->component_idx);
>> +			return -1;
>> +		}
>> +	}
>> +	return 0;
>> +}
>> +
>>  /*
>>   * This corresponds to the same macro defined in
>>   * include/linux/kallsyms.h
>> @@ -859,22 +886,21 @@ static void dump_invalid_symbol(const char *msg, const char *sym,
>>  	fprintf(stderr, "PAHOLE: Error: Use '--btf_encode_force' to ignore such symbols and force emit the btf.\n");
>>  }
>>  
>> -static int tag__check_id_drift(const struct tag *tag,
>> -			       uint32_t core_id, uint32_t btf_type_id,
>> -			       uint32_t type_id_off)
>> +static int tag__check_id_drift(struct btf_encoder *encoder, const struct tag *tag,
>> +			       uint32_t core_id, uint32_t btf_type_id)
>>  {
>> -	if (btf_type_id != (core_id + type_id_off)) {
>> +	if (btf_type_id != (core_id + encoder->type_id_off)) {
>>  		fprintf(stderr,
>>  			"%s: %s id drift, core_id: %u, btf_type_id: %u, type_id_off: %u\n",
>>  			__func__, dwarf_tag_name(tag->tag),
>> -			core_id, btf_type_id, type_id_off);
>> +			core_id, btf_type_id, encoder->type_id_off);
>>  		return -1;
>>  	}
>>  
>>  	return 0;
>>  }
>>  
>> -static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off)
>> +static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag)
>>  {
>>  	struct type *type = tag__type(tag);
>>  	struct class_member *pos;
>> @@ -896,7 +922,8 @@ static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct
>>  		 * is required.
>>  		 */
>>  		name = class_member__name(pos);
>> -		if (btf_encoder__add_field(encoder, name, type_id_off + pos->tag.type, pos->bitfield_size, pos->bit_offset))
>> +		if (btf_encoder__add_field(encoder, name, encoder->type_id_off + pos->tag.type,
>> +					   pos->bitfield_size, pos->bit_offset))
>>  			return -1;
>>  	}
>>  
>> @@ -936,11 +963,11 @@ static int32_t btf_encoder__add_enum_type(struct btf_encoder *encoder, struct ta
>>  	return type_id;
>>  }
>>  
>> -static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off,
>> +static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
>>  				   struct conf_load *conf_load)
>>  {
>>  	/* single out type 0 as it represents special type "void" */
>> -	uint32_t ref_type_id = tag->type == 0 ? 0 : type_id_off + tag->type;
>> +	uint32_t ref_type_id = tag->type == 0 ? 0 : encoder->type_id_off + tag->type;
>>  	struct base_type *bt;
>>  	const char *name;
>>  
>> @@ -970,7 +997,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
>>  		if (tag__type(tag)->declaration)
>>  			return btf_encoder__add_ref_type(encoder, BTF_KIND_FWD, 0, name, tag->tag == DW_TAG_union_type);
>>  		else
>> -			return btf_encoder__add_struct_type(encoder, tag, type_id_off);
>> +			return btf_encoder__add_struct_type(encoder, tag);
>>  	case DW_TAG_array_type:
>>  		/* TODO: Encode one dimension at a time. */
>>  		encoder->need_index_type = true;
>> @@ -978,7 +1005,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
>>  	case DW_TAG_enumeration_type:
>>  		return btf_encoder__add_enum_type(encoder, tag, conf_load);
>>  	case DW_TAG_subroutine_type:
>> -		return btf_encoder__add_func_proto(encoder, tag__ftype(tag), type_id_off);
>> +		return btf_encoder__add_func_proto(encoder, tag__ftype(tag));
>>          case DW_TAG_unspecified_type:
>>  		/* Just don't encode this for now, converting anything with this type to void (0) instead.
>>  		 *
>> @@ -1281,7 +1308,7 @@ static bool ftype__has_arg_names(const struct ftype *ftype)
>>  	return true;
>>  }
>>  
>> -static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_t type_id_off)
>> +static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
>>  {
>>  	struct cu *cu = encoder->cu;
>>  	uint32_t core_id;
>> @@ -1366,7 +1393,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_
>>  			continue;
>>  		}
>>  
>> -		type = var->ip.tag.type + type_id_off;
>> +		type = var->ip.tag.type + encoder->type_id_off;
>>  		linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC;
>>  
>>  		if (encoder->verbose) {
>> @@ -1507,7 +1534,6 @@ void btf_encoder__delete(struct btf_encoder *encoder)
>>  
>>  int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load)
>>  {
>> -	uint32_t type_id_off = btf__type_cnt(encoder->btf) - 1;
>>  	struct llvm_annotation *annot;
>>  	int btf_type_id, tag_type_id, skipped_types = 0;
>>  	uint32_t core_id;
>> @@ -1516,21 +1542,24 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>>  	int err = 0;
>>  
>>  	encoder->cu = cu;
>> +	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
>> +	if (encoder->cu->unspecified_type.tag)
>> +		encoder->unspecified_type = encoder->cu->unspecified_type.type;
>>  
>>  	if (!encoder->has_index_type) {
>>  		/* cu__find_base_type_by_name() takes "type_id_t *id" */
>>  		type_id_t id;
>>  		if (cu__find_base_type_by_name(cu, "int", &id)) {
>>  			encoder->has_index_type = true;
>> -			encoder->array_index_id = type_id_off + id;
>> +			encoder->array_index_id = encoder->type_id_off + id;
>>  		} else {
>>  			encoder->has_index_type = false;
>> -			encoder->array_index_id = type_id_off + cu->types_table.nr_entries;
>> +			encoder->array_index_id = encoder->type_id_off + cu->types_table.nr_entries;
>>  		}
>>  	}
>>  
>>  	cu__for_each_type(cu, core_id, pos) {
>> -		btf_type_id = btf_encoder__encode_tag(encoder, pos, type_id_off, conf_load);
>> +		btf_type_id = btf_encoder__encode_tag(encoder, pos, conf_load);
>>  
>>  		if (btf_type_id == 0) {
>>  			++skipped_types;
>> @@ -1538,7 +1567,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>>  		}
>>  
>>  		if (btf_type_id < 0 ||
>> -		    tag__check_id_drift(pos, core_id, btf_type_id + skipped_types, type_id_off)) {
>> +		    tag__check_id_drift(encoder, pos, core_id, btf_type_id + skipped_types)) {
>>  			err = -1;
>>  			goto out;
>>  		}
>> @@ -1572,7 +1601,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>>  			continue;
>>  		}
>>  
>> -		btf_type_id = type_id_off + core_id;
>> +		btf_type_id = encoder->type_id_off + core_id;
>>  		ns = tag__namespace(pos);
>>  		list_for_each_entry(annot, &ns->annots, node) {
>>  			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_type_id, annot->component_idx);
>> @@ -1585,8 +1614,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>>  	}
>>  
>>  	cu__for_each_function(cu, core_id, fn) {
>> -		int btf_fnproto_id, btf_fn_id;
>> -		const char *name;
>>  
>>  		/*
>>  		 * Skip functions that:
>> @@ -1616,27 +1643,13 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>>  				continue;
>>  		}
>>  
>> -		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto, type_id_off);
>> -		name = function__name(fn);
>> -		btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
>> -		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
>> -			err = -1;
>> -			printf("error: failed to encode function '%s'\n", function__name(fn));
>> +		err = btf_encoder__add_func(encoder, fn);
>> +		if (err)
>>  			goto out;
>> -		}
>> -
>> -		list_for_each_entry(annot, &fn->annots, node) {
>> -			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id, annot->component_idx);
>> -			if (tag_type_id < 0) {
>> -				fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
>> -					annot->value, name, annot->component_idx);
>> -				goto out;
>> -			}
>> -		}
>>  	}
>>  
>>  	if (!encoder->skip_encoding_vars)
>> -		err = btf_encoder__encode_cu_variables(encoder, type_id_off);
>> +		err = btf_encoder__encode_cu_variables(encoder);
>>  out:
>>  	encoder->cu = NULL;
>>  	return err;
>> -- 
>> 1.8.3.1
>>
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 17:18                                       ` Alan Maguire
@ 2023-02-01 18:54                                         ` Arnaldo Carvalho de Melo
  2023-02-01 22:33                                         ` Alan Maguire
  1 sibling, 0 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-01 18:54 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, David Vernet, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

Em Wed, Feb 01, 2023 at 05:18:29PM +0000, Alan Maguire escreveu:
> On 01/02/2023 17:01, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Feb 01, 2023 at 08:49:07AM -0800, Alexei Starovoitov escreveu:
> >> It feels fixed pahole should be done under some flag
> >> otherwise when people update the pahole the existing and older
> >> kernels might stop building with warns:
> >> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> >> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> >> ...
 
> Good point, something like
 
> --skip_inconsistent_proto	Skip functions that have multiple inconsistent
> 				function prototypes sharing the same name, or
> 				have optimized-out parameters.

We have:

⬢[acme@toolbox pahole]$ grep '"skip_encoding.*' pahole.c
		.name = "skip_encoding_btf_vars",
		.name = "skip_encoding_btf_decl_tag",
		.name = "skip_encoding_btf_type_tag",
		.name = "skip_encoding_btf_enum64",
⬢[acme@toolbox pahole]$

Perhaps, even being long, we should be consistent and name it:

	--skip_encoding_btf_inconsistent_proto

?
 
> ? Implementation needs a bit of thought though because we're
> not really doing the same thing that we were before. Previously we
> were adding the first instance of a function in the CU we came across.
> Probably safest to resurrect that behaviour for the legacy
> non-skip-inconsistent-proto case I think. The final patch handling

Consider getting what I have now in my next branch, that has the fixups
I made while reviewing, as discussed in this thread:

⬢[acme@toolbox pahole]$ git log --oneline -6
b1576cf15106efd7 (HEAD -> master) pahole: Sync with libbpf-1.1
e9db5622d97395b7 btf_encoder: Delay function addition to check for function prototype inconsistencies
74675488e8ed5718 btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
be470fa5757e5915 btf_encoder: Rework btf_encoders__*() API to allow traversal of encoders
d6e0778f6b5912da btf_encoder: Refactor function addition into dedicated btf_encoder__add_func
f77b5ae93844b5c4 dwarf_loader: Help spotting functions with optimized-out parameters
⬢[acme@toolbox pahole]$

And at the point where you change the behaviour you introduce the
option, so that we don't have to remove it and then ressurect.

- Arnaldo

> inconsistent function prototypes will need to be reworked a bit to 
> support this, since we tossed this approach and used saving/merging 
> multiple instances in the tree instead.  Once I've built bpf trees I'll
> have a go at getting this working.
> 
> >> Arnaldo, could you check what warns do you see with this fixed pahole
> >> in bpf tree ?
> > 
> > Sure.
> > 
> 
> I can collect this for x86_64/aarch64 too; might take a few hours
> before I have the results.
> 
> >> If there are only few warns then we can manually add __used noinline
> >> to these places, push to bpf tree and push to stable.
> >>
> >> Then in bpf-next we can clean up everything with __bpf_kfunc.
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func
  2023-02-01 17:50     ` Alan Maguire
@ 2023-02-01 18:59       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-01 18:59 UTC (permalink / raw)
  To: Alan Maguire
  Cc: yhs, ast, olsajiri, eddyz87, sinquersw, timo, daniel, andrii,
	songliubraving, john.fastabend, kpsingh, sdf, haoluo, martin.lau,
	bpf

Em Wed, Feb 01, 2023 at 05:50:45PM +0000, Alan Maguire escreveu:
> On 01/02/2023 17:19, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jan 30, 2023 at 02:29:42PM +0000, Alan Maguire escreveu:
> >> This will be useful for postponing local function addition later on.
> >> As part of this, store the type id offset and unspecified type in
> >> the encoder, as this will simplify late addition of local functions.
> >>
> >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> >> ---
> >>  btf_encoder.c | 101 +++++++++++++++++++++++++++++++++-------------------------
> >>  1 file changed, 57 insertions(+), 44 deletions(-)
> >>
> >> diff --git a/btf_encoder.c b/btf_encoder.c
> >> index a5fa04a..44f1905 100644
> >> --- a/btf_encoder.c
> >> +++ b/btf_encoder.c
> >> @@ -54,6 +54,8 @@ struct btf_encoder {
> >>  	struct gobuffer   percpu_secinfo;
> >>  	const char	  *filename;
> >>  	struct elf_symtab *symtab;
> >> +	uint32_t	  type_id_off;
> >> +	uint32_t	  unspecified_type;
> >>  	bool		  has_index_type,
> >>  			  need_index_type,
> >>  			  skip_encoding_vars,
> >> @@ -593,20 +595,20 @@ static int32_t btf_encoder__add_func_param(struct btf_encoder *encoder, const ch
> >>  	}
> >>  }
> >>  
> >> -static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t type_id_off, uint32_t tag_type)
> >> +static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_type)
> >>  {
> >>  	if (tag_type == 0)
> >>  		return 0;
> >>  
> >> -	if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
> >> +	if (tag_type == encoder->unspecified_type) {
> >>  		// No provision for encoding this, turn it into void.
> >>  		return 0;
> >>  	}
> > 
> > Humm, are those two lines (above) really equivalent? IIRC I read that as
> > encoder->cu->unspecified_type.tag being zero means we still didn't set
> > it, not that it is void (zero), right?
> > 
> > So if we're passing a tag_type zero, void, we'll return 0, i.e. turn
> > into a void, so seems equivalent, try not to combine patches like this
> > in the future, i.e. I would expect, from a quick glance, to have:
> > 
> > -     if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
> > +     if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
> > 
> > I.e. just the removal of the indirection thru encoder->cu. Or am I
> > missing something here?
> >
> 
> No, I don't think you're missing anything. I should have separated
> out the changes that record encoder info such that we don't need to
> rely on the current CU; we need those because now we interact with
> functions potentially much later on, and the current CU can be
> different. Ideally that would have come before this patch
> refactoring function addition.

That would be great, I have another branch, 'alt_dwarf', that supports
.dwz files (alternate DWARF) support DW_TAG_partial_unit, etc that I got
confirmation working with both userspace code (openvswitch) as well with
opensuse's kernel that uses alternate DWARF (using the dwz tool).

There is also work on supporting C atomics (yeah, there are different
DWARF tags for that, like DW_TAG_const_type -> DW_TAG_pointer_type...)
so I need to test it more, there is also implications for encoding such
things in BTF, as some people would like, so I need to find time to test
it more, after we cut 1.25.

I mention this because there are merge problems with what were doing
here, so the more granular we get the patches in this series, the less
difficult it will be to merge that other work.
 
> I can rework the series to do that if you like? Patch 5 will

Please.

> need a bit of work too so that we can continue to support the

Yeah, I actually suggested that in another message :-)

> legacy behaviour, and we'll need an additional patch to support
> switching the inconsistent prototype handling on also.

great. Meanwhile I'll do tests with what we have so far as suggested by
Alexei, on bpf-next.

Thanks!

- Arnaldo
  
> > - Arnaldo
> > 
> >>  
> >> -	return type_id_off + tag_type;
> >> +	return encoder->type_id_off + tag_type;
> >>  }
> >>  
> >> -static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype, uint32_t type_id_off)
> >> +static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype)
> >>  {
> >>  	struct btf *btf = encoder->btf;
> >>  	const struct btf_type *t;
> >> @@ -616,7 +618,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
> >>  
> >>  	/* add btf_type for func_proto */
> >>  	nr_params = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
> >> -	type_id = btf_encoder__tag_type(encoder, type_id_off, ftype->tag.type);
> >> +	type_id = btf_encoder__tag_type(encoder, ftype->tag.type);
> >>  
> >>  	id = btf__add_func_proto(btf, type_id);
> >>  	if (id > 0) {
> >> @@ -634,7 +636,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
> >>  	ftype__for_each_parameter(ftype, param) {
> >>  		const char *name = parameter__name(param);
> >>  
> >> -		type_id = param->tag.type == 0 ? 0 : type_id_off + param->tag.type;
> >> +		type_id = param->tag.type == 0 ? 0 : encoder->type_id_off + param->tag.type;
> >>  		++param_idx;
> >>  		if (btf_encoder__add_func_param(encoder, name, type_id, param_idx == nr_params))
> >>  			return -1;
> >> @@ -762,6 +764,31 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
> >>  	return id;
> >>  }
> >>  
> >> +static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
> >> +{
> >> +	int btf_fnproto_id, btf_fn_id, tag_type_id;
> >> +	struct llvm_annotation *annot;
> >> +	const char *name;
> >> +
> >> +	btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
> >> +	name = function__name(fn);
> >> +	btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> >> +	if (btf_fnproto_id < 0 || btf_fn_id < 0) {
> >> +		printf("error: failed to encode function '%s'\n", function__name(fn));
> >> +		return -1;
> >> +	}
> >> +	list_for_each_entry(annot, &fn->annots, node) {
> >> +		tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id,
> >> +							annot->component_idx);
> >> +		if (tag_type_id < 0) {
> >> +			fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
> >> +				annot->value, name, annot->component_idx);
> >> +			return -1;
> >> +		}
> >> +	}
> >> +	return 0;
> >> +}
> >> +
> >>  /*
> >>   * This corresponds to the same macro defined in
> >>   * include/linux/kallsyms.h
> >> @@ -859,22 +886,21 @@ static void dump_invalid_symbol(const char *msg, const char *sym,
> >>  	fprintf(stderr, "PAHOLE: Error: Use '--btf_encode_force' to ignore such symbols and force emit the btf.\n");
> >>  }
> >>  
> >> -static int tag__check_id_drift(const struct tag *tag,
> >> -			       uint32_t core_id, uint32_t btf_type_id,
> >> -			       uint32_t type_id_off)
> >> +static int tag__check_id_drift(struct btf_encoder *encoder, const struct tag *tag,
> >> +			       uint32_t core_id, uint32_t btf_type_id)
> >>  {
> >> -	if (btf_type_id != (core_id + type_id_off)) {
> >> +	if (btf_type_id != (core_id + encoder->type_id_off)) {
> >>  		fprintf(stderr,
> >>  			"%s: %s id drift, core_id: %u, btf_type_id: %u, type_id_off: %u\n",
> >>  			__func__, dwarf_tag_name(tag->tag),
> >> -			core_id, btf_type_id, type_id_off);
> >> +			core_id, btf_type_id, encoder->type_id_off);
> >>  		return -1;
> >>  	}
> >>  
> >>  	return 0;
> >>  }
> >>  
> >> -static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off)
> >> +static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag)
> >>  {
> >>  	struct type *type = tag__type(tag);
> >>  	struct class_member *pos;
> >> @@ -896,7 +922,8 @@ static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct
> >>  		 * is required.
> >>  		 */
> >>  		name = class_member__name(pos);
> >> -		if (btf_encoder__add_field(encoder, name, type_id_off + pos->tag.type, pos->bitfield_size, pos->bit_offset))
> >> +		if (btf_encoder__add_field(encoder, name, encoder->type_id_off + pos->tag.type,
> >> +					   pos->bitfield_size, pos->bit_offset))
> >>  			return -1;
> >>  	}
> >>  
> >> @@ -936,11 +963,11 @@ static int32_t btf_encoder__add_enum_type(struct btf_encoder *encoder, struct ta
> >>  	return type_id;
> >>  }
> >>  
> >> -static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off,
> >> +static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
> >>  				   struct conf_load *conf_load)
> >>  {
> >>  	/* single out type 0 as it represents special type "void" */
> >> -	uint32_t ref_type_id = tag->type == 0 ? 0 : type_id_off + tag->type;
> >> +	uint32_t ref_type_id = tag->type == 0 ? 0 : encoder->type_id_off + tag->type;
> >>  	struct base_type *bt;
> >>  	const char *name;
> >>  
> >> @@ -970,7 +997,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
> >>  		if (tag__type(tag)->declaration)
> >>  			return btf_encoder__add_ref_type(encoder, BTF_KIND_FWD, 0, name, tag->tag == DW_TAG_union_type);
> >>  		else
> >> -			return btf_encoder__add_struct_type(encoder, tag, type_id_off);
> >> +			return btf_encoder__add_struct_type(encoder, tag);
> >>  	case DW_TAG_array_type:
> >>  		/* TODO: Encode one dimension at a time. */
> >>  		encoder->need_index_type = true;
> >> @@ -978,7 +1005,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
> >>  	case DW_TAG_enumeration_type:
> >>  		return btf_encoder__add_enum_type(encoder, tag, conf_load);
> >>  	case DW_TAG_subroutine_type:
> >> -		return btf_encoder__add_func_proto(encoder, tag__ftype(tag), type_id_off);
> >> +		return btf_encoder__add_func_proto(encoder, tag__ftype(tag));
> >>          case DW_TAG_unspecified_type:
> >>  		/* Just don't encode this for now, converting anything with this type to void (0) instead.
> >>  		 *
> >> @@ -1281,7 +1308,7 @@ static bool ftype__has_arg_names(const struct ftype *ftype)
> >>  	return true;
> >>  }
> >>  
> >> -static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_t type_id_off)
> >> +static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
> >>  {
> >>  	struct cu *cu = encoder->cu;
> >>  	uint32_t core_id;
> >> @@ -1366,7 +1393,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_
> >>  			continue;
> >>  		}
> >>  
> >> -		type = var->ip.tag.type + type_id_off;
> >> +		type = var->ip.tag.type + encoder->type_id_off;
> >>  		linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC;
> >>  
> >>  		if (encoder->verbose) {
> >> @@ -1507,7 +1534,6 @@ void btf_encoder__delete(struct btf_encoder *encoder)
> >>  
> >>  int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load)
> >>  {
> >> -	uint32_t type_id_off = btf__type_cnt(encoder->btf) - 1;
> >>  	struct llvm_annotation *annot;
> >>  	int btf_type_id, tag_type_id, skipped_types = 0;
> >>  	uint32_t core_id;
> >> @@ -1516,21 +1542,24 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
> >>  	int err = 0;
> >>  
> >>  	encoder->cu = cu;
> >> +	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
> >> +	if (encoder->cu->unspecified_type.tag)
> >> +		encoder->unspecified_type = encoder->cu->unspecified_type.type;
> >>  
> >>  	if (!encoder->has_index_type) {
> >>  		/* cu__find_base_type_by_name() takes "type_id_t *id" */
> >>  		type_id_t id;
> >>  		if (cu__find_base_type_by_name(cu, "int", &id)) {
> >>  			encoder->has_index_type = true;
> >> -			encoder->array_index_id = type_id_off + id;
> >> +			encoder->array_index_id = encoder->type_id_off + id;
> >>  		} else {
> >>  			encoder->has_index_type = false;
> >> -			encoder->array_index_id = type_id_off + cu->types_table.nr_entries;
> >> +			encoder->array_index_id = encoder->type_id_off + cu->types_table.nr_entries;
> >>  		}
> >>  	}
> >>  
> >>  	cu__for_each_type(cu, core_id, pos) {
> >> -		btf_type_id = btf_encoder__encode_tag(encoder, pos, type_id_off, conf_load);
> >> +		btf_type_id = btf_encoder__encode_tag(encoder, pos, conf_load);
> >>  
> >>  		if (btf_type_id == 0) {
> >>  			++skipped_types;
> >> @@ -1538,7 +1567,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
> >>  		}
> >>  
> >>  		if (btf_type_id < 0 ||
> >> -		    tag__check_id_drift(pos, core_id, btf_type_id + skipped_types, type_id_off)) {
> >> +		    tag__check_id_drift(encoder, pos, core_id, btf_type_id + skipped_types)) {
> >>  			err = -1;
> >>  			goto out;
> >>  		}
> >> @@ -1572,7 +1601,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
> >>  			continue;
> >>  		}
> >>  
> >> -		btf_type_id = type_id_off + core_id;
> >> +		btf_type_id = encoder->type_id_off + core_id;
> >>  		ns = tag__namespace(pos);
> >>  		list_for_each_entry(annot, &ns->annots, node) {
> >>  			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_type_id, annot->component_idx);
> >> @@ -1585,8 +1614,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
> >>  	}
> >>  
> >>  	cu__for_each_function(cu, core_id, fn) {
> >> -		int btf_fnproto_id, btf_fn_id;
> >> -		const char *name;
> >>  
> >>  		/*
> >>  		 * Skip functions that:
> >> @@ -1616,27 +1643,13 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
> >>  				continue;
> >>  		}
> >>  
> >> -		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto, type_id_off);
> >> -		name = function__name(fn);
> >> -		btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> >> -		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
> >> -			err = -1;
> >> -			printf("error: failed to encode function '%s'\n", function__name(fn));
> >> +		err = btf_encoder__add_func(encoder, fn);
> >> +		if (err)
> >>  			goto out;
> >> -		}
> >> -
> >> -		list_for_each_entry(annot, &fn->annots, node) {
> >> -			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id, annot->component_idx);
> >> -			if (tag_type_id < 0) {
> >> -				fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
> >> -					annot->value, name, annot->component_idx);
> >> -				goto out;
> >> -			}
> >> -		}
> >>  	}
> >>  
> >>  	if (!encoder->skip_encoding_vars)
> >> -		err = btf_encoder__encode_cu_variables(encoder, type_id_off);
> >> +		err = btf_encoder__encode_cu_variables(encoder);
> >>  out:
> >>  	encoder->cu = NULL;
> >>  	return err;
> >> -- 
> >> 1.8.3.1
> >>
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 17:01                                     ` Arnaldo Carvalho de Melo
  2023-02-01 17:18                                       ` Alan Maguire
@ 2023-02-01 22:32                                       ` Arnaldo Carvalho de Melo
  2023-02-02  1:09                                         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-01 22:32 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Alan Maguire, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

Em Wed, Feb 01, 2023 at 02:01:47PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Feb 01, 2023 at 08:49:07AM -0800, Alexei Starovoitov escreveu:
> > On Wed, Feb 1, 2023 at 7:19 AM David Vernet <void@manifault.com> wrote:
> > >
> > > On Wed, Feb 01, 2023 at 12:02:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
> > > > > On 01/02/2023 03:02, David Vernet wrote:
> > > > > > On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
> > > > > >> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
> > > > > >>>
> > > > > >>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
> > > > > >>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
> > > > > >>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
> > > > > >>>>> <alexei.starovoitov@gmail.com> wrote:
> > > > > >>>>>>
> > > > > >>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
> > > > > >>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > >>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
> > > > > >>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
> > > > > >>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > >>>>>>>>>>>> +++ b/dwarves.h
> > > > > >>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
> > > > > >>>>>>>>>>>>   uint8_t          has_addr_info:1;
> > > > > >>>>>>>>>>>>   uint8_t          uses_global_strings:1;
> > > > > >>>>>>>>>>>>   uint8_t          little_endian:1;
> > > > > >>>>>>>>>>>> + uint8_t          nr_register_params;
> > > > > >>>>>>>>>>>>   uint16_t         language;
> > > > > >>>>>>>>>>>>   unsigned long    nr_inline_expansions;
> > > > > >>>>>>>>>>>>   size_t           size_inline_expansions;
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
> > > > > >>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
> > > > > >>>>>>>>>> defined if using a very old elf.h; below works around this
> > > > > >>>>>>>>>> (dwarves otherwise builds fine on this system).
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Ok, will add it and will test with containers for older distros too.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
> > > > > >>>>>>>> repo at:
> > > > > >>>>>>>>
> > > > > >>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
> > > > > >>>>>>>>
> > > > > >>>>>>>> It failed yesterday and today due to problems with the installation of
> > > > > >>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
> > > > > >>>>>>>> notifications floating by.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
> > > > > >>>>>>>> iterator that Jiri noticed.
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
> > > > > >>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
> > > > > >>>>>>> from the BTF representation, and as a result:
> > > > > >>>>>>>
> > > > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > > > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > > > > >>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> > > > > >>>>>>>
> > > > > >>>>>>> Not sure why I didn't notice this previously.
> > > > > >>>>>>>
> > > > > >>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
> > > > > >>>>>>>
> > > > > >>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > > > >>>>>>> {
> > > > > >>>>>>>         return -EOPNOTSUPP;
> > > > > >>>>>>> }
> > > > > >>>>>>>
> > > > > >>>>>>> looks like this:
> > > > > >>>>>>>
> > > > > >>>>>>>    <8af83a2>   DW_AT_external    : 1
> > > > > >>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
> > > > > >>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
> > > > > >>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
> > > > > >>>>>>>     <8af83a9>   DW_AT_decl_column : 5
> > > > > >>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
> > > > > >>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
> > > > > >>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
> > > > > >>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
> > > > > >>>>>>>     <8af83b3>   DW_AT_name        : ctx
> > > > > >>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
> > > > > >>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
> > > > > >>>>>>>     <8af83ba>   DW_AT_decl_column : 51
> > > > > >>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
> > > > > >>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
> > > > > >>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
> > > > > >>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
> > > > > >>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
> > > > > >>>>>>>     <8af83c7>   DW_AT_decl_column : 61
> > > > > >>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
> > > > > >>>>>>>
> > > > > >>>>>>> ...and because there are no further abstract origin references
> > > > > >>>>>>> with location information either, we classify it as lacking
> > > > > >>>>>>> locations for (some of) the parameters, and as a result
> > > > > >>>>>>> we skip BTF encoding. We can work around that by doing this:
> > > > > >>>>>>>
> > > > > >>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> > > > > >>>>>>
> > > > > >>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
> > > > > >>>>>>
> > > > > >>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
> > > > > >>>>>> applied to arguments.
> > > > > >>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
> > > > > >>>>>> the way we do in selftests.
> > > > > >>>>>
> > > > > >>>>> There is also
> > > > > >>>>> # define __visible __attribute__((__externally_visible__))
> > > > > >>>>> that probably fits the best here.
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>> testing thus for seems to show that for x86_64, David's series
> > > > > >>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
> > > > > >>>> to cover recently-arrived kfuncs like cpumask) is sufficient
> > > > > >>>> to avoid resolve_btfids warnings.
> > > > > >>>
> > > > > >>> Nice. Alexei -- lmk how you want to proceed. I think using the
> > > > > >>> __bpf_kfunc macro in the short term (with __used and noinline) is
> > > > > >>> probably the least controversial way to unblock this, but am open to
> > > > > >>> other suggestions.
> > > > > >>
> > > > > >> Sounds good to me, but sounds like __used and noinline are not
> > > > > >> enough to address the issues on aarch64?
> > > > > >
> > > > > > Indeed, we'll have to make sure that's also addressed. Alan -- did you
> > > > > > try Alexei's suggestion to use __weak? Does that fix the issue for
> > > > > > aarch64? I'm still confused as to why it's only complaining for a small
> > > > > > subset of kfuncs, which include those that have external linkage.
> > > > > >
> > > > >
> > > > > I finally got to the bottom of the aarch64 issues; there was a 1-line bug
> > > > > in the changes I made to the DWARF handling code which leads to BTF generation;
> > > > > it was excluding a bunch of functions incorrectly, marking them as optimized out.
> > > > > The fix is:
> > > > >
> > > > > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > > > > index dba2d37..8364e17 100644
> > > > > --- a/dwarf_loader.c
> > > > > +++ b/dwarf_loader.c
> > > > > @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
> > > > >                         Dwarf_Op *expr = loc.expr;
> > > > >
> > > > >                         switch (expr->atom) {
> > > > > -                       case DW_OP_reg1 ... DW_OP_reg31:
> > > > > +                       case DW_OP_reg0 ... DW_OP_reg31:
> > > > >                         case DW_OP_breg0 ... DW_OP_breg31:
> > > > >                                 break;
> > > > >                         default:
> > > > >
> > > > > ..and because reg0 is the first parameter for aarch64, we were
> > > > > incorrectly landing in the "default:" of the switch statement
> > > > > and marking a bunch of functions as optimized out
> > > > > because we thought the first argument was. Sorry about this,
> > > > > and thanks for all the suggestions!
> > >
> > > Great, so inline and __used with __bpf_kfunc sounds like the way forward
> > > in the short term. Arnaldo / Alexei -- how do you want to resolve the
> > > dependency here? Going through bpf-next is probably a good idea so that
> > > we get proper CI coverage, and any kfuncs added to bpf-next after this
> > > can use the macro. Does that work for you?
> > 
> > It feels fixed pahole should be done under some flag
> > otherwise when people update the pahole the existing and older
> > kernels might stop building with warns:
> > WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > ...
> > 
> > Arnaldo, could you check what warns do you see with this fixed pahole
> > in bpf tree ?
> 
> Sure.

These appeared on a distro like .config:

  BTFIDS  vmlinux
WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
WARN: resolve_btfids: unresolved symbol bpf_cpumask_any
WARN: resolve_btfids: unresolved symbol bpf_ct_change_status

I'll do it with allmodconfig
 
> > If there are only few warns then we can manually add __used noinline
> > to these places, push to bpf tree and push to stable.
> > 
> > Then in bpf-next we can clean up everything with __bpf_kfunc.
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 17:18                                       ` Alan Maguire
  2023-02-01 18:54                                         ` Arnaldo Carvalho de Melo
@ 2023-02-01 22:33                                         ` Alan Maguire
  1 sibling, 0 replies; 40+ messages in thread
From: Alan Maguire @ 2023-02-01 22:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Alexei Starovoitov
  Cc: David Vernet, Yonghong Song, Alexei Starovoitov, Jiri Olsa,
	Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

On 01/02/2023 17:18, Alan Maguire wrote:
> On 01/02/2023 17:01, Arnaldo Carvalho de Melo wrote:
>> Em Wed, Feb 01, 2023 at 08:49:07AM -0800, Alexei Starovoitov escreveu:
>>> On Wed, Feb 1, 2023 at 7:19 AM David Vernet <void@manifault.com> wrote:
>>>>
>>>> On Wed, Feb 01, 2023 at 12:02:07PM -0300, Arnaldo Carvalho de Melo wrote:
>>>>> Em Wed, Feb 01, 2023 at 01:59:30PM +0000, Alan Maguire escreveu:
>>>>>> On 01/02/2023 03:02, David Vernet wrote:
>>>>>>> On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
>>>>>>>> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
>>>>>>>>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
>>>>>>>>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
>>>>>>>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>>>>>>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>>>>>>>> +++ b/dwarves.h
>>>>>>>>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>>>>>>>>>>>>>   uint8_t          has_addr_info:1;
>>>>>>>>>>>>>>>>>>   uint8_t          uses_global_strings:1;
>>>>>>>>>>>>>>>>>>   uint8_t          little_endian:1;
>>>>>>>>>>>>>>>>>> + uint8_t          nr_register_params;
>>>>>>>>>>>>>>>>>>   uint16_t         language;
>>>>>>>>>>>>>>>>>>   unsigned long    nr_inline_expansions;
>>>>>>>>>>>>>>>>>>   size_t           size_inline_expansions;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>>>>>>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>>>>>>>>>>>>>> defined if using a very old elf.h; below works around this
>>>>>>>>>>>>>>>> (dwarves otherwise builds fine on this system).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ok, will add it and will test with containers for older distros too.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
>>>>>>>>>>>>>> repo at:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It failed yesterday and today due to problems with the installation of
>>>>>>>>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
>>>>>>>>>>>>>> notifications floating by.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
>>>>>>>>>>>>>> iterator that Jiri noticed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
>>>>>>>>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
>>>>>>>>>>>>> from the BTF representation, and as a result:
>>>>>>>>>>>>>
>>>>>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>>>>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>>>>>>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not sure why I didn't notice this previously.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
>>>>>>>>>>>>>
>>>>>>>>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>>>>>>>> {
>>>>>>>>>>>>>         return -EOPNOTSUPP;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> looks like this:
>>>>>>>>>>>>>
>>>>>>>>>>>>>    <8af83a2>   DW_AT_external    : 1
>>>>>>>>>>>>>     <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>>>>>>>>>>>>>     <8af83a6>   DW_AT_decl_file   : 5
>>>>>>>>>>>>>     <8af83a7>   DW_AT_decl_line   : 737
>>>>>>>>>>>>>     <8af83a9>   DW_AT_decl_column : 5
>>>>>>>>>>>>>     <8af83aa>   DW_AT_prototyped  : 1
>>>>>>>>>>>>>     <8af83aa>   DW_AT_type        : <0x8ad8547>
>>>>>>>>>>>>>     <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>>>>>>>>>>>>>  <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>>>>>>>>>>>>>     <8af83b3>   DW_AT_name        : ctx
>>>>>>>>>>>>>     <8af83b7>   DW_AT_decl_file   : 5
>>>>>>>>>>>>>     <8af83b8>   DW_AT_decl_line   : 737
>>>>>>>>>>>>>     <8af83ba>   DW_AT_decl_column : 51
>>>>>>>>>>>>>     <8af83bb>   DW_AT_type        : <0x8af421d>
>>>>>>>>>>>>>  <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>>>>>>>>>>>>>     <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>>>>>>>>>>>>>     <8af83c4>   DW_AT_decl_file   : 5
>>>>>>>>>>>>>     <8af83c5>   DW_AT_decl_line   : 737
>>>>>>>>>>>>>     <8af83c7>   DW_AT_decl_column : 61
>>>>>>>>>>>>>     <8af83c8>   DW_AT_type        : <0x8adc424>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ...and because there are no further abstract origin references
>>>>>>>>>>>>> with location information either, we classify it as lacking
>>>>>>>>>>>>> locations for (some of) the parameters, and as a result
>>>>>>>>>>>>> we skip BTF encoding. We can work around that by doing this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>>>>>>>
>>>>>>>>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
>>>>>>>>>>>>
>>>>>>>>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
>>>>>>>>>>>> applied to arguments.
>>>>>>>>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
>>>>>>>>>>>> the way we do in selftests.
>>>>>>>>>>>
>>>>>>>>>>> There is also
>>>>>>>>>>> # define __visible __attribute__((__externally_visible__))
>>>>>>>>>>> that probably fits the best here.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> testing thus for seems to show that for x86_64, David's series
>>>>>>>>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
>>>>>>>>>> to cover recently-arrived kfuncs like cpumask) is sufficient
>>>>>>>>>> to avoid resolve_btfids warnings.
>>>>>>>>>
>>>>>>>>> Nice. Alexei -- lmk how you want to proceed. I think using the
>>>>>>>>> __bpf_kfunc macro in the short term (with __used and noinline) is
>>>>>>>>> probably the least controversial way to unblock this, but am open to
>>>>>>>>> other suggestions.
>>>>>>>>
>>>>>>>> Sounds good to me, but sounds like __used and noinline are not
>>>>>>>> enough to address the issues on aarch64?
>>>>>>>
>>>>>>> Indeed, we'll have to make sure that's also addressed. Alan -- did you
>>>>>>> try Alexei's suggestion to use __weak? Does that fix the issue for
>>>>>>> aarch64? I'm still confused as to why it's only complaining for a small
>>>>>>> subset of kfuncs, which include those that have external linkage.
>>>>>>>
>>>>>>
>>>>>> I finally got to the bottom of the aarch64 issues; there was a 1-line bug
>>>>>> in the changes I made to the DWARF handling code which leads to BTF generation;
>>>>>> it was excluding a bunch of functions incorrectly, marking them as optimized out.
>>>>>> The fix is:
>>>>>>
>>>>>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>>>>>> index dba2d37..8364e17 100644
>>>>>> --- a/dwarf_loader.c
>>>>>> +++ b/dwarf_loader.c
>>>>>> @@ -1074,7 +1074,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>>>>>                         Dwarf_Op *expr = loc.expr;
>>>>>>
>>>>>>                         switch (expr->atom) {
>>>>>> -                       case DW_OP_reg1 ... DW_OP_reg31:
>>>>>> +                       case DW_OP_reg0 ... DW_OP_reg31:
>>>>>>                         case DW_OP_breg0 ... DW_OP_breg31:
>>>>>>                                 break;
>>>>>>                         default:
>>>>>>
>>>>>> ..and because reg0 is the first parameter for aarch64, we were
>>>>>> incorrectly landing in the "default:" of the switch statement
>>>>>> and marking a bunch of functions as optimized out
>>>>>> because we thought the first argument was. Sorry about this,
>>>>>> and thanks for all the suggestions!
>>>>
>>>> Great, so inline and __used with __bpf_kfunc sounds like the way forward
>>>> in the short term. Arnaldo / Alexei -- how do you want to resolve the
>>>> dependency here? Going through bpf-next is probably a good idea so that
>>>> we get proper CI coverage, and any kfuncs added to bpf-next after this
>>>> can use the macro. Does that work for you?
>>>
>>> It feels fixed pahole should be done under some flag
>>> otherwise when people update the pahole the existing and older
>>> kernels might stop building with warns:
>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>> ...
>>>
> 
> Good point, something like
> 
> --skip_inconsistent_proto	Skip functions that have multiple inconsistent
> 				function prototypes sharing the same name, or
> 				have optimized-out parameters.
> 
> ? Implementation needs a bit of thought though because we're
> not really doing the same thing that we were before. Previously we
> were adding the first instance of a function in the CU we came across.
> Probably safest to resurrect that behaviour for the legacy
> non-skip-inconsistent-proto case I think. The final patch handling
> inconsistent function prototypes will need to be reworked a bit to 
> support this, since we tossed this approach and used saving/merging 
> multiple instances in the tree instead.  Once I've built bpf trees I'll
> have a go at getting this working.
> 
>>> Arnaldo, could you check what warns do you see with this fixed pahole
>>> in bpf tree ?
>>
>> Sure.
>>
> 
> I can collect this for x86_64/aarch64 too; might take a few hours
> before I have the results.
>

The results I'm seeing with the bpf tree across x86_64 and aarch64 are 
consistent using the updated pahole:

WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
WARN: resolve_btfids: unresolved symbol bpf_ct_change_status

 
>>> If there are only few warns then we can manually add __used noinline
>>> to these places, push to bpf tree and push to stable.
>>>
>>> Then in bpf-next we can clean up everything with __bpf_kfunc.
>>

If the skipping of inconsistent prototype functions happens under 
a flag and not by default, presumably we'd have something like
a 3-patch series for bpf; one patch with an update to scripts/pahole-flags.sh

if [ "${pahole_ver}" -ge "125" ]; then
	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto"
fi

...so that we can enable building with 1.25, and then two additional patches adding
__used noinline prefixes to bpf_ct_change_status and bpf_task_kptr_get(); splitting 
these out into separate patches would probably make sense as different stable trees
might need one but not the other. I _think_ that's what you have in mind, is that
right?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01 22:32                                       ` Arnaldo Carvalho de Melo
@ 2023-02-02  1:09                                         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 40+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-02  1:09 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Alan Maguire, Yonghong Song, Alexei Starovoitov,
	Jiri Olsa, Eddy Z, sinquersw, Timo Beckers, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf

Em Wed, Feb 01, 2023 at 07:32:04PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Feb 01, 2023 at 02:01:47PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Feb 01, 2023 at 08:49:07AM -0800, Alexei Starovoitov escreveu:
> > > > Great, so inline and __used with __bpf_kfunc sounds like the way forward
> > > > in the short term. Arnaldo / Alexei -- how do you want to resolve the
> > > > dependency here? Going through bpf-next is probably a good idea so that
> > > > we get proper CI coverage, and any kfuncs added to bpf-next after this
> > > > can use the macro. Does that work for you?
> > > 
> > > It feels fixed pahole should be done under some flag
> > > otherwise when people update the pahole the existing and older
> > > kernels might stop building with warns:
> > > WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> > > WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> > > ...
> > > 
> > > Arnaldo, could you check what warns do you see with this fixed pahole
> > > in bpf tree ?
> > 
> > Sure.
> 
> These appeared on a distro like .config:
> 
>   BTFIDS  vmlinux
> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> WARN: resolve_btfids: unresolved symbol bpf_cpumask_any
> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
> 
> I'll do it with allmodconfig

^C[1]+  Done                    nohup make -j32 O=../build/allmodconfig

⬢[acme@toolbox bpf-next]$
⬢[acme@toolbox bpf-next]$ grep "^WARN: resolve_btfids: " nohup.out
WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
WARN: resolve_btfids: unresolved symbol bpf_cpumask_any
WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
⬢[acme@toolbox bpf-next]$


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters
  2023-02-01  3:02                           ` David Vernet
  2023-02-01 13:59                             ` Alan Maguire
@ 2023-02-03  1:09                             ` Yonghong Song
  1 sibling, 0 replies; 40+ messages in thread
From: Yonghong Song @ 2023-02-03  1:09 UTC (permalink / raw)
  To: David Vernet, Alexei Starovoitov
  Cc: Alan Maguire, Arnaldo Carvalho de Melo, Yonghong Song,
	Alexei Starovoitov, Jiri Olsa, Eddy Z, sinquersw, Timo Beckers,
	Daniel Borkmann, Andrii Nakryiko, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Martin KaFai Lau, bpf



On 1/31/23 7:02 PM, David Vernet wrote:
> On Tue, Jan 31, 2023 at 04:14:13PM -0800, Alexei Starovoitov wrote:
>> On Tue, Jan 31, 2023 at 3:59 PM David Vernet <void@manifault.com> wrote:
>>>
>>> On Tue, Jan 31, 2023 at 11:45:29PM +0000, Alan Maguire wrote:
>>>> On 31/01/2023 18:16, Alexei Starovoitov wrote:
>>>>> On Tue, Jan 31, 2023 at 9:43 AM Alexei Starovoitov
>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>
>>>>>> On Tue, Jan 31, 2023 at 4:14 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>>>
>>>>>>> On 31/01/2023 01:04, Arnaldo Carvalho de Melo wrote:
>>>>>>>> Em Mon, Jan 30, 2023 at 09:25:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>> Em Mon, Jan 30, 2023 at 10:37:56PM +0000, Alan Maguire escreveu:
>>>>>>>>>> On 30/01/2023 20:23, Arnaldo Carvalho de Melo wrote:
>>>>>>>>>>> Em Mon, Jan 30, 2023 at 05:10:51PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>>>>>>>>>> +++ b/dwarves.h
>>>>>>>>>>>> @@ -262,6 +262,7 @@ struct cu {
>>>>>>>>>>>>    uint8_t          has_addr_info:1;
>>>>>>>>>>>>    uint8_t          uses_global_strings:1;
>>>>>>>>>>>>    uint8_t          little_endian:1;
>>>>>>>>>>>> + uint8_t          nr_register_params;
>>>>>>>>>>>>    uint16_t         language;
>>>>>>>>>>>>    unsigned long    nr_inline_expansions;
>>>>>>>>>>>>    size_t           size_inline_expansions;
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks for this, never thought of cross-builds to be honest!
>>>>>>>>>
>>>>>>>>>> Tested just now on x86_64 and aarch64 at my end, just ran
>>>>>>>>>> into one small thing on one system; turns out EM_RISCV isn't
>>>>>>>>>> defined if using a very old elf.h; below works around this
>>>>>>>>>> (dwarves otherwise builds fine on this system).
>>>>>>>>>
>>>>>>>>> Ok, will add it and will test with containers for older distros too.
>>>>>>>>
>>>>>>>> Its on the 'next' branch, so that it gets tested in the libbpf github
>>>>>>>> repo at:
>>>>>>>>
>>>>>>>> https://github.com/libbpf/libbpf/actions/workflows/pahole.yml
>>>>>>>>
>>>>>>>> It failed yesterday and today due to problems with the installation of
>>>>>>>> llvm, probably tomorrow it'll be back working as I saw some
>>>>>>>> notifications floating by.
>>>>>>>>
>>>>>>>> I added the conditional EM_RISCV definition as well as removed the dup
>>>>>>>> iterator that Jiri noticed.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks again Arnaldo! I've hit an issue with this series in
>>>>>>> BTF encoding of kfuncs; specifically we see some kfuncs missing
>>>>>>> from the BTF representation, and as a result:
>>>>>>>
>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>>>>>>> WARN: resolve_btfids: unresolved symbol bpf_ct_change_status
>>>>>>>
>>>>>>> Not sure why I didn't notice this previously.
>>>>>>>
>>>>>>> The problem is the DWARF - and therefore BTF - generated for a function like
>>>>>>>
>>>>>>> int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>> {
>>>>>>>          return -EOPNOTSUPP;
>>>>>>> }
>>>>>>>
>>>>>>> looks like this:
>>>>>>>
>>>>>>>     <8af83a2>   DW_AT_external    : 1
>>>>>>>      <8af83a2>   DW_AT_name        : (indirect string, offset: 0x358bdc): bpf_xdp_metadata_rx_hash
>>>>>>>      <8af83a6>   DW_AT_decl_file   : 5
>>>>>>>      <8af83a7>   DW_AT_decl_line   : 737
>>>>>>>      <8af83a9>   DW_AT_decl_column : 5
>>>>>>>      <8af83aa>   DW_AT_prototyped  : 1
>>>>>>>      <8af83aa>   DW_AT_type        : <0x8ad8547>
>>>>>>>      <8af83ae>   DW_AT_sibling     : <0x8af83cd>
>>>>>>>   <2><8af83b2>: Abbrev Number: 38 (DW_TAG_formal_parameter)
>>>>>>>      <8af83b3>   DW_AT_name        : ctx
>>>>>>>      <8af83b7>   DW_AT_decl_file   : 5
>>>>>>>      <8af83b8>   DW_AT_decl_line   : 737
>>>>>>>      <8af83ba>   DW_AT_decl_column : 51
>>>>>>>      <8af83bb>   DW_AT_type        : <0x8af421d>
>>>>>>>   <2><8af83bf>: Abbrev Number: 35 (DW_TAG_formal_parameter)
>>>>>>>      <8af83c0>   DW_AT_name        : (indirect string, offset: 0x27f6a2): hash
>>>>>>>      <8af83c4>   DW_AT_decl_file   : 5
>>>>>>>      <8af83c5>   DW_AT_decl_line   : 737
>>>>>>>      <8af83c7>   DW_AT_decl_column : 61
>>>>>>>      <8af83c8>   DW_AT_type        : <0x8adc424>
>>>>>>>
>>>>>>> ...and because there are no further abstract origin references
>>>>>>> with location information either, we classify it as lacking
>>>>>>> locations for (some of) the parameters, and as a result
>>>>>>> we skip BTF encoding. We can work around that by doing this:
>>>>>>>
>>>>>>> __attribute__ ((optimize("O0"))) int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>>>>>>
>>>>>> replied in the other thread. This attr is broken and discouraged by gcc.
>>>>>>
>>>>>> For kfuncs where aregs are unused, please try __used and __may_unused
>>>>>> applied to arguments.
>>>>>> If that won't work, please add barrier_var(arg) to the body of kfunc
>>>>>> the way we do in selftests.
>>>>>
>>>>> There is also
>>>>> # define __visible __attribute__((__externally_visible__))
>>>>> that probably fits the best here.
>>>>>
>>>>
>>>> testing thus for seems to show that for x86_64, David's series
>>>> (using __used noinline in the BPF_KFUNC() wrapper and extended
>>>> to cover recently-arrived kfuncs like cpumask) is sufficient
>>>> to avoid resolve_btfids warnings.
>>>
>>> Nice. Alexei -- lmk how you want to proceed. I think using the
>>> __bpf_kfunc macro in the short term (with __used and noinline) is
>>> probably the least controversial way to unblock this, but am open to
>>> other suggestions.
>>
>> Sounds good to me, but sounds like __used and noinline are not
>> enough to address the issues on aarch64?
> 
> Indeed, we'll have to make sure that's also addressed. Alan -- did you
> try Alexei's suggestion to use __weak? Does that fix the issue for
> aarch64? I'm still confused as to why it's only complaining for a small
> subset of kfuncs, which include those that have external linkage.
> 
>>
>>> Yeah, I tend to think we should try to avoid using hidden / visible
>>> attributes given that (to my knowledge) they're really more meant for
>>> controlling whether a symbol is exported from a shared object rather
>>> than controlling what the compiler is doing when it creates the
>>> compilation unit. One could imagine that in an LTO build, the compiler
>>> would still optimize the function regardless of its visibility for that
>>> reason, though it's possible I don't have the full picture.
>>
>> __visible is specifically done to prevent optimization of
>> functions that are externally visible. That should address LTO concerns.
>> We haven't seen LTO messing up anything. Just something to keep in mind.
> 
> Ah, fair enough. I was conflating that with the visibility("...")
> attribute. As you pointed out, __visible is something else entirely, and
> is meant to avoid possible issues with LTO.
> 
> One other option we could consider is enforcing that kfuncs must have
> global linkage and can't be static. If we did that, it seems like

Do we really want static function to be kfuncs? It may work if we ensure
the same function name is not used in other files. But it sounds weird
since kfunc can be considered as an 'export' function (to be used by
bpf programs) which in general should have global linkage?

> __visible would be a viable option. Though we'd have to verify that it
> addresses the issue w/ aarch64.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-02-03  1:10 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-30 14:29 [PATCH v2 dwarves 0/5] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
2023-01-30 14:29 ` [PATCH v2 dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters Alan Maguire
2023-01-30 18:36   ` Arnaldo Carvalho de Melo
2023-01-30 20:10     ` Arnaldo Carvalho de Melo
2023-01-30 20:23       ` Arnaldo Carvalho de Melo
2023-01-30 22:37         ` Alan Maguire
2023-01-31  0:25           ` Arnaldo Carvalho de Melo
2023-01-31  1:04             ` Arnaldo Carvalho de Melo
2023-01-31 12:14               ` Alan Maguire
2023-01-31 12:33                 ` Arnaldo Carvalho de Melo
2023-01-31 13:35                   ` Jiri Olsa
2023-01-31 17:43                 ` Alexei Starovoitov
2023-01-31 18:16                   ` Alexei Starovoitov
2023-01-31 23:45                     ` Alan Maguire
2023-01-31 23:58                       ` David Vernet
2023-02-01  0:14                         ` Alexei Starovoitov
2023-02-01  3:02                           ` David Vernet
2023-02-01 13:59                             ` Alan Maguire
2023-02-01 15:02                               ` Arnaldo Carvalho de Melo
2023-02-01 15:13                                 ` Alan Maguire
2023-02-01 15:19                                 ` David Vernet
2023-02-01 16:49                                   ` Alexei Starovoitov
2023-02-01 17:01                                     ` Arnaldo Carvalho de Melo
2023-02-01 17:18                                       ` Alan Maguire
2023-02-01 18:54                                         ` Arnaldo Carvalho de Melo
2023-02-01 22:33                                         ` Alan Maguire
2023-02-01 22:32                                       ` Arnaldo Carvalho de Melo
2023-02-02  1:09                                         ` Arnaldo Carvalho de Melo
2023-02-03  1:09                             ` Yonghong Song
2023-01-30 14:29 ` [PATCH v2 dwarves 2/5] btf_encoder: refactor function addition into dedicated btf_encoder__add_func Alan Maguire
2023-02-01 17:19   ` Arnaldo Carvalho de Melo
2023-02-01 17:50     ` Alan Maguire
2023-02-01 18:59       ` Arnaldo Carvalho de Melo
2023-01-30 14:29 ` [PATCH v2 dwarves 3/5] btf_encoder: rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
2023-01-30 22:04   ` Jiri Olsa
2023-01-31  0:24     ` Arnaldo Carvalho de Melo
2023-01-30 14:29 ` [PATCH v2 dwarves 4/5] btf_encoder: represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
2023-01-30 14:29 ` [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies Alan Maguire
2023-01-30 17:20   ` Alexei Starovoitov
2023-01-30 18:08     ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.