All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
@ 2023-02-07 17:14 Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 1/8] dwarf_loader: Help spotting functions with optimized-out parameters Alan Maguire
                   ` (9 more replies)
  0 siblings, 10 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:14 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

At optimization level -O2 or higher in gcc, static functions may be
optimized such that they have suffixes like .isra.0, .constprop.0 etc.
These represent 
    
- constant propagation (.constprop.0);
- interprocedural scalar replacement of aggregates, removal of
  unused parameters and replacement of parameters passed by
  reference by parameters passed by value (.isra.0)
  
See [1] for details. 
    
Currently BTF encoding does not handle such optimized functions
that get renamed with a "." suffix such as ".isra.0", ".constprop.0".
This is safer because such suffixes can often indicate parameters have
been optimized out.  This series addresses this by matching a
function to a suffixed version ("foo" matching "foo.isra.0") while
ensuring that the function signature does not contain optimized-out
parameters.  Note that if the function is found ("foo") it will
be preferred, only falling back to "foo.isra.0" if lookup of the
function fails.  Addition to BTF is skipped if the function has
optimized-out parameters, since the expected function signature
will not match. BTF encoding does not include the "."-suffix to
be consistent with DWARF. In addition, the kernel currently does
not allow a "." suffix in a BTF function name.

A problem with this approach however is that BTF carries out the
encoding process in parallel across multiple CUs, and sometimes
a function has optimized-out parameters in one CU but not others;
we see this for NF_HOOK.constprop.0 for example.  So in order to
determine if the function has optimized-out parameters in any
CU, its addition is not carried out until we have processed all
CUs and are about to merge BTF.  At this point we know if any
such optimizations have occurred.  Patches 1-5 handle the
optimized-out parameter identification and matching "."-suffixed
functions with the original function to facilitate BTF
encoding.  This feature can be enabled via the
"--btf_gen_optimized" option.

Patch 6 addresses a related problem - it is entirely possible
for a static function of the same name to exist in different
CUs with different function signatures.  Because BTF does not
currently encode any information that would help disambiguate
which BTF function specification matches which static function
(in the case of multiple different function signatures), it is
best to eliminate such functions from BTF for now.  The same
mechanism that is used to compare static "."-suffixed functions
is re-used for the static function comparison.  A superficial
comparison of number of parameters/parameter names is done to
see if such representations are consistent, and if inconsistent
prototypes are observed, the function is flagged for exclusion
from BTF.

When these methods are combined - the additive encoding of
"."-suffixed functions and the subtractive elimination of
functions with inconsistent parameters - we see an overall
drop in the number of functions in vmlinux BTF, from
51529 to 50246.  Skipping inconsistent functions is enabled
via "--skip_encoding_btf_inconsistent_proto".

Changes since v2 [2]
- Arnaldo incorporated some of the suggestions in the v2 thread;
  these patches are based on those; the relevant changes are
  noted as committer changes.
- Patch 1 is unchanged from v2, but the rest of the patches
  have been updated:
- Patch 2 separates out the changes to the struct btf_encoder
  that better support later addition of functions.
- Patch 3 then is changed insofar as these changes are no
  longer needed for the function addition refactoring.
- Patch 4 has a small change; we need to verify that an
  encoder has actually been added to the encoders list
  prior to removal
- Patch 5 changed significantly; when attempting to measure
  performance the relatively good numbers attained when using
  delayed function addition were not reproducible.
  Further analysis revealed that the large number of lookups
  caused by the presence of the separate function tree was
  a major cause of performance degradation in the multi
  threaded case.  So instead of maintaining a separate tree,
  we use the ELF function list which we already need to look
  up to match ELF -> DWARF function descriptions to store
  the function representation.  This has 2 benefits; firstly
  as mentioned, we already look up the ELF function so no
  additional lookup is required to save the function.
  Secondly, the ELF representation is identical for each
  encoder, so we can index the same function across multiple
  encoder function arrays - this greatly speeds up the
  processing of comparing function representations across
  encoders.  There is still a performance cost in this
  approach however; more details are provided in patch 6.
  An option specific to adding functions with "." suffixes
  is added "--btf_gen_optimized"
- Patch 6 builds on patch 5 in applying the save/merge/add
  approach for all functions using the same mechanisms.
  In addition the "--skip_encoding_btf_inconsistent_proto"
  option is introduced.
- Patches 7/8 document the new options in the pahole manual
  page.
  
Changes since v1 [3]

- Eduard noted that a DW_AT_const_value attribute can signal
  an optimized-out parameter, and that the lack of a location
  attribute signals optimization; ensure we handle those cases
  also (Eduard, patch 1).
- Jiri noted we can have inconsistencies between a static
  and non-static function; apply the comparison process to
  all functions (Jiri, patch 5)
- segmentation fault was observed when handling functions with
  > 10 parameters; needed parameter comparison loop to exit
  at BTF_ENCODER_MAX_PARAMETERS (patch 5)
- Kui-Feng Lee pointed out that having a global shared function
  tree would lead to a lot of contention; here a per-encoder 
  tree is used, and once the threads are collected the trees
  are merged. Performance numbers are provided in patch 5 
  (Kui-Feng Lee, patches 4/5)

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2] https://lore.kernel.org/bpf/1675088985-20300-1-git-send-email-alan.maguire@oracle.com/
[3] https://lore.kernel.org/bpf/1674567931-26458-1-git-send-email-alan.maguire@oracle.com/

Alan Maguire (8):
  dwarf_loader: Help spotting functions with optimized-out parameters
  btf_encoder: store type_id_off, unspecified type in encoder
  btf_encoder: Refactor function addition into dedicated
    btf_encoder__add_func
  btf_encoder: Rework btf_encoders__*() API to allow traversal of
    encoders
  btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
  btf_encoder: support delaying function addition to check for function
    prototype inconsistencies
  dwarves: document --btf_gen_optimized option
  dwarves: document --skip_encoding_btf_inconsistent_proto option

 btf_encoder.c      | 360 +++++++++++++++++++++++++++++++++++++--------
 btf_encoder.h      |   6 -
 dwarf_loader.c     | 130 +++++++++++++++-
 dwarves.h          |  11 +-
 man-pages/pahole.1 |  10 ++
 pahole.c           |  30 +++-
 6 files changed, 468 insertions(+), 79 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 1/8] dwarf_loader: Help spotting functions with optimized-out parameters
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
@ 2023-02-07 17:14 ` Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 2/8] btf_encoder: store type_id_off, unspecified type in encoder Alan Maguire
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:14 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

Compilation generates DWARF at several stages, and often the later DWARF
representations more accurately represent optimizations that have
occurred during compilation.

In particular, parameter representations can be spotted by their
abstract origin references to the original parameter, but they often
have more accurate location information.  In most cases, the parameter
locations will match calling conventions, and be registers for the first
6 parameters on x86_64, first 8 on ARM64 etc.  If the parameter is not a
register when it should be however, it is likely passed via the stack or
the compiler has used a constant representation instead.  The latter can
often be spotted by checking for a DW_AT_const_value attribute, as noted
by Eduard.

In addition, absence of a location tag (either across the abstract
origin reference and the original parameter, or in the standalone
parameter description) is evidence of an optimized-out parameter.
Presence of a location tag is stored in the parameter description and
shared between abstract tags and their original referents.

This change adds a field to parameters and their associated ftype to
note if a parameter has been optimized out.  Having this information
allows us to skip such functions, as their presence in CUs makes BTF
encoding impossible.

Committer notes:

Changed the NR_REGISTER_PARAMS definition from a if/elif/endif for the
native architecture into a function that uses the ELF header e_machine
to find the target architecture, to allow for cross builds.

Also avoided looking at location expression in parameter__new() when the
param_idx argument is -1, as is the case when creating 'struct
parameter' instances for DW_TAG_subroutine_type, since we don't have
such info, only for the 'struct parameter' instances created from
DW_TAG_subprogram.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/1675088985-20300-2-git-send-email-alan.maguire@oracle.com
Link: https://lore.kernel.org/r/9c330c78-e668-fa4c-e0ab-52aa445ccc00@oracle.com # DW_OP_reg0 is the first register on aarch64
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 dwarf_loader.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++---
 dwarves.h      |   6 ++-
 2 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 5a74035..7aaf1d4 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -52,6 +52,10 @@
 #define DW_OP_addrx 0xa1
 #endif
 
+#ifndef EM_RISCV
+#define EM_RISCV	243
+#endif
+
 static pthread_mutex_t libdw__lock = PTHREAD_MUTEX_INITIALIZER;
 
 static uint32_t hashtags__bits = 12;
@@ -992,13 +996,98 @@ static struct class_member *class_member__new(Dwarf_Die *die, struct cu *cu,
 	return member;
 }
 
-static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
+/* How many function parameters are passed via registers?  Used below in
+ * determining if an argument has been optimized out or if it is simply
+ * an argument > cu__nr_register_params().  Making cu__nr_register_params()
+ * return 0 allows unsupported architectures to skip tagging optimized-out
+ * values.
+ */
+static int arch__nr_register_params(const GElf_Ehdr *ehdr)
+{
+	switch (ehdr->e_machine) {
+	case EM_S390:	 return 5;
+	case EM_SPARC:
+	case EM_SPARCV9:
+	case EM_X86_64:	 return 6;
+	case EM_AARCH64:
+	case EM_ARC:
+	case EM_ARM:
+	case EM_MIPS:
+	case EM_PPC:
+	case EM_PPC64:
+	case EM_RISCV:	 return 8;
+	default:	 break;
+	}
+
+	return 0;
+}
+
+static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
+					struct conf_load *conf, int param_idx)
 {
 	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
 
 	if (parm != NULL) {
+		bool has_const_value;
+		Dwarf_Attribute attr;
+		struct location loc;
+
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
+
+		if (param_idx >= cu->nr_register_params || param_idx < 0)
+			return parm;
+		/* Parameters which use DW_AT_abstract_origin to point at
+		 * the original parameter definition (with no name in the DIE)
+		 * are the result of later DWARF generation during compilation
+		 * so often better take into account if arguments were
+		 * optimized out.
+		 *
+		 * By checking that locations for parameters that are expected
+		 * to be passed as registers are actually passed as registers,
+		 * we can spot optimized-out parameters.
+		 *
+		 * It can also be the case that a parameter DIE has
+		 * a constant value attribute reflecting optimization or
+		 * has no location attribute.
+		 *
+		 * From the DWARF spec:
+		 *
+		 * "4.1.10
+		 *
+		 * A DW_AT_const_value attribute for an entry describing a
+		 * variable or formal parameter whose value is constant and not
+		 * represented by an object in the address space of the program,
+		 * or an entry describing a named constant. (Note
+		 * that such an entry does not have a location attribute.)"
+		 *
+		 * So we can also use the absence of a location for a parameter
+		 * as evidence it has been optimized out.  This info will
+		 * need to be shared between a parameter and any abstract
+		 * origin references however, since gcc can have location
+		 * information in the parameter that refers back to the original
+		 * via abstract origin, so we need to share location presence
+		 * between these parameter representations.  See
+		 * ftype__recode_dwarf_types() below for how this is handled.
+		 */
+		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
+		has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
+		if (parm->has_loc &&
+		    attr_location(die, &loc.expr, &loc.exprlen) == 0 &&
+			loc.exprlen != 0) {
+			Dwarf_Op *expr = loc.expr;
+
+			switch (expr->atom) {
+			case DW_OP_reg0 ... DW_OP_reg31:
+			case DW_OP_breg0 ... DW_OP_breg31:
+				break;
+			default:
+				parm->optimized = 1;
+				break;
+			}
+		} else if (has_const_value) {
+			parm->optimized = 1;
+		}
 	}
 
 	return parm;
@@ -1450,7 +1539,7 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
 					     struct cu *cu, struct conf_load *conf,
 					     int param_idx)
 {
-	struct parameter *parm = parameter__new(die, cu, conf);
+	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
 
 	if (parm == NULL)
 		return NULL;
@@ -2194,6 +2283,7 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 
 	ftype__for_each_parameter(type, pos) {
 		struct dwarf_tag *dpos = pos->tag.priv;
+		struct parameter *opos;
 		struct dwarf_tag *dtype;
 
 		if (dpos->type.off == 0) {
@@ -2207,8 +2297,18 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 				tag__print_abstract_origin_not_found(&pos->tag);
 				continue;
 			}
-			pos->name = tag__parameter(dtype->tag)->name;
+			opos = tag__parameter(dtype->tag);
+			pos->name = opos->name;
 			pos->tag.type = dtype->tag->type;
+			/* share location information between parameter and
+			 * abstract origin; if neither have location, we will
+			 * mark the parameter as optimized out.
+			 */
+			if (pos->has_loc)
+				opos->has_loc = pos->has_loc;
+
+			if (pos->optimized)
+				opos->optimized = pos->optimized;
 			continue;
 		}
 
@@ -2478,18 +2578,33 @@ out:
 	return 0;
 }
 
-static int cu__resolve_func_ret_types(struct cu *cu)
+static int cu__resolve_func_ret_types_optimized(struct cu *cu)
 {
 	struct ptr_table *pt = &cu->functions_table;
 	uint32_t i;
 
 	for (i = 0; i < pt->nr_entries; ++i) {
 		struct tag *tag = pt->entries[i];
+		struct parameter *pos;
+		struct function *fn = tag__function(tag);
+
+		/* mark function as optimized if parameter is, or
+		 * if parameter does not have a location; at this
+		 * point location presence has been marked in
+		 * abstract origins for cases where a parameter
+		 * location is not stored in the original function
+		 * parameter tag.
+		 */
+		ftype__for_each_parameter(&fn->proto, pos) {
+			if (pos->optimized || !pos->has_loc) {
+				fn->proto.optimized_parms = 1;
+				break;
+			}
+		}
 
 		if (tag == NULL || tag->type != 0)
 			continue;
 
-		struct function *fn = tag__function(tag);
 		if (!fn->abstract_origin)
 			continue;
 
@@ -2612,7 +2727,7 @@ static int die__process_and_recode(Dwarf_Die *die, struct cu *cu, struct conf_lo
 	if (ret != 0)
 		return ret;
 
-	return cu__resolve_func_ret_types(cu);
+	return cu__resolve_func_ret_types_optimized(cu);
 }
 
 static int class_member__cache_byte_size(struct tag *tag, struct cu *cu,
@@ -2753,6 +2868,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
 		return DWARF_CB_ABORT;
 
 	cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
+	cu->nr_register_params = arch__nr_register_params(&ehdr);
 	return 0;
 }
 
@@ -3132,7 +3248,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
 	 * encoded in another subprogram through abstract_origin
 	 * tag. Let us visit all subprograms again to resolve this.
 	 */
-	if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT)
+	if (cu__resolve_func_ret_types_optimized(cu) != LSK__KEEPIT)
 		goto out_abort;
 
 	if (cus__finalize(cus, cu, conf, NULL) == LSK__STOP_LOADING)
diff --git a/dwarves.h b/dwarves.h
index 589588e..1cd95f7 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -262,6 +262,7 @@ struct cu {
 	uint8_t		 has_addr_info:1;
 	uint8_t		 uses_global_strings:1;
 	uint8_t		 little_endian:1;
+	uint8_t		 nr_register_params;
 	uint16_t	 language;
 	unsigned long	 nr_inline_expansions;
 	size_t		 size_inline_expansions;
@@ -808,6 +809,8 @@ size_t lexblock__fprintf(const struct lexblock *lexblock, const struct cu *cu,
 struct parameter {
 	struct tag tag;
 	const char *name;
+	uint8_t optimized:1;
+	uint8_t has_loc:1;
 };
 
 static inline struct parameter *tag__parameter(const struct tag *tag)
@@ -827,7 +830,8 @@ struct ftype {
 	struct tag	 tag;
 	struct list_head parms;
 	uint16_t	 nr_parms;
-	uint8_t		 unspec_parms; /* just one bit is needed */
+	uint8_t		 unspec_parms:1; /* just one bit is needed */
+	uint8_t		 optimized_parms:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 2/8] btf_encoder: store type_id_off, unspecified type in encoder
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 1/8] dwarf_loader: Help spotting functions with optimized-out parameters Alan Maguire
@ 2023-02-07 17:14 ` Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 3/8] btf_encoder: Refactor function addition into dedicated btf_encoder__add_func Alan Maguire
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:14 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

Store the type id offset and unspecified type in the
encoder.

This will be useful for postponing local function addition
since to support function addition later on, CU references
will not work.  Provision will have to be made to save the
current type_id_off to support later addition of a function
by setting the type_id_off for the encoder to the saved
value.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 btf_encoder.c | 59 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 32 insertions(+), 27 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index a5fa04a..9063342 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -54,6 +54,8 @@ struct btf_encoder {
 	struct gobuffer   percpu_secinfo;
 	const char	  *filename;
 	struct elf_symtab *symtab;
+	uint32_t	  type_id_off;
+	uint32_t	  unspecified_type;
 	bool		  has_index_type,
 			  need_index_type,
 			  skip_encoding_vars,
@@ -593,20 +595,20 @@ static int32_t btf_encoder__add_func_param(struct btf_encoder *encoder, const ch
 	}
 }
 
-static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t type_id_off, uint32_t tag_type)
+static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_type)
 {
 	if (tag_type == 0)
 		return 0;
 
-	if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) {
+	if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
 		// No provision for encoding this, turn it into void.
 		return 0;
 	}
 
-	return type_id_off + tag_type;
+	return encoder->type_id_off + tag_type;
 }
 
-static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype, uint32_t type_id_off)
+static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct ftype *ftype)
 {
 	struct btf *btf = encoder->btf;
 	const struct btf_type *t;
@@ -616,7 +618,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
 
 	/* add btf_type for func_proto */
 	nr_params = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
-	type_id = btf_encoder__tag_type(encoder, type_id_off, ftype->tag.type);
+	type_id = btf_encoder__tag_type(encoder, ftype->tag.type);
 
 	id = btf__add_func_proto(btf, type_id);
 	if (id > 0) {
@@ -634,7 +636,7 @@ static int32_t btf_encoder__add_func_proto(struct btf_encoder *encoder, struct f
 	ftype__for_each_parameter(ftype, param) {
 		const char *name = parameter__name(param);
 
-		type_id = param->tag.type == 0 ? 0 : type_id_off + param->tag.type;
+		type_id = param->tag.type == 0 ? 0 : encoder->type_id_off + param->tag.type;
 		++param_idx;
 		if (btf_encoder__add_func_param(encoder, name, type_id, param_idx == nr_params))
 			return -1;
@@ -859,22 +861,21 @@ static void dump_invalid_symbol(const char *msg, const char *sym,
 	fprintf(stderr, "PAHOLE: Error: Use '--btf_encode_force' to ignore such symbols and force emit the btf.\n");
 }
 
-static int tag__check_id_drift(const struct tag *tag,
-			       uint32_t core_id, uint32_t btf_type_id,
-			       uint32_t type_id_off)
+static int tag__check_id_drift(struct btf_encoder *encoder, const struct tag *tag,
+			       uint32_t core_id, uint32_t btf_type_id)
 {
-	if (btf_type_id != (core_id + type_id_off)) {
+	if (btf_type_id != (core_id + encoder->type_id_off)) {
 		fprintf(stderr,
 			"%s: %s id drift, core_id: %u, btf_type_id: %u, type_id_off: %u\n",
 			__func__, dwarf_tag_name(tag->tag),
-			core_id, btf_type_id, type_id_off);
+			core_id, btf_type_id, encoder->type_id_off);
 		return -1;
 	}
 
 	return 0;
 }
 
-static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off)
+static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct tag *tag)
 {
 	struct type *type = tag__type(tag);
 	struct class_member *pos;
@@ -896,7 +897,8 @@ static int32_t btf_encoder__add_struct_type(struct btf_encoder *encoder, struct
 		 * is required.
 		 */
 		name = class_member__name(pos);
-		if (btf_encoder__add_field(encoder, name, type_id_off + pos->tag.type, pos->bitfield_size, pos->bit_offset))
+		if (btf_encoder__add_field(encoder, name, encoder->type_id_off + pos->tag.type,
+					   pos->bitfield_size, pos->bit_offset))
 			return -1;
 	}
 
@@ -936,11 +938,11 @@ static int32_t btf_encoder__add_enum_type(struct btf_encoder *encoder, struct ta
 	return type_id;
 }
 
-static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag, uint32_t type_id_off,
+static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
 				   struct conf_load *conf_load)
 {
 	/* single out type 0 as it represents special type "void" */
-	uint32_t ref_type_id = tag->type == 0 ? 0 : type_id_off + tag->type;
+	uint32_t ref_type_id = tag->type == 0 ? 0 : encoder->type_id_off + tag->type;
 	struct base_type *bt;
 	const char *name;
 
@@ -970,7 +972,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
 		if (tag__type(tag)->declaration)
 			return btf_encoder__add_ref_type(encoder, BTF_KIND_FWD, 0, name, tag->tag == DW_TAG_union_type);
 		else
-			return btf_encoder__add_struct_type(encoder, tag, type_id_off);
+			return btf_encoder__add_struct_type(encoder, tag);
 	case DW_TAG_array_type:
 		/* TODO: Encode one dimension at a time. */
 		encoder->need_index_type = true;
@@ -978,7 +980,7 @@ static int btf_encoder__encode_tag(struct btf_encoder *encoder, struct tag *tag,
 	case DW_TAG_enumeration_type:
 		return btf_encoder__add_enum_type(encoder, tag, conf_load);
 	case DW_TAG_subroutine_type:
-		return btf_encoder__add_func_proto(encoder, tag__ftype(tag), type_id_off);
+		return btf_encoder__add_func_proto(encoder, tag__ftype(tag));
         case DW_TAG_unspecified_type:
 		/* Just don't encode this for now, converting anything with this type to void (0) instead.
 		 *
@@ -1281,7 +1283,7 @@ static bool ftype__has_arg_names(const struct ftype *ftype)
 	return true;
 }
 
-static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_t type_id_off)
+static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
 {
 	struct cu *cu = encoder->cu;
 	uint32_t core_id;
@@ -1366,7 +1368,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder, uint32_
 			continue;
 		}
 
-		type = var->ip.tag.type + type_id_off;
+		type = var->ip.tag.type + encoder->type_id_off;
 		linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC;
 
 		if (encoder->verbose) {
@@ -1507,7 +1509,6 @@ void btf_encoder__delete(struct btf_encoder *encoder)
 
 int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load)
 {
-	uint32_t type_id_off = btf__type_cnt(encoder->btf) - 1;
 	struct llvm_annotation *annot;
 	int btf_type_id, tag_type_id, skipped_types = 0;
 	uint32_t core_id;
@@ -1516,21 +1517,24 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	int err = 0;
 
 	encoder->cu = cu;
+	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
+	if (encoder->cu->unspecified_type.tag)
+		encoder->unspecified_type = encoder->cu->unspecified_type.type;
 
 	if (!encoder->has_index_type) {
 		/* cu__find_base_type_by_name() takes "type_id_t *id" */
 		type_id_t id;
 		if (cu__find_base_type_by_name(cu, "int", &id)) {
 			encoder->has_index_type = true;
-			encoder->array_index_id = type_id_off + id;
+			encoder->array_index_id = encoder->type_id_off + id;
 		} else {
 			encoder->has_index_type = false;
-			encoder->array_index_id = type_id_off + cu->types_table.nr_entries;
+			encoder->array_index_id = encoder->type_id_off + cu->types_table.nr_entries;
 		}
 	}
 
 	cu__for_each_type(cu, core_id, pos) {
-		btf_type_id = btf_encoder__encode_tag(encoder, pos, type_id_off, conf_load);
+		btf_type_id = btf_encoder__encode_tag(encoder, pos, conf_load);
 
 		if (btf_type_id == 0) {
 			++skipped_types;
@@ -1538,7 +1542,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 		}
 
 		if (btf_type_id < 0 ||
-		    tag__check_id_drift(pos, core_id, btf_type_id + skipped_types, type_id_off)) {
+		    tag__check_id_drift(encoder, pos, core_id, btf_type_id + skipped_types)) {
 			err = -1;
 			goto out;
 		}
@@ -1572,7 +1576,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 			continue;
 		}
 
-		btf_type_id = type_id_off + core_id;
+		btf_type_id = encoder->type_id_off + core_id;
 		ns = tag__namespace(pos);
 		list_for_each_entry(annot, &ns->annots, node) {
 			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_type_id, annot->component_idx);
@@ -1616,7 +1620,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				continue;
 		}
 
-		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto, type_id_off);
+		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
 		name = function__name(fn);
 		btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
 		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
@@ -1633,10 +1637,11 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				goto out;
 			}
 		}
+
 	}
 
 	if (!encoder->skip_encoding_vars)
-		err = btf_encoder__encode_cu_variables(encoder, type_id_off);
+		err = btf_encoder__encode_cu_variables(encoder);
 out:
 	encoder->cu = NULL;
 	return err;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 3/8] btf_encoder: Refactor function addition into dedicated btf_encoder__add_func
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 1/8] dwarf_loader: Help spotting functions with optimized-out parameters Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 2/8] btf_encoder: store type_id_off, unspecified type in encoder Alan Maguire
@ 2023-02-07 17:14 ` Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 4/8] btf_encoder: Rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:14 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

This will be useful for postponing local function addition later on.
As part of this, store the type id offset and unspecified type in
the encoder, as this will simplify late addition of local functions.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 btf_encoder.c | 46 +++++++++++++++++++++++++++-------------------
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 9063342..71f67ae 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -764,6 +764,31 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
 	return id;
 }
 
+static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
+{
+	int btf_fnproto_id, btf_fn_id, tag_type_id;
+	struct llvm_annotation *annot;
+	const char *name;
+
+	btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
+	name = function__name(fn);
+	btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
+	if (btf_fnproto_id < 0 || btf_fn_id < 0) {
+		printf("error: failed to encode function '%s'\n", function__name(fn));
+		return -1;
+	}
+	list_for_each_entry(annot, &fn->annots, node) {
+		tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id,
+							annot->component_idx);
+		if (tag_type_id < 0) {
+			fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
+				annot->value, name, annot->component_idx);
+			return -1;
+		}
+	}
+	return 0;
+}
+
 /*
  * This corresponds to the same macro defined in
  * include/linux/kallsyms.h
@@ -1589,8 +1614,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	}
 
 	cu__for_each_function(cu, core_id, fn) {
-		int btf_fnproto_id, btf_fn_id;
-		const char *name;
 
 		/*
 		 * Skip functions that:
@@ -1620,24 +1643,9 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				continue;
 		}
 
-		btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
-		name = function__name(fn);
-		btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
-		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
-			err = -1;
-			printf("error: failed to encode function '%s'\n", function__name(fn));
+		err = btf_encoder__add_func(encoder, fn);
+		if (err)
 			goto out;
-		}
-
-		list_for_each_entry(annot, &fn->annots, node) {
-			tag_type_id = btf_encoder__add_decl_tag(encoder, annot->value, btf_fn_id, annot->component_idx);
-			if (tag_type_id < 0) {
-				fprintf(stderr, "error: failed to encode tag '%s' to func %s with component_idx %d\n",
-					annot->value, name, annot->component_idx);
-				goto out;
-			}
-		}
-
 	}
 
 	if (!encoder->skip_encoding_vars)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 4/8] btf_encoder: Rework btf_encoders__*() API to allow traversal of encoders
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (2 preceding siblings ...)
  2023-02-07 17:14 ` [PATCH v3 dwarves 3/8] btf_encoder: Refactor function addition into dedicated btf_encoder__add_func Alan Maguire
@ 2023-02-07 17:14 ` Alan Maguire
  2023-02-07 17:14 ` [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:14 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

To coordinate across multiple encoders at collection time, there will be
a need to access the set of encoders.  Rework the unused
btf_encoders__*() API to facilitate this.

Committer notes:

Removed btf_encoders__for_each_encoder() duplicate define, pointed out
by Jiri Olsa.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 btf_encoder.c | 36 ++++++++++++++++++++++++++++--------
 btf_encoder.h |  6 ------
 2 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 71f67ae..74ab61b 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -30,6 +30,7 @@
 
 #include <errno.h>
 #include <stdint.h>
+#include <pthread.h>
 
 struct elf_function {
 	const char	*name;
@@ -79,19 +80,36 @@ struct btf_encoder {
 	} functions;
 };
 
-void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder)
-{
-	list_add_tail(&encoder->node, encoders);
-}
+static LIST_HEAD(encoders);
+static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
+
+/* mutex only needed for add/delete, as this can happen in multiple encoding
+ * threads.  Traversal of the list is currently confined to thread collection.
+ */
 
-struct btf_encoder *btf_encoders__first(struct list_head *encoders)
+#define btf_encoders__for_each_encoder(encoder)		\
+	list_for_each_entry(encoder, &encoders, node)
+
+static void btf_encoders__add(struct btf_encoder *encoder)
 {
-	return list_first_entry(encoders, struct btf_encoder, node);
+	pthread_mutex_lock(&encoders__lock);
+	list_add_tail(&encoder->node, &encoders);
+	pthread_mutex_unlock(&encoders__lock);
 }
 
-struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder)
+static void btf_encoders__delete(struct btf_encoder *encoder)
 {
-	return list_next_entry(encoder, node);
+	struct btf_encoder *existing = NULL;
+
+	pthread_mutex_lock(&encoders__lock);
+	/* encoder may not have been added to list yet; check. */
+	btf_encoders__for_each_encoder(existing) {
+		if (encoder == existing)
+			break;
+	}
+	if (encoder == existing)
+		list_del(&encoder->node);
+	pthread_mutex_unlock(&encoders__lock);
 }
 
 #define PERCPU_SECTION ".data..percpu"
@@ -1505,6 +1523,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
 
 		if (encoder->verbose)
 			printf("File %s:\n", cu->filename);
+		btf_encoders__add(encoder);
 	}
 out:
 	return encoder;
@@ -1519,6 +1538,7 @@ void btf_encoder__delete(struct btf_encoder *encoder)
 	if (encoder == NULL)
 		return;
 
+	btf_encoders__delete(encoder);
 	__gobuffer__delete(&encoder->percpu_secinfo);
 	zfree(&encoder->filename);
 	btf__free(encoder->btf);
diff --git a/btf_encoder.h b/btf_encoder.h
index a65120c..34516bb 100644
--- a/btf_encoder.h
+++ b/btf_encoder.h
@@ -23,12 +23,6 @@ int btf_encoder__encode(struct btf_encoder *encoder);
 
 int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
 
-void btf_encoders__add(struct list_head *encoders, struct btf_encoder *encoder);
-
-struct btf_encoder *btf_encoders__first(struct list_head *encoders);
-
-struct btf_encoder *btf_encoders__next(struct btf_encoder *encoder);
-
 struct btf *btf_encoder__btf(struct btf_encoder *encoder);
 
 int btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (3 preceding siblings ...)
  2023-02-07 17:14 ` [PATCH v3 dwarves 4/8] btf_encoder: Rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
@ 2023-02-07 17:14 ` Alan Maguire
  2023-02-08 13:19   ` Jiri Olsa
  2023-02-07 17:15 ` [PATCH v3 dwarves 6/8] btf_encoder: support delaying function addition to check for function prototype inconsistencies Alan Maguire
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:14 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

At gcc optimization level O2 or higher, many function optimizations
occur such as

- constant propagation (.constprop.0);
- interprocedural scalar replacement of aggregates, removal of
  unused parameters and replacement of parameters passed by
  reference by parameters passed by value (.isra.0)

See [1] for details.

Currently BTF encoding does not handle such optimized functions that get
renamed with a "." suffix such as ".isra.0", ".constprop.0".  This is
safer because such suffixes can often indicate parameters have been
optimized out.  Since we can now spot this, support matching to a "."
suffix and represent the function in BTF if it does not have
optimized-out parameters.  First an attempt to match by exact name is
made; if that fails we fall back to checking for a "."-suffixed name.
The BTF representation will use the original function name "foo" not
"foo.isra.0" for consistency with DWARF representation.

There is a complication however, and this arises because we process each
CU separately and merge BTF when complete.  Different CUs may optimize
differently, so in one CU, a function may have optimized-out parameters
- and thus be ineligible for BTF - while in another it does not have
  optimized-out parameters - making it eligible for BTF.  The NF_HOOK
function is an example of this.

To avoid disrupting BTF generation parallelism, the approach taken is to
save pointers to the function representations in the ELF function table;
it is per-encoder, but the same representation is used across encoders,
so we can use the same array index to find the same function when
merging for efficiency.  Using the ELF function is ideal also as we
already have to carry out a name lookup for matching from DWARF to
ELF.

At thread collection time, observations about optimizations are
merged across encoders and we know whether it is safe to add a
"."-suffixed function or not.

The result of this is we add 1180 "."-suffixed functions to the BTF
representation.

Encoding of "."-suffixed functions is done only if the
"--btf_gen_optimized" function is specified.

Because we need to ensure consistency across encoders, there is
a performance cost to the save/merge/apply approach. Comparing
baseline to test times:

$ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux

real	0m7.788s
user	0m17.629s
sys	0m0.652s

$ time LLVM_OBJCOPY=objcopy pahole -J -j4 --btf_gen_optimized vmlinux

real	0m10.839s
user	0m19.473s
sys	0m5.932s

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 btf_encoder.c | 148 +++++++++++++++++++++++++++++++++++++++++++++++---
 dwarves.h     |   3 +
 pahole.c      |  22 +++++---
 3 files changed, 159 insertions(+), 14 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 74ab61b..cb50401 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -30,11 +30,20 @@
 
 #include <errno.h>
 #include <stdint.h>
+#include <search.h> /* for tsearch(), tfind() and tdestroy() */
 #include <pthread.h>
 
+/* state used to do later encoding of saved functions */
+struct btf_encoder_state {
+	uint32_t type_id_off;
+};
+
 struct elf_function {
 	const char	*name;
 	bool		 generated;
+	size_t		prefixlen;
+	struct function	*function;
+	struct btf_encoder_state state;
 };
 
 #define MAX_PERCPU_VAR_CNT 4096
@@ -57,6 +66,7 @@ struct btf_encoder {
 	struct elf_symtab *symtab;
 	uint32_t	  type_id_off;
 	uint32_t	  unspecified_type;
+	int		  saved_func_cnt;
 	bool		  has_index_type,
 			  need_index_type,
 			  skip_encoding_vars,
@@ -77,12 +87,15 @@ struct btf_encoder {
 		struct elf_function *entries;
 		int		    allocated;
 		int		    cnt;
+		int		    suffix_cnt; /* number of .isra, .part etc */
 	} functions;
 };
 
 static LIST_HEAD(encoders);
 static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
 
+static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder);
+
 /* mutex only needed for add/delete, as this can happen in multiple encoding
  * threads.  Traversal of the list is currently confined to thread collection.
  */
@@ -707,6 +720,11 @@ int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder
 	int32_t i, id;
 	struct btf_var_secinfo *vsi;
 
+	if (encoder == other)
+		return 0;
+
+	btf_encoder__add_saved_funcs(other);
+
 	for (i = 0; i < nr_var_secinfo; i++) {
 		vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i;
 		type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */
@@ -782,6 +800,27 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
 	return id;
 }
 
+static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct function *fn, struct elf_function *func)
+{
+	if (func->function) {
+		struct function *existing = func->function;
+
+		/* If saving and we find an existing entry, we want to merge
+		 * observations across both functions, checking that the
+		 * "seen optimized parameters" status is reflected in the func
+		 * entry. If the entry is new, record encoder state required
+		 * to add the local function later (encoder + type_id_off)
+		 * such that we can add the function later.
+		 */
+		existing->proto.optimized_parms |= fn->proto.optimized_parms;
+	} else {
+		func->state.type_id_off = encoder->type_id_off;
+		func->function = fn;
+		encoder->saved_func_cnt++;
+	}
+	return 0;
+}
+
 static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
 {
 	int btf_fnproto_id, btf_fn_id, tag_type_id;
@@ -807,6 +846,50 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
 	return 0;
 }
 
+static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder)
+{
+	int i;
+
+	for (i = 0; i < encoder->functions.cnt; i++) {
+		struct function *fn = encoder->functions.entries[i].function;
+		struct btf_encoder *other_encoder;
+
+		if (!fn || fn->proto.processed)
+			continue;
+
+		/* merge optimized-out status across encoders; since each
+		 * encoder has the same elf symbol table we can use the
+		 * same index to access the same elf symbol.
+		 */
+		btf_encoders__for_each_encoder(other_encoder) {
+			struct function *other_fn;
+
+			if (other_encoder == encoder)
+				continue;
+
+			other_fn = other_encoder->functions.entries[i].function;
+			if (!other_fn)
+				continue;
+			fn->proto.optimized_parms |= other_fn->proto.optimized_parms;
+			other_fn->proto.processed = 1;
+		}
+		if (fn->proto.optimized_parms) {
+			if (encoder->verbose) {
+				const char *name = function__name(fn);
+
+				printf("skipping addition of '%s'(%s) due to optimized-out parameters\n",
+				       name, fn->alias ?: name);
+			}
+		} else {
+			struct elf_function *func = &encoder->functions.entries[i];
+
+			encoder->type_id_off = func->state.type_id_off;
+			btf_encoder__add_func(encoder, fn);
+		}
+		fn->proto.processed = 1;
+	}
+}
+
 /*
  * This corresponds to the same macro defined in
  * include/linux/kallsyms.h
@@ -818,6 +901,11 @@ static int functions_cmp(const void *_a, const void *_b)
 	const struct elf_function *a = _a;
 	const struct elf_function *b = _b;
 
+	/* if search key allows prefix match, verify target has matching
+	 * prefix len and prefix matches.
+	 */
+	if (a->prefixlen && a->prefixlen == b->prefixlen)
+		return strncmp(a->name, b->name, b->prefixlen);
 	return strcmp(a->name, b->name);
 }
 
@@ -850,14 +938,22 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
 	}
 
 	encoder->functions.entries[encoder->functions.cnt].name = name;
+	if (strchr(name, '.')) {
+		const char *suffix = strchr(name, '.');
+
+		encoder->functions.suffix_cnt++;
+		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
+	}
 	encoder->functions.entries[encoder->functions.cnt].generated = false;
+	encoder->functions.entries[encoder->functions.cnt].function = NULL;
 	encoder->functions.cnt++;
 	return 0;
 }
 
-static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name)
+static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
+						       const char *name, size_t prefixlen)
 {
-	struct elf_function key = { .name = name };
+	struct elf_function key = { .name = name, .prefixlen = prefixlen };
 
 	return bsearch(&key, encoder->functions.entries, encoder->functions.cnt, sizeof(key), functions_cmp);
 }
@@ -1187,6 +1283,9 @@ int btf_encoder__encode(struct btf_encoder *encoder)
 {
 	int err;
 
+	/* for single-threaded case, saved funcs are added here */
+	btf_encoder__add_saved_funcs(encoder);
+
 	if (gobuffer__size(&encoder->percpu_secinfo) != 0)
 		btf_encoder__add_datasec(encoder, PERCPU_SECTION);
 
@@ -1634,6 +1733,8 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 	}
 
 	cu__for_each_function(cu, core_id, fn) {
+		struct elf_function *func = NULL;
+		bool save = false;
 
 		/*
 		 * Skip functions that:
@@ -1647,29 +1748,62 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 		if (!ftype__has_arg_names(&fn->proto))
 			continue;
 		if (encoder->functions.cnt) {
-			struct elf_function *func;
 			const char *name;
 
 			name = function__name(fn);
 			if (!name)
 				continue;
 
-			func = btf_encoder__find_function(encoder, name);
-			if (!func || func->generated)
+			/* prefer exact function name match... */
+			func = btf_encoder__find_function(encoder, name, 0);
+			if (func) {
+				if (func->generated)
+					continue;
+				func->generated = true;
+			} else if (encoder->functions.suffix_cnt &&
+				   conf_load->btf_gen_optimized) {
+				/* falling back to name.isra.0 match if no exact
+				 * match is found; only bother if we found any
+				 * .suffix function names.  The function
+				 * will be saved and added once we ensure
+				 * it does not have optimized-out parameters
+				 * in any cu.
+				 */
+				func = btf_encoder__find_function(encoder, name,
+								  strlen(name));
+				if (func) {
+					save = true;
+					if (encoder->verbose)
+						printf("matched function '%s' with '%s'%s\n",
+						       name, func->name,
+						       fn->proto.optimized_parms ?
+						       ", has optimized-out parameters" : "");
+					fn->alias = func->name;
+				}
+			}
+			if (!func)
 				continue;
-			func->generated = true;
 		} else {
 			if (!fn->external)
 				continue;
 		}
 
-		err = btf_encoder__add_func(encoder, fn);
+		if (save)
+			err = btf_encoder__save_func(encoder, fn, func);
+		else
+			err = btf_encoder__add_func(encoder, fn);
 		if (err)
 			goto out;
 	}
 
 	if (!encoder->skip_encoding_vars)
 		err = btf_encoder__encode_cu_variables(encoder);
+
+	/* It is only safe to delete this CU if we have not stashed any static
+	 * functions for later addition.
+	 */
+	if (!err)
+		err = encoder->saved_func_cnt > 0 ? LSK__KEEPIT : LSK__DELETE;
 out:
 	encoder->cu = NULL;
 	return err;
diff --git a/dwarves.h b/dwarves.h
index 1cd95f7..bb2c3bb 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -66,6 +66,7 @@ struct conf_load {
 	bool			skip_missing;
 	bool			skip_encoding_btf_type_tag;
 	bool			skip_encoding_btf_enum64;
+	bool			btf_gen_optimized;
 	uint8_t			hashtable_bits;
 	uint8_t			max_hashtable_bits;
 	uint16_t		kabi_prefix_len;
@@ -832,6 +833,7 @@ struct ftype {
 	uint16_t	 nr_parms;
 	uint8_t		 unspec_parms:1; /* just one bit is needed */
 	uint8_t		 optimized_parms:1;
+	uint8_t		 processed:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
@@ -884,6 +886,7 @@ struct function {
 	struct rb_node	 rb_node;
 	const char	 *name;
 	const char	 *linkage_name;
+	const char	 *alias;	/* name.isra.0 */
 	uint32_t	 cu_total_size_inline_expansions;
 	uint16_t	 cu_total_nr_inline_expansions;
 	uint8_t		 inlined:2;
diff --git a/pahole.c b/pahole.c
index 6f4f87c..f48b66d 100644
--- a/pahole.c
+++ b/pahole.c
@@ -1221,6 +1221,7 @@ ARGP_PROGRAM_VERSION_HOOK_DEF = dwarves_print_version;
 #define ARGP_languages_exclude	   336
 #define ARGP_skip_encoding_btf_enum64 337
 #define ARGP_skip_emitting_atomic_typedefs 338
+#define ARGP_btf_gen_optimized  339
 
 static const struct argp_option pahole__options[] = {
 	{
@@ -1633,6 +1634,11 @@ static const struct argp_option pahole__options[] = {
 		.key  = ARGP_skip_emitting_atomic_typedefs,
 		.doc  = "Do not emit 'typedef _Atomic int atomic_int' & friends."
 	},
+	{
+		.name = "btf_gen_optimized",
+		.key  = ARGP_btf_gen_optimized,
+		.doc  = "Generate BTF for functions with optimization-related suffixes (.isra, .constprop).  BTF will only be generated if a function does not optimize out parameters."
+	},
 	{
 		.name = NULL,
 	}
@@ -1802,6 +1808,8 @@ static error_t pahole__options_parser(int key, char *arg,
 		conf_load.skip_encoding_btf_enum64 = true;	break;
 	case ARGP_skip_emitting_atomic_typedefs:
 		conf.skip_emitting_atomic_typedefs = true;	break;
+	case ARGP_btf_gen_optimized:
+		conf_load.btf_gen_optimized = true;		break;
 	default:
 		return ARGP_ERR_UNKNOWN;
 	}
@@ -2980,20 +2988,20 @@ static int pahole_threads_collect(struct conf_load *conf, int nr_threads, void *
 		 * Merge content of the btf instances of worker threads to the btf
 		 * instance of the primary btf_encoder.
                 */
-		if (!threads[i]->btf || threads[i]->encoder == btf_encoder)
-			continue; /* The primary btf_encoder */
+		if (!threads[i]->btf)
+			continue;
 		err = btf_encoder__add_encoder(btf_encoder, threads[i]->encoder);
 		if (err < 0)
 			goto out;
-		btf_encoder__delete(threads[i]->encoder);
-		threads[i]->encoder = NULL;
 	}
 	err = 0;
 
 out:
 	for (i = 0; i < nr_threads; i++) {
-		if (threads[i]->encoder && threads[i]->encoder != btf_encoder)
+		if (threads[i]->encoder && threads[i]->encoder != btf_encoder) {
 			btf_encoder__delete(threads[i]->encoder);
+			threads[i]->encoder = NULL;
+		}
 	}
 	free(threads[0]);
 
@@ -3077,11 +3085,11 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
 			encoder = btf_encoder;
 		}
 
-		if (btf_encoder__encode_cu(encoder, cu, conf_load)) {
+		ret = btf_encoder__encode_cu(encoder, cu, conf_load);
+		if (ret < 0) {
 			fprintf(stderr, "Encountered error while encoding BTF.\n");
 			exit(1);
 		}
-		ret = LSK__DELETE;
 out_btf:
 		return ret;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 6/8] btf_encoder: support delaying function addition to check for function prototype inconsistencies
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (4 preceding siblings ...)
  2023-02-07 17:14 ` [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
@ 2023-02-07 17:15 ` Alan Maguire
  2023-02-07 17:15 ` [PATCH v3 dwarves 7/8] dwarves: document --btf_gen_optimized option Alan Maguire
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:15 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

There are multiple sources of inconsistency that can result in functions
of the same name having multiple prototypes:

- multiple static functions in different CUs share the same name
- static and external functions share the same name

In addition a function may have optimized-out parameters in some
CUs but not others.

Here we attempt to catch such cases by finding inconsistencies across
CUs using the save/compare/merge mechanisms that were previously
introduced to handle optimized-out parameters.

For two instances of a function to be considered consistent:

- number of parameters must match
- parameter names must match

The latter is a less strong method than a full type comparison but
suffices to match functions.

To enable inconsistency checking, the --skip_encoding_btf_inconsistent_proto
option is introduced.

With it, and the --btf_gen_optimized options in place:

- 285 functions are omitted due to inconsistent function prototypes
- 2495 functions are omitted due to optimized-out parameters, of these
  803 are functions with optimization-related suffixes ".isra", etc,
  leaving 1692 other functions without such suffixes.

Below performance effects and variability in encoded BTF are detailed.
It can be seen that due to the approach used - where functions are
marked "generated" on a per-encoder basis, we see quite variable
numbers of multiply-defined functions in the baseline case, some
with inconsistent prototypes.  With --skip_encoding_btf_inconsistent_proto
specified, this variability disappears, at the cost of a longer time
to carry out encoding due to the need to compare representations across
encoders at thread collection time.

Baseline:

Single-threaded:

$ time LLVM_OBJCOPY=objcopy pahole -J vmlinux

real    0m17.534s
user    0m17.019s
sys     0m0.514s
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |wc -l
51529
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |uniq|wc -l
51529

2 threads:

$ time LLVM_OBJCOPY=objcopy pahole -J -j2 vmlinux

real    0m10.942s
user    0m17.309s
sys     0m0.592s
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |wc -l
51798
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |uniq|wc -l
51529

4 threads:

$ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux

real    0m7.890s
user    0m18.067s
sys     0m0.661s
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |wc -l
52028
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |uniq|wc -l
51529

Test:

Single-threaded:

$ time LLVM_OBJCOPY=objcopy pahole -J --skip_encoding_btf_inconsistent_proto --btf_gen_optimized vmlinux

real    0m19.216s
user    0m17.590s
sys     0m1.624s
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |wc -l
50246
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |uniq|wc -l
50246

2 threads:

$ time LLVM_OBJCOPY=objcopy pahole -J -j2 --skip_encoding_btf_inconsistent_proto --btf_gen_optimized vmlinux

real    0m13.147s
user    0m18.179s
sys     0m3.486s
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |wc -l
50246
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |uniq|wc -l
50246

4 threads:

$ time LLVM_OBJCOPY=objcopy pahole -J -j4 --skip_encoding_btf_inconsistent_proto --btf_gen_optimized vmlinux

real    0m11.090s
user    0m19.613s
sys     0m5.895s
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |wc -l
50246
$ bpftool btf dump file vmlinux|grep " FUNC "| awk '{print $3}'|sort |uniq|wc -l
50246

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 btf_encoder.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++-----
 dwarves.h     |  2 ++
 pahole.c      |  8 +++++
 3 files changed, 96 insertions(+), 9 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index cb50401..35fb60a 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -33,9 +33,13 @@
 #include <search.h> /* for tsearch(), tfind() and tdestroy() */
 #include <pthread.h>
 
+#define BTF_ENCODER_MAX_PARAMETERS	12
+
 /* state used to do later encoding of saved functions */
 struct btf_encoder_state {
 	uint32_t type_id_off;
+	bool got_parameter_names;
+	const char *parameter_names[BTF_ENCODER_MAX_PARAMETERS];
 };
 
 struct elf_function {
@@ -800,6 +804,66 @@ static int32_t btf_encoder__add_decl_tag(struct btf_encoder *encoder, const char
 	return id;
 }
 
+static void parameter_names__get(struct ftype *ftype, size_t nr_parameters,
+				 const char **parameter_names)
+{
+	struct parameter *parameter;
+	int i = 0;
+
+	ftype__for_each_parameter(ftype, parameter) {
+		if (i >= nr_parameters)
+			return;
+		parameter_names[i++] = parameter__name(parameter);
+	}
+}
+
+static bool funcs__match(struct btf_encoder *encoder, struct elf_function *func, struct function *f2)
+{
+	const char *parameter_names[BTF_ENCODER_MAX_PARAMETERS];
+	struct function *f1 = func->function;
+	const char *name;
+	int i;
+
+	if (!f1)
+		return false;
+
+	name = function__name(f1);
+
+	if (f1->proto.nr_parms != f2->proto.nr_parms) {
+		if (encoder->verbose)
+			printf("function mismatch for '%s'(%s): %d params != %d params\n",
+			       name, f1->alias ?: name,
+			       f1->proto.nr_parms, f2->proto.nr_parms);
+		return false;
+	}
+	if (f1->proto.nr_parms == 0)
+		return true;
+
+	if (!func->state.got_parameter_names) {
+		parameter_names__get(&f1->proto, BTF_ENCODER_MAX_PARAMETERS,
+				     func->state.parameter_names);
+		func->state.got_parameter_names = true;
+	}
+	parameter_names__get(&f2->proto, BTF_ENCODER_MAX_PARAMETERS, parameter_names);
+	for (i = 0; i < f1->proto.nr_parms && i < BTF_ENCODER_MAX_PARAMETERS; i++) {
+		if (!func->state.parameter_names[i]) {
+			if (!parameter_names[i])
+				continue;
+		} else if (parameter_names[i]) {
+			if (strcmp(func->state.parameter_names[i], parameter_names[i]) == 0)
+				continue;
+		}
+		if (encoder->verbose) {
+			printf("function mismatch for '%s'(%s): parameter #%d '%s' != '%s'\n",
+			       name, f1->alias ?: name, i,
+			       func->state.parameter_names[i] ?: "<null>",
+			       parameter_names[i] ?: "<null>");
+		}
+		return false;
+	}
+	return true;
+}
+
 static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct function *fn, struct elf_function *func)
 {
 	if (func->function) {
@@ -807,12 +871,16 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 
 		/* If saving and we find an existing entry, we want to merge
 		 * observations across both functions, checking that the
-		 * "seen optimized parameters" status is reflected in the func
-		 * entry. If the entry is new, record encoder state required
+		 * "seen optimized parameters" and "inconsistent prototype"
+		 * status is reflected in the func entry.
+		 * If the entry is new, record encoder state required
 		 * to add the local function later (encoder + type_id_off)
 		 * such that we can add the function later.
 		 */
 		existing->proto.optimized_parms |= fn->proto.optimized_parms;
+		if (!existing->proto.optimized_parms && !existing->proto.inconsistent_proto &&
+		     !funcs__match(encoder, func, fn))
+			existing->proto.inconsistent_proto = 1;
 	} else {
 		func->state.type_id_off = encoder->type_id_off;
 		func->function = fn;
@@ -851,7 +919,8 @@ static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder)
 	int i;
 
 	for (i = 0; i < encoder->functions.cnt; i++) {
-		struct function *fn = encoder->functions.entries[i].function;
+		struct elf_function *func = &encoder->functions.entries[i];
+		struct function *fn = func->function;
 		struct btf_encoder *other_encoder;
 
 		if (!fn || fn->proto.processed)
@@ -871,18 +940,23 @@ static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder)
 			if (!other_fn)
 				continue;
 			fn->proto.optimized_parms |= other_fn->proto.optimized_parms;
+			if (other_fn->proto.inconsistent_proto)
+				fn->proto.inconsistent_proto = 1;
+			if (!fn->proto.optimized_parms && !fn->proto.inconsistent_proto &&
+			    !funcs__match(encoder, func, other_fn))
+				fn->proto.inconsistent_proto = 1;
 			other_fn->proto.processed = 1;
 		}
-		if (fn->proto.optimized_parms) {
+		if (fn->proto.optimized_parms || fn->proto.inconsistent_proto) {
 			if (encoder->verbose) {
 				const char *name = function__name(fn);
 
-				printf("skipping addition of '%s'(%s) due to optimized-out parameters\n",
-				       name, fn->alias ?: name);
+				printf("skipping addition of '%s'(%s) due to %s\n",
+				       name, fn->alias ?: name,
+				       fn->proto.optimized_parms ? "optimized-out parameters" :
+								   "multiple inconsistent function prototypes");
 			}
 		} else {
-			struct elf_function *func = &encoder->functions.entries[i];
-
 			encoder->type_id_off = func->state.type_id_off;
 			btf_encoder__add_func(encoder, fn);
 		}
@@ -1759,7 +1833,10 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 			if (func) {
 				if (func->generated)
 					continue;
-				func->generated = true;
+				if (conf_load->skip_encoding_btf_inconsistent_proto)
+					save = true;
+				else
+					func->generated = true;
 			} else if (encoder->functions.suffix_cnt &&
 				   conf_load->btf_gen_optimized) {
 				/* falling back to name.isra.0 match if no exact
diff --git a/dwarves.h b/dwarves.h
index bb2c3bb..c9d2bf9 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -67,6 +67,7 @@ struct conf_load {
 	bool			skip_encoding_btf_type_tag;
 	bool			skip_encoding_btf_enum64;
 	bool			btf_gen_optimized;
+	bool			skip_encoding_btf_inconsistent_proto;
 	uint8_t			hashtable_bits;
 	uint8_t			max_hashtable_bits;
 	uint16_t		kabi_prefix_len;
@@ -834,6 +835,7 @@ struct ftype {
 	uint8_t		 unspec_parms:1; /* just one bit is needed */
 	uint8_t		 optimized_parms:1;
 	uint8_t		 processed:1;
+	uint8_t		 inconsistent_proto:1;
 };
 
 static inline struct ftype *tag__ftype(const struct tag *tag)
diff --git a/pahole.c b/pahole.c
index f48b66d..2992f43 100644
--- a/pahole.c
+++ b/pahole.c
@@ -1222,6 +1222,7 @@ ARGP_PROGRAM_VERSION_HOOK_DEF = dwarves_print_version;
 #define ARGP_skip_encoding_btf_enum64 337
 #define ARGP_skip_emitting_atomic_typedefs 338
 #define ARGP_btf_gen_optimized  339
+#define ARGP_skip_encoding_btf_inconsistent_proto 340
 
 static const struct argp_option pahole__options[] = {
 	{
@@ -1639,6 +1640,11 @@ static const struct argp_option pahole__options[] = {
 		.key  = ARGP_btf_gen_optimized,
 		.doc  = "Generate BTF for functions with optimization-related suffixes (.isra, .constprop).  BTF will only be generated if a function does not optimize out parameters."
 	},
+	{
+		.name = "skip_encoding_btf_inconsistent_proto",
+		.key = ARGP_skip_encoding_btf_inconsistent_proto,
+		.doc = "Skip functions that have multiple inconsistent function prototypes sharing the same name, or have optimized-out parameters."
+	},
 	{
 		.name = NULL,
 	}
@@ -1810,6 +1816,8 @@ static error_t pahole__options_parser(int key, char *arg,
 		conf.skip_emitting_atomic_typedefs = true;	break;
 	case ARGP_btf_gen_optimized:
 		conf_load.btf_gen_optimized = true;		break;
+	case ARGP_skip_encoding_btf_inconsistent_proto:
+		conf_load.skip_encoding_btf_inconsistent_proto = true; break;
 	default:
 		return ARGP_ERR_UNKNOWN;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 7/8] dwarves: document --btf_gen_optimized option
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (5 preceding siblings ...)
  2023-02-07 17:15 ` [PATCH v3 dwarves 6/8] btf_encoder: support delaying function addition to check for function prototype inconsistencies Alan Maguire
@ 2023-02-07 17:15 ` Alan Maguire
  2023-02-07 17:15 ` [PATCH v3 dwarves 8/8] dwarves: document --skip_encoding_btf_inconsistent_proto option Alan Maguire
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:15 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

Describe the option in the manual page.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 man-pages/pahole.1 | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
index 7460104..1a85f6d 100644
--- a/man-pages/pahole.1
+++ b/man-pages/pahole.1
@@ -261,6 +261,11 @@ to "/sys/kernel/btf/vmlinux".
 Allow producing BTF_KIND_FLOAT entries in systems where the vmlinux DWARF
 information has float types.
 
+.TP
+.B \-\-btf_gen_optimized
+Generate BTF for functions with optimization-related suffixes (.isra, .constprop).
+BTF will only be generated if a function does not optimize out parameters.
+
 .TP
 .B \-\-btf_gen_all
 Allow using all the BTF features supported by pahole.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 dwarves 8/8] dwarves: document --skip_encoding_btf_inconsistent_proto option
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (6 preceding siblings ...)
  2023-02-07 17:15 ` [PATCH v3 dwarves 7/8] dwarves: document --btf_gen_optimized option Alan Maguire
@ 2023-02-07 17:15 ` Alan Maguire
  2023-02-08 13:20 ` [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Jiri Olsa
  2023-02-08 16:20 ` Arnaldo Carvalho de Melo
  9 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-07 17:15 UTC (permalink / raw)
  To: acme
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf, Alan Maguire

Describe the option in the manual page.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Kui-Feng Lee <sinquersw@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Timo Beckers <timo@incline.eu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
---
 man-pages/pahole.1 | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
index 1a85f6d..bfa20dc 100644
--- a/man-pages/pahole.1
+++ b/man-pages/pahole.1
@@ -229,6 +229,11 @@ Do not encode enum64 in BTF.
 .B \-\-skip_encoding_btf_type_tag
 Do not encode type tags in BTF.
 
+.TP
+.B \-\-skip_encoding_btf_inconsistent_proto
+Do not encode functions with multiple inconsistent prototypes or optimized-out
+parameters.
+
 .TP
 .B \-j, \-\-jobs=N
 Run N jobs in parallel. Defaults to number of online processors + 10% (like
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
  2023-02-07 17:14 ` [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
@ 2023-02-08 13:19   ` Jiri Olsa
  2023-02-08 14:43     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2023-02-08 13:19 UTC (permalink / raw)
  To: Alan Maguire
  Cc: acme, ast, andrii, daniel, eddyz87, haoluo, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf

On Tue, Feb 07, 2023 at 05:14:59PM +0000, Alan Maguire wrote:

SNIP

> +
>  /*
>   * This corresponds to the same macro defined in
>   * include/linux/kallsyms.h
> @@ -818,6 +901,11 @@ static int functions_cmp(const void *_a, const void *_b)
>  	const struct elf_function *a = _a;
>  	const struct elf_function *b = _b;
>  
> +	/* if search key allows prefix match, verify target has matching
> +	 * prefix len and prefix matches.
> +	 */
> +	if (a->prefixlen && a->prefixlen == b->prefixlen)
> +		return strncmp(a->name, b->name, b->prefixlen);
>  	return strcmp(a->name, b->name);
>  }
>  
> @@ -850,14 +938,22 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
>  	}
>  
>  	encoder->functions.entries[encoder->functions.cnt].name = name;
> +	if (strchr(name, '.')) {
> +		const char *suffix = strchr(name, '.');
> +
> +		encoder->functions.suffix_cnt++;
> +		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
> +	}
>  	encoder->functions.entries[encoder->functions.cnt].generated = false;
> +	encoder->functions.entries[encoder->functions.cnt].function = NULL;

should we zero functions.state in here? next patch adds other stuff
like got_parameter_names and parameter_names in it, so looks like it
could actually matter

jirka

>  	encoder->functions.cnt++;
>  	return 0;
>  }
>  
> -static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name)
> +static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
> +						       const char *name, size_t prefixlen)
>  {
> -	struct elf_function key = { .name = name };
> +	struct elf_function key = { .name = name, .prefixlen = prefixlen };
>  

SNIP

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (7 preceding siblings ...)
  2023-02-07 17:15 ` [PATCH v3 dwarves 8/8] dwarves: document --skip_encoding_btf_inconsistent_proto option Alan Maguire
@ 2023-02-08 13:20 ` Jiri Olsa
  2023-02-08 15:25   ` Alan Maguire
  2023-02-08 16:20 ` Arnaldo Carvalho de Melo
  9 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2023-02-08 13:20 UTC (permalink / raw)
  To: Alan Maguire
  Cc: acme, ast, andrii, daniel, eddyz87, haoluo, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf

On Tue, Feb 07, 2023 at 05:14:54PM +0000, Alan Maguire wrote:

SNIP

> 
> Changes since v2 [2]
> - Arnaldo incorporated some of the suggestions in the v2 thread;
>   these patches are based on those; the relevant changes are
>   noted as committer changes.
> - Patch 1 is unchanged from v2, but the rest of the patches
>   have been updated:
> - Patch 2 separates out the changes to the struct btf_encoder
>   that better support later addition of functions.
> - Patch 3 then is changed insofar as these changes are no
>   longer needed for the function addition refactoring.
> - Patch 4 has a small change; we need to verify that an
>   encoder has actually been added to the encoders list
>   prior to removal
> - Patch 5 changed significantly; when attempting to measure
>   performance the relatively good numbers attained when using
>   delayed function addition were not reproducible.
>   Further analysis revealed that the large number of lookups
>   caused by the presence of the separate function tree was
>   a major cause of performance degradation in the multi
>   threaded case.  So instead of maintaining a separate tree,
>   we use the ELF function list which we already need to look
>   up to match ELF -> DWARF function descriptions to store
>   the function representation.  This has 2 benefits; firstly
>   as mentioned, we already look up the ELF function so no
>   additional lookup is required to save the function.
>   Secondly, the ELF representation is identical for each
>   encoder, so we can index the same function across multiple
>   encoder function arrays - this greatly speeds up the
>   processing of comparing function representations across
>   encoders.  There is still a performance cost in this

awesome.. great we can do it without the extra tree

I wonder we could save some cycles just by memdup-ing the encoder->functions
array for the subsequent encoders, but that's ok for another patch ;-)

thanks,
jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
  2023-02-08 13:19   ` Jiri Olsa
@ 2023-02-08 14:43     ` Arnaldo Carvalho de Melo
  2023-02-08 20:51       ` Jiri Olsa
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-08 14:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alan Maguire, ast, andrii, daniel, eddyz87, haoluo,
	john.fastabend, kpsingh, sinquersw, martin.lau, songliubraving,
	sdf, timo, yhs, bpf

Em Wed, Feb 08, 2023 at 02:19:20PM +0100, Jiri Olsa escreveu:
> On Tue, Feb 07, 2023 at 05:14:59PM +0000, Alan Maguire wrote:
> 
> SNIP
> 
> > +
> >  /*
> >   * This corresponds to the same macro defined in
> >   * include/linux/kallsyms.h
> > @@ -818,6 +901,11 @@ static int functions_cmp(const void *_a, const void *_b)
> >  	const struct elf_function *a = _a;
> >  	const struct elf_function *b = _b;
> >  
> > +	/* if search key allows prefix match, verify target has matching
> > +	 * prefix len and prefix matches.
> > +	 */
> > +	if (a->prefixlen && a->prefixlen == b->prefixlen)
> > +		return strncmp(a->name, b->name, b->prefixlen);
> >  	return strcmp(a->name, b->name);
> >  }
> >  
> > @@ -850,14 +938,22 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
> >  	}
> >  
> >  	encoder->functions.entries[encoder->functions.cnt].name = name;
> > +	if (strchr(name, '.')) {
> > +		const char *suffix = strchr(name, '.');
> > +
> > +		encoder->functions.suffix_cnt++;
> > +		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
> > +	}
> >  	encoder->functions.entries[encoder->functions.cnt].generated = false;
> > +	encoder->functions.entries[encoder->functions.cnt].function = NULL;
> 
> should we zero functions.state in here? next patch adds other stuff
> like got_parameter_names and parameter_names in it, so looks like it
> could actually matter

Probably, but that can come as a followup patch, right?

I've applied the patches, combining the patches documenting the two new
command line options with the patches where those options are
introduced.

Testing everything now.

Thanks,

- Arnaldo
 
> jirka
> 
> >  	encoder->functions.cnt++;
> >  	return 0;
> >  }
> >  
> > -static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name)
> > +static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
> > +						       const char *name, size_t prefixlen)
> >  {
> > -	struct elf_function key = { .name = name };
> > +	struct elf_function key = { .name = name, .prefixlen = prefixlen };
> >  
> 
> SNIP

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
  2023-02-08 13:20 ` [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Jiri Olsa
@ 2023-02-08 15:25   ` Alan Maguire
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-08 15:25 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, ast, andrii, daniel, eddyz87, haoluo, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf

On 08/02/2023 13:20, Jiri Olsa wrote:
> On Tue, Feb 07, 2023 at 05:14:54PM +0000, Alan Maguire wrote:
> 
> SNIP
> 
>>
>> Changes since v2 [2]
>> - Arnaldo incorporated some of the suggestions in the v2 thread;
>>   these patches are based on those; the relevant changes are
>>   noted as committer changes.
>> - Patch 1 is unchanged from v2, but the rest of the patches
>>   have been updated:
>> - Patch 2 separates out the changes to the struct btf_encoder
>>   that better support later addition of functions.
>> - Patch 3 then is changed insofar as these changes are no
>>   longer needed for the function addition refactoring.
>> - Patch 4 has a small change; we need to verify that an
>>   encoder has actually been added to the encoders list
>>   prior to removal
>> - Patch 5 changed significantly; when attempting to measure
>>   performance the relatively good numbers attained when using
>>   delayed function addition were not reproducible.
>>   Further analysis revealed that the large number of lookups
>>   caused by the presence of the separate function tree was
>>   a major cause of performance degradation in the multi
>>   threaded case.  So instead of maintaining a separate tree,
>>   we use the ELF function list which we already need to look
>>   up to match ELF -> DWARF function descriptions to store
>>   the function representation.  This has 2 benefits; firstly
>>   as mentioned, we already look up the ELF function so no
>>   additional lookup is required to save the function.
>>   Secondly, the ELF representation is identical for each
>>   encoder, so we can index the same function across multiple
>>   encoder function arrays - this greatly speeds up the
>>   processing of comparing function representations across
>>   encoders.  There is still a performance cost in this
> 
> awesome.. great we can do it without the extra tree
> 
> I wonder we could save some cycles just by memdup-ing the encoder->functions
> array for the subsequent encoders, but that's ok for another patch ;-)
> 

great idea; also provides extra assurance the layout of the
ELF function arrays are identical! I'd started to explore having
ELF info allocated once in main encoder thread and just duped
for other threads; should definitely save some time. thanks!

Alan

> thanks,
> jirka
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
  2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
                   ` (8 preceding siblings ...)
  2023-02-08 13:20 ` [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Jiri Olsa
@ 2023-02-08 16:20 ` Arnaldo Carvalho de Melo
  2023-02-08 16:50   ` Arnaldo Carvalho de Melo
  9 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-08 16:20 UTC (permalink / raw)
  To: Alan Maguire
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf

Em Tue, Feb 07, 2023 at 05:14:54PM +0000, Alan Maguire escreveu:
> At optimization level -O2 or higher in gcc, static functions may be
> optimized such that they have suffixes like .isra.0, .constprop.0 etc.
> These represent 
>     
> - constant propagation (.constprop.0);
> - interprocedural scalar replacement of aggregates, removal of
>   unused parameters and replacement of parameters passed by
>   reference by parameters passed by value (.isra.0)

Initial test, without using the new options:

[acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | tail
      3 start_show
      3 timeout_show
      3 uuid_show
      4 m_next
      4 parse_options
      4 sk_diag_fill
      4 state_show
      4 state_store
      5 status_show
      6 type_show
[acme@pumpkin ~]$

Now I'll use --skip_encoding_btf_inconsistent_proto and --btf_gen_optimized

- Arnaldo
   
> See [1] for details. 
>     
> Currently BTF encoding does not handle such optimized functions
> that get renamed with a "." suffix such as ".isra.0", ".constprop.0".
> This is safer because such suffixes can often indicate parameters have
> been optimized out.  This series addresses this by matching a
> function to a suffixed version ("foo" matching "foo.isra.0") while
> ensuring that the function signature does not contain optimized-out
> parameters.  Note that if the function is found ("foo") it will
> be preferred, only falling back to "foo.isra.0" if lookup of the
> function fails.  Addition to BTF is skipped if the function has
> optimized-out parameters, since the expected function signature
> will not match. BTF encoding does not include the "."-suffix to
> be consistent with DWARF. In addition, the kernel currently does
> not allow a "." suffix in a BTF function name.
> 
> A problem with this approach however is that BTF carries out the
> encoding process in parallel across multiple CUs, and sometimes
> a function has optimized-out parameters in one CU but not others;
> we see this for NF_HOOK.constprop.0 for example.  So in order to
> determine if the function has optimized-out parameters in any
> CU, its addition is not carried out until we have processed all
> CUs and are about to merge BTF.  At this point we know if any
> such optimizations have occurred.  Patches 1-5 handle the
> optimized-out parameter identification and matching "."-suffixed
> functions with the original function to facilitate BTF
> encoding.  This feature can be enabled via the
> "--btf_gen_optimized" option.
> 
> Patch 6 addresses a related problem - it is entirely possible
> for a static function of the same name to exist in different
> CUs with different function signatures.  Because BTF does not
> currently encode any information that would help disambiguate
> which BTF function specification matches which static function
> (in the case of multiple different function signatures), it is
> best to eliminate such functions from BTF for now.  The same
> mechanism that is used to compare static "."-suffixed functions
> is re-used for the static function comparison.  A superficial
> comparison of number of parameters/parameter names is done to
> see if such representations are consistent, and if inconsistent
> prototypes are observed, the function is flagged for exclusion
> from BTF.
> 
> When these methods are combined - the additive encoding of
> "."-suffixed functions and the subtractive elimination of
> functions with inconsistent parameters - we see an overall
> drop in the number of functions in vmlinux BTF, from
> 51529 to 50246.  Skipping inconsistent functions is enabled
> via "--skip_encoding_btf_inconsistent_proto".
> 
> Changes since v2 [2]
> - Arnaldo incorporated some of the suggestions in the v2 thread;
>   these patches are based on those; the relevant changes are
>   noted as committer changes.
> - Patch 1 is unchanged from v2, but the rest of the patches
>   have been updated:
> - Patch 2 separates out the changes to the struct btf_encoder
>   that better support later addition of functions.
> - Patch 3 then is changed insofar as these changes are no
>   longer needed for the function addition refactoring.
> - Patch 4 has a small change; we need to verify that an
>   encoder has actually been added to the encoders list
>   prior to removal
> - Patch 5 changed significantly; when attempting to measure
>   performance the relatively good numbers attained when using
>   delayed function addition were not reproducible.
>   Further analysis revealed that the large number of lookups
>   caused by the presence of the separate function tree was
>   a major cause of performance degradation in the multi
>   threaded case.  So instead of maintaining a separate tree,
>   we use the ELF function list which we already need to look
>   up to match ELF -> DWARF function descriptions to store
>   the function representation.  This has 2 benefits; firstly
>   as mentioned, we already look up the ELF function so no
>   additional lookup is required to save the function.
>   Secondly, the ELF representation is identical for each
>   encoder, so we can index the same function across multiple
>   encoder function arrays - this greatly speeds up the
>   processing of comparing function representations across
>   encoders.  There is still a performance cost in this
>   approach however; more details are provided in patch 6.
>   An option specific to adding functions with "." suffixes
>   is added "--btf_gen_optimized"
> - Patch 6 builds on patch 5 in applying the save/merge/add
>   approach for all functions using the same mechanisms.
>   In addition the "--skip_encoding_btf_inconsistent_proto"
>   option is introduced.
> - Patches 7/8 document the new options in the pahole manual
>   page.
>   
> Changes since v1 [3]
> 
> - Eduard noted that a DW_AT_const_value attribute can signal
>   an optimized-out parameter, and that the lack of a location
>   attribute signals optimization; ensure we handle those cases
>   also (Eduard, patch 1).
> - Jiri noted we can have inconsistencies between a static
>   and non-static function; apply the comparison process to
>   all functions (Jiri, patch 5)
> - segmentation fault was observed when handling functions with
>   > 10 parameters; needed parameter comparison loop to exit
>   at BTF_ENCODER_MAX_PARAMETERS (patch 5)
> - Kui-Feng Lee pointed out that having a global shared function
>   tree would lead to a lot of contention; here a per-encoder 
>   tree is used, and once the threads are collected the trees
>   are merged. Performance numbers are provided in patch 5 
>   (Kui-Feng Lee, patches 4/5)
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> [2] https://lore.kernel.org/bpf/1675088985-20300-1-git-send-email-alan.maguire@oracle.com/
> [3] https://lore.kernel.org/bpf/1674567931-26458-1-git-send-email-alan.maguire@oracle.com/
> 
> Alan Maguire (8):
>   dwarf_loader: Help spotting functions with optimized-out parameters
>   btf_encoder: store type_id_off, unspecified type in encoder
>   btf_encoder: Refactor function addition into dedicated
>     btf_encoder__add_func
>   btf_encoder: Rework btf_encoders__*() API to allow traversal of
>     encoders
>   btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
>   btf_encoder: support delaying function addition to check for function
>     prototype inconsistencies
>   dwarves: document --btf_gen_optimized option
>   dwarves: document --skip_encoding_btf_inconsistent_proto option
> 
>  btf_encoder.c      | 360 +++++++++++++++++++++++++++++++++++++--------
>  btf_encoder.h      |   6 -
>  dwarf_loader.c     | 130 +++++++++++++++-
>  dwarves.h          |  11 +-
>  man-pages/pahole.1 |  10 ++
>  pahole.c           |  30 +++-
>  6 files changed, 468 insertions(+), 79 deletions(-)
> 
> -- 
> 2.31.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
  2023-02-08 16:20 ` Arnaldo Carvalho de Melo
@ 2023-02-08 16:50   ` Arnaldo Carvalho de Melo
  2023-02-09  9:36     ` Jiri Olsa
       [not found]     ` <3c021d56-8818-2464-f7e0-889e769c0311@oracle.com>
  0 siblings, 2 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-08 16:50 UTC (permalink / raw)
  To: Alan Maguire
  Cc: ast, andrii, daniel, eddyz87, haoluo, jolsa, john.fastabend,
	kpsingh, sinquersw, martin.lau, songliubraving, sdf, timo, yhs,
	bpf

Em Wed, Feb 08, 2023 at 01:20:39PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Feb 07, 2023 at 05:14:54PM +0000, Alan Maguire escreveu:
> > At optimization level -O2 or higher in gcc, static functions may be
> > optimized such that they have suffixes like .isra.0, .constprop.0 etc.
> > These represent 
> >     
> > - constant propagation (.constprop.0);
> > - interprocedural scalar replacement of aggregates, removal of
> >   unused parameters and replacement of parameters passed by
> >   reference by parameters passed by value (.isra.0)
> 
> Initial test, without using the new options:
> 
> [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | tail
>       3 start_show
>       3 timeout_show
>       3 uuid_show
>       4 m_next
>       4 parse_options
>       4 sk_diag_fill
>       4 state_show
>       4 state_store
>       5 status_show
>       6 type_show
> [acme@pumpkin ~]$
> 
> Now I'll use --skip_encoding_btf_inconsistent_proto and --btf_gen_optimized

With:

⬢[acme@toolbox linux]$ git diff
diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
index 1f1f1d397c399afe..9dd185fb1ff1fc3b 100755
--- a/scripts/pahole-flags.sh
+++ b/scripts/pahole-flags.sh
@@ -21,7 +21,7 @@ if [ "${pahole_ver}" -ge "122" ]; then
 fi
 if [ "${pahole_ver}" -ge "124" ]; then
        # see PAHOLE_HAS_LANG_EXCLUDE
-       extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
+       extra_paholeopt="${extra_paholeopt} --lang_exclude=rust --btf_gen_optimized --skip_encoding_btf_inconsistent_proto"
 fi

 echo ${extra_paholeopt}
⬢[acme@toolbox linux]$

I get:

[acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | tail
      1 zswap_writeback_entry
      1 zswap_zpool_param_set
      1 zs_zpool_create
      1 zs_zpool_destroy
      1 zs_zpool_free
      1 zs_zpool_malloc
      1 zs_zpool_map
      1 zs_zpool_shrink
      1 zs_zpool_total_size
      1 zs_zpool_unmap
[acme@pumpkin ~]$

No functions with more than one entry:

[acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | grep -v ' 1 '
[acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | grep ' 1 ' | wc -l
54558
[acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | wc -l
54558
[acme@pumpkin ~]$

So I'll bump the release as we did in the past when testing features
that we need to test against a release on the pahole-flags.sh script so
that we can do further tests.

- Arnaldo
 
> - Arnaldo
>    
> > See [1] for details. 
> >     
> > Currently BTF encoding does not handle such optimized functions
> > that get renamed with a "." suffix such as ".isra.0", ".constprop.0".
> > This is safer because such suffixes can often indicate parameters have
> > been optimized out.  This series addresses this by matching a
> > function to a suffixed version ("foo" matching "foo.isra.0") while
> > ensuring that the function signature does not contain optimized-out
> > parameters.  Note that if the function is found ("foo") it will
> > be preferred, only falling back to "foo.isra.0" if lookup of the
> > function fails.  Addition to BTF is skipped if the function has
> > optimized-out parameters, since the expected function signature
> > will not match. BTF encoding does not include the "."-suffix to
> > be consistent with DWARF. In addition, the kernel currently does
> > not allow a "." suffix in a BTF function name.
> > 
> > A problem with this approach however is that BTF carries out the
> > encoding process in parallel across multiple CUs, and sometimes
> > a function has optimized-out parameters in one CU but not others;
> > we see this for NF_HOOK.constprop.0 for example.  So in order to
> > determine if the function has optimized-out parameters in any
> > CU, its addition is not carried out until we have processed all
> > CUs and are about to merge BTF.  At this point we know if any
> > such optimizations have occurred.  Patches 1-5 handle the
> > optimized-out parameter identification and matching "."-suffixed
> > functions with the original function to facilitate BTF
> > encoding.  This feature can be enabled via the
> > "--btf_gen_optimized" option.
> > 
> > Patch 6 addresses a related problem - it is entirely possible
> > for a static function of the same name to exist in different
> > CUs with different function signatures.  Because BTF does not
> > currently encode any information that would help disambiguate
> > which BTF function specification matches which static function
> > (in the case of multiple different function signatures), it is
> > best to eliminate such functions from BTF for now.  The same
> > mechanism that is used to compare static "."-suffixed functions
> > is re-used for the static function comparison.  A superficial
> > comparison of number of parameters/parameter names is done to
> > see if such representations are consistent, and if inconsistent
> > prototypes are observed, the function is flagged for exclusion
> > from BTF.
> > 
> > When these methods are combined - the additive encoding of
> > "."-suffixed functions and the subtractive elimination of
> > functions with inconsistent parameters - we see an overall
> > drop in the number of functions in vmlinux BTF, from
> > 51529 to 50246.  Skipping inconsistent functions is enabled
> > via "--skip_encoding_btf_inconsistent_proto".
> > 
> > Changes since v2 [2]
> > - Arnaldo incorporated some of the suggestions in the v2 thread;
> >   these patches are based on those; the relevant changes are
> >   noted as committer changes.
> > - Patch 1 is unchanged from v2, but the rest of the patches
> >   have been updated:
> > - Patch 2 separates out the changes to the struct btf_encoder
> >   that better support later addition of functions.
> > - Patch 3 then is changed insofar as these changes are no
> >   longer needed for the function addition refactoring.
> > - Patch 4 has a small change; we need to verify that an
> >   encoder has actually been added to the encoders list
> >   prior to removal
> > - Patch 5 changed significantly; when attempting to measure
> >   performance the relatively good numbers attained when using
> >   delayed function addition were not reproducible.
> >   Further analysis revealed that the large number of lookups
> >   caused by the presence of the separate function tree was
> >   a major cause of performance degradation in the multi
> >   threaded case.  So instead of maintaining a separate tree,
> >   we use the ELF function list which we already need to look
> >   up to match ELF -> DWARF function descriptions to store
> >   the function representation.  This has 2 benefits; firstly
> >   as mentioned, we already look up the ELF function so no
> >   additional lookup is required to save the function.
> >   Secondly, the ELF representation is identical for each
> >   encoder, so we can index the same function across multiple
> >   encoder function arrays - this greatly speeds up the
> >   processing of comparing function representations across
> >   encoders.  There is still a performance cost in this
> >   approach however; more details are provided in patch 6.
> >   An option specific to adding functions with "." suffixes
> >   is added "--btf_gen_optimized"
> > - Patch 6 builds on patch 5 in applying the save/merge/add
> >   approach for all functions using the same mechanisms.
> >   In addition the "--skip_encoding_btf_inconsistent_proto"
> >   option is introduced.
> > - Patches 7/8 document the new options in the pahole manual
> >   page.
> >   
> > Changes since v1 [3]
> > 
> > - Eduard noted that a DW_AT_const_value attribute can signal
> >   an optimized-out parameter, and that the lack of a location
> >   attribute signals optimization; ensure we handle those cases
> >   also (Eduard, patch 1).
> > - Jiri noted we can have inconsistencies between a static
> >   and non-static function; apply the comparison process to
> >   all functions (Jiri, patch 5)
> > - segmentation fault was observed when handling functions with
> >   > 10 parameters; needed parameter comparison loop to exit
> >   at BTF_ENCODER_MAX_PARAMETERS (patch 5)
> > - Kui-Feng Lee pointed out that having a global shared function
> >   tree would lead to a lot of contention; here a per-encoder 
> >   tree is used, and once the threads are collected the trees
> >   are merged. Performance numbers are provided in patch 5 
> >   (Kui-Feng Lee, patches 4/5)
> > 
> > [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> > [2] https://lore.kernel.org/bpf/1675088985-20300-1-git-send-email-alan.maguire@oracle.com/
> > [3] https://lore.kernel.org/bpf/1674567931-26458-1-git-send-email-alan.maguire@oracle.com/
> > 
> > Alan Maguire (8):
> >   dwarf_loader: Help spotting functions with optimized-out parameters
> >   btf_encoder: store type_id_off, unspecified type in encoder
> >   btf_encoder: Refactor function addition into dedicated
> >     btf_encoder__add_func
> >   btf_encoder: Rework btf_encoders__*() API to allow traversal of
> >     encoders
> >   btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
> >   btf_encoder: support delaying function addition to check for function
> >     prototype inconsistencies
> >   dwarves: document --btf_gen_optimized option
> >   dwarves: document --skip_encoding_btf_inconsistent_proto option
> > 
> >  btf_encoder.c      | 360 +++++++++++++++++++++++++++++++++++++--------
> >  btf_encoder.h      |   6 -
> >  dwarf_loader.c     | 130 +++++++++++++++-
> >  dwarves.h          |  11 +-
> >  man-pages/pahole.1 |  10 ++
> >  pahole.c           |  30 +++-
> >  6 files changed, 468 insertions(+), 79 deletions(-)
> > 
> > -- 
> > 2.31.1
> > 
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
  2023-02-08 14:43     ` Arnaldo Carvalho de Melo
@ 2023-02-08 20:51       ` Jiri Olsa
  2023-02-08 22:57         ` Alan Maguire
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2023-02-08 20:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Alan Maguire, ast, andrii, daniel, eddyz87, haoluo,
	john.fastabend, kpsingh, sinquersw, martin.lau, songliubraving,
	sdf, timo, yhs, bpf

On Wed, Feb 08, 2023 at 11:43:18AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Feb 08, 2023 at 02:19:20PM +0100, Jiri Olsa escreveu:
> > On Tue, Feb 07, 2023 at 05:14:59PM +0000, Alan Maguire wrote:
> > 
> > SNIP
> > 
> > > +
> > >  /*
> > >   * This corresponds to the same macro defined in
> > >   * include/linux/kallsyms.h
> > > @@ -818,6 +901,11 @@ static int functions_cmp(const void *_a, const void *_b)
> > >  	const struct elf_function *a = _a;
> > >  	const struct elf_function *b = _b;
> > >  
> > > +	/* if search key allows prefix match, verify target has matching
> > > +	 * prefix len and prefix matches.
> > > +	 */
> > > +	if (a->prefixlen && a->prefixlen == b->prefixlen)
> > > +		return strncmp(a->name, b->name, b->prefixlen);
> > >  	return strcmp(a->name, b->name);
> > >  }
> > >  
> > > @@ -850,14 +938,22 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
> > >  	}
> > >  
> > >  	encoder->functions.entries[encoder->functions.cnt].name = name;
> > > +	if (strchr(name, '.')) {
> > > +		const char *suffix = strchr(name, '.');
> > > +
> > > +		encoder->functions.suffix_cnt++;
> > > +		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
> > > +	}
> > >  	encoder->functions.entries[encoder->functions.cnt].generated = false;
> > > +	encoder->functions.entries[encoder->functions.cnt].function = NULL;
> > 
> > should we zero functions.state in here? next patch adds other stuff
> > like got_parameter_names and parameter_names in it, so looks like it
> > could actually matter
> 
> Probably, but that can come as a followup patch, right?

sure, if Alan is ok with that

jirka

> 
> I've applied the patches, combining the patches documenting the two new
> command line options with the patches where those options are
> introduced.
> 
> Testing everything now.
> 
> Thanks,
> 
> - Arnaldo
>  
> > jirka
> > 
> > >  	encoder->functions.cnt++;
> > >  	return 0;
> > >  }
> > >  
> > > -static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name)
> > > +static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
> > > +						       const char *name, size_t prefixlen)
> > >  {
> > > -	struct elf_function key = { .name = name };
> > > +	struct elf_function key = { .name = name, .prefixlen = prefixlen };
> > >  
> > 
> > SNIP
> 
> -- 
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF
  2023-02-08 20:51       ` Jiri Olsa
@ 2023-02-08 22:57         ` Alan Maguire
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Maguire @ 2023-02-08 22:57 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo
  Cc: ast, andrii, daniel, eddyz87, haoluo, john.fastabend, kpsingh,
	sinquersw, martin.lau, songliubraving, sdf, timo, yhs, bpf

On 08/02/2023 20:51, Jiri Olsa wrote:
> On Wed, Feb 08, 2023 at 11:43:18AM -0300, Arnaldo Carvalho de Melo wrote:
>> Em Wed, Feb 08, 2023 at 02:19:20PM +0100, Jiri Olsa escreveu:
>>> On Tue, Feb 07, 2023 at 05:14:59PM +0000, Alan Maguire wrote:
>>>
>>> SNIP
>>>
>>>> +
>>>>  /*
>>>>   * This corresponds to the same macro defined in
>>>>   * include/linux/kallsyms.h
>>>> @@ -818,6 +901,11 @@ static int functions_cmp(const void *_a, const void *_b)
>>>>  	const struct elf_function *a = _a;
>>>>  	const struct elf_function *b = _b;
>>>>  
>>>> +	/* if search key allows prefix match, verify target has matching
>>>> +	 * prefix len and prefix matches.
>>>> +	 */
>>>> +	if (a->prefixlen && a->prefixlen == b->prefixlen)
>>>> +		return strncmp(a->name, b->name, b->prefixlen);
>>>>  	return strcmp(a->name, b->name);
>>>>  }
>>>>  
>>>> @@ -850,14 +938,22 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
>>>>  	}
>>>>  
>>>>  	encoder->functions.entries[encoder->functions.cnt].name = name;
>>>> +	if (strchr(name, '.')) {
>>>> +		const char *suffix = strchr(name, '.');
>>>> +
>>>> +		encoder->functions.suffix_cnt++;
>>>> +		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
>>>> +	}
>>>>  	encoder->functions.entries[encoder->functions.cnt].generated = false;
>>>> +	encoder->functions.entries[encoder->functions.cnt].function = NULL;
>>>
>>> should we zero functions.state in here? next patch adds other stuff
>>> like got_parameter_names and parameter_names in it, so looks like it
>>> could actually matter
>>
>> Probably, but that can come as a followup patch, right?
> 
> sure, if Alan is ok with that
> 

it's a great catch; I sent:

https://lore.kernel.org/bpf/1675896868-26339-1-git-send-email-alan.maguire@oracle.com/

...as a followup. Thanks!

> jirka
> 
>>
>> I've applied the patches, combining the patches documenting the two new
>> command line options with the patches where those options are
>> introduced.
>>
>> Testing everything now.
>>
>> Thanks,
>>
>> - Arnaldo
>>  
>>> jirka
>>>
>>>>  	encoder->functions.cnt++;
>>>>  	return 0;
>>>>  }
>>>>  
>>>> -static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name)
>>>> +static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
>>>> +						       const char *name, size_t prefixlen)
>>>>  {
>>>> -	struct elf_function key = { .name = name };
>>>> +	struct elf_function key = { .name = name, .prefixlen = prefixlen };
>>>>  
>>>
>>> SNIP
>>
>> -- 
>>
>> - Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
  2023-02-08 16:50   ` Arnaldo Carvalho de Melo
@ 2023-02-09  9:36     ` Jiri Olsa
  2023-02-09 12:22       ` Arnaldo Carvalho de Melo
       [not found]     ` <3c021d56-8818-2464-f7e0-889e769c0311@oracle.com>
  1 sibling, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2023-02-09  9:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alan Maguire, ast, andrii, daniel, eddyz87, haoluo,
	john.fastabend, kpsingh, sinquersw, martin.lau, songliubraving,
	sdf, timo, yhs, bpf

On Wed, Feb 08, 2023 at 01:50:27PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Feb 08, 2023 at 01:20:39PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Tue, Feb 07, 2023 at 05:14:54PM +0000, Alan Maguire escreveu:
> > > At optimization level -O2 or higher in gcc, static functions may be
> > > optimized such that they have suffixes like .isra.0, .constprop.0 etc.
> > > These represent 
> > >     
> > > - constant propagation (.constprop.0);
> > > - interprocedural scalar replacement of aggregates, removal of
> > >   unused parameters and replacement of parameters passed by
> > >   reference by parameters passed by value (.isra.0)
> > 
> > Initial test, without using the new options:
> > 
> > [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | tail
> >       3 start_show
> >       3 timeout_show
> >       3 uuid_show
> >       4 m_next
> >       4 parse_options
> >       4 sk_diag_fill
> >       4 state_show
> >       4 state_store
> >       5 status_show
> >       6 type_show
> > [acme@pumpkin ~]$
> > 
> > Now I'll use --skip_encoding_btf_inconsistent_proto and --btf_gen_optimized
> 
> With:
> 
> ⬢[acme@toolbox linux]$ git diff
> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> index 1f1f1d397c399afe..9dd185fb1ff1fc3b 100755
> --- a/scripts/pahole-flags.sh
> +++ b/scripts/pahole-flags.sh
> @@ -21,7 +21,7 @@ if [ "${pahole_ver}" -ge "122" ]; then
>  fi
>  if [ "${pahole_ver}" -ge "124" ]; then
>         # see PAHOLE_HAS_LANG_EXCLUDE
> -       extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> +       extra_paholeopt="${extra_paholeopt} --lang_exclude=rust --btf_gen_optimized --skip_encoding_btf_inconsistent_proto"
>  fi
> 
>  echo ${extra_paholeopt}
> ⬢[acme@toolbox linux]$
> 
> I get:
> 
> [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | tail
>       1 zswap_writeback_entry
>       1 zswap_zpool_param_set
>       1 zs_zpool_create
>       1 zs_zpool_destroy
>       1 zs_zpool_free
>       1 zs_zpool_malloc
>       1 zs_zpool_map
>       1 zs_zpool_shrink
>       1 zs_zpool_total_size
>       1 zs_zpool_unmap
> [acme@pumpkin ~]$
> 
> No functions with more than one entry:
> 
> [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | grep -v ' 1 '
> [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | grep ' 1 ' | wc -l
> 54558
> [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | wc -l
> 54558
> [acme@pumpkin ~]$
> 
> So I'll bump the release as we did in the past when testing features
> that we need to test against a release on the pahole-flags.sh script so
> that we can do further tests.

I did similar test and ran bpf selftests built with new pahole,
all looks good

Acked/Tested-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions
  2023-02-09  9:36     ` Jiri Olsa
@ 2023-02-09 12:22       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-09 12:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alan Maguire, ast, andrii, daniel, eddyz87, haoluo,
	john.fastabend, kpsingh, sinquersw, martin.lau, songliubraving,
	sdf, timo, yhs, bpf

Em Thu, Feb 09, 2023 at 10:36:51AM +0100, Jiri Olsa escreveu:
> On Wed, Feb 08, 2023 at 01:50:27PM -0300, Arnaldo Carvalho de Melo wrote:
> > No functions with more than one entry:

> > [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | grep -v ' 1 '
> > [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | sort | uniq -c | sort -n | grep ' 1 ' | wc -l
> > 54558
> > [acme@pumpkin ~]$ pfunct /sys/kernel/btf/vmlinux  | wc -l
> > 54558
> > [acme@pumpkin ~]$

> > So I'll bump the release as we did in the past when testing features
> > that we need to test against a release on the pahole-flags.sh script so
> > that we can do further tests.

> I did similar test and ran bpf selftests built with new pahole,
> all looks good

> Acked/Tested-by: Jiri Olsa <jolsa@kernel.org>

Thanks, added to the patches in this series.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25
       [not found]     ` <3c021d56-8818-2464-f7e0-889e769c0311@oracle.com>
@ 2023-02-09 13:09       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-02-09 13:09 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf,
	Jiri Olsa, dwarves

Em Wed, Feb 08, 2023 at 05:17:02PM +0000, Alan Maguire escreveu:
> From: Alan Maguire <alan.maguire@oracle.com>
> Date: Thu, 2 Feb 2023 12:26:20 +0000
> Subject: [PATCH bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto,
>  --btf_gen_optimized to pahole flags for v1.25
> 
> v1.25 of pahole supports filtering out functions with multiple
> inconsistent function prototypes or optimized-out parameters
> from the BTF representation.  These present problems because
> there is no additional info in BTF saying which inconsistent
> prototype matches which function instance to help guide
> attachment, and functions with optimized-out parameters can
> lead to incorrect assumptions about register contents.
> 
> So for now, filter out such functions while adding BTF
> representations for functions that have "."-suffixes
> (foo.isra.0) but not optimized-out parameters.
> 
> This patch assumes changes in [1] land and pahole is bumped
> to v1.25.
> 
> [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  scripts/pahole-flags.sh | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> index 1f1f1d3..728d551 100755
> --- a/scripts/pahole-flags.sh
> +++ b/scripts/pahole-flags.sh
> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>  	# see PAHOLE_HAS_LANG_EXCLUDE
>  	extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>  fi
> +if [ "${pahole_ver}" -ge "125" ]; then
> +	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> +fi
>  
>  echo ${extra_paholeopt}
> -- 
> 1.8.3.1


I applied the patch and it works as advertised using what is in pahole's
'next' branch, that should go to 'master' later today.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-02-09 13:09 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-07 17:14 [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 1/8] dwarf_loader: Help spotting functions with optimized-out parameters Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 2/8] btf_encoder: store type_id_off, unspecified type in encoder Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 3/8] btf_encoder: Refactor function addition into dedicated btf_encoder__add_func Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 4/8] btf_encoder: Rework btf_encoders__*() API to allow traversal of encoders Alan Maguire
2023-02-07 17:14 ` [PATCH v3 dwarves 5/8] btf_encoder: Represent "."-suffixed functions (".isra.0") in BTF Alan Maguire
2023-02-08 13:19   ` Jiri Olsa
2023-02-08 14:43     ` Arnaldo Carvalho de Melo
2023-02-08 20:51       ` Jiri Olsa
2023-02-08 22:57         ` Alan Maguire
2023-02-07 17:15 ` [PATCH v3 dwarves 6/8] btf_encoder: support delaying function addition to check for function prototype inconsistencies Alan Maguire
2023-02-07 17:15 ` [PATCH v3 dwarves 7/8] dwarves: document --btf_gen_optimized option Alan Maguire
2023-02-07 17:15 ` [PATCH v3 dwarves 8/8] dwarves: document --skip_encoding_btf_inconsistent_proto option Alan Maguire
2023-02-08 13:20 ` [PATCH v3 dwarves 0/8] dwarves: support encoding of optimized-out parameters, removal of inconsistent static functions Jiri Olsa
2023-02-08 15:25   ` Alan Maguire
2023-02-08 16:20 ` Arnaldo Carvalho de Melo
2023-02-08 16:50   ` Arnaldo Carvalho de Melo
2023-02-09  9:36     ` Jiri Olsa
2023-02-09 12:22       ` Arnaldo Carvalho de Melo
     [not found]     ` <3c021d56-8818-2464-f7e0-889e769c0311@oracle.com>
2023-02-09 13:09       ` [PATCH bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25 Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.