bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs
@ 2023-05-17 16:16 Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 1/6] btf_encoder: record function address and if it is local Alan Maguire
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

As a means to continue the discussion in [1], which is
concerned with finding the best long-term solution to
having a BPF Type Format (BTF) representation of
functions that is usable for tracing of edge cases, this
proof-of-concept series is intended to explore one approach
to adding information to help make tracing more accurate.

A key problem today is that there is no matching from function
description to the actual instances of a function.

When that function only has one description, that is
not an issue, but if we have multiple inconsistent
static functions in different CUs such as

From kernel/irq/irqdesc.c
    
    static ssize_t wakeup_show(struct kobject *kobj,
                               struct kobj_attribute *attr, char *buf)
    
...and from drivers/base/power/sysfs.c
    
    static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
                               char *buf);

...this becomes a problem.  If I am attaching,
which do I want?  And even if I know which one
I want, which instance in kallsyms is which?

This series is a proof-of-concept that supports encoding
function addresses and associating them with BTF FUNC
descriptions using BTF declaration tags.

More work would need to be done on the kernel side
to _use_ this representation, but hopefully having a
rough approach outlined will help make that more feasible.

[1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/

Alan Maguire (6):
  btf_encoder: record function address and if it is local
  dwarf_loader: store address in function low_pc if available
  dwarf_loader: transfer low_pc info from subtroutine to its abstract
    origin
  btf_encoder: add "addr=0x<addr>" function declaration tag if
    --btf_gen_func_addr specified
  btf_encoder: store ELF function representations sorted by name _and_
    address
  pahole: document --btf_gen_func_addr

 btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
 btf_encoder.h      |  4 +--
 dwarf_loader.c     | 16 +++++++++---
 dwarves.h          |  3 +++
 man-pages/pahole.1 |  8 ++++++
 pahole.c           | 12 +++++++--
 6 files changed, 85 insertions(+), 22 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC dwarves 1/6] btf_encoder: record function address and if it is local
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
@ 2023-05-17 16:16 ` Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 2/6] dwarf_loader: store address in function low_pc if available Alan Maguire
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

this will help our encoding disambiguate functions.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/btf_encoder.c b/btf_encoder.c
index 65f6e71..edf72e6 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -45,7 +45,9 @@ struct btf_encoder_state {
 struct elf_function {
 	const char	*name;
 	bool		 generated;
+	bool		local;
 	size_t		prefixlen;
+	uint64_t	addr;
 	struct function	*function;
 	struct btf_encoder_state state;
 };
@@ -1015,6 +1017,8 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
 		encoder->functions.entries[encoder->functions.cnt].prefixlen = suffix - name;
 	}
 	encoder->functions.entries[encoder->functions.cnt].generated = false;
+	encoder->functions.entries[encoder->functions.cnt].local = elf_sym__is_local_function(sym);
+	encoder->functions.entries[encoder->functions.cnt].addr = elf_sym__value(sym);
 	encoder->functions.entries[encoder->functions.cnt].function = NULL;
 	encoder->functions.entries[encoder->functions.cnt].state.got_proto = false;
 	encoder->functions.entries[encoder->functions.cnt].state.proto[0] = '\0';
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC dwarves 2/6] dwarf_loader: store address in function low_pc if available
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 1/6] btf_encoder: record function address and if it is local Alan Maguire
@ 2023-05-17 16:16 ` Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 3/6] dwarf_loader: transfer low_pc info from subtroutine to its abstract origin Alan Maguire
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

low_pc will be useful when we want to compare functions
by name and address.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 dwarf_loader.c | 3 +++
 dwarves.h      | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index ccf3194..9271ac0 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1319,6 +1319,9 @@ static struct function *function__new(Dwarf_Die *die, struct cu *cu, struct conf
 				    attr_type(die, DW_AT_specification));
 		func->accessibility   = attr_numeric(die, DW_AT_accessibility);
 		func->virtuality      = attr_numeric(die, DW_AT_virtuality);
+		func->has_low_pc      = dwarf_hasattr(die, DW_AT_low_pc);
+		if (func->has_low_pc)
+			dwarf_lowpc(die, &func->low_pc);
 		INIT_LIST_HEAD(&func->vtable_node);
 		INIT_LIST_HEAD(&func->annots);
 		INIT_LIST_HEAD(&func->tool_node);
diff --git a/dwarves.h b/dwarves.h
index eb1a6df..9cf13dd 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -937,7 +937,9 @@ struct function {
 	uint8_t		 virtuality:2; /* DW_VIRTUALITY_{none,virtual,pure_virtual} */
 	uint8_t		 declaration:1;
 	uint8_t		 btf:1;
+	uint8_t		 has_low_pc:1;
 	int32_t		 vtable_entry;
+	uint64_t	 low_pc;
 	struct list_head vtable_node;
 	struct list_head annots;
 	/* fields used by tools */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC dwarves 3/6] dwarf_loader: transfer low_pc info from subtroutine to its abstract origin
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 1/6] btf_encoder: record function address and if it is local Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 2/6] dwarf_loader: store address in function low_pc if available Alan Maguire
@ 2023-05-17 16:16 ` Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 4/6] btf_encoder: add "addr=0x<addr>" function declaration tag if --btf_gen_func_addr specified Alan Maguire
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

when recoding, often the subroutine DIE contains low_pc for a function
whereas the abstract origin it refers to does not; in such cases
copy the low_pc info into the abstract origin also.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 dwarf_loader.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 9271ac0..ed94873 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -2594,9 +2594,16 @@ static int tag__recode_dwarf_type(struct tag *tag, struct cu *cu)
 			dtype = dwarf_cu__find_tag_by_ref(cu->priv, &dtag->abstract_origin);
 			if (dtype == NULL)
 				dtype = dwarf_cu__find_tag_by_ref(cu->priv, &specification);
-			if (dtype != NULL)
-				fn->name = tag__function(dtype->tag)->name;
-			else {
+			if (dtype != NULL) {
+				struct function *ofn = tag__function(dtype->tag);
+
+				fn->name = ofn->name;
+				/* abstract origin may not have low_pc... */
+				if (fn->has_low_pc && !ofn->has_low_pc) {
+					ofn->has_low_pc = fn->has_low_pc;
+					ofn->low_pc = fn->low_pc;
+				}
+			} else {
 				fprintf(stderr,
 					"%s: couldn't find name for "
 					"function %#llx, abstract_origin=%#llx,"
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC dwarves 4/6] btf_encoder: add "addr=0x<addr>" function declaration tag if --btf_gen_func_addr specified
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
                   ` (2 preceding siblings ...)
  2023-05-17 16:16 ` [RFC dwarves 3/6] dwarf_loader: transfer low_pc info from subtroutine to its abstract origin Alan Maguire
@ 2023-05-17 16:16 ` Alan Maguire
  2023-05-17 16:16 ` [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address Alan Maguire
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

Now that we have address information, use BTF declaration tag
of form

[105031] DECL_TAG 'address=0xffffffff81777410' type_id=105030 component_idx=-1

...which points at the function in question.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 34 +++++++++++++++++++++++++---------
 btf_encoder.h |  4 ++--
 dwarves.h     |  1 +
 pahole.c      | 12 ++++++++++--
 4 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index edf72e6..3bd0fe0 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -99,7 +99,7 @@ struct btf_encoder {
 static LIST_HEAD(encoders);
 static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER;
 
-static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder);
+static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder, struct conf_load *conf_load);
 
 /* mutex only needed for add/delete, as this can happen in multiple encoding
  * threads.  Traversal of the list is currently confined to thread collection.
@@ -710,7 +710,7 @@ static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, uint32_
 	return gobuffer__add(&encoder->percpu_secinfo, &si, sizeof(si));
 }
 
-int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other)
+int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other, struct conf_load *conf_load)
 {
 	struct gobuffer *var_secinfo_buf = &other->percpu_secinfo;
 	size_t sz = gobuffer__size(var_secinfo_buf);
@@ -723,7 +723,7 @@ int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder
 	if (encoder == other)
 		return 0;
 
-	btf_encoder__add_saved_funcs(other);
+	btf_encoder__add_saved_funcs(other, conf_load);
 
 	for (i = 0; i < nr_var_secinfo; i++) {
 		vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i;
@@ -881,7 +881,7 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 	return 0;
 }
 
-static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn)
+static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct function *fn, struct conf_load *conf_load)
 {
 	int btf_fnproto_id, btf_fn_id, tag_type_id;
 	struct llvm_annotation *annot;
@@ -903,10 +903,26 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
 			return -1;
 		}
 	}
+	if (conf_load->btf_gen_func_addr) {
+		if (fn->low_pc) {
+			char addr_tag[64];
+
+			snprintf(addr_tag, sizeof(addr_tag), "address=0x%lx", fn->low_pc);
+			tag_type_id = btf_encoder__add_decl_tag(encoder, addr_tag, btf_fn_id, -1);
+			if (tag_type_id < 0) {
+				fprintf(stderr, "error: failed to encode tag '%s' to func %s\n",
+					addr_tag, name);
+				return -1;
+			}
+		} else {
+			if (encoder->verbose)
+				printf("no addr info for func '%s'\n", name);
+		}
+	}
 	return 0;
 }
 
-static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder)
+static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder, struct conf_load *conf_load)
 {
 	int i;
 
@@ -956,7 +972,7 @@ static void btf_encoder__add_saved_funcs(struct btf_encoder *encoder)
 			}
 		} else {
 			encoder->type_id_off = func->state.type_id_off;
-			btf_encoder__add_func(encoder, fn);
+			btf_encoder__add_func(encoder, fn, conf_load);
 		}
 		fn->proto.processed = 1;
 	}
@@ -1356,12 +1372,12 @@ out:
 	return err;
 }
 
-int btf_encoder__encode(struct btf_encoder *encoder)
+int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf_load)
 {
 	int err;
 
 	/* for single-threaded case, saved funcs are added here */
-	btf_encoder__add_saved_funcs(encoder);
+	btf_encoder__add_saved_funcs(encoder, conf_load);
 
 	if (gobuffer__size(&encoder->percpu_secinfo) != 0)
 		btf_encoder__add_datasec(encoder, PERCPU_SECTION);
@@ -1871,7 +1887,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 		if (save)
 			err = btf_encoder__save_func(encoder, fn, func);
 		else
-			err = btf_encoder__add_func(encoder, fn);
+			err = btf_encoder__add_func(encoder, fn, conf_load);
 		if (err)
 			goto out;
 	}
diff --git a/btf_encoder.h b/btf_encoder.h
index 34516bb..b5440cd 100644
--- a/btf_encoder.h
+++ b/btf_encoder.h
@@ -19,12 +19,12 @@ struct list_head;
 struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool skip_encoding_vars, bool force, bool gen_floats, bool verbose);
 void btf_encoder__delete(struct btf_encoder *encoder);
 
-int btf_encoder__encode(struct btf_encoder *encoder);
+int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf_load);
 
 int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
 
 struct btf *btf_encoder__btf(struct btf_encoder *encoder);
 
-int btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other);
+int btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other, struct conf_load *conf_load);
 
 #endif /* _BTF_ENCODER_H_ */
diff --git a/dwarves.h b/dwarves.h
index 9cf13dd..8486cfc 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -68,6 +68,7 @@ struct conf_load {
 	bool			skip_encoding_btf_enum64;
 	bool			btf_gen_optimized;
 	bool			skip_encoding_btf_inconsistent_proto;
+	bool			btf_gen_func_addr;
 	uint8_t			hashtable_bits;
 	uint8_t			max_hashtable_bits;
 	uint16_t		kabi_prefix_len;
diff --git a/pahole.c b/pahole.c
index 6fc4ed6..b799ea6 100644
--- a/pahole.c
+++ b/pahole.c
@@ -1232,6 +1232,7 @@ ARGP_PROGRAM_VERSION_HOOK_DEF = dwarves_print_version;
 #define ARGP_skip_emitting_atomic_typedefs 338
 #define ARGP_btf_gen_optimized  339
 #define ARGP_skip_encoding_btf_inconsistent_proto 340
+#define ARGP_btf_gen_func_addr 341
 
 static const struct argp_option pahole__options[] = {
 	{
@@ -1654,6 +1655,11 @@ static const struct argp_option pahole__options[] = {
 		.key = ARGP_skip_encoding_btf_inconsistent_proto,
 		.doc = "Skip functions that have multiple inconsistent function prototypes sharing the same name, or that use unexpected registers for parameter values."
 	},
+	{
+		.name = "btf_gen_func_addr",
+		.key = ARGP_btf_gen_func_addr,
+		.doc = "Encode function addresses as a BTF declaration tag pointing at the function; this allows easier matching of BTF function descriptions to kernel function addresses."
+	},
 	{
 		.name = NULL,
 	}
@@ -1829,6 +1835,8 @@ static error_t pahole__options_parser(int key, char *arg,
 		conf_load.btf_gen_optimized = true;		break;
 	case ARGP_skip_encoding_btf_inconsistent_proto:
 		conf_load.skip_encoding_btf_inconsistent_proto = true; break;
+	case ARGP_btf_gen_func_addr:
+		conf_load.btf_gen_func_addr = true;		break;
 	default:
 		return ARGP_ERR_UNKNOWN;
 	}
@@ -3009,7 +3017,7 @@ static int pahole_threads_collect(struct conf_load *conf, int nr_threads, void *
                 */
 		if (!threads[i]->btf)
 			continue;
-		err = btf_encoder__add_encoder(btf_encoder, threads[i]->encoder);
+		err = btf_encoder__add_encoder(btf_encoder, threads[i]->encoder, conf);
 		if (err < 0)
 			goto out;
 	}
@@ -3567,7 +3575,7 @@ try_sole_arg_as_class_names:
 	header = NULL;
 
 	if (btf_encode && btf_encoder) { // maybe all CUs were filtered out and thus we don't have an encoder?
-		err = btf_encoder__encode(btf_encoder);
+		err = btf_encoder__encode(btf_encoder, &conf_load);
 		if (err) {
 			fputs("Failed to encode BTF\n", stderr);
 			goto out_cus_delete;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
                   ` (3 preceding siblings ...)
  2023-05-17 16:16 ` [RFC dwarves 4/6] btf_encoder: add "addr=0x<addr>" function declaration tag if --btf_gen_func_addr specified Alan Maguire
@ 2023-05-17 16:16 ` Alan Maguire
  2023-05-18  8:39   ` Jiri Olsa
  2023-05-17 16:16 ` [RFC dwarves 6/6] pahole: document --btf_gen_func_addr Alan Maguire
  2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
  6 siblings, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

By making sorting function for our ELF function list match on
both name and function, we ensure that the set of ELF functions
includes multiple copies for functions which have multiple instances
across CUs.  For example, cpumask_weight has 22 instances in
System.map/kallsyms:

ffffffff8103b530 t cpumask_weight
ffffffff8103e300 t cpumask_weight
ffffffff81040d30 t cpumask_weight
ffffffff8104fa00 t cpumask_weight
ffffffff81064300 t cpumask_weight
ffffffff81082ba0 t cpumask_weight
ffffffff81084f50 t cpumask_weight
ffffffff810a4ad0 t cpumask_weight
ffffffff810bb740 t cpumask_weight
ffffffff8110a6c0 t cpumask_weight
ffffffff81118ab0 t cpumask_weight
ffffffff81129b50 t cpumask_weight
ffffffff81137dc0 t cpumask_weight
ffffffff811aead0 t cpumask_weight
ffffffff811d6800 t cpumask_weight
ffffffff811e1370 t cpumask_weight
ffffffff812fae80 t cpumask_weight
ffffffff81375c50 t cpumask_weight
ffffffff81634b60 t cpumask_weight
ffffffff817ba540 t cpumask_weight
ffffffff819abf30 t cpumask_weight
ffffffff81a7cb60 t cpumask_weight

With ELF representations for each address, and DWARF info about
addresses (low_pc) we can match DWARF with ELF accurately.
The result for the BTF representation is that we end up with
a single de-duped function:

[9287] FUNC 'cpumask_weight' type_id=9286 linkage=static

...and 22 DECL_TAGs for each address that point at it:

9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
[9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
[9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
[11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
[13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
[15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
[15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
[18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
[20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
[25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
[25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
[26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
[27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
[32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
[35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
[35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
[48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
[54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
[87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
[108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
[132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
[143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1

Consider another case where the same name - wakeup_show() - is
used for two different function signatures:

From kernel/irq/irqdesc.c

static ssize_t wakeup_show(struct kobject *kobj,
 			   struct kobj_attribute *attr, char *buf)

...and from drivers/base/power/sysfs.c

static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
                           char *buf);

We see both defined in BTF, along with the addresses that
tell us which is which:

[28472] FUNC 'wakeup_show' type_id=11214 linkage=static

specifies

[11214] FUNC_PROTO '(anon)' ret_type_id=76 vlen=3
        'kobj' type_id=877
        'attr' type_id=11200
        'buf' type_id=56

...and has declaration tag

[28473] DECL_TAG 'address=0xffffffff8115eab0' type_id=28472 component_idx=-1

which identifies

ffffffff8115eab0 t wakeup_show

...as the function with the first signature.

Similarly,

[114375] FUNC 'wakeup_show' type_id=4750 linkage=static

[4750] FUNC_PROTO '(anon)' ret_type_id=76 vlen=3
        'dev' type_id=1488
        'attr' type_id=3909
        'buf' type_id=56
...and

[114376] DECL_TAG 'address=0xffffffff8181eac0' type_id=114375 component_idx=-1

...tell us that

ffffffff8181eac0 t wakeup_show

...has the second signature.  So we can accommodate multiple
functions with conflicting signatures in BTF, since we have
added extra info to map from function description in BTF
to address.

In total for vmlinux 52006 DECL_TAGs are added, and these add
2MB to the BTF representation.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 btf_encoder.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 3bd0fe0..315053d 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -988,13 +988,25 @@ static int functions_cmp(const void *_a, const void *_b)
 {
 	const struct elf_function *a = _a;
 	const struct elf_function *b = _b;
+	int ret;
 
 	/* if search key allows prefix match, verify target has matching
 	 * prefix len and prefix matches.
 	 */
 	if (a->prefixlen && a->prefixlen == b->prefixlen)
-		return strncmp(a->name, b->name, b->prefixlen);
-	return strcmp(a->name, b->name);
+		ret = strncmp(a->name, b->name, b->prefixlen);
+	else
+		ret = strcmp(a->name, b->name);
+
+	if (ret || !b->addr)
+		return ret;
+
+	/* secondarily sort/search by address. */
+	if (a->addr < b->addr)
+		return -1;
+	if (a->addr > b->addr)
+		return 1;
+	return 0;
 }
 
 #ifndef max
@@ -1044,9 +1056,11 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
 }
 
 static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
-						       const char *name, size_t prefixlen)
+						       struct function *fn, size_t prefixlen)
 {
-	struct elf_function key = { .name = name, .prefixlen = prefixlen };
+	struct elf_function key = { .name = function__name(fn),
+				    .addr = fn->low_pc,
+				    .prefixlen = prefixlen };
 
 	return bsearch(&key, encoder->functions.entries, encoder->functions.cnt, sizeof(key), functions_cmp);
 }
@@ -1846,7 +1860,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				continue;
 
 			/* prefer exact function name match... */
-			func = btf_encoder__find_function(encoder, name, 0);
+			func = btf_encoder__find_function(encoder, fn, 0);
 			if (func) {
 				if (func->generated)
 					continue;
@@ -1863,7 +1877,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 				 * it does not have optimized-out parameters
 				 * in any cu.
 				 */
-				func = btf_encoder__find_function(encoder, name,
+				func = btf_encoder__find_function(encoder, fn,
 								  strlen(name));
 				if (func) {
 					save = true;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC dwarves 6/6] pahole: document --btf_gen_func_addr
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
                   ` (4 preceding siblings ...)
  2023-05-17 16:16 ` [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address Alan Maguire
@ 2023-05-17 16:16 ` Alan Maguire
  2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
  6 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-17 16:16 UTC (permalink / raw)
  To: acme, ast, jolsa, yhs, andrii
  Cc: daniel, laoar.shao, martin.lau, song, john.fastabend, kpsingh,
	sdf, haoluo, bpf, Alan Maguire

Document how we can generate declaration tags specifying
function address.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 man-pages/pahole.1 | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
index c1b48de..aa0a3a4 100644
--- a/man-pages/pahole.1
+++ b/man-pages/pahole.1
@@ -265,6 +265,14 @@ to "/sys/kernel/btf/vmlinux".
 Allow producing BTF_KIND_FLOAT entries in systems where the vmlinux DWARF
 information has float types.
 
+.TP
+.B \-\-btf_gen_func_addr
+Generate DECL_TAG references that specify "address=0x<address of function>"
+tags for each instance of a function.
+This allows us to map from a function description in BTF to instances of its use
+in code;
+this is helpful if a function has multiple static definitions in different CUs that are incompatible.
+
 .TP
 .B \-\-btf_gen_optimized
 Generate BTF for functions with optimization-related suffixes (.isra, .constprop).
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-17 16:16 ` [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address Alan Maguire
@ 2023-05-18  8:39   ` Jiri Olsa
  2023-05-18 13:23     ` Alan Maguire
  0 siblings, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2023-05-18  8:39 UTC (permalink / raw)
  To: Alan Maguire
  Cc: acme, ast, yhs, andrii, daniel, laoar.shao, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf

On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
> By making sorting function for our ELF function list match on
> both name and function, we ensure that the set of ELF functions
> includes multiple copies for functions which have multiple instances
> across CUs.  For example, cpumask_weight has 22 instances in
> System.map/kallsyms:
> 
> ffffffff8103b530 t cpumask_weight
> ffffffff8103e300 t cpumask_weight
> ffffffff81040d30 t cpumask_weight
> ffffffff8104fa00 t cpumask_weight
> ffffffff81064300 t cpumask_weight
> ffffffff81082ba0 t cpumask_weight
> ffffffff81084f50 t cpumask_weight
> ffffffff810a4ad0 t cpumask_weight
> ffffffff810bb740 t cpumask_weight
> ffffffff8110a6c0 t cpumask_weight
> ffffffff81118ab0 t cpumask_weight
> ffffffff81129b50 t cpumask_weight
> ffffffff81137dc0 t cpumask_weight
> ffffffff811aead0 t cpumask_weight
> ffffffff811d6800 t cpumask_weight
> ffffffff811e1370 t cpumask_weight
> ffffffff812fae80 t cpumask_weight
> ffffffff81375c50 t cpumask_weight
> ffffffff81634b60 t cpumask_weight
> ffffffff817ba540 t cpumask_weight
> ffffffff819abf30 t cpumask_weight
> ffffffff81a7cb60 t cpumask_weight
> 
> With ELF representations for each address, and DWARF info about
> addresses (low_pc) we can match DWARF with ELF accurately.
> The result for the BTF representation is that we end up with
> a single de-duped function:
> 
> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
> 
> ...and 22 DECL_TAGs for each address that point at it:
> 
> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1

right, Yonghong pointed this out in:
  https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/

it's problem, because we pass btf id as attach id during bpf program load,
and kernel does not have a way to figure out which address from the associated
DECL_TAGs to use

if we could change dedup algo to take the function address into account and
make it not de-duplicate equal functions with different addresses, then we
could:

  - find btf id that properly and uniquely identifies the function we
    want to trace

  - store the vmlinux base address and treat stored function addresses as
    offsets, so the verifier can get proper address even if the kernel
    is relocated

    or

  - add support for kernel's kallsyms to return address for given btf id,
    I plan to check on this one

jirka

> 
> Consider another case where the same name - wakeup_show() - is
> used for two different function signatures:
> 
> From kernel/irq/irqdesc.c
> 
> static ssize_t wakeup_show(struct kobject *kobj,
>  			   struct kobj_attribute *attr, char *buf)
> 
> ...and from drivers/base/power/sysfs.c
> 
> static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>                            char *buf);
> 
> We see both defined in BTF, along with the addresses that
> tell us which is which:
> 
> [28472] FUNC 'wakeup_show' type_id=11214 linkage=static
> 
> specifies
> 
> [11214] FUNC_PROTO '(anon)' ret_type_id=76 vlen=3
>         'kobj' type_id=877
>         'attr' type_id=11200
>         'buf' type_id=56
> 
> ...and has declaration tag
> 
> [28473] DECL_TAG 'address=0xffffffff8115eab0' type_id=28472 component_idx=-1
> 
> which identifies
> 
> ffffffff8115eab0 t wakeup_show
> 
> ...as the function with the first signature.
> 
> Similarly,
> 
> [114375] FUNC 'wakeup_show' type_id=4750 linkage=static
> 
> [4750] FUNC_PROTO '(anon)' ret_type_id=76 vlen=3
>         'dev' type_id=1488
>         'attr' type_id=3909
>         'buf' type_id=56
> ...and
> 
> [114376] DECL_TAG 'address=0xffffffff8181eac0' type_id=114375 component_idx=-1
> 
> ...tell us that
> 
> ffffffff8181eac0 t wakeup_show
> 
> ...has the second signature.  So we can accommodate multiple
> functions with conflicting signatures in BTF, since we have
> added extra info to map from function description in BTF
> to address.
> 
> In total for vmlinux 52006 DECL_TAGs are added, and these add
> 2MB to the BTF representation.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  btf_encoder.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 3bd0fe0..315053d 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -988,13 +988,25 @@ static int functions_cmp(const void *_a, const void *_b)
>  {
>  	const struct elf_function *a = _a;
>  	const struct elf_function *b = _b;
> +	int ret;
>  
>  	/* if search key allows prefix match, verify target has matching
>  	 * prefix len and prefix matches.
>  	 */
>  	if (a->prefixlen && a->prefixlen == b->prefixlen)
> -		return strncmp(a->name, b->name, b->prefixlen);
> -	return strcmp(a->name, b->name);
> +		ret = strncmp(a->name, b->name, b->prefixlen);
> +	else
> +		ret = strcmp(a->name, b->name);
> +
> +	if (ret || !b->addr)
> +		return ret;
> +
> +	/* secondarily sort/search by address. */
> +	if (a->addr < b->addr)
> +		return -1;
> +	if (a->addr > b->addr)
> +		return 1;
> +	return 0;
>  }
>  
>  #ifndef max
> @@ -1044,9 +1056,11 @@ static int btf_encoder__collect_function(struct btf_encoder *encoder, GElf_Sym *
>  }
>  
>  static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder,
> -						       const char *name, size_t prefixlen)
> +						       struct function *fn, size_t prefixlen)
>  {
> -	struct elf_function key = { .name = name, .prefixlen = prefixlen };
> +	struct elf_function key = { .name = function__name(fn),
> +				    .addr = fn->low_pc,
> +				    .prefixlen = prefixlen };
>  
>  	return bsearch(&key, encoder->functions.entries, encoder->functions.cnt, sizeof(key), functions_cmp);
>  }
> @@ -1846,7 +1860,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  				continue;
>  
>  			/* prefer exact function name match... */
> -			func = btf_encoder__find_function(encoder, name, 0);
> +			func = btf_encoder__find_function(encoder, fn, 0);
>  			if (func) {
>  				if (func->generated)
>  					continue;
> @@ -1863,7 +1877,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  				 * it does not have optimized-out parameters
>  				 * in any cu.
>  				 */
> -				func = btf_encoder__find_function(encoder, name,
> +				func = btf_encoder__find_function(encoder, fn,
>  								  strlen(name));
>  				if (func) {
>  					save = true;
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-18  8:39   ` Jiri Olsa
@ 2023-05-18 13:23     ` Alan Maguire
  2023-05-18 15:20       ` Yonghong Song
  2023-05-18 16:22       ` Jiri Olsa
  0 siblings, 2 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-18 13:23 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, ast, yhs, andrii, daniel, laoar.shao, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf

On 18/05/2023 09:39, Jiri Olsa wrote:
> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>> By making sorting function for our ELF function list match on
>> both name and function, we ensure that the set of ELF functions
>> includes multiple copies for functions which have multiple instances
>> across CUs.  For example, cpumask_weight has 22 instances in
>> System.map/kallsyms:
>>
>> ffffffff8103b530 t cpumask_weight
>> ffffffff8103e300 t cpumask_weight
>> ffffffff81040d30 t cpumask_weight
>> ffffffff8104fa00 t cpumask_weight
>> ffffffff81064300 t cpumask_weight
>> ffffffff81082ba0 t cpumask_weight
>> ffffffff81084f50 t cpumask_weight
>> ffffffff810a4ad0 t cpumask_weight
>> ffffffff810bb740 t cpumask_weight
>> ffffffff8110a6c0 t cpumask_weight
>> ffffffff81118ab0 t cpumask_weight
>> ffffffff81129b50 t cpumask_weight
>> ffffffff81137dc0 t cpumask_weight
>> ffffffff811aead0 t cpumask_weight
>> ffffffff811d6800 t cpumask_weight
>> ffffffff811e1370 t cpumask_weight
>> ffffffff812fae80 t cpumask_weight
>> ffffffff81375c50 t cpumask_weight
>> ffffffff81634b60 t cpumask_weight
>> ffffffff817ba540 t cpumask_weight
>> ffffffff819abf30 t cpumask_weight
>> ffffffff81a7cb60 t cpumask_weight
>>
>> With ELF representations for each address, and DWARF info about
>> addresses (low_pc) we can match DWARF with ELF accurately.
>> The result for the BTF representation is that we end up with
>> a single de-duped function:
>>
>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>
>> ...and 22 DECL_TAGs for each address that point at it:
>>
>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1
> 
> right, Yonghong pointed this out in:
>   https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
> 
> it's problem, because we pass btf id as attach id during bpf program load,
> and kernel does not have a way to figure out which address from the associated
> DECL_TAGs to use
> 
> if we could change dedup algo to take the function address into account and
> make it not de-duplicate equal functions with different addresses, then we
> could:
> 
>   - find btf id that properly and uniquely identifies the function we
>     want to trace
> 

So maybe a more natural approach would be to extend BTF_KIND_FUNC
(I think Alexei suggested something this earlier but I could be
misremembering) as follows:


2.2.12 BTF_KIND_FUNC
~~~~~~~~~~~~~~~~~~~~

``struct btf_type`` encoding requirement:
  * ``name_off``: offset to a valid C identifier
-  * ``info.kind_flag``: 0
+  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` follows
  * ``info.kind``: BTF_KIND_FUNC
  * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
                   or BTF_FUNC_EXTERN)
  * ``type``: a BTF_KIND_FUNC_PROTO type

- No additional type data follow ``btf_type``.
+ If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
+
+	struct btf_func {
+		__u64 addr;
+	};
+ Otherwise no additional type data follows ``btf_type``.


With the above, dedup could be made to fail when functions have non-
identical associated addresses. Judging by the number of DECL_TAGs in
the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
space for struct btf_funcs would require roughly 400k. We'd still get
dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB.



>   - store the vmlinux base address and treat stored function addresses as
>     offsets, so the verifier can get proper address even if the kernel
>     is relocated
>

yep; when we read kernel/module BTF in we could hopefully carry out
this recalculation and update the vmlinux/module BTF addresses
accordingly.

Thanks!

Alan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-18 13:23     ` Alan Maguire
@ 2023-05-18 15:20       ` Yonghong Song
  2023-05-18 16:22       ` Jiri Olsa
  1 sibling, 0 replies; 21+ messages in thread
From: Yonghong Song @ 2023-05-18 15:20 UTC (permalink / raw)
  To: Alan Maguire, Jiri Olsa
  Cc: acme, ast, yhs, andrii, daniel, laoar.shao, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf



On 5/18/23 6:23 AM, Alan Maguire wrote:
> On 18/05/2023 09:39, Jiri Olsa wrote:
>> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>>> By making sorting function for our ELF function list match on
>>> both name and function, we ensure that the set of ELF functions
>>> includes multiple copies for functions which have multiple instances
>>> across CUs.  For example, cpumask_weight has 22 instances in
>>> System.map/kallsyms:
>>>
>>> ffffffff8103b530 t cpumask_weight
>>> ffffffff8103e300 t cpumask_weight
>>> ffffffff81040d30 t cpumask_weight
>>> ffffffff8104fa00 t cpumask_weight
>>> ffffffff81064300 t cpumask_weight
>>> ffffffff81082ba0 t cpumask_weight
>>> ffffffff81084f50 t cpumask_weight
>>> ffffffff810a4ad0 t cpumask_weight
>>> ffffffff810bb740 t cpumask_weight
>>> ffffffff8110a6c0 t cpumask_weight
>>> ffffffff81118ab0 t cpumask_weight
>>> ffffffff81129b50 t cpumask_weight
>>> ffffffff81137dc0 t cpumask_weight
>>> ffffffff811aead0 t cpumask_weight
>>> ffffffff811d6800 t cpumask_weight
>>> ffffffff811e1370 t cpumask_weight
>>> ffffffff812fae80 t cpumask_weight
>>> ffffffff81375c50 t cpumask_weight
>>> ffffffff81634b60 t cpumask_weight
>>> ffffffff817ba540 t cpumask_weight
>>> ffffffff819abf30 t cpumask_weight
>>> ffffffff81a7cb60 t cpumask_weight
>>>
>>> With ELF representations for each address, and DWARF info about
>>> addresses (low_pc) we can match DWARF with ELF accurately.
>>> The result for the BTF representation is that we end up with
>>> a single de-duped function:
>>>
>>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>>
>>> ...and 22 DECL_TAGs for each address that point at it:
>>>
>>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
>>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
>>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
>>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
>>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
>>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
>>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
>>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
>>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
>>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
>>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
>>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
>>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
>>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
>>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
>>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
>>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
>>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
>>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
>>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
>>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
>>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1
>>
>> right, Yonghong pointed this out in:
>>    https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
>>
>> it's problem, because we pass btf id as attach id during bpf program load,
>> and kernel does not have a way to figure out which address from the associated
>> DECL_TAGs to use
>>
>> if we could change dedup algo to take the function address into account and
>> make it not de-duplicate equal functions with different addresses, then we
>> could:
>>
>>    - find btf id that properly and uniquely identifies the function we
>>      want to trace
>>
> 
> So maybe a more natural approach would be to extend BTF_KIND_FUNC
> (I think Alexei suggested something this earlier but I could be
> misremembering) as follows:
> 
> 
> 2.2.12 BTF_KIND_FUNC
> ~~~~~~~~~~~~~~~~~~~~
> 
> ``struct btf_type`` encoding requirement:
>    * ``name_off``: offset to a valid C identifier
> -  * ``info.kind_flag``: 0
> +  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` follows
>    * ``info.kind``: BTF_KIND_FUNC
>    * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
>                     or BTF_FUNC_EXTERN)
>    * ``type``: a BTF_KIND_FUNC_PROTO type
> 
> - No additional type data follow ``btf_type``.
> + If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
> +
> +	struct btf_func {
> +		__u64 addr;
> +	};
> + Otherwise no additional type data follows ``btf_type``.
> 
> 
> With the above, dedup could be made to fail when functions have non-
> identical associated addresses. Judging by the number of DECL_TAGs in
> the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
> space for struct btf_funcs would require roughly 400k. We'd still get
> dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB.

Agree that this seems a better idea to save space and also not impacting
existing dedup algorithm. The only weird thing is previously we use
KIND + option vlen to decide overall size of the type, but now we need
KIND + kflag to decide FUNC type size. But this should be okay.

We need to add an option to pahole to enable this feature and
in kernel to enable this option only if the kernel supports new format.

Do we want to add addresses to all functions in which case we will have
FUNC types with both kflag 0 and kflag 1? Do you imagine whether
we potentially need to encode other additional information to FUNC type?

> 
> 
> 
>>    - store the vmlinux base address and treat stored function addresses as
>>      offsets, so the verifier can get proper address even if the kernel
>>      is relocated
>>
> 
> yep; when we read kernel/module BTF in we could hopefully carry out
> this recalculation and update the vmlinux/module BTF addresses
> accordingly.
> 
> Thanks!
> 
> Alan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-18 13:23     ` Alan Maguire
  2023-05-18 15:20       ` Yonghong Song
@ 2023-05-18 16:22       ` Jiri Olsa
  2023-05-18 18:25         ` Yonghong Song
  1 sibling, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2023-05-18 16:22 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Jiri Olsa, acme, ast, yhs, andrii, daniel, laoar.shao,
	martin.lau, song, john.fastabend, kpsingh, sdf, haoluo, bpf

On Thu, May 18, 2023 at 02:23:34PM +0100, Alan Maguire wrote:
> On 18/05/2023 09:39, Jiri Olsa wrote:
> > On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
> >> By making sorting function for our ELF function list match on
> >> both name and function, we ensure that the set of ELF functions
> >> includes multiple copies for functions which have multiple instances
> >> across CUs.  For example, cpumask_weight has 22 instances in
> >> System.map/kallsyms:
> >>
> >> ffffffff8103b530 t cpumask_weight
> >> ffffffff8103e300 t cpumask_weight
> >> ffffffff81040d30 t cpumask_weight
> >> ffffffff8104fa00 t cpumask_weight
> >> ffffffff81064300 t cpumask_weight
> >> ffffffff81082ba0 t cpumask_weight
> >> ffffffff81084f50 t cpumask_weight
> >> ffffffff810a4ad0 t cpumask_weight
> >> ffffffff810bb740 t cpumask_weight
> >> ffffffff8110a6c0 t cpumask_weight
> >> ffffffff81118ab0 t cpumask_weight
> >> ffffffff81129b50 t cpumask_weight
> >> ffffffff81137dc0 t cpumask_weight
> >> ffffffff811aead0 t cpumask_weight
> >> ffffffff811d6800 t cpumask_weight
> >> ffffffff811e1370 t cpumask_weight
> >> ffffffff812fae80 t cpumask_weight
> >> ffffffff81375c50 t cpumask_weight
> >> ffffffff81634b60 t cpumask_weight
> >> ffffffff817ba540 t cpumask_weight
> >> ffffffff819abf30 t cpumask_weight
> >> ffffffff81a7cb60 t cpumask_weight
> >>
> >> With ELF representations for each address, and DWARF info about
> >> addresses (low_pc) we can match DWARF with ELF accurately.
> >> The result for the BTF representation is that we end up with
> >> a single de-duped function:
> >>
> >> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
> >>
> >> ...and 22 DECL_TAGs for each address that point at it:
> >>
> >> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
> >> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
> >> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
> >> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
> >> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
> >> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
> >> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
> >> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
> >> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
> >> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
> >> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
> >> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
> >> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
> >> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
> >> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
> >> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
> >> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
> >> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
> >> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
> >> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
> >> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
> >> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1
> > 
> > right, Yonghong pointed this out in:
> >   https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
> > 
> > it's problem, because we pass btf id as attach id during bpf program load,
> > and kernel does not have a way to figure out which address from the associated
> > DECL_TAGs to use
> > 
> > if we could change dedup algo to take the function address into account and
> > make it not de-duplicate equal functions with different addresses, then we
> > could:
> > 
> >   - find btf id that properly and uniquely identifies the function we
> >     want to trace
> > 
> 
> So maybe a more natural approach would be to extend BTF_KIND_FUNC
> (I think Alexei suggested something this earlier but I could be
> misremembering) as follows:
> 
> 
> 2.2.12 BTF_KIND_FUNC
> ~~~~~~~~~~~~~~~~~~~~
> 
> ``struct btf_type`` encoding requirement:
>   * ``name_off``: offset to a valid C identifier
> -  * ``info.kind_flag``: 0
> +  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` follows
>   * ``info.kind``: BTF_KIND_FUNC
>   * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
>                    or BTF_FUNC_EXTERN)
>   * ``type``: a BTF_KIND_FUNC_PROTO type
> 
> - No additional type data follow ``btf_type``.
> + If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
> +
> +	struct btf_func {
> +		__u64 addr;
> +	};
> + Otherwise no additional type data follows ``btf_type``.
> 
> 
> With the above, dedup could be made to fail when functions have non-
> identical associated addresses. Judging by the number of DECL_TAGs in
> the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
> space for struct btf_funcs would require roughly 400k. We'd still get
> dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB.

nice, I think it's better solution

> 
> 
> 
> >   - store the vmlinux base address and treat stored function addresses as
> >     offsets, so the verifier can get proper address even if the kernel
> >     is relocated
> >
> 
> yep; when we read kernel/module BTF in we could hopefully carry out
> this recalculation and update the vmlinux/module BTF addresses
> accordingly.

I wonder now when the address will be stored as number (not string) we
could somehow generate relocation records and have the module loader
do the relocation automatically

not sure how that works for vmlinux when it's loaded/relocated on boot

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-18 16:22       ` Jiri Olsa
@ 2023-05-18 18:25         ` Yonghong Song
  2023-05-19  0:23           ` Yonghong Song
  2023-05-19  0:26           ` Alexei Starovoitov
  0 siblings, 2 replies; 21+ messages in thread
From: Yonghong Song @ 2023-05-18 18:25 UTC (permalink / raw)
  To: Jiri Olsa, Alan Maguire
  Cc: acme, ast, yhs, andrii, daniel, laoar.shao, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf



On 5/18/23 9:22 AM, Jiri Olsa wrote:
> On Thu, May 18, 2023 at 02:23:34PM +0100, Alan Maguire wrote:
>> On 18/05/2023 09:39, Jiri Olsa wrote:
>>> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>>>> By making sorting function for our ELF function list match on
>>>> both name and function, we ensure that the set of ELF functions
>>>> includes multiple copies for functions which have multiple instances
>>>> across CUs.  For example, cpumask_weight has 22 instances in
>>>> System.map/kallsyms:
>>>>
>>>> ffffffff8103b530 t cpumask_weight
>>>> ffffffff8103e300 t cpumask_weight
>>>> ffffffff81040d30 t cpumask_weight
>>>> ffffffff8104fa00 t cpumask_weight
>>>> ffffffff81064300 t cpumask_weight
>>>> ffffffff81082ba0 t cpumask_weight
>>>> ffffffff81084f50 t cpumask_weight
>>>> ffffffff810a4ad0 t cpumask_weight
>>>> ffffffff810bb740 t cpumask_weight
>>>> ffffffff8110a6c0 t cpumask_weight
>>>> ffffffff81118ab0 t cpumask_weight
>>>> ffffffff81129b50 t cpumask_weight
>>>> ffffffff81137dc0 t cpumask_weight
>>>> ffffffff811aead0 t cpumask_weight
>>>> ffffffff811d6800 t cpumask_weight
>>>> ffffffff811e1370 t cpumask_weight
>>>> ffffffff812fae80 t cpumask_weight
>>>> ffffffff81375c50 t cpumask_weight
>>>> ffffffff81634b60 t cpumask_weight
>>>> ffffffff817ba540 t cpumask_weight
>>>> ffffffff819abf30 t cpumask_weight
>>>> ffffffff81a7cb60 t cpumask_weight
>>>>
>>>> With ELF representations for each address, and DWARF info about
>>>> addresses (low_pc) we can match DWARF with ELF accurately.
>>>> The result for the BTF representation is that we end up with
>>>> a single de-duped function:
>>>>
>>>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>>>
>>>> ...and 22 DECL_TAGs for each address that point at it:
>>>>
>>>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
>>>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
>>>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
>>>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
>>>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
>>>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
>>>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
>>>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
>>>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
>>>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
>>>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
>>>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
>>>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
>>>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
>>>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
>>>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
>>>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
>>>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
>>>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
>>>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
>>>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
>>>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1
>>>
>>> right, Yonghong pointed this out in:
>>>    https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
>>>
>>> it's problem, because we pass btf id as attach id during bpf program load,
>>> and kernel does not have a way to figure out which address from the associated
>>> DECL_TAGs to use
>>>
>>> if we could change dedup algo to take the function address into account and
>>> make it not de-duplicate equal functions with different addresses, then we
>>> could:
>>>
>>>    - find btf id that properly and uniquely identifies the function we
>>>      want to trace
>>>
>>
>> So maybe a more natural approach would be to extend BTF_KIND_FUNC
>> (I think Alexei suggested something this earlier but I could be
>> misremembering) as follows:
>>
>>
>> 2.2.12 BTF_KIND_FUNC
>> ~~~~~~~~~~~~~~~~~~~~
>>
>> ``struct btf_type`` encoding requirement:
>>    * ``name_off``: offset to a valid C identifier
>> -  * ``info.kind_flag``: 0
>> +  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` follows
>>    * ``info.kind``: BTF_KIND_FUNC
>>    * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
>>                     or BTF_FUNC_EXTERN)
>>    * ``type``: a BTF_KIND_FUNC_PROTO type
>>
>> - No additional type data follow ``btf_type``.
>> + If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
>> +
>> +	struct btf_func {
>> +		__u64 addr;
>> +	};
>> + Otherwise no additional type data follows ``btf_type``.
>>
>>
>> With the above, dedup could be made to fail when functions have non-
>> identical associated addresses. Judging by the number of DECL_TAGs in
>> the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
>> space for struct btf_funcs would require roughly 400k. We'd still get
>> dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB
> nice, I think it's better solution
> 
>>
>>
>>
>>>    - store the vmlinux base address and treat stored function addresses as
>>>      offsets, so the verifier can get proper address even if the kernel
>>>      is relocated
>>>
>>
>> yep; when we read kernel/module BTF in we could hopefully carry out
>> this recalculation and update the vmlinux/module BTF addresses
>> accordingly.
> 
> I wonder now when the address will be stored as number (not string) we
> could somehow generate relocation records and have the module loader
> do the relocation automatically
> 
> not sure how that works for vmlinux when it's loaded/relocated on boot

Right, actual module address will mostly not match the one in dwarf.
Some during module btf load, we should modify btf address as well
for later use? Yes, may need to reuse some routines used in initial
module relocation.

> 
> jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-18 18:25         ` Yonghong Song
@ 2023-05-19  0:23           ` Yonghong Song
  2023-05-19  0:26           ` Alexei Starovoitov
  1 sibling, 0 replies; 21+ messages in thread
From: Yonghong Song @ 2023-05-19  0:23 UTC (permalink / raw)
  To: Jiri Olsa, Alan Maguire
  Cc: acme, ast, yhs, andrii, daniel, laoar.shao, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf



On 5/18/23 11:25 AM, Yonghong Song wrote:
> 
> 
> On 5/18/23 9:22 AM, Jiri Olsa wrote:
>> On Thu, May 18, 2023 at 02:23:34PM +0100, Alan Maguire wrote:
>>> On 18/05/2023 09:39, Jiri Olsa wrote:
>>>> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>>>>> By making sorting function for our ELF function list match on
>>>>> both name and function, we ensure that the set of ELF functions
>>>>> includes multiple copies for functions which have multiple instances
>>>>> across CUs.  For example, cpumask_weight has 22 instances in
>>>>> System.map/kallsyms:
>>>>>
>>>>> ffffffff8103b530 t cpumask_weight
>>>>> ffffffff8103e300 t cpumask_weight
>>>>> ffffffff81040d30 t cpumask_weight
>>>>> ffffffff8104fa00 t cpumask_weight
>>>>> ffffffff81064300 t cpumask_weight
>>>>> ffffffff81082ba0 t cpumask_weight
>>>>> ffffffff81084f50 t cpumask_weight
>>>>> ffffffff810a4ad0 t cpumask_weight
>>>>> ffffffff810bb740 t cpumask_weight
>>>>> ffffffff8110a6c0 t cpumask_weight
>>>>> ffffffff81118ab0 t cpumask_weight
>>>>> ffffffff81129b50 t cpumask_weight
>>>>> ffffffff81137dc0 t cpumask_weight
>>>>> ffffffff811aead0 t cpumask_weight
>>>>> ffffffff811d6800 t cpumask_weight
>>>>> ffffffff811e1370 t cpumask_weight
>>>>> ffffffff812fae80 t cpumask_weight
>>>>> ffffffff81375c50 t cpumask_weight
>>>>> ffffffff81634b60 t cpumask_weight
>>>>> ffffffff817ba540 t cpumask_weight
>>>>> ffffffff819abf30 t cpumask_weight
>>>>> ffffffff81a7cb60 t cpumask_weight
>>>>>
>>>>> With ELF representations for each address, and DWARF info about
>>>>> addresses (low_pc) we can match DWARF with ELF accurately.
>>>>> The result for the BTF representation is that we end up with
>>>>> a single de-duped function:
>>>>>
>>>>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>>>>
>>>>> ...and 22 DECL_TAGs for each address that point at it:
>>>>>
>>>>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 
>>>>> component_idx=-1
>>>>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 
>>>>> component_idx=-1
>>>>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 
>>>>> component_idx=-1
>>>>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 
>>>>> component_idx=-1
>>>>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 
>>>>> component_idx=-1
>>>>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 
>>>>> component_idx=-1
>>>>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 
>>>>> component_idx=-1
>>>>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 
>>>>> component_idx=-1
>>>>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 
>>>>> component_idx=-1
>>>>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 
>>>>> component_idx=-1
>>>>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 
>>>>> component_idx=-1
>>>>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 
>>>>> component_idx=-1
>>>>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 
>>>>> component_idx=-1
>>>>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 
>>>>> component_idx=-1
>>>>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 
>>>>> component_idx=-1
>>>>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 
>>>>> component_idx=-1
>>>>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 
>>>>> component_idx=-1
>>>>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 
>>>>> component_idx=-1
>>>>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 
>>>>> component_idx=-1
>>>>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 
>>>>> component_idx=-1
>>>>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 
>>>>> component_idx=-1
>>>>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 
>>>>> component_idx=-1
>>>>
>>>> right, Yonghong pointed this out in:
>>>>    
>>>> https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@meta.com/
>>>>
>>>> it's problem, because we pass btf id as attach id during bpf program 
>>>> load,
>>>> and kernel does not have a way to figure out which address from the 
>>>> associated
>>>> DECL_TAGs to use
>>>>
>>>> if we could change dedup algo to take the function address into 
>>>> account and
>>>> make it not de-duplicate equal functions with different addresses, 
>>>> then we
>>>> could:
>>>>
>>>>    - find btf id that properly and uniquely identifies the function we
>>>>      want to trace
>>>>
>>>
>>> So maybe a more natural approach would be to extend BTF_KIND_FUNC
>>> (I think Alexei suggested something this earlier but I could be
>>> misremembering) as follows:
>>>
>>>
>>> 2.2.12 BTF_KIND_FUNC
>>> ~~~~~~~~~~~~~~~~~~~~
>>>
>>> ``struct btf_type`` encoding requirement:
>>>    * ``name_off``: offset to a valid C identifier
>>> -  * ``info.kind_flag``: 0
>>> +  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` 
>>> follows
>>>    * ``info.kind``: BTF_KIND_FUNC
>>>    * ``info.vlen``: linkage information (BTF_FUNC_STATIC, 
>>> BTF_FUNC_GLOBAL
>>>                     or BTF_FUNC_EXTERN)
>>>    * ``type``: a BTF_KIND_FUNC_PROTO type
>>>
>>> - No additional type data follow ``btf_type``.
>>> + If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
>>> +
>>> +    struct btf_func {
>>> +        __u64 addr;
>>> +    };
>>> + Otherwise no additional type data follows ``btf_type``.
>>>
>>>
>>> With the above, dedup could be made to fail when functions have non-
>>> identical associated addresses. Judging by the number of DECL_TAGs in
>>> the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
>>> space for struct btf_funcs would require roughly 400k. We'd still get
>>> dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB
>> nice, I think it's better solution
>>
>>>
>>>
>>>
>>>>    - store the vmlinux base address and treat stored function 
>>>> addresses as
>>>>      offsets, so the verifier can get proper address even if the kernel
>>>>      is relocated
>>>>
>>>
>>> yep; when we read kernel/module BTF in we could hopefully carry out
>>> this recalculation and update the vmlinux/module BTF addresses
>>> accordingly.
>>
>> I wonder now when the address will be stored as number (not string) we
>> could somehow generate relocation records and have the module loader
>> do the relocation automatically

This is an interesting idea if bpf subsystem does not want to do it...

>>
>> not sure how that works for vmlinux when it's loaded/relocated on boot

If no KASLR, address in the vmlinux dwarf should match the one in BTF
based on my experience. With KASLR, yes, relocation needs to be
done for those addresses in BTF as they won't match the actual func
address in the dwarf.

Do most distributions enable KASLR (CONFIG_RANDOMIZE_BASE)?

> 
> Right, actual module address will mostly not match the one in dwarf.
> Some during module btf load, we should modify btf address as well
> for later use? Yes, may need to reuse some routines used in initial
> module relocation.
> 
>>
>> jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-18 18:25         ` Yonghong Song
  2023-05-19  0:23           ` Yonghong Song
@ 2023-05-19  0:26           ` Alexei Starovoitov
  2023-05-22 21:31             ` Andrii Nakryiko
  1 sibling, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2023-05-19  0:26 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alan Maguire, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko,
	Daniel Borkmann, Yafang Shao, Martin KaFai Lau, Song Liu,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, bpf

On Thu, May 18, 2023 at 11:26 AM Yonghong Song <yhs@meta.com> wrote:
> > I wonder now when the address will be stored as number (not string) we
> > could somehow generate relocation records and have the module loader
> > do the relocation automatically
> >
> > not sure how that works for vmlinux when it's loaded/relocated on boot
>
> Right, actual module address will mostly not match the one in dwarf.
> Some during module btf load, we should modify btf address as well
> for later use? Yes, may need to reuse some routines used in initial
> module relocation.


Few thoughts:

Initially I felt that single FUNC with multiple DECL_TAG(addr)
is better, since BTF for all funcs is the same and it's likely
one static inline function that the compiler decided not to inline
(like cpumask_weight), so when libbpf wants to attach prog to it
the kernel should automatically attach in all places.
But then noticed that actually different functions with
the same name and proto will be deduplicated into one.
Their bodies at different locations will be different.
Example: seq_show.
In this case it's better to let libbpf pick the exact one to attach.
Then realized that even the same function like cpumask_weight()
might have different body at different locations due to optimizations.
I don't think dwarf contains enough info to distinguish all the combinations.

Considering all that it's better to keep one BTF kind_func -> one addr.
If it's extended the way Alan is proposing with kind_flag
the dedup logic will not combine them due to different addresses.

Also turned out that the kernel doesn't validate decl_tag string.
The following code loads without error:
__attribute__((btf_decl_tag("\x10\xf0")));

I'm not sure whether we want to tighten decl_tag validation and how.
If we keep it as-is we can use func+decl_tag approach
to add 4 bytes of addr in the binary format (if 1st byte is not zero).
But it feels like a hack, since the kernel needs to be changed
anyway to adjust the addresses after module loading and kernel relocation.
So func with kind_flag seems like the best approach.

Regarding relocation of address in the kernel and modules...
We just need to add base_addr to all addrs-es recorded in BTF.
Both for kernel and for module BTFs.
Shouldn't be too complicated.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs
  2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
                   ` (5 preceding siblings ...)
  2023-05-17 16:16 ` [RFC dwarves 6/6] pahole: document --btf_gen_func_addr Alan Maguire
@ 2023-05-19  9:44 ` Yafang Shao
  2023-05-19 14:46   ` Yonghong Song
  2023-05-19 15:08   ` Alan Maguire
  6 siblings, 2 replies; 21+ messages in thread
From: Yafang Shao @ 2023-05-19  9:44 UTC (permalink / raw)
  To: Alan Maguire
  Cc: acme, ast, jolsa, yhs, andrii, daniel, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf

On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> As a means to continue the discussion in [1], which is
> concerned with finding the best long-term solution to
> having a BPF Type Format (BTF) representation of
> functions that is usable for tracing of edge cases, this
> proof-of-concept series is intended to explore one approach
> to adding information to help make tracing more accurate.
>
> A key problem today is that there is no matching from function
> description to the actual instances of a function.
>
> When that function only has one description, that is
> not an issue, but if we have multiple inconsistent
> static functions in different CUs such as
>
> From kernel/irq/irqdesc.c
>
>     static ssize_t wakeup_show(struct kobject *kobj,
>                                struct kobj_attribute *attr, char *buf)
>
> ...and from drivers/base/power/sysfs.c
>
>     static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>                                char *buf);
>
> ...this becomes a problem.  If I am attaching,
> which do I want?  And even if I know which one
> I want, which instance in kallsyms is which?
>

As you described in the above example,  it is natural to attach a
*function* defined in a specific *file_path*.  So why not encoding the
file path instead ? What's the problem in it?

If we expose the addr and let the user select which address to attach,
it will be a trouble to deploy a bpf program across multiple kernels.
While the file path will have a lower chance to be conflict between
different kernel versions. So I think it would be better if we use the
file path instead and let the kernel find the address automatically.
In the old days, when we wanted to deploy a kprobe kernel module or a
systemtap script across multiple kernels, we had to use if-else in the
code, which was really troublesome as it is not scalable. I don't
think we want to do it the same way in the bpf program.

> This series is a proof-of-concept that supports encoding
> function addresses and associating them with BTF FUNC
> descriptions using BTF declaration tags.
>
> More work would need to be done on the kernel side
> to _use_ this representation, but hopefully having a
> rough approach outlined will help make that more feasible.
>
> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/
>
> Alan Maguire (6):
>   btf_encoder: record function address and if it is local
>   dwarf_loader: store address in function low_pc if available
>   dwarf_loader: transfer low_pc info from subtroutine to its abstract
>     origin
>   btf_encoder: add "addr=0x<addr>" function declaration tag if
>     --btf_gen_func_addr specified
>   btf_encoder: store ELF function representations sorted by name _and_
>     address
>   pahole: document --btf_gen_func_addr
>
>  btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
>  btf_encoder.h      |  4 +--
>  dwarf_loader.c     | 16 +++++++++---
>  dwarves.h          |  3 +++
>  man-pages/pahole.1 |  8 ++++++
>  pahole.c           | 12 +++++++--
>  6 files changed, 85 insertions(+), 22 deletions(-)
>
> --
> 2.31.1
>


-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs
  2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
@ 2023-05-19 14:46   ` Yonghong Song
  2023-05-19 15:08   ` Alan Maguire
  1 sibling, 0 replies; 21+ messages in thread
From: Yonghong Song @ 2023-05-19 14:46 UTC (permalink / raw)
  To: Yafang Shao, Alan Maguire
  Cc: acme, ast, jolsa, yhs, andrii, daniel, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf



On 5/19/23 2:44 AM, Yafang Shao wrote:
> On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> As a means to continue the discussion in [1], which is
>> concerned with finding the best long-term solution to
>> having a BPF Type Format (BTF) representation of
>> functions that is usable for tracing of edge cases, this
>> proof-of-concept series is intended to explore one approach
>> to adding information to help make tracing more accurate.
>>
>> A key problem today is that there is no matching from function
>> description to the actual instances of a function.
>>
>> When that function only has one description, that is
>> not an issue, but if we have multiple inconsistent
>> static functions in different CUs such as
>>
>>  From kernel/irq/irqdesc.c
>>
>>      static ssize_t wakeup_show(struct kobject *kobj,
>>                                 struct kobj_attribute *attr, char *buf)
>>
>> ...and from drivers/base/power/sysfs.c
>>
>>      static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>>                                 char *buf);
>>
>> ...this becomes a problem.  If I am attaching,
>> which do I want?  And even if I know which one
>> I want, which instance in kallsyms is which?
>>
> 
> As you described in the above example,  it is natural to attach a
> *function* defined in a specific *file_path*.  So why not encoding the
> file path instead ? What's the problem in it?

Say we have the following kernel code:

common.h:
    static inline int foo(...) { ...}

bar1.c:
    #include "common.h."
    ... foo() ...
bar2.c:
    #include "common.h"
    ... foo() ...

Even if the function 'foo' is marked as inline, but the compiler
may not actually inline it. So now we got two static functions
'foo' in bar1.o and bar2.o with identical code path (common.h),
and we do want to differentiate it as user might want to
only trace one of them.

> 
> If we expose the addr and let the user select which address to attach,
> it will be a trouble to deploy a bpf program across multiple kernels.

user space may need a little bit work to decide which address to take
by looking at the vmlinux BTF intending to run, e.g., checking
BTF signatures in most time or in the worst case checking dwarf.

The kernel can then handle addr by doing some relocation if needed.

> While the file path will have a lower chance to be conflict between
> different kernel versions. So I think it would be better if we use the
> file path instead and let the kernel find the address automatically.
> In the old days, when we wanted to deploy a kprobe kernel module or a
> systemtap script across multiple kernels, we had to use if-else in the
> code, which was really troublesome as it is not scalable. I don't
> think we want to do it the same way in the bpf program.
> 
>> This series is a proof-of-concept that supports encoding
>> function addresses and associating them with BTF FUNC
>> descriptions using BTF declaration tags.
>>
>> More work would need to be done on the kernel side
>> to _use_ this representation, but hopefully having a
>> rough approach outlined will help make that more feasible.
>>
>> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/
>>
>> Alan Maguire (6):
>>    btf_encoder: record function address and if it is local
>>    dwarf_loader: store address in function low_pc if available
>>    dwarf_loader: transfer low_pc info from subtroutine to its abstract
>>      origin
>>    btf_encoder: add "addr=0x<addr>" function declaration tag if
>>      --btf_gen_func_addr specified
>>    btf_encoder: store ELF function representations sorted by name _and_
>>      address
>>    pahole: document --btf_gen_func_addr
>>
>>   btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
>>   btf_encoder.h      |  4 +--
>>   dwarf_loader.c     | 16 +++++++++---
>>   dwarves.h          |  3 +++
>>   man-pages/pahole.1 |  8 ++++++
>>   pahole.c           | 12 +++++++--
>>   6 files changed, 85 insertions(+), 22 deletions(-)
>>
>> --
>> 2.31.1
>>
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs
  2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
  2023-05-19 14:46   ` Yonghong Song
@ 2023-05-19 15:08   ` Alan Maguire
  1 sibling, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2023-05-19 15:08 UTC (permalink / raw)
  To: Yafang Shao
  Cc: acme, ast, jolsa, yhs, andrii, daniel, martin.lau, song,
	john.fastabend, kpsingh, sdf, haoluo, bpf

On 19/05/2023 10:44, Yafang Shao wrote:
> On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> As a means to continue the discussion in [1], which is
>> concerned with finding the best long-term solution to
>> having a BPF Type Format (BTF) representation of
>> functions that is usable for tracing of edge cases, this
>> proof-of-concept series is intended to explore one approach
>> to adding information to help make tracing more accurate.
>>
>> A key problem today is that there is no matching from function
>> description to the actual instances of a function.
>>
>> When that function only has one description, that is
>> not an issue, but if we have multiple inconsistent
>> static functions in different CUs such as
>>
>> From kernel/irq/irqdesc.c
>>
>>     static ssize_t wakeup_show(struct kobject *kobj,
>>                                struct kobj_attribute *attr, char *buf)
>>
>> ...and from drivers/base/power/sysfs.c
>>
>>     static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>>                                char *buf);
>>
>> ...this becomes a problem.  If I am attaching,
>> which do I want?  And even if I know which one
>> I want, which instance in kallsyms is which?
>>
> 
> As you described in the above example,  it is natural to attach a
> *function* defined in a specific *file_path*.  So why not encoding the
> file path instead ? What's the problem in it?
> 
> If we expose the addr and let the user select which address to attach,
> it will be a trouble to deploy a bpf program across multiple kernels.
> While the file path will have a lower chance to be conflict between
> different kernel versions. So I think it would be better if we use the
> file path instead and let the kernel find the address automatically.
> In the old days, when we wanted to deploy a kprobe kernel module or a
> systemtap script across multiple kernels, we had to use if-else in the
> code, which was really troublesome as it is not scalable. I don't
> think we want to do it the same way in the bpf program.
> 

I think it's important to distinguish between what we do
in libbpf/kernel to ensure safe attach and what sorts
of tracing capabilities a tracer supports. For example,
a tracer (or indeed libbpf) can take the set of different
instances of cpumask_weight() and because they all have the
same BTF description, can attach to all instances. What we
lose though by not having address-level accuracy is the
ability to do more precise tracing.

I also don't think matching addresses -> BTF ids stops us
having file information in the mix; a "struct btf_func"
could potentially contain a name offset to a string
specifying the file too. My concern though is that
relying on _just_ the file/line will not be enough
in some cases. There are some weird edge cases in the
kernel. I'll give an example where file/line isn't
enough to figure out the mapping.

Consider 'struct elf_note_info'; there are two copies
of this in the vmlinux BTF:

[16819] STRUCT 'elf_note_info' size=248 vlen=8
        'thread' type_id=16817 bits_offset=0
        'psinfo' type_id=16815 bits_offset=64
        'signote' type_id=16815 bits_offset=256
        'auxv' type_id=16815 bits_offset=448
        'files' type_id=16815 bits_offset=640
        'csigdata' type_id=6083 bits_offset=832
        'size' type_id=74 bits_offset=1856
        'thread_notes' type_id=21 bits_offset=1920


[16851] STRUCT 'elf_note_info' size=248 vlen=8
        'thread' type_id=16850 bits_offset=0
        'psinfo' type_id=16815 bits_offset=64
        'signote' type_id=16815 bits_offset=256
        'auxv' type_id=16815 bits_offset=448
        'files' type_id=16815 bits_offset=640
        'csigdata' type_id=6507 bits_offset=832
        'size' type_id=74 bits_offset=1856
        'thread_notes' type_id=21 bits_offset=1920


...and here's the structure itself:

struct elf_note_info {
 	struct elf_thread_core_info *thread;
 	struct memelfnote psinfo;
 	struct memelfnote signote;
 	struct memelfnote auxv;
 	struct memelfnote files;
 	user_siginfo_t csigdata;
 	size_t size;
 	int thread_notes;
};

The reason there are two copies is not a dedup failure;
in the first case user_siginfo_t is defined as

[6083] TYPEDEF 'siginfo_t' type_id=6082

while in the second case it gets defined as

[6507] TYPEDEF 'compat_siginfo_t' type_id=6506

The reason is include/linux/compat.h was included
in one place when fs/binfmt_elf.c was built and not
in another when fs/binfmt_elf.c was built the
second time.

Because the object was built into the kernel twice,
as well as having two different instances of BTF descriptions
for the same structure, we also have multiple different BTF
function descriptions for functions that use a
"struct elf_note_info *", and we need to figure out which
maps to which kallsyms instance of the functions like
this:

ffffffff81489380 t fill_thread_core_info
ffffffff8148ca90 t fill_thread_core_info

Now, which BTF description goes with which instance?
The file/line number is identical for each BTF description,
but the BTF is (correctly) different because the object has
been built into the kernel twice and is slightly different
in each case. The difference isn't significant in practice here,
but that doesn't mean it couldn't be in other cases. Imagine a
case where one instance of the structure contained a field and
the other didn't; we'd be interpreting the tracing data
incorrectly.

I realize it's a weird edge case, but that's just one
I found from digging a bit. My worry is if we go with
the file/line as a way of identifying the function,
these sorts of edge cases will pile up and we'll need
to go more fine-grained and use addresses in the end
anyway. Again, that doesn't stop us applying higher-level
semantics at a tracer level, but we need to build those on
solid foundations.

Alan

>> This series is a proof-of-concept that supports encoding
>> function addresses and associating them with BTF FUNC
>> descriptions using BTF declaration tags.
>>
>> More work would need to be done on the kernel side
>> to _use_ this representation, but hopefully having a
>> rough approach outlined will help make that more feasible.
>>
>> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/
>>
>> Alan Maguire (6):
>>   btf_encoder: record function address and if it is local
>>   dwarf_loader: store address in function low_pc if available
>>   dwarf_loader: transfer low_pc info from subtroutine to its abstract
>>     origin
>>   btf_encoder: add "addr=0x<addr>" function declaration tag if
>>     --btf_gen_func_addr specified
>>   btf_encoder: store ELF function representations sorted by name _and_
>>     address
>>   pahole: document --btf_gen_func_addr
>>
>>  btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
>>  btf_encoder.h      |  4 +--
>>  dwarf_loader.c     | 16 +++++++++---
>>  dwarves.h          |  3 +++
>>  man-pages/pahole.1 |  8 ++++++
>>  pahole.c           | 12 +++++++--
>>  6 files changed, 85 insertions(+), 22 deletions(-)
>>
>> --
>> 2.31.1
>>
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-19  0:26           ` Alexei Starovoitov
@ 2023-05-22 21:31             ` Andrii Nakryiko
  2023-05-25  8:52               ` Jiri Olsa
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2023-05-22 21:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Yonghong Song, Jiri Olsa, Alan Maguire, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko,
	Daniel Borkmann, Yafang Shao, Martin KaFai Lau, Song Liu,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, bpf

On Thu, May 18, 2023 at 5:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, May 18, 2023 at 11:26 AM Yonghong Song <yhs@meta.com> wrote:
> > > I wonder now when the address will be stored as number (not string) we
> > > could somehow generate relocation records and have the module loader
> > > do the relocation automatically
> > >
> > > not sure how that works for vmlinux when it's loaded/relocated on boot
> >
> > Right, actual module address will mostly not match the one in dwarf.
> > Some during module btf load, we should modify btf address as well
> > for later use? Yes, may need to reuse some routines used in initial
> > module relocation.
>
>
> Few thoughts:
>
> Initially I felt that single FUNC with multiple DECL_TAG(addr)
> is better, since BTF for all funcs is the same and it's likely
> one static inline function that the compiler decided not to inline
> (like cpumask_weight), so when libbpf wants to attach prog to it
> the kernel should automatically attach in all places.
> But then noticed that actually different functions with
> the same name and proto will be deduplicated into one.
> Their bodies at different locations will be different.
> Example: seq_show.
> In this case it's better to let libbpf pick the exact one to attach.
> Then realized that even the same function like cpumask_weight()
> might have different body at different locations due to optimizations.
> I don't think dwarf contains enough info to distinguish all the combinations.
>
> Considering all that it's better to keep one BTF kind_func -> one addr.
> If it's extended the way Alan is proposing with kind_flag
> the dedup logic will not combine them due to different addresses.

I've discussed this w/ Alexei and Yonghong offline, so will summarize
what I said here. I don't think that we should go the route of adding
kflag to BTF_KIND_FUNC. As Yonghong pointed out, previously only vlen
and kind determined byte size of the type, and so adding a third
variable (kflag), which would apply only to BTF_KIND_FUNC, seems like
an unnecessary new complication.

I propose to go with an entirely new kind instead, we have plenty of
them left. This new kind will be pretty kernel-specific, so could be
targeted for kernel use cases better without adding unnecessary
complications to Clang. BTF_KIND_FUNCs generated by Clang for .bpf.o
files don't need addr, they are meaningless and Clang doesn't know
anything about addresses anyways. So we can keep Clang unchanged and
more backwards compatible.

But now that this new kind (BTF_KIND_KERNEL_FUNC? KFUNC would be
misleading, unfortunately) is kernel-specific and generated by pahole
only, besides addr we can add some flags field and use them to mark
function as defined as kfunc or not, or (as a hypothetical example)
traceable or not, or maybe we even have inline flag some day, etc.
Something that makes sense mostly for kernel functions.

Having said all that, given we are going to break all existing
BTF-aware tools again with a new kind, we should really couple all
this work with making BTF self-describing as discussed in [0], so that
future changes like this won't break older bpftool and other similar
tools, unnecessarily.

Which, btw, is another reason to not use kflag to determine the size
of btf_type. Proposed solution in [0] assumes that kind + vlen defines
the size. We should probably have dedicated discussion for
self-describing BTF, but I think both changes have to be done in the
same release window.

  [0] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/#t

>
> Also turned out that the kernel doesn't validate decl_tag string.
> The following code loads without error:
> __attribute__((btf_decl_tag("\x10\xf0")));
>
> I'm not sure whether we want to tighten decl_tag validation and how.
> If we keep it as-is we can use func+decl_tag approach
> to add 4 bytes of addr in the binary format (if 1st byte is not zero).
> But it feels like a hack, since the kernel needs to be changed
> anyway to adjust the addresses after module loading and kernel relocation.
> So func with kind_flag seems like the best approach.
>
> Regarding relocation of address in the kernel and modules...
> We just need to add base_addr to all addrs-es recorded in BTF.
> Both for kernel and for module BTFs.
> Shouldn't be too complicated.

yep, KASLR seems simple enough to handle by the kernel itself at boot time.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-22 21:31             ` Andrii Nakryiko
@ 2023-05-25  8:52               ` Jiri Olsa
  2023-05-25 10:14                 ` Alan Maguire
  0 siblings, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2023-05-25  8:52 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Yonghong Song, Jiri Olsa, Alan Maguire,
	Arnaldo Carvalho de Melo, Alexei Starovoitov, Yonghong Song,
	Andrii Nakryiko, Daniel Borkmann, Yafang Shao, Martin KaFai Lau,
	Song Liu, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo,
	bpf

On Mon, May 22, 2023 at 02:31:01PM -0700, Andrii Nakryiko wrote:
> On Thu, May 18, 2023 at 5:26 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, May 18, 2023 at 11:26 AM Yonghong Song <yhs@meta.com> wrote:
> > > > I wonder now when the address will be stored as number (not string) we
> > > > could somehow generate relocation records and have the module loader
> > > > do the relocation automatically
> > > >
> > > > not sure how that works for vmlinux when it's loaded/relocated on boot
> > >
> > > Right, actual module address will mostly not match the one in dwarf.
> > > Some during module btf load, we should modify btf address as well
> > > for later use? Yes, may need to reuse some routines used in initial
> > > module relocation.
> >
> >
> > Few thoughts:
> >
> > Initially I felt that single FUNC with multiple DECL_TAG(addr)
> > is better, since BTF for all funcs is the same and it's likely
> > one static inline function that the compiler decided not to inline
> > (like cpumask_weight), so when libbpf wants to attach prog to it
> > the kernel should automatically attach in all places.
> > But then noticed that actually different functions with
> > the same name and proto will be deduplicated into one.
> > Their bodies at different locations will be different.
> > Example: seq_show.
> > In this case it's better to let libbpf pick the exact one to attach.
> > Then realized that even the same function like cpumask_weight()
> > might have different body at different locations due to optimizations.
> > I don't think dwarf contains enough info to distinguish all the combinations.
> >
> > Considering all that it's better to keep one BTF kind_func -> one addr.
> > If it's extended the way Alan is proposing with kind_flag
> > the dedup logic will not combine them due to different addresses.
> 
> I've discussed this w/ Alexei and Yonghong offline, so will summarize
> what I said here. I don't think that we should go the route of adding
> kflag to BTF_KIND_FUNC. As Yonghong pointed out, previously only vlen
> and kind determined byte size of the type, and so adding a third
> variable (kflag), which would apply only to BTF_KIND_FUNC, seems like
> an unnecessary new complication.
> 
> I propose to go with an entirely new kind instead, we have plenty of
> them left. This new kind will be pretty kernel-specific, so could be
> targeted for kernel use cases better without adding unnecessary
> complications to Clang. BTF_KIND_FUNCs generated by Clang for .bpf.o
> files don't need addr, they are meaningless and Clang doesn't know
> anything about addresses anyways. So we can keep Clang unchanged and
> more backwards compatible.
> 
> But now that this new kind (BTF_KIND_KERNEL_FUNC? KFUNC would be
> misleading, unfortunately) is kernel-specific and generated by pahole
> only, besides addr we can add some flags field and use them to mark
> function as defined as kfunc or not, or (as a hypothetical example)
> traceable or not, or maybe we even have inline flag some day, etc.
> Something that makes sense mostly for kernel functions.
> 
> Having said all that, given we are going to break all existing
> BTF-aware tools again with a new kind, we should really couple all
> this work with making BTF self-describing as discussed in [0], so that
> future changes like this won't break older bpftool and other similar
> tools, unnecessarily.

nice, would be great to have this and eventually got rid of new pahole
enable/disable options, makes sense to do this before adding new type

jirka

> 
> Which, btw, is another reason to not use kflag to determine the size
> of btf_type. Proposed solution in [0] assumes that kind + vlen defines
> the size. We should probably have dedicated discussion for
> self-describing BTF, but I think both changes have to be done in the
> same release window.
> 
>   [0] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/#t
> 
> >
> > Also turned out that the kernel doesn't validate decl_tag string.
> > The following code loads without error:
> > __attribute__((btf_decl_tag("\x10\xf0")));
> >
> > I'm not sure whether we want to tighten decl_tag validation and how.
> > If we keep it as-is we can use func+decl_tag approach
> > to add 4 bytes of addr in the binary format (if 1st byte is not zero).
> > But it feels like a hack, since the kernel needs to be changed
> > anyway to adjust the addresses after module loading and kernel relocation.
> > So func with kind_flag seems like the best approach.
> >
> > Regarding relocation of address in the kernel and modules...
> > We just need to add base_addr to all addrs-es recorded in BTF.
> > Both for kernel and for module BTFs.
> > Shouldn't be too complicated.
> 
> yep, KASLR seems simple enough to handle by the kernel itself at boot time.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-25  8:52               ` Jiri Olsa
@ 2023-05-25 10:14                 ` Alan Maguire
  2023-05-25 17:29                   ` Andrii Nakryiko
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2023-05-25 10:14 UTC (permalink / raw)
  To: Jiri Olsa, Andrii Nakryiko
  Cc: Alexei Starovoitov, Yonghong Song, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko,
	Daniel Borkmann, Yafang Shao, Martin KaFai Lau, Song Liu,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, bpf

On 25/05/2023 09:52, Jiri Olsa wrote:
> On Mon, May 22, 2023 at 02:31:01PM -0700, Andrii Nakryiko wrote:
>> On Thu, May 18, 2023 at 5:26 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>>
>>> On Thu, May 18, 2023 at 11:26 AM Yonghong Song <yhs@meta.com> wrote:
>>>>> I wonder now when the address will be stored as number (not string) we
>>>>> could somehow generate relocation records and have the module loader
>>>>> do the relocation automatically
>>>>>
>>>>> not sure how that works for vmlinux when it's loaded/relocated on boot
>>>>
>>>> Right, actual module address will mostly not match the one in dwarf.
>>>> Some during module btf load, we should modify btf address as well
>>>> for later use? Yes, may need to reuse some routines used in initial
>>>> module relocation.
>>>
>>>
>>> Few thoughts:
>>>
>>> Initially I felt that single FUNC with multiple DECL_TAG(addr)
>>> is better, since BTF for all funcs is the same and it's likely
>>> one static inline function that the compiler decided not to inline
>>> (like cpumask_weight), so when libbpf wants to attach prog to it
>>> the kernel should automatically attach in all places.
>>> But then noticed that actually different functions with
>>> the same name and proto will be deduplicated into one.
>>> Their bodies at different locations will be different.
>>> Example: seq_show.
>>> In this case it's better to let libbpf pick the exact one to attach.
>>> Then realized that even the same function like cpumask_weight()
>>> might have different body at different locations due to optimizations.
>>> I don't think dwarf contains enough info to distinguish all the combinations.
>>>
>>> Considering all that it's better to keep one BTF kind_func -> one addr.
>>> If it's extended the way Alan is proposing with kind_flag
>>> the dedup logic will not combine them due to different addresses.
>>
>> I've discussed this w/ Alexei and Yonghong offline, so will summarize
>> what I said here. I don't think that we should go the route of adding
>> kflag to BTF_KIND_FUNC. As Yonghong pointed out, previously only vlen
>> and kind determined byte size of the type, and so adding a third
>> variable (kflag), which would apply only to BTF_KIND_FUNC, seems like
>> an unnecessary new complication.
>>
>> I propose to go with an entirely new kind instead, we have plenty of
>> them left. This new kind will be pretty kernel-specific, so could be
>> targeted for kernel use cases better without adding unnecessary
>> complications to Clang. BTF_KIND_FUNCs generated by Clang for .bpf.o
>> files don't need addr, they are meaningless and Clang doesn't know
>> anything about addresses anyways. So we can keep Clang unchanged and
>> more backwards compatible.
>>
>> But now that this new kind (BTF_KIND_KERNEL_FUNC? KFUNC would be
>> misleading, unfortunately) is kernel-specific and generated by pahole
>> only, besides addr we can add some flags field and use them to mark
>> function as defined as kfunc or not, or (as a hypothetical example)
>> traceable or not, or maybe we even have inline flag some day, etc.
>> Something that makes sense mostly for kernel functions.
>>
>> Having said all that, given we are going to break all existing
>> BTF-aware tools again with a new kind, we should really couple all
>> this work with making BTF self-describing as discussed in [0], so that
>> future changes like this won't break older bpftool and other similar
>> tools, unnecessarily.
> 
> nice, would be great to have this and eventually got rid of new pahole
> enable/disable options, makes sense to do this before adding new type
> 
> jirka
>

agreed; I'd been thinking the same and I've been working on a proof-of-
concept of this based on our previous discussions, I'll send it out as
soon as I've got it roughly working.

With respect to the question of having a new kind, I'm not sure I agree
with the above. We've already broken the "vlen == number of objects
following" for BTF_KIND_FUNC, where vlen is used to represent linkage
information instead.

To me, it feels more natural to have continuity across different object
types (kernel versus BPF program) with BTF_KIND_FUNC: the fact that
it's hard to come up with an alternate name is perhaps a reflection of
this. Most characteristics (aside from "is a kfunc") seem to be shared
across kernel and BPF program functions, but the best way to judge
is probably to come up with as complete a list as is possible I suppose.

In order to accommodate a metadata description using existing
BTF_KIND_FUNC, we can have a metadata flag that can say
"KFLAG set means singular object following of object_size" that is
set for  BTF_KIND_FUNC. We can mark it as discouraged for future
use.

One argument I definitely see for a new kind representing kernel
functions is if it were the case that we might need N elements
_and_ a singular object following the btf_type to represent it.
I don't currently see any use for such a model for function
representation, but if that is anticipated somehow, it might be
worth having a new kind to support that sort of representation.

Alan

>>
>> Which, btw, is another reason to not use kflag to determine the size
>> of btf_type. Proposed solution in [0] assumes that kind + vlen defines
>> the size. We should probably have dedicated discussion for
>> self-describing BTF, but I think both changes have to be done in the
>> same release window.
>>
>>   [0] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/#t
>>
>>>
>>> Also turned out that the kernel doesn't validate decl_tag string.
>>> The following code loads without error:
>>> __attribute__((btf_decl_tag("\x10\xf0")));
>>>
>>> I'm not sure whether we want to tighten decl_tag validation and how.
>>> If we keep it as-is we can use func+decl_tag approach
>>> to add 4 bytes of addr in the binary format (if 1st byte is not zero).
>>> But it feels like a hack, since the kernel needs to be changed
>>> anyway to adjust the addresses after module loading and kernel relocation.
>>> So func with kind_flag seems like the best approach.
>>>
>>> Regarding relocation of address in the kernel and modules...
>>> We just need to add base_addr to all addrs-es recorded in BTF.
>>> Both for kernel and for module BTFs.
>>> Shouldn't be too complicated.
>>
>> yep, KASLR seems simple enough to handle by the kernel itself at boot time.
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address
  2023-05-25 10:14                 ` Alan Maguire
@ 2023-05-25 17:29                   ` Andrii Nakryiko
  0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2023-05-25 17:29 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Jiri Olsa, Alexei Starovoitov, Yonghong Song,
	Arnaldo Carvalho de Melo, Alexei Starovoitov, Yonghong Song,
	Andrii Nakryiko, Daniel Borkmann, Yafang Shao, Martin KaFai Lau,
	Song Liu, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo,
	bpf

On Thu, May 25, 2023 at 3:20 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 25/05/2023 09:52, Jiri Olsa wrote:
> > On Mon, May 22, 2023 at 02:31:01PM -0700, Andrii Nakryiko wrote:
> >> On Thu, May 18, 2023 at 5:26 PM Alexei Starovoitov
> >> <alexei.starovoitov@gmail.com> wrote:
> >>>
> >>> On Thu, May 18, 2023 at 11:26 AM Yonghong Song <yhs@meta.com> wrote:
> >>>>> I wonder now when the address will be stored as number (not string) we
> >>>>> could somehow generate relocation records and have the module loader
> >>>>> do the relocation automatically
> >>>>>
> >>>>> not sure how that works for vmlinux when it's loaded/relocated on boot
> >>>>
> >>>> Right, actual module address will mostly not match the one in dwarf.
> >>>> Some during module btf load, we should modify btf address as well
> >>>> for later use? Yes, may need to reuse some routines used in initial
> >>>> module relocation.
> >>>
> >>>
> >>> Few thoughts:
> >>>
> >>> Initially I felt that single FUNC with multiple DECL_TAG(addr)
> >>> is better, since BTF for all funcs is the same and it's likely
> >>> one static inline function that the compiler decided not to inline
> >>> (like cpumask_weight), so when libbpf wants to attach prog to it
> >>> the kernel should automatically attach in all places.
> >>> But then noticed that actually different functions with
> >>> the same name and proto will be deduplicated into one.
> >>> Their bodies at different locations will be different.
> >>> Example: seq_show.
> >>> In this case it's better to let libbpf pick the exact one to attach.
> >>> Then realized that even the same function like cpumask_weight()
> >>> might have different body at different locations due to optimizations.
> >>> I don't think dwarf contains enough info to distinguish all the combinations.
> >>>
> >>> Considering all that it's better to keep one BTF kind_func -> one addr.
> >>> If it's extended the way Alan is proposing with kind_flag
> >>> the dedup logic will not combine them due to different addresses.
> >>
> >> I've discussed this w/ Alexei and Yonghong offline, so will summarize
> >> what I said here. I don't think that we should go the route of adding
> >> kflag to BTF_KIND_FUNC. As Yonghong pointed out, previously only vlen
> >> and kind determined byte size of the type, and so adding a third
> >> variable (kflag), which would apply only to BTF_KIND_FUNC, seems like
> >> an unnecessary new complication.
> >>
> >> I propose to go with an entirely new kind instead, we have plenty of
> >> them left. This new kind will be pretty kernel-specific, so could be
> >> targeted for kernel use cases better without adding unnecessary
> >> complications to Clang. BTF_KIND_FUNCs generated by Clang for .bpf.o
> >> files don't need addr, they are meaningless and Clang doesn't know
> >> anything about addresses anyways. So we can keep Clang unchanged and
> >> more backwards compatible.
> >>
> >> But now that this new kind (BTF_KIND_KERNEL_FUNC? KFUNC would be
> >> misleading, unfortunately) is kernel-specific and generated by pahole
> >> only, besides addr we can add some flags field and use them to mark
> >> function as defined as kfunc or not, or (as a hypothetical example)
> >> traceable or not, or maybe we even have inline flag some day, etc.
> >> Something that makes sense mostly for kernel functions.
> >>
> >> Having said all that, given we are going to break all existing
> >> BTF-aware tools again with a new kind, we should really couple all
> >> this work with making BTF self-describing as discussed in [0], so that
> >> future changes like this won't break older bpftool and other similar
> >> tools, unnecessarily.
> >
> > nice, would be great to have this and eventually got rid of new pahole
> > enable/disable options, makes sense to do this before adding new type
> >
> > jirka
> >
>
> agreed; I'd been thinking the same and I've been working on a proof-of-
> concept of this based on our previous discussions, I'll send it out as
> soon as I've got it roughly working.

nice! it would be great to not have every older tool break whenever we
add a tiny extension to BTF

>
> With respect to the question of having a new kind, I'm not sure I agree
> with the above. We've already broken the "vlen == number of objects
> following" for BTF_KIND_FUNC, where vlen is used to represent linkage
> information instead.

I'd say BTF_KIND_FUNC and its abuse of vlen is just an example of what
we shouldn't do going forward. But even that still allows us to
compactly describe each btf_type's size as a pair of (fixed_sz,
vlen_sz), where the total size will be fixed_sz + vlen * vlen_sz. For
BTF_KIND_FUNC vlen_sz will be zero, even if vlen is non-zero.

If we allow klag to change bytes size of a type, we'd need another
sizing dimension, which is what I try to avoid here.

Look at ENUM and ENUM64. We chose to add a new kind for ENUM64 because
of different byte sizing needs, instead of abusing kflag for this, and
I think this was the right decision. Let's do that here as well.

>
> To me, it feels more natural to have continuity across different object
> types (kernel versus BPF program) with BTF_KIND_FUNC: the fact that
> it's hard to come up with an alternate name is perhaps a reflection of
> this. Most characteristics (aside from "is a kfunc") seem to be shared
> across kernel and BPF program functions, but the best way to judge
> is probably to come up with as complete a list as is possible I suppose.

Even the address that you'll add to BTF_KIND_FUNC doesn't apply to BTF
emitted for BPF object files. So while I can sympathize (conceptually)
with the desire to have one kind for "all thing function", it will
cause more problems. Again, look at ENUM and ENUM64 case. Sure, we had
to teach BTF dedup and BPF CO-RE about both kinds to represent the
same enum concept, but we'd have to do the same even if just added a
kflag for 64-bit enum. So my point is that tools will still need to be
extended to take advantage of new information, but reusing the same
kind is causing more problems and is not solving any.

>
> In order to accommodate a metadata description using existing
> BTF_KIND_FUNC, we can have a metadata flag that can say
> "KFLAG set means singular object following of object_size" that is
> set for  BTF_KIND_FUNC. We can mark it as discouraged for future
> use.
>
> One argument I definitely see for a new kind representing kernel
> functions is if it were the case that we might need N elements
> _and_ a singular object following the btf_type to represent it.
> I don't currently see any use for such a model for function
> representation, but if that is anticipated somehow, it might be
> worth having a new kind to support that sort of representation.

We can never foresee stuff like this, but it's best to not design
ourselves into the corner. We have enough space for BTF_KIND_xxx,
which gives you all the same benefits. Let's keep it simple and
straightforward.

>
> Alan
>
> >>
> >> Which, btw, is another reason to not use kflag to determine the size
> >> of btf_type. Proposed solution in [0] assumes that kind + vlen defines
> >> the size. We should probably have dedicated discussion for
> >> self-describing BTF, but I think both changes have to be done in the
> >> same release window.
> >>
> >>   [0] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/#t
> >>
> >>>
> >>> Also turned out that the kernel doesn't validate decl_tag string.
> >>> The following code loads without error:
> >>> __attribute__((btf_decl_tag("\x10\xf0")));
> >>>
> >>> I'm not sure whether we want to tighten decl_tag validation and how.
> >>> If we keep it as-is we can use func+decl_tag approach
> >>> to add 4 bytes of addr in the binary format (if 1st byte is not zero).
> >>> But it feels like a hack, since the kernel needs to be changed
> >>> anyway to adjust the addresses after module loading and kernel relocation.
> >>> So func with kind_flag seems like the best approach.
> >>>
> >>> Regarding relocation of address in the kernel and modules...
> >>> We just need to add base_addr to all addrs-es recorded in BTF.
> >>> Both for kernel and for module BTFs.
> >>> Shouldn't be too complicated.
> >>
> >> yep, KASLR seems simple enough to handle by the kernel itself at boot time.
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-05-25 17:30 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-17 16:16 [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 1/6] btf_encoder: record function address and if it is local Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 2/6] dwarf_loader: store address in function low_pc if available Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 3/6] dwarf_loader: transfer low_pc info from subtroutine to its abstract origin Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 4/6] btf_encoder: add "addr=0x<addr>" function declaration tag if --btf_gen_func_addr specified Alan Maguire
2023-05-17 16:16 ` [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address Alan Maguire
2023-05-18  8:39   ` Jiri Olsa
2023-05-18 13:23     ` Alan Maguire
2023-05-18 15:20       ` Yonghong Song
2023-05-18 16:22       ` Jiri Olsa
2023-05-18 18:25         ` Yonghong Song
2023-05-19  0:23           ` Yonghong Song
2023-05-19  0:26           ` Alexei Starovoitov
2023-05-22 21:31             ` Andrii Nakryiko
2023-05-25  8:52               ` Jiri Olsa
2023-05-25 10:14                 ` Alan Maguire
2023-05-25 17:29                   ` Andrii Nakryiko
2023-05-17 16:16 ` [RFC dwarves 6/6] pahole: document --btf_gen_func_addr Alan Maguire
2023-05-19  9:44 ` [RFC dwarves 0/6] Encoding function addresses using DECL_TAGs Yafang Shao
2023-05-19 14:46   ` Yonghong Song
2023-05-19 15:08   ` Alan Maguire

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).