bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] pahole: Workaround dwarf bug for function encoding
@ 2020-10-26 22:36 Jiri Olsa
  2020-10-26 22:36 ` [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function Jiri Olsa
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-10-26 22:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo, Frank Ch. Eigler, Mark Wielaard

hi,
because of gcc bug [1] we can no longer rely on DW_AT_declaration
attribute to filter out declarations and end up with just
one copy of the function in the BTF data.

It seems this bug is not easy to fix, but regardless if the
it's coming soon, it's probably good idea not to depend so
much only on dwarf data and make some extra checks.

Thus for function encoding we are now doing following checks:
  - argument names are defined for the function
  - there's symbol and address defined for the function
  - function is generated only once

These checks ensure that we encode function with defined
symbol/address and argument names.

I marked this post as RFC, because with this workaround in
place we are also encoding assembly functions, which were
not present when using the previous gcc version.

Full functions diff to previous gcc working version:

  http://people.redhat.com/~jolsa/functions.diff.txt

I'm not sure this does not break some rule for functions in
BTF data, becuse those assembly functions are not attachable
by bpf trampolines, so I don't think there's any use for them.

thoughts?
jirka


[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
---
Jiri Olsa (3):
      btf_encoder: Move find_all_percpu_vars in generic config function
      btf_encoder: Change functions check due to broken dwarf
      btf_encoder: Include static functions to BTF data

 btf_encoder.c | 221 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------
 elf_symtab.h  |   8 +++++
 2 files changed, 170 insertions(+), 59 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function
  2020-10-26 22:36 [RFC 0/3] pahole: Workaround dwarf bug for function encoding Jiri Olsa
@ 2020-10-26 22:36 ` Jiri Olsa
  2020-10-27 23:12   ` Andrii Nakryiko
  2020-10-26 22:36 ` [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf Jiri Olsa
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-10-26 22:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo, Frank Ch. Eigler, Mark Wielaard

Moving find_all_percpu_vars under generic onfig function
that walks over symbols and calls config_percpu_var.

We will add another config function that needs to go
through all the symbols, so it's better they go through
them just once.

There's no functional change intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 btf_encoder.c | 126 ++++++++++++++++++++++++++------------------------
 1 file changed, 66 insertions(+), 60 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 2a6455be4c52..2dd26c904039 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -250,7 +250,64 @@ static bool percpu_var_exists(uint64_t addr, uint32_t *sz, const char **name)
 	return true;
 }
 
-static int find_all_percpu_vars(struct btf_elf *btfe)
+static int config_percpu_var(struct btf_elf *btfe, GElf_Sym *sym)
+{
+	const char *sym_name;
+	uint64_t addr;
+	uint32_t size;
+
+	/* compare a symbol's shndx to determine if it's a percpu variable */
+	if (elf_sym__section(sym) != btfe->percpu_shndx)
+		return 0;
+	if (elf_sym__type(sym) != STT_OBJECT)
+		return 0;
+
+	addr = elf_sym__value(sym);
+	/*
+	 * Store only those symbols that have allocated space in the percpu section.
+	 * This excludes the following three types of symbols:
+	 *
+	 *  1. __ADDRESSABLE(sym), which are forcely emitted as symbols.
+	 *  2. __UNIQUE_ID(prefix), which are introduced to generate unique ids.
+	 *  3. __exitcall(fn), functions which are labeled as exit calls.
+	 *
+	 * In addition, the variables defined using DEFINE_PERCPU_FIRST are
+	 * also not included, which currently includes:
+	 *
+	 *  1. fixed_percpu_data
+	 */
+	if (!addr)
+		return 0;
+
+	sym_name = elf_sym__name(sym, btfe->symtab);
+	if (!btf_name_valid(sym_name)) {
+		dump_invalid_symbol("Found symbol of invalid name when encoding btf",
+				    sym_name, btf_elf__verbose, btf_elf__force);
+		return btf_elf__force ? 0 : -1;
+	}
+	size = elf_sym__size(sym);
+	if (!size) {
+		dump_invalid_symbol("Found symbol of zero size when encoding btf",
+				    sym_name, btf_elf__verbose, btf_elf__force);
+		return btf_elf__force ? 0 : -1;
+	}
+
+	if (btf_elf__verbose)
+		printf("Found per-CPU symbol '%s' at address 0x%lx\n", sym_name, addr);
+
+	if (percpu_var_cnt == MAX_PERCPU_VAR_CNT) {
+		fprintf(stderr, "Reached the limit of per-CPU variables: %d\n",
+			MAX_PERCPU_VAR_CNT);
+		return -1;
+	}
+	percpu_vars[percpu_var_cnt].addr = addr;
+	percpu_vars[percpu_var_cnt].sz = size;
+	percpu_vars[percpu_var_cnt].name = sym_name;
+	percpu_var_cnt++;
+	return 0;
+}
+
+static int config(struct btf_elf *btfe, bool do_percpu_vars)
 {
 	uint32_t core_id;
 	GElf_Sym sym;
@@ -260,69 +317,18 @@ static int find_all_percpu_vars(struct btf_elf *btfe)
 
 	/* search within symtab for percpu variables */
 	elf_symtab__for_each_symbol(btfe->symtab, core_id, sym) {
-		const char *sym_name;
-		uint64_t addr;
-		uint32_t size;
-
-		/* compare a symbol's shndx to determine if it's a percpu variable */
-		if (elf_sym__section(&sym) != btfe->percpu_shndx)
-			continue;
-		if (elf_sym__type(&sym) != STT_OBJECT)
-			continue;
-
-		addr = elf_sym__value(&sym);
-		/*
-		 * Store only those symbols that have allocated space in the percpu section.
-		 * This excludes the following three types of symbols:
-		 *
-		 *  1. __ADDRESSABLE(sym), which are forcely emitted as symbols.
-		 *  2. __UNIQUE_ID(prefix), which are introduced to generate unique ids.
-		 *  3. __exitcall(fn), functions which are labeled as exit calls.
-		 *
-		 * In addition, the variables defined using DEFINE_PERCPU_FIRST are
-		 * also not included, which currently includes:
-		 *
-		 *  1. fixed_percpu_data
-		 */
-		if (!addr)
-			continue;
-
-		sym_name = elf_sym__name(&sym, btfe->symtab);
-		if (!btf_name_valid(sym_name)) {
-			dump_invalid_symbol("Found symbol of invalid name when encoding btf",
-					    sym_name, btf_elf__verbose, btf_elf__force);
-			if (btf_elf__force)
-				continue;
+		if (do_percpu_vars && config_percpu_var(btfe, &sym))
 			return -1;
-		}
-		size = elf_sym__size(&sym);
-		if (!size) {
-			dump_invalid_symbol("Found symbol of zero size when encoding btf",
-					    sym_name, btf_elf__verbose, btf_elf__force);
-			if (btf_elf__force)
-				continue;
-			return -1;
-		}
+	}
 
-		if (btf_elf__verbose)
-			printf("Found per-CPU symbol '%s' at address 0x%lx\n", sym_name, addr);
+	if (do_percpu_vars) {
+		if (percpu_var_cnt)
+			qsort(percpu_vars, percpu_var_cnt, sizeof(percpu_vars[0]), percpu_var_cmp);
 
-		if (percpu_var_cnt == MAX_PERCPU_VAR_CNT) {
-			fprintf(stderr, "Reached the limit of per-CPU variables: %d\n",
-				MAX_PERCPU_VAR_CNT);
-			return -1;
-		}
-		percpu_vars[percpu_var_cnt].addr = addr;
-		percpu_vars[percpu_var_cnt].sz = size;
-		percpu_vars[percpu_var_cnt].name = sym_name;
-		percpu_var_cnt++;
+		if (btf_elf__verbose)
+			printf("Found %d per-CPU variables!\n", percpu_var_cnt);
 	}
 
-	if (percpu_var_cnt)
-		qsort(percpu_vars, percpu_var_cnt, sizeof(percpu_vars[0]), percpu_var_cmp);
-
-	if (btf_elf__verbose)
-		printf("Found %d per-CPU variables!\n", percpu_var_cnt);
 	return 0;
 }
 
@@ -351,7 +357,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 		if (!btfe)
 			return -1;
 
-		if (!skip_encoding_vars && find_all_percpu_vars(btfe))
+		if (config(btfe, !skip_encoding_vars))
 			goto out;
 
 		has_index_type = false;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf
  2020-10-26 22:36 [RFC 0/3] pahole: Workaround dwarf bug for function encoding Jiri Olsa
  2020-10-26 22:36 ` [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function Jiri Olsa
@ 2020-10-26 22:36 ` Jiri Olsa
  2020-10-27 23:20   ` Andrii Nakryiko
  2020-10-26 22:36 ` [PATCH 3/3] btf_encoder: Include static functions to BTF data Jiri Olsa
  2020-10-27 23:13 ` [RFC 0/3] pahole: Workaround dwarf bug for function encoding Andrii Nakryiko
  3 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-10-26 22:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo, Frank Ch. Eigler, Mark Wielaard

We need to generate just single BTF instance for the
function, while DWARF data contains multiple instances
of DW_TAG_subprogram tag.

Unfortunately we can no longer rely on DW_AT_declaration
tag (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060)

Instead we apply following checks:
  - argument names are defined for the function
  - there's symbol and address defined for the function
  - function is generated only once

They might be slightly superfluous together, but it's
better to be ready for another DWARF mishap.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 btf_encoder.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++-
 elf_symtab.h  |   8 ++++
 2 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 2dd26c904039..99b9abe36993 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -26,6 +26,62 @@
  */
 #define KSYM_NAME_LEN 128
 
+struct elf_function {
+	const char *name;
+	bool generated;
+};
+
+static struct elf_function *functions;
+static int functions_alloc;
+static int functions_cnt;
+
+static int functions_cmp(const void *_a, const void *_b)
+{
+	const struct elf_function *a = _a;
+	const struct elf_function *b = _b;
+
+	return strcmp(a->name, b->name);
+}
+
+static void delete_functions(void)
+{
+	free(functions);
+	functions_alloc = functions_cnt = 0;
+}
+
+static int config_function(struct btf_elf *btfe, GElf_Sym *sym)
+{
+	if (!elf_sym__is_function(sym))
+		return 0;
+	if (!elf_sym__value(sym))
+		return 0;
+
+	if (functions_cnt == functions_alloc) {
+		functions_alloc += 5000;
+		functions = realloc(functions, functions_alloc * sizeof(*functions));
+		if (!functions)
+			return -1;
+	}
+
+	functions[functions_cnt].name = elf_sym__name(sym, btfe->symtab);
+	functions_cnt++;
+	return 0;
+}
+
+static bool should_generate_func(const struct btf_elf *btfe, const char *name)
+{
+	struct elf_function *p;
+	struct elf_function key = { .name = name };
+
+	p = bsearch(&key, functions, functions_cnt,
+		    sizeof(functions[0]), functions_cmp);
+	if (!p || p->generated)
+		return false;
+
+	p->generated = true;
+	return true;
+}
+
 static bool btf_name_char_ok(char c, bool first)
 {
 	if (c == '_' || c == '.')
@@ -207,6 +263,7 @@ int btf_encoder__encode()
 		btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
 
 	err = btf_elf__encode(btfe, 0);
+	delete_functions();
 	btf_elf__delete(btfe);
 	btfe = NULL;
 
@@ -314,11 +371,16 @@ static int config(struct btf_elf *btfe, bool do_percpu_vars)
 
 	/* cache variables' addresses, preparing for searching in symtab. */
 	percpu_var_cnt = 0;
+	/* functions addresses */
+	functions_cnt = 0;
+	functions_alloc = 0;
 
 	/* search within symtab for percpu variables */
 	elf_symtab__for_each_symbol(btfe->symtab, core_id, sym) {
 		if (do_percpu_vars && config_percpu_var(btfe, &sym))
 			return -1;
+		if (config_function(btfe, &sym))
+			return -1;
 	}
 
 	if (do_percpu_vars) {
@@ -329,9 +391,25 @@ static int config(struct btf_elf *btfe, bool do_percpu_vars)
 			printf("Found %d per-CPU variables!\n", percpu_var_cnt);
 	}
 
+	if (functions_cnt)
+		qsort(functions, functions_cnt, sizeof(functions[0]), functions_cmp);
+
 	return 0;
 }
 
+static bool has_arg_names(struct cu *cu, struct ftype *ftype)
+{
+	struct parameter *param;
+	const char *name;
+
+	ftype__for_each_parameter(ftype, param) {
+		name = dwarves__active_loader->strings__ptr(cu, param->name);
+		if (name == NULL)
+			return false;
+	}
+	return true;
+}
+
 int cu__encode_btf(struct cu *cu, int verbose, bool force,
 		   bool skip_encoding_vars)
 {
@@ -407,7 +485,28 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 		int btf_fnproto_id, btf_fn_id;
 		const char *name;
 
-		if (fn->declaration || !fn->external)
+		if (!fn->external)
+			continue;
+
+		/*
+		 * We need to generate just single BTF instance for the
+		 * function, while DWARF data contains multiple instances
+		 * of DW_TAG_subprogram tag.
+		 *
+		 * We can no longer rely on DW_AT_declaration tag.
+		 *  (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060)
+		 *
+		 * We check following conditions in following calls:
+		 *   - argument names are defined
+		 *   - there's symbol and address defined for the function
+		 *   - function is generated only once
+		 *
+		 * They might be slightly superfluous together, but it's
+		 * better to be ready for another DWARF mishap.
+		 */
+		if (!has_arg_names(cu, &fn->proto))
+			continue;
+		if (!should_generate_func(btfe, function__name(fn, cu)))
 			continue;
 
 		btf_fnproto_id = btf_elf__add_func_proto(btfe, cu, &fn->proto, type_id_off);
@@ -491,6 +590,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 
 out:
 	if (err) {
+		delete_functions();
 		btf_elf__delete(btfe);
 		btfe = NULL;
 	}
diff --git a/elf_symtab.h b/elf_symtab.h
index 359add69c8ab..094ec4683d01 100644
--- a/elf_symtab.h
+++ b/elf_symtab.h
@@ -63,6 +63,14 @@ static inline uint64_t elf_sym__value(const GElf_Sym *sym)
 	return sym->st_value;
 }
 
+static inline int elf_sym__is_function(const GElf_Sym *sym)
+{
+	return (elf_sym__type(sym) == STT_FUNC ||
+		elf_sym__type(sym) == STT_GNU_IFUNC) &&
+		sym->st_name != 0 &&
+		sym->st_shndx != SHN_UNDEF;
+}
+
 static inline bool elf_sym__is_local_function(const GElf_Sym *sym)
 {
 	return elf_sym__type(sym) == STT_FUNC &&
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] btf_encoder: Include static functions to BTF data
  2020-10-26 22:36 [RFC 0/3] pahole: Workaround dwarf bug for function encoding Jiri Olsa
  2020-10-26 22:36 ` [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function Jiri Olsa
  2020-10-26 22:36 ` [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf Jiri Olsa
@ 2020-10-26 22:36 ` Jiri Olsa
  2020-10-27 23:21   ` Andrii Nakryiko
  2020-10-27 23:13 ` [RFC 0/3] pahole: Workaround dwarf bug for function encoding Andrii Nakryiko
  3 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2020-10-26 22:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo, Frank Ch. Eigler, Mark Wielaard

Removing the condition to skip static functions.

Getting extra 23k functions on my kernel .config:

           nr     .BTF size (bytes)
  before:  23291  3342279
   after:  46606  4361045

The BTF section size increased of about 1MB.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 btf_encoder.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 99b9abe36993..03a4bef11947 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -485,9 +485,6 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 		int btf_fnproto_id, btf_fn_id;
 		const char *name;
 
-		if (!fn->external)
-			continue;
-
 		/*
 		 * We need to generate just single BTF instance for the
 		 * function, while DWARF data contains multiple instances
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function
  2020-10-26 22:36 ` [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function Jiri Olsa
@ 2020-10-27 23:12   ` Andrii Nakryiko
  2020-10-27 23:54     ` Hao Luo
  2020-10-28 15:51     ` Jiri Olsa
  0 siblings, 2 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2020-10-27 23:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo, Frank Ch. Eigler,
	Mark Wielaard

On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Moving find_all_percpu_vars under generic onfig function
> that walks over symbols and calls config_percpu_var.
>
> We will add another config function that needs to go
> through all the symbols, so it's better they go through
> them just once.
>
> There's no functional change intended.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  btf_encoder.c | 126 ++++++++++++++++++++++++++------------------------
>  1 file changed, 66 insertions(+), 60 deletions(-)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 2a6455be4c52..2dd26c904039 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -250,7 +250,64 @@ static bool percpu_var_exists(uint64_t addr, uint32_t *sz, const char **name)
>         return true;
>  }
>
> -static int find_all_percpu_vars(struct btf_elf *btfe)
> +static int config_percpu_var(struct btf_elf *btfe, GElf_Sym *sym)

I find the "config" name completely misleading. How about
"collect_percpu_var" or something along those lines?

> +{
> +       const char *sym_name;
> +       uint64_t addr;
> +       uint32_t size;
> +

[...]

> +}
> +
> +static int config(struct btf_elf *btfe, bool do_percpu_vars)

same here, config is generic and misrepresenting what we are doing
here. E.g., collect_symbols would probably be more clear.

>  {
>         uint32_t core_id;
>         GElf_Sym sym;

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 0/3] pahole: Workaround dwarf bug for function encoding
  2020-10-26 22:36 [RFC 0/3] pahole: Workaround dwarf bug for function encoding Jiri Olsa
                   ` (2 preceding siblings ...)
  2020-10-26 22:36 ` [PATCH 3/3] btf_encoder: Include static functions to BTF data Jiri Olsa
@ 2020-10-27 23:13 ` Andrii Nakryiko
  2020-10-28 15:49   ` Jiri Olsa
  3 siblings, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2020-10-27 23:13 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo, Frank Ch. Eigler,
	Mark Wielaard

On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> hi,
> because of gcc bug [1] we can no longer rely on DW_AT_declaration
> attribute to filter out declarations and end up with just
> one copy of the function in the BTF data.
>
> It seems this bug is not easy to fix, but regardless if the
> it's coming soon, it's probably good idea not to depend so
> much only on dwarf data and make some extra checks.
>
> Thus for function encoding we are now doing following checks:
>   - argument names are defined for the function
>   - there's symbol and address defined for the function
>   - function is generated only once
>
> These checks ensure that we encode function with defined
> symbol/address and argument names.
>
> I marked this post as RFC, because with this workaround in
> place we are also encoding assembly functions, which were
> not present when using the previous gcc version.
>
> Full functions diff to previous gcc working version:
>
>   http://people.redhat.com/~jolsa/functions.diff.txt
>
> I'm not sure this does not break some rule for functions in
> BTF data, becuse those assembly functions are not attachable
> by bpf trampolines, so I don't think there's any use for them.

What will happen if we do try to attach to those assembly functions?
Will there be some corruption or crash, or will it just fail and
return error cleanly? What we actually want in BTF is all the
functions that are attachable through BPF trampoline, which is all the
functions that ftrace subsystem can attach to, right? So how does
ftrace system know what can or cannot be attached to?

>
> thoughts?
> jirka
>
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> ---
> Jiri Olsa (3):
>       btf_encoder: Move find_all_percpu_vars in generic config function
>       btf_encoder: Change functions check due to broken dwarf
>       btf_encoder: Include static functions to BTF data
>
>  btf_encoder.c | 221 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------
>  elf_symtab.h  |   8 +++++
>  2 files changed, 170 insertions(+), 59 deletions(-)
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf
  2020-10-26 22:36 ` [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf Jiri Olsa
@ 2020-10-27 23:20   ` Andrii Nakryiko
  2020-10-28 15:50     ` Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2020-10-27 23:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo, Frank Ch. Eigler,
	Mark Wielaard

On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> We need to generate just single BTF instance for the
> function, while DWARF data contains multiple instances
> of DW_TAG_subprogram tag.
>
> Unfortunately we can no longer rely on DW_AT_declaration
> tag (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060)
>
> Instead we apply following checks:
>   - argument names are defined for the function
>   - there's symbol and address defined for the function
>   - function is generated only once
>
> They might be slightly superfluous together, but it's
> better to be ready for another DWARF mishap.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  btf_encoder.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  elf_symtab.h  |   8 ++++
>  2 files changed, 109 insertions(+), 1 deletion(-)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 2dd26c904039..99b9abe36993 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -26,6 +26,62 @@
>   */
>  #define KSYM_NAME_LEN 128
>
> +struct elf_function {
> +       const char *name;
> +       bool generated;
> +};
> +
> +static struct elf_function *functions;
> +static int functions_alloc;
> +static int functions_cnt;
> +
> +static int functions_cmp(const void *_a, const void *_b)
> +{
> +       const struct elf_function *a = _a;
> +       const struct elf_function *b = _b;
> +
> +       return strcmp(a->name, b->name);
> +}
> +
> +static void delete_functions(void)
> +{
> +       free(functions);
> +       functions_alloc = functions_cnt = 0;
> +}
> +
> +static int config_function(struct btf_elf *btfe, GElf_Sym *sym)
> +{
> +       if (!elf_sym__is_function(sym))
> +               return 0;
> +       if (!elf_sym__value(sym))
> +               return 0;
> +
> +       if (functions_cnt == functions_alloc) {
> +               functions_alloc += 5000;

maybe just do a conventional exponential size increase? Not
necessarily * 2, could be (* 3 / 2) or (* 4 / 3), libbpf uses such
approach.

> +               functions = realloc(functions, functions_alloc * sizeof(*functions));
> +               if (!functions)
> +                       return -1;
> +       }
> +
> +       functions[functions_cnt].name = elf_sym__name(sym, btfe->symtab);
> +       functions_cnt++;
> +       return 0;
> +}
> +

[...]

> diff --git a/elf_symtab.h b/elf_symtab.h
> index 359add69c8ab..094ec4683d01 100644
> --- a/elf_symtab.h
> +++ b/elf_symtab.h
> @@ -63,6 +63,14 @@ static inline uint64_t elf_sym__value(const GElf_Sym *sym)
>         return sym->st_value;
>  }
>
> +static inline int elf_sym__is_function(const GElf_Sym *sym)
> +{
> +       return (elf_sym__type(sym) == STT_FUNC ||
> +               elf_sym__type(sym) == STT_GNU_IFUNC) &&

Why do we need to collect STT_GNU_IFUNC? That is some PLT special
magic, does the kernel use that? Even if it does, are we even able to
attach to that? Could that remove some of the assembly functions?

> +               sym->st_name != 0 &&
> +               sym->st_shndx != SHN_UNDEF;
> +}
> +
>  static inline bool elf_sym__is_local_function(const GElf_Sym *sym)
>  {
>         return elf_sym__type(sym) == STT_FUNC &&
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] btf_encoder: Include static functions to BTF data
  2020-10-26 22:36 ` [PATCH 3/3] btf_encoder: Include static functions to BTF data Jiri Olsa
@ 2020-10-27 23:21   ` Andrii Nakryiko
  2020-10-28 15:53     ` Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2020-10-27 23:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo, Frank Ch. Eigler,
	Mark Wielaard

On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Removing the condition to skip static functions.
>
> Getting extra 23k functions on my kernel .config:
>
>            nr     .BTF size (bytes)
>   before:  23291  3342279
>    after:  46606  4361045

almost exactly 2x... such coincidences make me nervous ;)

>
> The BTF section size increased of about 1MB.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  btf_encoder.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 99b9abe36993..03a4bef11947 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -485,9 +485,6 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
>                 int btf_fnproto_id, btf_fn_id;
>                 const char *name;
>
> -               if (!fn->external)
> -                       continue;
> -
>                 /*
>                  * We need to generate just single BTF instance for the
>                  * function, while DWARF data contains multiple instances
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function
  2020-10-27 23:12   ` Andrii Nakryiko
@ 2020-10-27 23:54     ` Hao Luo
  2020-10-28 15:51     ` Jiri Olsa
  1 sibling, 0 replies; 13+ messages in thread
From: Hao Luo @ 2020-10-27 23:54 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Frank Ch. Eigler, Mark Wielaard

Agree with Andrii. The function name 'config' is too generic and
'config_per_var' is confusing to me. But the rest looked good. Thanks
for taking a look at this.

Hao


On Tue, Oct 27, 2020 at 4:12 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Moving find_all_percpu_vars under generic onfig function
> > that walks over symbols and calls config_percpu_var.
> >
> > We will add another config function that needs to go
> > through all the symbols, so it's better they go through
> > them just once.
> >
> > There's no functional change intended.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  btf_encoder.c | 126 ++++++++++++++++++++++++++------------------------
> >  1 file changed, 66 insertions(+), 60 deletions(-)
> >
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 2a6455be4c52..2dd26c904039 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -250,7 +250,64 @@ static bool percpu_var_exists(uint64_t addr, uint32_t *sz, const char **name)
> >         return true;
> >  }
> >
> > -static int find_all_percpu_vars(struct btf_elf *btfe)
> > +static int config_percpu_var(struct btf_elf *btfe, GElf_Sym *sym)
>
> I find the "config" name completely misleading. How about
> "collect_percpu_var" or something along those lines?
>
> > +{
> > +       const char *sym_name;
> > +       uint64_t addr;
> > +       uint32_t size;
> > +
>
> [...]
>
> > +}
> > +
> > +static int config(struct btf_elf *btfe, bool do_percpu_vars)
>
> same here, config is generic and misrepresenting what we are doing
> here. E.g., collect_symbols would probably be more clear.
>
> >  {
> >         uint32_t core_id;
> >         GElf_Sym sym;
>
> [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 0/3] pahole: Workaround dwarf bug for function encoding
  2020-10-27 23:13 ` [RFC 0/3] pahole: Workaround dwarf bug for function encoding Andrii Nakryiko
@ 2020-10-28 15:49   ` Jiri Olsa
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-10-28 15:49 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo,
	Frank Ch. Eigler, Mark Wielaard

On Tue, Oct 27, 2020 at 04:13:46PM -0700, Andrii Nakryiko wrote:
> On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > hi,
> > because of gcc bug [1] we can no longer rely on DW_AT_declaration
> > attribute to filter out declarations and end up with just
> > one copy of the function in the BTF data.
> >
> > It seems this bug is not easy to fix, but regardless if the
> > it's coming soon, it's probably good idea not to depend so
> > much only on dwarf data and make some extra checks.
> >
> > Thus for function encoding we are now doing following checks:
> >   - argument names are defined for the function
> >   - there's symbol and address defined for the function
> >   - function is generated only once
> >
> > These checks ensure that we encode function with defined
> > symbol/address and argument names.
> >
> > I marked this post as RFC, because with this workaround in
> > place we are also encoding assembly functions, which were
> > not present when using the previous gcc version.
> >
> > Full functions diff to previous gcc working version:
> >
> >   http://people.redhat.com/~jolsa/functions.diff.txt
> >
> > I'm not sure this does not break some rule for functions in
> > BTF data, becuse those assembly functions are not attachable
> > by bpf trampolines, so I don't think there's any use for them.
> 
> What will happen if we do try to attach to those assembly functions?
> Will there be some corruption or crash, or will it just fail and

the attach code checks for the __fentry__ nop,
so it will fail probably with EBUSY

> return error cleanly? What we actually want in BTF is all the
> functions that are attachable through BPF trampoline, which is all the
> functions that ftrace subsystem can attach to, right? So how does
> ftrace system know what can or cannot be attached to?

not sure, I think it records all the functions with
__fentry__ calls, perhaps we could take these records
as base for FUNCs, I'll check

jirka

> 
> >
> > thoughts?
> > jirka
> >
> >
> > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > ---
> > Jiri Olsa (3):
> >       btf_encoder: Move find_all_percpu_vars in generic config function
> >       btf_encoder: Change functions check due to broken dwarf
> >       btf_encoder: Include static functions to BTF data
> >
> >  btf_encoder.c | 221 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------
> >  elf_symtab.h  |   8 +++++
> >  2 files changed, 170 insertions(+), 59 deletions(-)
> >
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf
  2020-10-27 23:20   ` Andrii Nakryiko
@ 2020-10-28 15:50     ` Jiri Olsa
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-10-28 15:50 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo,
	Frank Ch. Eigler, Mark Wielaard

On Tue, Oct 27, 2020 at 04:20:10PM -0700, Andrii Nakryiko wrote:
> On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > We need to generate just single BTF instance for the
> > function, while DWARF data contains multiple instances
> > of DW_TAG_subprogram tag.
> >
> > Unfortunately we can no longer rely on DW_AT_declaration
> > tag (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060)
> >
> > Instead we apply following checks:
> >   - argument names are defined for the function
> >   - there's symbol and address defined for the function
> >   - function is generated only once
> >
> > They might be slightly superfluous together, but it's
> > better to be ready for another DWARF mishap.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  btf_encoder.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  elf_symtab.h  |   8 ++++
> >  2 files changed, 109 insertions(+), 1 deletion(-)
> >
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 2dd26c904039..99b9abe36993 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -26,6 +26,62 @@
> >   */
> >  #define KSYM_NAME_LEN 128
> >
> > +struct elf_function {
> > +       const char *name;
> > +       bool generated;
> > +};
> > +
> > +static struct elf_function *functions;
> > +static int functions_alloc;
> > +static int functions_cnt;
> > +
> > +static int functions_cmp(const void *_a, const void *_b)
> > +{
> > +       const struct elf_function *a = _a;
> > +       const struct elf_function *b = _b;
> > +
> > +       return strcmp(a->name, b->name);
> > +}
> > +
> > +static void delete_functions(void)
> > +{
> > +       free(functions);
> > +       functions_alloc = functions_cnt = 0;
> > +}
> > +
> > +static int config_function(struct btf_elf *btfe, GElf_Sym *sym)
> > +{
> > +       if (!elf_sym__is_function(sym))
> > +               return 0;
> > +       if (!elf_sym__value(sym))
> > +               return 0;
> > +
> > +       if (functions_cnt == functions_alloc) {
> > +               functions_alloc += 5000;
> 
> maybe just do a conventional exponential size increase? Not
> necessarily * 2, could be (* 3 / 2) or (* 4 / 3), libbpf uses such
> approach.

ok, will change

> 
> > +               functions = realloc(functions, functions_alloc * sizeof(*functions));
> > +               if (!functions)
> > +                       return -1;
> > +       }
> > +
> > +       functions[functions_cnt].name = elf_sym__name(sym, btfe->symtab);
> > +       functions_cnt++;
> > +       return 0;
> > +}
> > +
> 
> [...]
> 
> > diff --git a/elf_symtab.h b/elf_symtab.h
> > index 359add69c8ab..094ec4683d01 100644
> > --- a/elf_symtab.h
> > +++ b/elf_symtab.h
> > @@ -63,6 +63,14 @@ static inline uint64_t elf_sym__value(const GElf_Sym *sym)
> >         return sym->st_value;
> >  }
> >
> > +static inline int elf_sym__is_function(const GElf_Sym *sym)
> > +{
> > +       return (elf_sym__type(sym) == STT_FUNC ||
> > +               elf_sym__type(sym) == STT_GNU_IFUNC) &&
> 
> Why do we need to collect STT_GNU_IFUNC? That is some PLT special
> magic, does the kernel use that? Even if it does, are we even able to
> attach to that? Could that remove some of the assembly functions?

I missed that when I copied that function from perf ;-) I'll check

jirka

> 
> > +               sym->st_name != 0 &&
> > +               sym->st_shndx != SHN_UNDEF;
> > +}
> > +
> >  static inline bool elf_sym__is_local_function(const GElf_Sym *sym)
> >  {
> >         return elf_sym__type(sym) == STT_FUNC &&
> > --
> > 2.26.2
> >
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function
  2020-10-27 23:12   ` Andrii Nakryiko
  2020-10-27 23:54     ` Hao Luo
@ 2020-10-28 15:51     ` Jiri Olsa
  1 sibling, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-10-28 15:51 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo,
	Frank Ch. Eigler, Mark Wielaard

On Tue, Oct 27, 2020 at 04:12:04PM -0700, Andrii Nakryiko wrote:
> On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Moving find_all_percpu_vars under generic onfig function
> > that walks over symbols and calls config_percpu_var.
> >
> > We will add another config function that needs to go
> > through all the symbols, so it's better they go through
> > them just once.
> >
> > There's no functional change intended.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  btf_encoder.c | 126 ++++++++++++++++++++++++++------------------------
> >  1 file changed, 66 insertions(+), 60 deletions(-)
> >
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 2a6455be4c52..2dd26c904039 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -250,7 +250,64 @@ static bool percpu_var_exists(uint64_t addr, uint32_t *sz, const char **name)
> >         return true;
> >  }
> >
> > -static int find_all_percpu_vars(struct btf_elf *btfe)
> > +static int config_percpu_var(struct btf_elf *btfe, GElf_Sym *sym)
> 
> I find the "config" name completely misleading. How about
> "collect_percpu_var" or something along those lines?

ok

> 
> > +{
> > +       const char *sym_name;
> > +       uint64_t addr;
> > +       uint32_t size;
> > +
> 
> [...]
> 
> > +}
> > +
> > +static int config(struct btf_elf *btfe, bool do_percpu_vars)
> 
> same here, config is generic and misrepresenting what we are doing
> here. E.g., collect_symbols would probably be more clear.

ook

jirka

> 
> >  {
> >         uint32_t core_id;
> >         GElf_Sym sym;
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] btf_encoder: Include static functions to BTF data
  2020-10-27 23:21   ` Andrii Nakryiko
@ 2020-10-28 15:53     ` Jiri Olsa
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2020-10-28 15:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo,
	Frank Ch. Eigler, Mark Wielaard

On Tue, Oct 27, 2020 at 04:21:14PM -0700, Andrii Nakryiko wrote:
> On Mon, Oct 26, 2020 at 5:07 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Removing the condition to skip static functions.
> >
> > Getting extra 23k functions on my kernel .config:
> >
> >            nr     .BTF size (bytes)
> >   before:  23291  3342279
> >    after:  46606  4361045
> 
> almost exactly 2x... such coincidences make me nervous ;)

hum, nice.. 'new' functions looked good, I'll make the
full list to be sure

jirka

> 
> >
> > The BTF section size increased of about 1MB.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  btf_encoder.c | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 99b9abe36993..03a4bef11947 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -485,9 +485,6 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> >                 int btf_fnproto_id, btf_fn_id;
> >                 const char *name;
> >
> > -               if (!fn->external)
> > -                       continue;
> > -
> >                 /*
> >                  * We need to generate just single BTF instance for the
> >                  * function, while DWARF data contains multiple instances
> > --
> > 2.26.2
> >
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-10-28 21:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-26 22:36 [RFC 0/3] pahole: Workaround dwarf bug for function encoding Jiri Olsa
2020-10-26 22:36 ` [PATCH 1/3] btf_encoder: Move find_all_percpu_vars in generic config function Jiri Olsa
2020-10-27 23:12   ` Andrii Nakryiko
2020-10-27 23:54     ` Hao Luo
2020-10-28 15:51     ` Jiri Olsa
2020-10-26 22:36 ` [PATCH 2/3] btf_encoder: Change functions check due to broken dwarf Jiri Olsa
2020-10-27 23:20   ` Andrii Nakryiko
2020-10-28 15:50     ` Jiri Olsa
2020-10-26 22:36 ` [PATCH 3/3] btf_encoder: Include static functions to BTF data Jiri Olsa
2020-10-27 23:21   ` Andrii Nakryiko
2020-10-28 15:53     ` Jiri Olsa
2020-10-27 23:13 ` [RFC 0/3] pahole: Workaround dwarf bug for function encoding Andrii Nakryiko
2020-10-28 15:49   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).