bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] btf_encoder: Fix functions BTF data generation
@ 2020-11-12 15:05 Jiri Olsa
  2020-11-12 15:05 ` [RFC/PATCH 1/3] btf_encoder: Generate also .init functions Jiri Olsa
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jiri Olsa @ 2020-11-12 15:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo

hi,
recent btf encoder's changes brakes BTF data for some gcc
versions. The problem is that some functions can appear
in dwarf data in some instances without arguments, while
they are defined with some.

I tried the approach I described in my former email and
basically process all dwarf data first and collect args
before we generate any BTF function.

I had to change LSK__DELETE to LSK__KEEPIT for every
CU we process, so that might have some implications
that I still need to check.

Andrii,
could you please check this with your gcc?

thanks,
jirka


---
Jiri Olsa (3):
      btf_encoder: Generate also .init functions
      btf_encoder: Put function generation code to generate_func
      btf_encoder: Func generation fix

 btf_encoder.c | 177 +++++++++++++++++++++++++++++++++++----------------------------------
 pahole.c      |   2 +-
 2 files changed, 91 insertions(+), 88 deletions(-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/PATCH 1/3] btf_encoder: Generate also .init functions
  2020-11-12 15:05 [RFC 0/3] btf_encoder: Fix functions BTF data generation Jiri Olsa
@ 2020-11-12 15:05 ` Jiri Olsa
  2020-11-12 19:37   ` Andrii Nakryiko
  2020-11-12 15:05 ` [RFC/PATCH 2/3] btf_encoder: Put function generation code to generate_func Jiri Olsa
  2020-11-12 15:05 ` [RFC/PATCH 3/3] btf_encoder: Func generation fix Jiri Olsa
  2 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2020-11-12 15:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo

Currently we skip functions under .init* sections, Removing the .init*
section check, BTF now contains also functions from .init* sections.

Andrii's explanation from email:

> ...                  I think we should just drop the __init check and
> include all the __init functions into BTF. There could be cases where
> we'd need to attach BPF programs to __init functions (e.g., bpf_lsm
> security cases), so having BTFs for those FUNCs are necessary as well.
> Ftrace currently disallows that, but it's only because no user-space
> application has a way to attach probes early enough. This might change
> in the future, so there is no need to invent special mechanisms now
> for bpf_iter function preservation. Let's just include all __init
> functions in BTF.

It's over ~2000 functions on my .config:

   $ bpftool btf dump file ./vmlinux | grep 'FUNC ' | wc -l
   41505
   $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep 'FUNC ' | wc -l
   39256

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 btf_encoder.c | 43 ++-----------------------------------------
 1 file changed, 2 insertions(+), 41 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 9b93e9963727..d531651b1e9e 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -29,10 +29,6 @@
 struct funcs_layout {
 	unsigned long mcount_start;
 	unsigned long mcount_stop;
-	unsigned long init_begin;
-	unsigned long init_end;
-	unsigned long init_bpf_begin;
-	unsigned long init_bpf_end;
 	unsigned long mcount_sec_idx;
 };
 
@@ -104,16 +100,6 @@ static int addrs_cmp(const void *_a, const void *_b)
 	return *a < *b ? -1 : 1;
 }
 
-static bool is_init(struct funcs_layout *fl, unsigned long addr)
-{
-	return addr >= fl->init_begin && addr < fl->init_end;
-}
-
-static bool is_bpf_init(struct funcs_layout *fl, unsigned long addr)
-{
-	return addr >= fl->init_bpf_begin && addr < fl->init_bpf_end;
-}
-
 static int filter_functions(struct btf_elf *btfe, struct funcs_layout *fl)
 {
 	unsigned long *addrs, count, offset, i;
@@ -155,18 +141,11 @@ static int filter_functions(struct btf_elf *btfe, struct funcs_layout *fl)
 
 	/*
 	 * Let's got through all collected functions and filter
-	 * out those that are not in ftrace and init code.
+	 * out those that are not in ftrace.
 	 */
 	for (i = 0; i < functions_cnt; i++) {
 		struct elf_function *func = &functions[i];
 
-		/*
-		 * Do not enable .init section functions,
-		 * but keep .init.bpf.preserve_type functions.
-		 */
-		if (is_init(fl, func->addr) && !is_bpf_init(fl, func->addr))
-			continue;
-
 		/* Make sure function is within ftrace addresses. */
 		if (bsearch(&func->addr, addrs, count, sizeof(addrs[0]), addrs_cmp)) {
 			/*
@@ -493,29 +472,11 @@ static void collect_symbol(GElf_Sym *sym, struct funcs_layout *fl)
 	if (!fl->mcount_stop &&
 	    !strcmp("__stop_mcount_loc", elf_sym__name(sym, btfe->symtab)))
 		fl->mcount_stop = sym->st_value;
-
-	if (!fl->init_begin &&
-	    !strcmp("__init_begin", elf_sym__name(sym, btfe->symtab)))
-		fl->init_begin = sym->st_value;
-
-	if (!fl->init_end &&
-	    !strcmp("__init_end", elf_sym__name(sym, btfe->symtab)))
-		fl->init_end = sym->st_value;
-
-	if (!fl->init_bpf_begin &&
-	    !strcmp("__init_bpf_preserve_type_begin", elf_sym__name(sym, btfe->symtab)))
-		fl->init_bpf_begin = sym->st_value;
-
-	if (!fl->init_bpf_end &&
-	    !strcmp("__init_bpf_preserve_type_end", elf_sym__name(sym, btfe->symtab)))
-		fl->init_bpf_end = sym->st_value;
 }
 
 static int has_all_symbols(struct funcs_layout *fl)
 {
-	return fl->mcount_start && fl->mcount_stop &&
-	       fl->init_begin && fl->init_end &&
-	       fl->init_bpf_begin && fl->init_bpf_end;
+	return fl->mcount_start && fl->mcount_stop;
 }
 
 static int collect_symbols(struct btf_elf *btfe, bool collect_percpu_vars)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC/PATCH 2/3] btf_encoder: Put function generation code to generate_func
  2020-11-12 15:05 [RFC 0/3] btf_encoder: Fix functions BTF data generation Jiri Olsa
  2020-11-12 15:05 ` [RFC/PATCH 1/3] btf_encoder: Generate also .init functions Jiri Olsa
@ 2020-11-12 15:05 ` Jiri Olsa
  2020-11-12 19:39   ` Andrii Nakryiko
  2020-11-12 15:05 ` [RFC/PATCH 3/3] btf_encoder: Func generation fix Jiri Olsa
  2 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2020-11-12 15:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo

We will use generate_func from another place in following change.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 btf_encoder.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index d531651b1e9e..efc4f48dbc5a 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -351,6 +351,23 @@ static struct btf_elf *btfe;
 static uint32_t array_index_id;
 static bool has_index_type;
 
+static int generate_func(struct btf_elf *btfe, struct cu *cu,
+			 struct function *fn, uint32_t type_id_off)
+{
+	int btf_fnproto_id, btf_fn_id, err = 0;
+	const char *name;
+
+	btf_fnproto_id = btf_elf__add_func_proto(btfe, cu, &fn->proto, type_id_off);
+	name = dwarves__active_loader->strings__ptr(cu, fn->name);
+	btf_fn_id = btf_elf__add_ref_type(btfe, BTF_KIND_FUNC, btf_fnproto_id, name, false);
+	if (btf_fnproto_id < 0 || btf_fn_id < 0) {
+		err = -1;
+		printf("error: failed to encode function '%s'\n", function__name(fn, cu));
+	}
+
+	return err;
+}
+
 int btf_encoder__encode()
 {
 	int err;
@@ -608,9 +625,6 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 	}
 
 	cu__for_each_function(cu, core_id, fn) {
-		int btf_fnproto_id, btf_fn_id;
-		const char *name;
-
 		/*
 		 * The functions_cnt != 0 means we parsed all necessary
 		 * kernel symbols and we are using ftrace location filter
@@ -634,14 +648,8 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 				continue;
 		}
 
-		btf_fnproto_id = btf_elf__add_func_proto(btfe, cu, &fn->proto, type_id_off);
-		name = dwarves__active_loader->strings__ptr(cu, fn->name);
-		btf_fn_id = btf_elf__add_ref_type(btfe, BTF_KIND_FUNC, btf_fnproto_id, name, false);
-		if (btf_fnproto_id < 0 || btf_fn_id < 0) {
-			err = -1;
-			printf("error: failed to encode function '%s'\n", function__name(fn, cu));
+		if (generate_func(btfe, cu, fn, type_id_off))
 			goto out;
-		}
 	}
 
 	if (skip_encoding_vars)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-12 15:05 [RFC 0/3] btf_encoder: Fix functions BTF data generation Jiri Olsa
  2020-11-12 15:05 ` [RFC/PATCH 1/3] btf_encoder: Generate also .init functions Jiri Olsa
  2020-11-12 15:05 ` [RFC/PATCH 2/3] btf_encoder: Put function generation code to generate_func Jiri Olsa
@ 2020-11-12 15:05 ` Jiri Olsa
  2020-11-12 19:54   ` Andrii Nakryiko
  2 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2020-11-12 15:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Hao Luo

Recent btf encoder's changes brakes BTF data for some gcc versions.

The problem is that some functions can appear in dwarf data in some
instances without arguments, while they are defined with some.

Current code will record 'no arguments' for such functions and they
disregard the rest of the DWARF data claiming otherwise.

This patch changes the BTF function generation, so that in the main
cu__encode_btf processing we do not generate any BTF function code,
but only collect functions 'to generate' and update their arguments.

When we process the whole data, we go through the functions and
generate its BTD data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 btf_encoder.c | 110 +++++++++++++++++++++++++++++++++-----------------
 pahole.c      |   2 +-
 2 files changed, 73 insertions(+), 39 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index efc4f48dbc5a..46cb7e6f5abe 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -35,7 +35,10 @@ struct funcs_layout {
 struct elf_function {
 	const char	*name;
 	unsigned long	 addr;
-	bool		 generated;
+	struct cu	*cu;
+	struct function *fn;
+	int		 args_cnt;
+	uint32_t	 type_id_off;
 };
 
 static struct elf_function *functions;
@@ -64,6 +67,7 @@ static void delete_functions(void)
 static int collect_function(struct btf_elf *btfe, GElf_Sym *sym)
 {
 	struct elf_function *new;
+	char *name;
 
 	if (elf_sym__type(sym) != STT_FUNC)
 		return 0;
@@ -83,9 +87,20 @@ static int collect_function(struct btf_elf *btfe, GElf_Sym *sym)
 		functions = new;
 	}
 
-	functions[functions_cnt].name = elf_sym__name(sym, btfe->symtab);
+	/*
+	 * At the time we process functions,
+	 * elf object might be already released.
+	 */
+	name = strdup(elf_sym__name(sym, btfe->symtab));
+	if (!name)
+		return -1;
+
+	functions[functions_cnt].name = name;
 	functions[functions_cnt].addr = elf_sym__value(sym);
-	functions[functions_cnt].generated = false;
+	functions[functions_cnt].fn = NULL;
+	functions[functions_cnt].cu = NULL;
+	functions[functions_cnt].args_cnt = 0;
+	functions[functions_cnt].type_id_off = 0;
 	functions_cnt++;
 	return 0;
 }
@@ -164,20 +179,6 @@ static int filter_functions(struct btf_elf *btfe, struct funcs_layout *fl)
 	return 0;
 }
 
-static bool should_generate_function(const struct btf_elf *btfe, const char *name)
-{
-	struct elf_function *p;
-	struct elf_function key = { .name = name };
-
-	p = bsearch(&key, functions, functions_cnt,
-		    sizeof(functions[0]), functions_cmp);
-	if (!p || p->generated)
-		return false;
-
-	p->generated = true;
-	return true;
-}
-
 static bool btf_name_char_ok(char c, bool first)
 {
 	if (c == '_' || c == '.')
@@ -368,6 +369,21 @@ static int generate_func(struct btf_elf *btfe, struct cu *cu,
 	return err;
 }
 
+static int process_functions(struct btf_elf *btfe)
+{
+	unsigned long i;
+
+	for (i = 0; i < functions_cnt; i++) {
+		struct elf_function *func = &functions[i];
+
+		if (!func->fn)
+			continue;
+		if (generate_func(btfe, func->cu, func->fn, func->type_id_off))
+			return -1;
+	}
+	return 0;
+}
+
 int btf_encoder__encode()
 {
 	int err;
@@ -375,7 +391,9 @@ int btf_encoder__encode()
 	if (gobuffer__size(&btfe->percpu_secinfo) != 0)
 		btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
 
-	err = btf_elf__encode(btfe, 0);
+	err = process_functions(btfe);
+	if (!err)
+		err = btf_elf__encode(btfe, 0);
 	delete_functions();
 	btf_elf__delete(btfe);
 	btfe = NULL;
@@ -539,15 +557,17 @@ static int collect_symbols(struct btf_elf *btfe, bool collect_percpu_vars)
 	return 0;
 }
 
-static bool has_arg_names(struct cu *cu, struct ftype *ftype)
+static bool has_arg_names(struct cu *cu, struct ftype *ftype, int *args_cnt)
 {
 	struct parameter *param;
 	const char *name;
 
+	*args_cnt = 0;
 	ftype__for_each_parameter(ftype, param) {
 		name = dwarves__active_loader->strings__ptr(cu, param->name);
 		if (name == NULL)
 			return false;
+		++*args_cnt;
 	}
 	return true;
 }
@@ -624,32 +644,46 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 		has_index_type = true;
 	}
 
-	cu__for_each_function(cu, core_id, fn) {
-		/*
-		 * The functions_cnt != 0 means we parsed all necessary
-		 * kernel symbols and we are using ftrace location filter
-		 * for functions. If it's not available keep the current
-		 * dwarf declaration check.
-		 */
-		if (functions_cnt) {
+	/*
+	 * The functions_cnt != 0 means we parsed all necessary
+	 * kernel symbols and we are using ftrace location filter
+	 * for functions. If it's not available keep the current
+	 * dwarf declaration check.
+	 */
+	if (functions_cnt) {
+		cu__for_each_function(cu, core_id, fn) {
+			struct elf_function *p;
+			struct elf_function key = { .name = function__name(fn, cu) };
+			int args_cnt = 0;
+
 			/*
-			 * We check following conditions:
-			 *   - argument names are defined
-			 *   - there's symbol and address defined for the function
-			 *   - function address belongs to ftrace locations
-			 *   - function is generated only once
+			 * Collect functions that match ftrace filter
+			 * and pick the one with proper argument names.
+			 * The BTF generation happens at the end in
+			 * btf_encoder__encode function.
 			 */
-			if (!has_arg_names(cu, &fn->proto))
+			p = bsearch(&key, functions, functions_cnt,
+				    sizeof(functions[0]), functions_cmp);
+			if (!p)
 				continue;
-			if (!should_generate_function(btfe, function__name(fn, cu)))
+
+			if (!has_arg_names(cu, &fn->proto, &args_cnt))
 				continue;
-		} else {
+
+			if (!p->fn || args_cnt > p->args_cnt) {
+				p->fn = fn;
+				p->cu = cu;
+				p->args_cnt = args_cnt;
+				p->type_id_off = type_id_off;
+			}
+		}
+	} else {
+		cu__for_each_function(cu, core_id, fn) {
 			if (fn->declaration || !fn->external)
 				continue;
+			if (generate_func(btfe, cu, fn, type_id_off))
+				goto out;
 		}
-
-		if (generate_func(btfe, cu, fn, type_id_off))
-			goto out;
 	}
 
 	if (skip_encoding_vars)
diff --git a/pahole.c b/pahole.c
index fca27148e0bb..d6165d4164dd 100644
--- a/pahole.c
+++ b/pahole.c
@@ -2392,7 +2392,7 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
 			fprintf(stderr, "Encountered error while encoding BTF.\n");
 			exit(1);
 		}
-		return LSK__DELETE;
+		return LSK__KEEPIT;
 	}
 
 	if (ctf_encode) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 1/3] btf_encoder: Generate also .init functions
  2020-11-12 15:05 ` [RFC/PATCH 1/3] btf_encoder: Generate also .init functions Jiri Olsa
@ 2020-11-12 19:37   ` Andrii Nakryiko
  0 siblings, 0 replies; 15+ messages in thread
From: Andrii Nakryiko @ 2020-11-12 19:37 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 7:05 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Currently we skip functions under .init* sections, Removing the .init*
> section check, BTF now contains also functions from .init* sections.
>
> Andrii's explanation from email:
>
> > ...                  I think we should just drop the __init check and
> > include all the __init functions into BTF. There could be cases where
> > we'd need to attach BPF programs to __init functions (e.g., bpf_lsm
> > security cases), so having BTFs for those FUNCs are necessary as well.
> > Ftrace currently disallows that, but it's only because no user-space
> > application has a way to attach probes early enough. This might change
> > in the future, so there is no need to invent special mechanisms now
> > for bpf_iter function preservation. Let's just include all __init
> > functions in BTF.
>
> It's over ~2000 functions on my .config:
>
>    $ bpftool btf dump file ./vmlinux | grep 'FUNC ' | wc -l
>    41505
>    $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep 'FUNC ' | wc -l
>    39256
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---

Looks good.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  btf_encoder.c | 43 ++-----------------------------------------
>  1 file changed, 2 insertions(+), 41 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 2/3] btf_encoder: Put function generation code to generate_func
  2020-11-12 15:05 ` [RFC/PATCH 2/3] btf_encoder: Put function generation code to generate_func Jiri Olsa
@ 2020-11-12 19:39   ` Andrii Nakryiko
  0 siblings, 0 replies; 15+ messages in thread
From: Andrii Nakryiko @ 2020-11-12 19:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 7:05 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> We will use generate_func from another place in following change.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  btf_encoder.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index d531651b1e9e..efc4f48dbc5a 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -351,6 +351,23 @@ static struct btf_elf *btfe;
>  static uint32_t array_index_id;
>  static bool has_index_type;
>
> +static int generate_func(struct btf_elf *btfe, struct cu *cu,
> +                        struct function *fn, uint32_t type_id_off)
> +{
> +       int btf_fnproto_id, btf_fn_id, err = 0;

btf_ prefix for these variables don't contribute anything, I'd just
drop them here

> +       const char *name;
> +
> +       btf_fnproto_id = btf_elf__add_func_proto(btfe, cu, &fn->proto, type_id_off);
> +       name = dwarves__active_loader->strings__ptr(cu, fn->name);
> +       btf_fn_id = btf_elf__add_ref_type(btfe, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> +       if (btf_fnproto_id < 0 || btf_fn_id < 0) {
> +               err = -1;
> +               printf("error: failed to encode function '%s'\n", function__name(fn, cu));

return -1;

> +       }
> +
> +       return err;

return 0; drop err variable.

> +}
> +
>  int btf_encoder__encode()
>  {
>         int err;
> @@ -608,9 +625,6 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
>         }
>
>         cu__for_each_function(cu, core_id, fn) {
> -               int btf_fnproto_id, btf_fn_id;
> -               const char *name;
> -
>                 /*
>                  * The functions_cnt != 0 means we parsed all necessary
>                  * kernel symbols and we are using ftrace location filter
> @@ -634,14 +648,8 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
>                                 continue;
>                 }
>
> -               btf_fnproto_id = btf_elf__add_func_proto(btfe, cu, &fn->proto, type_id_off);
> -               name = dwarves__active_loader->strings__ptr(cu, fn->name);
> -               btf_fn_id = btf_elf__add_ref_type(btfe, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> -               if (btf_fnproto_id < 0 || btf_fn_id < 0) {
> -                       err = -1;
> -                       printf("error: failed to encode function '%s'\n", function__name(fn, cu));
> +               if (generate_func(btfe, cu, fn, type_id_off))
>                         goto out;
> -               }
>         }
>
>         if (skip_encoding_vars)
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-12 15:05 ` [RFC/PATCH 3/3] btf_encoder: Func generation fix Jiri Olsa
@ 2020-11-12 19:54   ` Andrii Nakryiko
  2020-11-12 21:14     ` Jiri Olsa
  0 siblings, 1 reply; 15+ messages in thread
From: Andrii Nakryiko @ 2020-11-12 19:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 7:05 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Recent btf encoder's changes brakes BTF data for some gcc versions.
>
> The problem is that some functions can appear in dwarf data in some
> instances without arguments, while they are defined with some.
>
> Current code will record 'no arguments' for such functions and they
> disregard the rest of the DWARF data claiming otherwise.
>
> This patch changes the BTF function generation, so that in the main
> cu__encode_btf processing we do not generate any BTF function code,
> but only collect functions 'to generate' and update their arguments.
>
> When we process the whole data, we go through the functions and
> generate its BTD data.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  btf_encoder.c | 110 +++++++++++++++++++++++++++++++++-----------------
>  pahole.c      |   2 +-
>  2 files changed, 73 insertions(+), 39 deletions(-)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index efc4f48dbc5a..46cb7e6f5abe 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -35,7 +35,10 @@ struct funcs_layout {
>  struct elf_function {
>         const char      *name;
>         unsigned long    addr;
> -       bool             generated;
> +       struct cu       *cu;
> +       struct function *fn;
> +       int              args_cnt;
> +       uint32_t         type_id_off;
>  };
>
>  static struct elf_function *functions;
> @@ -64,6 +67,7 @@ static void delete_functions(void)
>  static int collect_function(struct btf_elf *btfe, GElf_Sym *sym)
>  {
>         struct elf_function *new;
> +       char *name;
>
>         if (elf_sym__type(sym) != STT_FUNC)
>                 return 0;
> @@ -83,9 +87,20 @@ static int collect_function(struct btf_elf *btfe, GElf_Sym *sym)
>                 functions = new;
>         }
>
> -       functions[functions_cnt].name = elf_sym__name(sym, btfe->symtab);
> +       /*
> +        * At the time we process functions,
> +        * elf object might be already released.
> +        */
> +       name = strdup(elf_sym__name(sym, btfe->symtab));
> +       if (!name)
> +               return -1;
> +
> +       functions[functions_cnt].name = name;
>         functions[functions_cnt].addr = elf_sym__value(sym);
> -       functions[functions_cnt].generated = false;
> +       functions[functions_cnt].fn = NULL;
> +       functions[functions_cnt].cu = NULL;
> +       functions[functions_cnt].args_cnt = 0;
> +       functions[functions_cnt].type_id_off = 0;
>         functions_cnt++;
>         return 0;
>  }
> @@ -164,20 +179,6 @@ static int filter_functions(struct btf_elf *btfe, struct funcs_layout *fl)
>         return 0;
>  }
>
> -static bool should_generate_function(const struct btf_elf *btfe, const char *name)
> -{
> -       struct elf_function *p;
> -       struct elf_function key = { .name = name };
> -
> -       p = bsearch(&key, functions, functions_cnt,
> -                   sizeof(functions[0]), functions_cmp);
> -       if (!p || p->generated)
> -               return false;
> -
> -       p->generated = true;
> -       return true;
> -}
> -
>  static bool btf_name_char_ok(char c, bool first)
>  {
>         if (c == '_' || c == '.')
> @@ -368,6 +369,21 @@ static int generate_func(struct btf_elf *btfe, struct cu *cu,
>         return err;
>  }
>
> +static int process_functions(struct btf_elf *btfe)
> +{
> +       unsigned long i;
> +
> +       for (i = 0; i < functions_cnt; i++) {
> +               struct elf_function *func = &functions[i];
> +
> +               if (!func->fn)
> +                       continue;
> +               if (generate_func(btfe, func->cu, func->fn, func->type_id_off))
> +                       return -1;
> +       }
> +       return 0;
> +}
> +
>  int btf_encoder__encode()
>  {
>         int err;
> @@ -375,7 +391,9 @@ int btf_encoder__encode()
>         if (gobuffer__size(&btfe->percpu_secinfo) != 0)
>                 btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
>
> -       err = btf_elf__encode(btfe, 0);
> +       err = process_functions(btfe);
> +       if (!err)
> +               err = btf_elf__encode(btfe, 0);
>         delete_functions();
>         btf_elf__delete(btfe);
>         btfe = NULL;
> @@ -539,15 +557,17 @@ static int collect_symbols(struct btf_elf *btfe, bool collect_percpu_vars)
>         return 0;
>  }
>
> -static bool has_arg_names(struct cu *cu, struct ftype *ftype)
> +static bool has_arg_names(struct cu *cu, struct ftype *ftype, int *args_cnt)
>  {
>         struct parameter *param;
>         const char *name;
>
> +       *args_cnt = 0;
>         ftype__for_each_parameter(ftype, param) {
>                 name = dwarves__active_loader->strings__ptr(cu, param->name);
>                 if (name == NULL)
>                         return false;
> +               ++*args_cnt;
>         }
>         return true;
>  }
> @@ -624,32 +644,46 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
>                 has_index_type = true;
>         }
>
> -       cu__for_each_function(cu, core_id, fn) {
> -               /*
> -                * The functions_cnt != 0 means we parsed all necessary
> -                * kernel symbols and we are using ftrace location filter
> -                * for functions. If it's not available keep the current
> -                * dwarf declaration check.
> -                */
> -               if (functions_cnt) {
> +       /*
> +        * The functions_cnt != 0 means we parsed all necessary
> +        * kernel symbols and we are using ftrace location filter
> +        * for functions. If it's not available keep the current
> +        * dwarf declaration check.
> +        */
> +       if (functions_cnt) {
> +               cu__for_each_function(cu, core_id, fn) {
> +                       struct elf_function *p;
> +                       struct elf_function key = { .name = function__name(fn, cu) };
> +                       int args_cnt = 0;
> +
>                         /*
> -                        * We check following conditions:
> -                        *   - argument names are defined
> -                        *   - there's symbol and address defined for the function
> -                        *   - function address belongs to ftrace locations
> -                        *   - function is generated only once
> +                        * Collect functions that match ftrace filter
> +                        * and pick the one with proper argument names.
> +                        * The BTF generation happens at the end in
> +                        * btf_encoder__encode function.
>                          */
> -                       if (!has_arg_names(cu, &fn->proto))
> +                       p = bsearch(&key, functions, functions_cnt,
> +                                   sizeof(functions[0]), functions_cmp);
> +                       if (!p)
>                                 continue;
> -                       if (!should_generate_function(btfe, function__name(fn, cu)))
> +
> +                       if (!has_arg_names(cu, &fn->proto, &args_cnt))

So I can't unfortunately reproduce that GCC bug with DWARF info. What
was exactly the symptom? Maybe you can also share readelf -wi dump for
your problematic vmlinux?

The reason I'm asking is because I wonder if we should still ignore
functions if fn->declaration is set. E.g., for the issue we
investigated yesterday, the function with no arguments has declaration
set to 1, so just ignoring it would solve the problem. I'm wondering
if it's enough to do just that instead of doing this whole delayed
function collection/processing.

Also, I'd imagine the only expected cases where we can override  the
function (args_cnt > p->args_cnt) would be if p->args_cnt == 0, no?
All other cases are either newly discovered "bogusness" of DWARF (and
would be good to know about this) or it's a name collision for
functions. Basically, before we go all the way to rework this again,
let's see if just skipping declarations would be enough?

>                                 continue;
> -               } else {
> +
> +                       if (!p->fn || args_cnt > p->args_cnt) {
> +                               p->fn = fn;
> +                               p->cu = cu;
> +                               p->args_cnt = args_cnt;
> +                               p->type_id_off = type_id_off;
> +                       }
> +               }
> +       } else {
> +               cu__for_each_function(cu, core_id, fn) {
>                         if (fn->declaration || !fn->external)
>                                 continue;
> +                       if (generate_func(btfe, cu, fn, type_id_off))
> +                               goto out;
>                 }

I'm trending towards disliking this completely different fallback
mechanism. It saved bpf-next accidentally, but otherwise obscured the
issue and generally makes testing pahole with artificial binary BTFs
(from test programs) harder. How about we unify approaches, but just
use mcount symbols opportunistically, as an additional filter, if it's
available?

With that, testing that we still handle functions with duplicate names
properly would be trivial (which I suspect we don't and we'll just
keep the one with more args now, right?) And it makes static functions
available for non-vmlinux binaries automatically (might be good or
bad, but still...).

> -
> -               if (generate_func(btfe, cu, fn, type_id_off))
> -                       goto out;
>         }
>
>         if (skip_encoding_vars)
> diff --git a/pahole.c b/pahole.c
> index fca27148e0bb..d6165d4164dd 100644
> --- a/pahole.c
> +++ b/pahole.c
> @@ -2392,7 +2392,7 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
>                         fprintf(stderr, "Encountered error while encoding BTF.\n");
>                         exit(1);
>                 }
> -               return LSK__DELETE;
> +               return LSK__KEEPIT;
>         }
>
>         if (ctf_encode) {
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-12 19:54   ` Andrii Nakryiko
@ 2020-11-12 21:14     ` Jiri Olsa
  2020-11-13  0:08       ` Andrii Nakryiko
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2020-11-12 21:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 11:54:41AM -0800, Andrii Nakryiko wrote:

SNIP

> > @@ -624,32 +644,46 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> >                 has_index_type = true;
> >         }
> >
> > -       cu__for_each_function(cu, core_id, fn) {
> > -               /*
> > -                * The functions_cnt != 0 means we parsed all necessary
> > -                * kernel symbols and we are using ftrace location filter
> > -                * for functions. If it's not available keep the current
> > -                * dwarf declaration check.
> > -                */
> > -               if (functions_cnt) {
> > +       /*
> > +        * The functions_cnt != 0 means we parsed all necessary
> > +        * kernel symbols and we are using ftrace location filter
> > +        * for functions. If it's not available keep the current
> > +        * dwarf declaration check.
> > +        */
> > +       if (functions_cnt) {
> > +               cu__for_each_function(cu, core_id, fn) {
> > +                       struct elf_function *p;
> > +                       struct elf_function key = { .name = function__name(fn, cu) };
> > +                       int args_cnt = 0;
> > +
> >                         /*
> > -                        * We check following conditions:
> > -                        *   - argument names are defined
> > -                        *   - there's symbol and address defined for the function
> > -                        *   - function address belongs to ftrace locations
> > -                        *   - function is generated only once
> > +                        * Collect functions that match ftrace filter
> > +                        * and pick the one with proper argument names.
> > +                        * The BTF generation happens at the end in
> > +                        * btf_encoder__encode function.
> >                          */
> > -                       if (!has_arg_names(cu, &fn->proto))
> > +                       p = bsearch(&key, functions, functions_cnt,
> > +                                   sizeof(functions[0]), functions_cmp);
> > +                       if (!p)
> >                                 continue;
> > -                       if (!should_generate_function(btfe, function__name(fn, cu)))
> > +
> > +                       if (!has_arg_names(cu, &fn->proto, &args_cnt))
> 
> So I can't unfortunately reproduce that GCC bug with DWARF info. What
> was exactly the symptom? Maybe you can also share readelf -wi dump for
> your problematic vmlinux?

hum, can't see -wi working for readelf, however I placed my vmlinux
in here:
  http://people.redhat.com/~jolsa/vmlinux.gz

the symptom is that resolve_btfids will fail kernel build:

  BTFIDS  vmlinux
FAILED unresolved symbol vfs_getattr

because BTF data contains multiple FUNC records for same function

and the problem is that declaration tag itself is missing:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
so pahole won't skip them

the first workaround was to workaround that and go for function
records that have code address attached, but that's buggy as well:
  https://bugzilla.redhat.com/show_bug.cgi?id=1890107

then after some discussions we ended up using ftrace addresses
as filter for what we want to display.. plus we got static functions
as well

however for this way we failed to get proper arguments ;-)

> 
> The reason I'm asking is because I wonder if we should still ignore
> functions if fn->declaration is set. E.g., for the issue we
> investigated yesterday, the function with no arguments has declaration
> set to 1, so just ignoring it would solve the problem. I'm wondering
> if it's enough to do just that instead of doing this whole delayed
> function collection/processing.
> 
> Also, I'd imagine the only expected cases where we can override  the
> function (args_cnt > p->args_cnt) would be if p->args_cnt == 0, no?

I don't know, because originally I'd expect that we would not see
function record with zero args when it actualy has some

> All other cases are either newly discovered "bogusness" of DWARF (and
> would be good to know about this) or it's a name collision for
> functions. Basically, before we go all the way to rework this again,
> let's see if just skipping declarations would be enough?

so there's actualy new developement today in:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060#c14

gcc might actualy get fixed, so I think we could go back to
using declaration tag and use ftrace as additional filter..
at least this exercise gave us static functions ;-)

however we still have fedora with disabled disabled CONFIG_DEBUG_INFO_BTF
and will need to wait for that fix to enable that back

> 
> >                                 continue;
> > -               } else {
> > +
> > +                       if (!p->fn || args_cnt > p->args_cnt) {
> > +                               p->fn = fn;
> > +                               p->cu = cu;
> > +                               p->args_cnt = args_cnt;
> > +                               p->type_id_off = type_id_off;
> > +                       }
> > +               }
> > +       } else {
> > +               cu__for_each_function(cu, core_id, fn) {
> >                         if (fn->declaration || !fn->external)
> >                                 continue;
> > +                       if (generate_func(btfe, cu, fn, type_id_off))
> > +                               goto out;
> >                 }
> 
> I'm trending towards disliking this completely different fallback
> mechanism. It saved bpf-next accidentally, but otherwise obscured the
> issue and generally makes testing pahole with artificial binary BTFs
> (from test programs) harder. How about we unify approaches, but just
> use mcount symbols opportunistically, as an additional filter, if it's
> available?

ok

> 
> With that, testing that we still handle functions with duplicate names
> properly would be trivial (which I suspect we don't and we'll just
> keep the one with more args now, right?) And it makes static functions
> available for non-vmlinux binaries automatically (might be good or
> bad, but still...).

if we keep just the ftrace filter and rely on declaration tag,
the args checking will go away right?

jirka


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-12 21:14     ` Jiri Olsa
@ 2020-11-13  0:08       ` Andrii Nakryiko
  2020-11-13  0:18         ` Alexei Starovoitov
  2020-11-13 10:59         ` Jiri Olsa
  0 siblings, 2 replies; 15+ messages in thread
From: Andrii Nakryiko @ 2020-11-13  0:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 1:14 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Thu, Nov 12, 2020 at 11:54:41AM -0800, Andrii Nakryiko wrote:
>
> SNIP
>
> > > @@ -624,32 +644,46 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > >                 has_index_type = true;
> > >         }
> > >
> > > -       cu__for_each_function(cu, core_id, fn) {
> > > -               /*
> > > -                * The functions_cnt != 0 means we parsed all necessary
> > > -                * kernel symbols and we are using ftrace location filter
> > > -                * for functions. If it's not available keep the current
> > > -                * dwarf declaration check.
> > > -                */
> > > -               if (functions_cnt) {
> > > +       /*
> > > +        * The functions_cnt != 0 means we parsed all necessary
> > > +        * kernel symbols and we are using ftrace location filter
> > > +        * for functions. If it's not available keep the current
> > > +        * dwarf declaration check.
> > > +        */
> > > +       if (functions_cnt) {
> > > +               cu__for_each_function(cu, core_id, fn) {
> > > +                       struct elf_function *p;
> > > +                       struct elf_function key = { .name = function__name(fn, cu) };
> > > +                       int args_cnt = 0;
> > > +
> > >                         /*
> > > -                        * We check following conditions:
> > > -                        *   - argument names are defined
> > > -                        *   - there's symbol and address defined for the function
> > > -                        *   - function address belongs to ftrace locations
> > > -                        *   - function is generated only once
> > > +                        * Collect functions that match ftrace filter
> > > +                        * and pick the one with proper argument names.
> > > +                        * The BTF generation happens at the end in
> > > +                        * btf_encoder__encode function.
> > >                          */
> > > -                       if (!has_arg_names(cu, &fn->proto))
> > > +                       p = bsearch(&key, functions, functions_cnt,
> > > +                                   sizeof(functions[0]), functions_cmp);
> > > +                       if (!p)
> > >                                 continue;
> > > -                       if (!should_generate_function(btfe, function__name(fn, cu)))
> > > +
> > > +                       if (!has_arg_names(cu, &fn->proto, &args_cnt))
> >
> > So I can't unfortunately reproduce that GCC bug with DWARF info. What
> > was exactly the symptom? Maybe you can also share readelf -wi dump for
> > your problematic vmlinux?
>
> hum, can't see -wi working for readelf, however I placed my vmlinux
> in here:
>   http://people.redhat.com/~jolsa/vmlinux.gz
>
> the symptom is that resolve_btfids will fail kernel build:
>
>   BTFIDS  vmlinux
> FAILED unresolved symbol vfs_getattr
>
> because BTF data contains multiple FUNC records for same function
>
> and the problem is that declaration tag itself is missing:
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> so pahole won't skip them
>
> the first workaround was to workaround that and go for function
> records that have code address attached, but that's buggy as well:
>   https://bugzilla.redhat.com/show_bug.cgi?id=1890107
>
> then after some discussions we ended up using ftrace addresses
> as filter for what we want to display.. plus we got static functions
> as well
>
> however for this way we failed to get proper arguments ;-)

Right, I followed along overall, but forgot the details of the initial
problem. Thanks for the refresher. See below for my current thoughts
on dealing with all this.

>
> >
> > The reason I'm asking is because I wonder if we should still ignore
> > functions if fn->declaration is set. E.g., for the issue we
> > investigated yesterday, the function with no arguments has declaration
> > set to 1, so just ignoring it would solve the problem. I'm wondering
> > if it's enough to do just that instead of doing this whole delayed
> > function collection/processing.
> >
> > Also, I'd imagine the only expected cases where we can override  the
> > function (args_cnt > p->args_cnt) would be if p->args_cnt == 0, no?
>
> I don't know, because originally I'd expect that we would not see
> function record with zero args when it actualy has some
>
> > All other cases are either newly discovered "bogusness" of DWARF (and
> > would be good to know about this) or it's a name collision for
> > functions. Basically, before we go all the way to rework this again,
> > let's see if just skipping declarations would be enough?
>
> so there's actualy new developement today in:
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060#c14
>
> gcc might actualy get fixed, so I think we could go back to
> using declaration tag and use ftrace as additional filter..
> at least this exercise gave us static functions ;-)
>
> however we still have fedora with disabled disabled CONFIG_DEBUG_INFO_BTF
> and will need to wait for that fix to enable that back

Right, we better have a more robust approach not relying on
not-yet-released GCC.

>
> >
> > >                                 continue;
> > > -               } else {
> > > +
> > > +                       if (!p->fn || args_cnt > p->args_cnt) {
> > > +                               p->fn = fn;
> > > +                               p->cu = cu;
> > > +                               p->args_cnt = args_cnt;
> > > +                               p->type_id_off = type_id_off;
> > > +                       }
> > > +               }
> > > +       } else {
> > > +               cu__for_each_function(cu, core_id, fn) {
> > >                         if (fn->declaration || !fn->external)
> > >                                 continue;
> > > +                       if (generate_func(btfe, cu, fn, type_id_off))
> > > +                               goto out;
> > >                 }
> >
> > I'm trending towards disliking this completely different fallback
> > mechanism. It saved bpf-next accidentally, but otherwise obscured the
> > issue and generally makes testing pahole with artificial binary BTFs
> > (from test programs) harder. How about we unify approaches, but just
> > use mcount symbols opportunistically, as an additional filter, if it's
> > available?
>
> ok
>
> >
> > With that, testing that we still handle functions with duplicate names
> > properly would be trivial (which I suspect we don't and we'll just
> > keep the one with more args now, right?) And it makes static functions
> > available for non-vmlinux binaries automatically (might be good or
> > bad, but still...).
>
> if we keep just the ftrace filter and rely on declaration tag,
> the args checking will go away right?

So I looked at your vmlinux image. I think we should just keep
everything mostly as it it right now (without changes in this patch),
but add just two simple checks:

1. Skip if fn->declaration (ignore correctly marked func declarations)
2. Skip if DW_AT_inline: 1 (ignore inlined functions).

I'd keep the named arguments check as is, I think it's helpful. 1)
will skip stuff that's explicitly marked as declaration. 2) inline
check will partially mitigate dropping of fn->external check (and we
can't really attach to inlined functions). So will the named args
check as well. Then we should do mcount checks, **if** mcount symbols
are present (which should always be the case for any reasonable
vmlinux image that's supposed to be used with BPF, I think).

So together, it should cover all known issues, I think. And then we'll
just have to watch out for any new ones. GCC bugfix is good, but it's
too late: the harm is done and there are compilers out there that we
should deal with.

>
> jirka
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-13  0:08       ` Andrii Nakryiko
@ 2020-11-13  0:18         ` Alexei Starovoitov
  2020-11-13  0:30           ` Andrii Nakryiko
  2020-11-13 10:59         ` Jiri Olsa
  1 sibling, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2020-11-13  0:18 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 4:08 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> So I looked at your vmlinux image. I think we should just keep
> everything mostly as it it right now (without changes in this patch),
> but add just two simple checks:
>
> 1. Skip if fn->declaration (ignore correctly marked func declarations)
> 2. Skip if DW_AT_inline: 1 (ignore inlined functions).
>
> I'd keep the named arguments check as is, I think it's helpful. 1)
> will skip stuff that's explicitly marked as declaration. 2) inline
> check will partially mitigate dropping of fn->external check (and we
> can't really attach to inlined functions).

I thought DW_AT_inline is an indication that the function was marked "inline"
in C code. That doesn't mean that the function was actually inlined.
So I don't think pahole should check that bit.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-13  0:18         ` Alexei Starovoitov
@ 2020-11-13  0:30           ` Andrii Nakryiko
  2020-11-13  1:00             ` Yonghong Song
  0 siblings, 1 reply; 15+ messages in thread
From: Andrii Nakryiko @ 2020-11-13  0:30 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 4:19 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Nov 12, 2020 at 4:08 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > So I looked at your vmlinux image. I think we should just keep
> > everything mostly as it it right now (without changes in this patch),
> > but add just two simple checks:
> >
> > 1. Skip if fn->declaration (ignore correctly marked func declarations)
> > 2. Skip if DW_AT_inline: 1 (ignore inlined functions).
> >
> > I'd keep the named arguments check as is, I think it's helpful. 1)
> > will skip stuff that's explicitly marked as declaration. 2) inline
> > check will partially mitigate dropping of fn->external check (and we
> > can't really attach to inlined functions).
>
> I thought DW_AT_inline is an indication that the function was marked "inline"
> in C code. That doesn't mean that the function was actually inlined.
> So I don't think pahole should check that bit.

According to DWARF spec, there are 4 possible values:

DW_INL_not_inlined = 0            Not declared inline nor inlined by
the compiler
DW_INL_inlined = 1                Not declared inline but inlined by
the compiler
DW_INL_declared_not_inlined = 2   Declared inline but not inlined by
the compiler
DW_INL_declared_inlined = 3       Declared inline and inlined by the compiler

So DW_INL_inlined is supposed to be added to functions that are not
marked inline, but were nevertheless inlined. I saw this for one of
vfs_getattr entries in DWARF, which clearly is not marked inline.

But also that DWARF entry had proper args with names, so it would work
fine as well. I don't know, with DWARF it's always some guessing game.
Let's leave DW_AT_inline alone for now.

Important part is skipping declarations (when they are marked as
such), though I'm not claiming it will solve the problem completely...
:)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-13  0:30           ` Andrii Nakryiko
@ 2020-11-13  1:00             ` Yonghong Song
  2020-11-13  1:12               ` Andrii Nakryiko
  0 siblings, 1 reply; 15+ messages in thread
From: Yonghong Song @ 2020-11-13  1:00 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: Jiri Olsa, Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Hao Luo



On 11/12/20 4:30 PM, Andrii Nakryiko wrote:
> On Thu, Nov 12, 2020 at 4:19 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Thu, Nov 12, 2020 at 4:08 PM Andrii Nakryiko
>> <andrii.nakryiko@gmail.com> wrote:
>>>
>>> So I looked at your vmlinux image. I think we should just keep
>>> everything mostly as it it right now (without changes in this patch),
>>> but add just two simple checks:
>>>
>>> 1. Skip if fn->declaration (ignore correctly marked func declarations)
>>> 2. Skip if DW_AT_inline: 1 (ignore inlined functions).
>>>
>>> I'd keep the named arguments check as is, I think it's helpful. 1)
>>> will skip stuff that's explicitly marked as declaration. 2) inline
>>> check will partially mitigate dropping of fn->external check (and we
>>> can't really attach to inlined functions).
>>
>> I thought DW_AT_inline is an indication that the function was marked "inline"
>> in C code. That doesn't mean that the function was actually inlined.
>> So I don't think pahole should check that bit.
> 
> According to DWARF spec, there are 4 possible values:
> 
> DW_INL_not_inlined = 0            Not declared inline nor inlined by
> the compiler
> DW_INL_inlined = 1                Not declared inline but inlined by
> the compiler
> DW_INL_declared_not_inlined = 2   Declared inline but not inlined by
> the compiler
> DW_INL_declared_inlined = 3       Declared inline and inlined by the compiler
> 
> So DW_INL_inlined is supposed to be added to functions that are not
> marked inline, but were nevertheless inlined. I saw this for one of
> vfs_getattr entries in DWARF, which clearly is not marked inline.

I looked at llvm source code, llvm only tries to assign DW_INL_inlined
and also only at certain conditions. Not sure about gcc. Probably 
similar. So this field is not reliable, esp. without it does not mean it 
is not inlined.

> 
> But also that DWARF entry had proper args with names, so it would work
> fine as well. I don't know, with DWARF it's always some guessing game.
> Let's leave DW_AT_inline alone for now.
> 
> Important part is skipping declarations (when they are marked as
> such), though I'm not claiming it will solve the problem completely...
> :)
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-13  1:00             ` Yonghong Song
@ 2020-11-13  1:12               ` Andrii Nakryiko
  0 siblings, 0 replies; 15+ messages in thread
From: Andrii Nakryiko @ 2020-11-13  1:12 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Jiri Olsa, Jiri Olsa,
	Arnaldo Carvalho de Melo, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Hao Luo

On Thu, Nov 12, 2020 at 5:01 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 11/12/20 4:30 PM, Andrii Nakryiko wrote:
> > On Thu, Nov 12, 2020 at 4:19 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >>
> >> On Thu, Nov 12, 2020 at 4:08 PM Andrii Nakryiko
> >> <andrii.nakryiko@gmail.com> wrote:
> >>>
> >>> So I looked at your vmlinux image. I think we should just keep
> >>> everything mostly as it it right now (without changes in this patch),
> >>> but add just two simple checks:
> >>>
> >>> 1. Skip if fn->declaration (ignore correctly marked func declarations)
> >>> 2. Skip if DW_AT_inline: 1 (ignore inlined functions).
> >>>
> >>> I'd keep the named arguments check as is, I think it's helpful. 1)
> >>> will skip stuff that's explicitly marked as declaration. 2) inline
> >>> check will partially mitigate dropping of fn->external check (and we
> >>> can't really attach to inlined functions).
> >>
> >> I thought DW_AT_inline is an indication that the function was marked "inline"
> >> in C code. That doesn't mean that the function was actually inlined.
> >> So I don't think pahole should check that bit.
> >
> > According to DWARF spec, there are 4 possible values:
> >
> > DW_INL_not_inlined = 0            Not declared inline nor inlined by
> > the compiler
> > DW_INL_inlined = 1                Not declared inline but inlined by
> > the compiler
> > DW_INL_declared_not_inlined = 2   Declared inline but not inlined by
> > the compiler
> > DW_INL_declared_inlined = 3       Declared inline and inlined by the compiler
> >
> > So DW_INL_inlined is supposed to be added to functions that are not
> > marked inline, but were nevertheless inlined. I saw this for one of
> > vfs_getattr entries in DWARF, which clearly is not marked inline.
>
> I looked at llvm source code, llvm only tries to assign DW_INL_inlined
> and also only at certain conditions. Not sure about gcc. Probably
> similar. So this field is not reliable, esp. without it does not mean it
> is not inlined.

Can't say I'm surprised...

>
> >
> > But also that DWARF entry had proper args with names, so it would work
> > fine as well. I don't know, with DWARF it's always some guessing game.
> > Let's leave DW_AT_inline alone for now.
> >
> > Important part is skipping declarations (when they are marked as
> > such), though I'm not claiming it will solve the problem completely...
> > :)
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-13  0:08       ` Andrii Nakryiko
  2020-11-13  0:18         ` Alexei Starovoitov
@ 2020-11-13 10:59         ` Jiri Olsa
  2020-11-13 11:52           ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2020-11-13 10:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, dwarves, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Hao Luo

On Thu, Nov 12, 2020 at 04:08:02PM -0800, Andrii Nakryiko wrote:
> On Thu, Nov 12, 2020 at 1:14 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Thu, Nov 12, 2020 at 11:54:41AM -0800, Andrii Nakryiko wrote:
> >
> > SNIP
> >
> > > > @@ -624,32 +644,46 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > > >                 has_index_type = true;
> > > >         }
> > > >
> > > > -       cu__for_each_function(cu, core_id, fn) {
> > > > -               /*
> > > > -                * The functions_cnt != 0 means we parsed all necessary
> > > > -                * kernel symbols and we are using ftrace location filter
> > > > -                * for functions. If it's not available keep the current
> > > > -                * dwarf declaration check.
> > > > -                */
> > > > -               if (functions_cnt) {
> > > > +       /*
> > > > +        * The functions_cnt != 0 means we parsed all necessary
> > > > +        * kernel symbols and we are using ftrace location filter
> > > > +        * for functions. If it's not available keep the current
> > > > +        * dwarf declaration check.
> > > > +        */
> > > > +       if (functions_cnt) {
> > > > +               cu__for_each_function(cu, core_id, fn) {
> > > > +                       struct elf_function *p;
> > > > +                       struct elf_function key = { .name = function__name(fn, cu) };
> > > > +                       int args_cnt = 0;
> > > > +
> > > >                         /*
> > > > -                        * We check following conditions:
> > > > -                        *   - argument names are defined
> > > > -                        *   - there's symbol and address defined for the function
> > > > -                        *   - function address belongs to ftrace locations
> > > > -                        *   - function is generated only once
> > > > +                        * Collect functions that match ftrace filter
> > > > +                        * and pick the one with proper argument names.
> > > > +                        * The BTF generation happens at the end in
> > > > +                        * btf_encoder__encode function.
> > > >                          */
> > > > -                       if (!has_arg_names(cu, &fn->proto))
> > > > +                       p = bsearch(&key, functions, functions_cnt,
> > > > +                                   sizeof(functions[0]), functions_cmp);
> > > > +                       if (!p)
> > > >                                 continue;
> > > > -                       if (!should_generate_function(btfe, function__name(fn, cu)))
> > > > +
> > > > +                       if (!has_arg_names(cu, &fn->proto, &args_cnt))
> > >
> > > So I can't unfortunately reproduce that GCC bug with DWARF info. What
> > > was exactly the symptom? Maybe you can also share readelf -wi dump for
> > > your problematic vmlinux?
> >
> > hum, can't see -wi working for readelf, however I placed my vmlinux
> > in here:
> >   http://people.redhat.com/~jolsa/vmlinux.gz
> >
> > the symptom is that resolve_btfids will fail kernel build:
> >
> >   BTFIDS  vmlinux
> > FAILED unresolved symbol vfs_getattr
> >
> > because BTF data contains multiple FUNC records for same function
> >
> > and the problem is that declaration tag itself is missing:
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > so pahole won't skip them
> >
> > the first workaround was to workaround that and go for function
> > records that have code address attached, but that's buggy as well:
> >   https://bugzilla.redhat.com/show_bug.cgi?id=1890107
> >
> > then after some discussions we ended up using ftrace addresses
> > as filter for what we want to display.. plus we got static functions
> > as well
> >
> > however for this way we failed to get proper arguments ;-)
> 
> Right, I followed along overall, but forgot the details of the initial
> problem. Thanks for the refresher. See below for my current thoughts
> on dealing with all this.
> 
> >
> > >
> > > The reason I'm asking is because I wonder if we should still ignore
> > > functions if fn->declaration is set. E.g., for the issue we
> > > investigated yesterday, the function with no arguments has declaration
> > > set to 1, so just ignoring it would solve the problem. I'm wondering
> > > if it's enough to do just that instead of doing this whole delayed
> > > function collection/processing.
> > >
> > > Also, I'd imagine the only expected cases where we can override  the
> > > function (args_cnt > p->args_cnt) would be if p->args_cnt == 0, no?
> >
> > I don't know, because originally I'd expect that we would not see
> > function record with zero args when it actualy has some
> >
> > > All other cases are either newly discovered "bogusness" of DWARF (and
> > > would be good to know about this) or it's a name collision for
> > > functions. Basically, before we go all the way to rework this again,
> > > let's see if just skipping declarations would be enough?
> >
> > so there's actualy new developement today in:
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060#c14
> >
> > gcc might actualy get fixed, so I think we could go back to
> > using declaration tag and use ftrace as additional filter..
> > at least this exercise gave us static functions ;-)
> >
> > however we still have fedora with disabled disabled CONFIG_DEBUG_INFO_BTF
> > and will need to wait for that fix to enable that back
> 
> Right, we better have a more robust approach not relying on
> not-yet-released GCC.
> 
> >
> > >
> > > >                                 continue;
> > > > -               } else {
> > > > +
> > > > +                       if (!p->fn || args_cnt > p->args_cnt) {
> > > > +                               p->fn = fn;
> > > > +                               p->cu = cu;
> > > > +                               p->args_cnt = args_cnt;
> > > > +                               p->type_id_off = type_id_off;
> > > > +                       }
> > > > +               }
> > > > +       } else {
> > > > +               cu__for_each_function(cu, core_id, fn) {
> > > >                         if (fn->declaration || !fn->external)
> > > >                                 continue;
> > > > +                       if (generate_func(btfe, cu, fn, type_id_off))
> > > > +                               goto out;
> > > >                 }
> > >
> > > I'm trending towards disliking this completely different fallback
> > > mechanism. It saved bpf-next accidentally, but otherwise obscured the
> > > issue and generally makes testing pahole with artificial binary BTFs
> > > (from test programs) harder. How about we unify approaches, but just
> > > use mcount symbols opportunistically, as an additional filter, if it's
> > > available?
> >
> > ok
> >
> > >
> > > With that, testing that we still handle functions with duplicate names
> > > properly would be trivial (which I suspect we don't and we'll just
> > > keep the one with more args now, right?) And it makes static functions
> > > available for non-vmlinux binaries automatically (might be good or
> > > bad, but still...).
> >
> > if we keep just the ftrace filter and rely on declaration tag,
> > the args checking will go away right?
> 
> So I looked at your vmlinux image. I think we should just keep
> everything mostly as it it right now (without changes in this patch),
> but add just two simple checks:
> 
> 1. Skip if fn->declaration (ignore correctly marked func declarations)
> 2. Skip if DW_AT_inline: 1 (ignore inlined functions).
> 
> I'd keep the named arguments check as is, I think it's helpful. 1)
> will skip stuff that's explicitly marked as declaration. 2) inline

ok, that should fix the fail on gccs that still generate
declaration tag, so you should be fine and our version of
gcc should continue work as well 

> check will partially mitigate dropping of fn->external check (and we
> can't really attach to inlined functions). So will the named args
> check as well. Then we should do mcount checks, **if** mcount symbols
> are present (which should always be the case for any reasonable
> vmlinux image that's supposed to be used with BPF, I think).
> 
> So together, it should cover all known issues, I think. And then we'll
> just have to watch out for any new ones. GCC bugfix is good, but it's
> too late: the harm is done and there are compilers out there that we
> should deal with.

hopefuly with this change we can enable BTF back before the gcc fix

thanks,
jirka


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC/PATCH 3/3] btf_encoder: Func generation fix
  2020-11-13 10:59         ` Jiri Olsa
@ 2020-11-13 11:52           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-11-13 11:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Yonghong Song, Hao Luo

Em Fri, Nov 13, 2020 at 11:59:23AM +0100, Jiri Olsa escreveu:
> On Thu, Nov 12, 2020 at 04:08:02PM -0800, Andrii Nakryiko wrote:
> > On Thu, Nov 12, 2020 at 1:14 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > > On Thu, Nov 12, 2020 at 11:54:41AM -0800, Andrii Nakryiko wrote:
> > > > So I can't unfortunately reproduce that GCC bug with DWARF info. What
> > > > was exactly the symptom? Maybe you can also share readelf -wi dump for
> > > > your problematic vmlinux?

> > > hum, can't see -wi working for readelf, however I placed my vmlinux
> > > in here:
> > >   http://people.redhat.com/~jolsa/vmlinux.gz

> > > the symptom is that resolve_btfids will fail kernel build:

> > >   BTFIDS  vmlinux
> > > FAILED unresolved symbol vfs_getattr

> > > because BTF data contains multiple FUNC records for same function

> > > and the problem is that declaration tag itself is missing:
> > >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > > so pahole won't skip them

> > > the first workaround was to workaround that and go for function
> > > records that have code address attached, but that's buggy as well:
> > >   https://bugzilla.redhat.com/show_bug.cgi?id=1890107

> > > then after some discussions we ended up using ftrace addresses
> > > as filter for what we want to display.. plus we got static functions
> > > as well

> > > however for this way we failed to get proper arguments ;-)

> > Right, I followed along overall, but forgot the details of the initial
> > problem. Thanks for the refresher. See below for my current thoughts
> > on dealing with all this.

I'll add this to the set of regression tests I use with pahole:

[acme@five vfs_gettattr.multiple.btf.entries.jolsa]$ bpftool btf dump file vmlinux | grep -w FUNC | cut -d\' -f2 | sort | uniq -c | sort -nr | head
      3 __x64_sys_userfaultfd
      3 __x64_sys_timerfd_settime
      3 __x64_sys_timerfd_gettime
      3 __x64_sys_timerfd_create
      3 __x64_sys_syslog
      3 __x64_sys_sysfs
      3 __x64_sys_swapon
      3 __x64_sys_swapoff
      3 __x64_sys_socketpair
      3 __x64_sys_socket
[acme@five vfs_gettattr.multiple.btf.entries.jolsa]$ pfunct -F btf vmlinux | sort | uniq -c | sort -nr | head
      3 __x64_sys_userfaultfd
      3 __x64_sys_timerfd_settime
      3 __x64_sys_timerfd_gettime
      3 __x64_sys_timerfd_create
      3 __x64_sys_syslog
      3 __x64_sys_sysfs
      3 __x64_sys_swapon
      3 __x64_sys_swapoff
      3 __x64_sys_socketpair
      3 __x64_sys_socket
[acme@five vfs_gettattr.multiple.btf.entries.jolsa]$

I.e. the output of those tools need to match and all functions need to
appear only once.

I'll also use pfunct with -F dwarf to get the same results, probably
will add these to the 'btfdiff' tool that is in the pahole git repo.

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-11-13 12:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-12 15:05 [RFC 0/3] btf_encoder: Fix functions BTF data generation Jiri Olsa
2020-11-12 15:05 ` [RFC/PATCH 1/3] btf_encoder: Generate also .init functions Jiri Olsa
2020-11-12 19:37   ` Andrii Nakryiko
2020-11-12 15:05 ` [RFC/PATCH 2/3] btf_encoder: Put function generation code to generate_func Jiri Olsa
2020-11-12 19:39   ` Andrii Nakryiko
2020-11-12 15:05 ` [RFC/PATCH 3/3] btf_encoder: Func generation fix Jiri Olsa
2020-11-12 19:54   ` Andrii Nakryiko
2020-11-12 21:14     ` Jiri Olsa
2020-11-13  0:08       ` Andrii Nakryiko
2020-11-13  0:18         ` Alexei Starovoitov
2020-11-13  0:30           ` Andrii Nakryiko
2020-11-13  1:00             ` Yonghong Song
2020-11-13  1:12               ` Andrii Nakryiko
2020-11-13 10:59         ` Jiri Olsa
2020-11-13 11:52           ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).