All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types
@ 2023-02-28 20:23 Eduard Zingerman
  2023-03-01 18:58 ` Andrii Nakryiko
  2023-03-01 21:19 ` Jiri Olsa
  0 siblings, 2 replies; 4+ messages in thread
From: Eduard Zingerman @ 2023-02-28 20:23 UTC (permalink / raw)
  To: dwarves, arnaldo.melo
  Cc: bpf, kernel-team, ast, daniel, andrii, yhs, Eduard Zingerman,
	KP Singh, Matt Bobrowski

Recent changes to handle unspecified types (see [1]) cause BTF ID drift.

Specifically, the intent of commits [2], [3] and [4] is to render
references to unspecified types as void type.
However, as a consequence:
- in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
  for unspecified type tags and adds these tags to `cu->types_table`;
- `btf_encoder__encode_tag()` skips generation of BTF entries for
  `DW_TAG_unspecified_type` tags.

Such logic causes ID drift if unspecified type is not the last type
processed for compilation unit. `small_id` of each type following
unspecified type in the `cu->types_table` would have its BTF id off by -1.
Thus, rendering references established on recode phase invalid.

This commit reverts `unspecified_type` id/tag tracking.
Instead, the following is done:
- `small_id` for unspecified type tags is set to 0, thus reference to
  unspecified type tag would render BTF id of a `void` on recode phase;
- unspecified type tags are not added to `cu->types_table`.

This change also happens to fix issue reported in [5], the gist of
that issue is that the field `encoder->unspecified_type` is set but
not reset by function `btf_encoder__encode_cu()`. Thus, the following
sequence of events might occur when BTF encoding is requested:
- CU with unspecified type is processed:
  - unspecified type id is 42
  - encoder->unspecified_type is set to 42
- CU without unspecified type is processed next using the same
  `encoder` object:
  - some `struct foo` has id 42 in this CU
  - the references to `struct foo` are set 0 by function
    `btf_encoder__tag_type()`.

[1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
[2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
[3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
[4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
[5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/

Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
Tested-by: KP Singh <kpsingh@kernel.org>
Reported-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 btf_encoder.c  |  8 --------
 dwarf_loader.c | 25 +++++++++++++++++++------
 dwarves.h      |  8 --------
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index da776f4..07a9dc5 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -69,7 +69,6 @@ struct btf_encoder {
 	const char	  *filename;
 	struct elf_symtab *symtab;
 	uint32_t	  type_id_off;
-	uint32_t	  unspecified_type;
 	int		  saved_func_cnt;
 	bool		  has_index_type,
 			  need_index_type,
@@ -635,11 +634,6 @@ static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_t
 	if (tag_type == 0)
 		return 0;
 
-	if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
-		// No provision for encoding this, turn it into void.
-		return 0;
-	}
-
 	return encoder->type_id_off + tag_type;
 }
 
@@ -1746,8 +1740,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 
 	encoder->cu = cu;
 	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
-	if (encoder->cu->unspecified_type.tag)
-		encoder->unspecified_type = encoder->cu->unspecified_type.type;
 
 	if (!encoder->has_index_type) {
 		/* cu__find_base_type_by_name() takes "type_id_t *id" */
diff --git a/dwarf_loader.c b/dwarf_loader.c
index 014e130..c37bd7b 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -2155,8 +2155,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
 	case DW_TAG_atomic_type:
 		tag = die__create_new_tag(die, cu);		break;
 	case DW_TAG_unspecified_type:
-		cu->unspecified_type.tag =
-			tag = die__create_new_tag(die, cu);     break;
+		tag = die__create_new_tag(die, cu);		break;
 	case DW_TAG_pointer_type:
 		tag = die__create_new_pointer_tag(die, cu, conf);	break;
 	case DW_TAG_ptr_to_member_type:
@@ -2219,13 +2218,27 @@ static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *co
 			continue;
 		}
 
-		uint32_t id;
-		cu__add_tag(cu, tag, &id);
+		uint32_t id = 0;
+		/* There is no BTF representation for unspecified types.
+		 * Currently we want such types to be represented as `void`
+		 * (and thus skip BTF encoding).
+		 *
+		 * As BTF encoding is skipped, such types must not be added to type table,
+		 * otherwise an ID for a type would be allocated and we would be forced
+		 * to put something in BTF at this ID.
+		 * Thus avoid `cu__add_tag()` call for such types.
+		 *
+		 * On the other hand, there might be references to this type from other
+		 * tags, so `dwarf_cu__find_tag_by_ref()` must return something.
+		 * Thus call `cu__hash()` for such types.
+		 *
+		 * Note, that small_id of zero would be assigned to unspecified type entry.
+		 */
+		if (tag->tag != DW_TAG_unspecified_type)
+			cu__add_tag(cu, tag, &id);
 		cu__hash(cu, tag);
 		struct dwarf_tag *dtag = tag->priv;
 		dtag->small_id = id;
-		if (tag->tag == DW_TAG_unspecified_type)
-			cu->unspecified_type.type = id;
 	} while (dwarf_siblingof(die, die) == 0);
 
 	return 0;
diff --git a/dwarves.h b/dwarves.h
index 5074cf8..e92b2fd 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -236,10 +236,6 @@ struct debug_fmt_ops {
 
 #define ARCH_MAX_REGISTER_PARAMS	8
 
-/*
- * unspecified_type: If this CU has a DW_TAG_unspecified_type, as BTF doesn't have a representation for this
- * 		     and thus we need to check functions returning this to convert it to void.
- */
 struct cu {
 	struct list_head node;
 	struct list_head tags;
@@ -248,10 +244,6 @@ struct cu {
 	struct ptr_table functions_table;
 	struct ptr_table tags_table;
 	struct rb_root	 functions;
-	struct {
-		struct tag	 *tag;
-		uint32_t	 type;
-	} unspecified_type;
 	char		 *name;
 	char		 *filename;
 	void 		 *priv;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types
  2023-02-28 20:23 [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types Eduard Zingerman
@ 2023-03-01 18:58 ` Andrii Nakryiko
  2023-03-01 21:19 ` Jiri Olsa
  1 sibling, 0 replies; 4+ messages in thread
From: Andrii Nakryiko @ 2023-03-01 18:58 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: dwarves, arnaldo.melo, bpf, kernel-team, ast, daniel, andrii,
	yhs, KP Singh, Matt Bobrowski

On Tue, Feb 28, 2023 at 12:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Recent changes to handle unspecified types (see [1]) cause BTF ID drift.
>
> Specifically, the intent of commits [2], [3] and [4] is to render
> references to unspecified types as void type.
> However, as a consequence:
> - in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
>   for unspecified type tags and adds these tags to `cu->types_table`;
> - `btf_encoder__encode_tag()` skips generation of BTF entries for
>   `DW_TAG_unspecified_type` tags.
>
> Such logic causes ID drift if unspecified type is not the last type
> processed for compilation unit. `small_id` of each type following
> unspecified type in the `cu->types_table` would have its BTF id off by -1.
> Thus, rendering references established on recode phase invalid.
>
> This commit reverts `unspecified_type` id/tag tracking.
> Instead, the following is done:
> - `small_id` for unspecified type tags is set to 0, thus reference to
>   unspecified type tag would render BTF id of a `void` on recode phase;
> - unspecified type tags are not added to `cu->types_table`.
>
> This change also happens to fix issue reported in [5], the gist of
> that issue is that the field `encoder->unspecified_type` is set but
> not reset by function `btf_encoder__encode_cu()`. Thus, the following
> sequence of events might occur when BTF encoding is requested:
> - CU with unspecified type is processed:
>   - unspecified type id is 42
>   - encoder->unspecified_type is set to 42
> - CU without unspecified type is processed next using the same
>   `encoder` object:
>   - some `struct foo` has id 42 in this CU
>   - the references to `struct foo` are set 0 by function
>     `btf_encoder__tag_type()`.
>
> [1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
> [2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> [3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
> [4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
> [5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/
>
> Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
> Tested-by: KP Singh <kpsingh@kernel.org>
> Reported-by: Matt Bobrowski <mattbobrowski@google.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---

Looks like a sensible approach to me.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  btf_encoder.c  |  8 --------
>  dwarf_loader.c | 25 +++++++++++++++++++------
>  dwarves.h      |  8 --------
>  3 files changed, 19 insertions(+), 22 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types
  2023-02-28 20:23 [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types Eduard Zingerman
  2023-03-01 18:58 ` Andrii Nakryiko
@ 2023-03-01 21:19 ` Jiri Olsa
  2023-03-02 18:00   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 4+ messages in thread
From: Jiri Olsa @ 2023-03-01 21:19 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: dwarves, arnaldo.melo, bpf, kernel-team, ast, daniel, andrii,
	yhs, KP Singh, Matt Bobrowski

On Tue, Feb 28, 2023 at 10:23:57PM +0200, Eduard Zingerman wrote:
> Recent changes to handle unspecified types (see [1]) cause BTF ID drift.
> 
> Specifically, the intent of commits [2], [3] and [4] is to render
> references to unspecified types as void type.
> However, as a consequence:
> - in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
>   for unspecified type tags and adds these tags to `cu->types_table`;
> - `btf_encoder__encode_tag()` skips generation of BTF entries for
>   `DW_TAG_unspecified_type` tags.
> 
> Such logic causes ID drift if unspecified type is not the last type
> processed for compilation unit. `small_id` of each type following
> unspecified type in the `cu->types_table` would have its BTF id off by -1.
> Thus, rendering references established on recode phase invalid.
> 
> This commit reverts `unspecified_type` id/tag tracking.
> Instead, the following is done:
> - `small_id` for unspecified type tags is set to 0, thus reference to
>   unspecified type tag would render BTF id of a `void` on recode phase;
> - unspecified type tags are not added to `cu->types_table`.
> 
> This change also happens to fix issue reported in [5], the gist of
> that issue is that the field `encoder->unspecified_type` is set but
> not reset by function `btf_encoder__encode_cu()`. Thus, the following
> sequence of events might occur when BTF encoding is requested:
> - CU with unspecified type is processed:
>   - unspecified type id is 42
>   - encoder->unspecified_type is set to 42
> - CU without unspecified type is processed next using the same
>   `encoder` object:
>   - some `struct foo` has id 42 in this CU
>   - the references to `struct foo` are set 0 by function
>     `btf_encoder__tag_type()`.
> 
> [1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
> [2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> [3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
> [4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
> [5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/
> 
> Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
> Tested-by: KP Singh <kpsingh@kernel.org>
> Reported-by: Matt Bobrowski <mattbobrowski@google.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

lgtm, tested on top of the pahole next branch with bpf selftests

Tested-by: Jiri Olsa <jolsa@kernel.org>

jirka

> ---
>  btf_encoder.c  |  8 --------
>  dwarf_loader.c | 25 +++++++++++++++++++------
>  dwarves.h      |  8 --------
>  3 files changed, 19 insertions(+), 22 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index da776f4..07a9dc5 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -69,7 +69,6 @@ struct btf_encoder {
>  	const char	  *filename;
>  	struct elf_symtab *symtab;
>  	uint32_t	  type_id_off;
> -	uint32_t	  unspecified_type;
>  	int		  saved_func_cnt;
>  	bool		  has_index_type,
>  			  need_index_type,
> @@ -635,11 +634,6 @@ static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_t
>  	if (tag_type == 0)
>  		return 0;
>  
> -	if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
> -		// No provision for encoding this, turn it into void.
> -		return 0;
> -	}
> -
>  	return encoder->type_id_off + tag_type;
>  }
>  
> @@ -1746,8 +1740,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  
>  	encoder->cu = cu;
>  	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
> -	if (encoder->cu->unspecified_type.tag)
> -		encoder->unspecified_type = encoder->cu->unspecified_type.type;
>  
>  	if (!encoder->has_index_type) {
>  		/* cu__find_base_type_by_name() takes "type_id_t *id" */
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 014e130..c37bd7b 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -2155,8 +2155,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
>  	case DW_TAG_atomic_type:
>  		tag = die__create_new_tag(die, cu);		break;
>  	case DW_TAG_unspecified_type:
> -		cu->unspecified_type.tag =
> -			tag = die__create_new_tag(die, cu);     break;
> +		tag = die__create_new_tag(die, cu);		break;
>  	case DW_TAG_pointer_type:
>  		tag = die__create_new_pointer_tag(die, cu, conf);	break;
>  	case DW_TAG_ptr_to_member_type:
> @@ -2219,13 +2218,27 @@ static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *co
>  			continue;
>  		}
>  
> -		uint32_t id;
> -		cu__add_tag(cu, tag, &id);
> +		uint32_t id = 0;
> +		/* There is no BTF representation for unspecified types.
> +		 * Currently we want such types to be represented as `void`
> +		 * (and thus skip BTF encoding).
> +		 *
> +		 * As BTF encoding is skipped, such types must not be added to type table,
> +		 * otherwise an ID for a type would be allocated and we would be forced
> +		 * to put something in BTF at this ID.
> +		 * Thus avoid `cu__add_tag()` call for such types.
> +		 *
> +		 * On the other hand, there might be references to this type from other
> +		 * tags, so `dwarf_cu__find_tag_by_ref()` must return something.
> +		 * Thus call `cu__hash()` for such types.
> +		 *
> +		 * Note, that small_id of zero would be assigned to unspecified type entry.
> +		 */
> +		if (tag->tag != DW_TAG_unspecified_type)
> +			cu__add_tag(cu, tag, &id);
>  		cu__hash(cu, tag);
>  		struct dwarf_tag *dtag = tag->priv;
>  		dtag->small_id = id;
> -		if (tag->tag == DW_TAG_unspecified_type)
> -			cu->unspecified_type.type = id;
>  	} while (dwarf_siblingof(die, die) == 0);
>  
>  	return 0;
> diff --git a/dwarves.h b/dwarves.h
> index 5074cf8..e92b2fd 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -236,10 +236,6 @@ struct debug_fmt_ops {
>  
>  #define ARCH_MAX_REGISTER_PARAMS	8
>  
> -/*
> - * unspecified_type: If this CU has a DW_TAG_unspecified_type, as BTF doesn't have a representation for this
> - * 		     and thus we need to check functions returning this to convert it to void.
> - */
>  struct cu {
>  	struct list_head node;
>  	struct list_head tags;
> @@ -248,10 +244,6 @@ struct cu {
>  	struct ptr_table functions_table;
>  	struct ptr_table tags_table;
>  	struct rb_root	 functions;
> -	struct {
> -		struct tag	 *tag;
> -		uint32_t	 type;
> -	} unspecified_type;
>  	char		 *name;
>  	char		 *filename;
>  	void 		 *priv;
> -- 
> 2.39.1
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types
  2023-03-01 21:19 ` Jiri Olsa
@ 2023-03-02 18:00   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-03-02 18:00 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Eduard Zingerman, dwarves, arnaldo.melo, bpf, kernel-team, ast,
	daniel, andrii, yhs, KP Singh, Matt Bobrowski

Em Wed, Mar 01, 2023 at 10:19:56PM +0100, Jiri Olsa escreveu:
> On Tue, Feb 28, 2023 at 10:23:57PM +0200, Eduard Zingerman wrote:
> > Recent changes to handle unspecified types (see [1]) cause BTF ID drift.
> > 
> > Specifically, the intent of commits [2], [3] and [4] is to render
> > references to unspecified types as void type.
> > However, as a consequence:
> > - in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
> >   for unspecified type tags and adds these tags to `cu->types_table`;
> > - `btf_encoder__encode_tag()` skips generation of BTF entries for
> >   `DW_TAG_unspecified_type` tags.
> > 
> > Such logic causes ID drift if unspecified type is not the last type
> > processed for compilation unit. `small_id` of each type following
> > unspecified type in the `cu->types_table` would have its BTF id off by -1.
> > Thus, rendering references established on recode phase invalid.
> > 
> > This commit reverts `unspecified_type` id/tag tracking.
> > Instead, the following is done:
> > - `small_id` for unspecified type tags is set to 0, thus reference to
> >   unspecified type tag would render BTF id of a `void` on recode phase;
> > - unspecified type tags are not added to `cu->types_table`.
> > 
> > This change also happens to fix issue reported in [5], the gist of
> > that issue is that the field `encoder->unspecified_type` is set but
> > not reset by function `btf_encoder__encode_cu()`. Thus, the following
> > sequence of events might occur when BTF encoding is requested:
> > - CU with unspecified type is processed:
> >   - unspecified type id is 42
> >   - encoder->unspecified_type is set to 42
> > - CU without unspecified type is processed next using the same
> >   `encoder` object:
> >   - some `struct foo` has id 42 in this CU
> >   - the references to `struct foo` are set 0 by function
> >     `btf_encoder__tag_type()`.
> > 
> > [1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
> > [2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> > [3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
> > [4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
> > [5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/
> > 
> > Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> > Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
> > Tested-by: KP Singh <kpsingh@kernel.org>
> > Reported-by: Matt Bobrowski <mattbobrowski@google.com>
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> lgtm, tested on top of the pahole next branch with bpf selftests
> 
> Tested-by: Jiri Olsa <jolsa@kernel.org>

Looks good to me as well, and way more elegant, thanks!

Applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-02 18:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-28 20:23 [PATCH dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types Eduard Zingerman
2023-03-01 18:58 ` Andrii Nakryiko
2023-03-01 21:19 ` Jiri Olsa
2023-03-02 18:00   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.