bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH dwarves] btf_encoder: sanitize non-regular int base type
@ 2021-02-06 19:13 Yonghong Song
  2021-02-07  6:36 ` Andrii Nakryiko
  0 siblings, 1 reply; 3+ messages in thread
From: Yonghong Song @ 2021-02-06 19:13 UTC (permalink / raw)
  To: acme, dwarves; +Cc: bpf, andriin, mark, ndesaulniers, sedat.dilek

clang with dwarf5 may generate non-regular int base type,
i.e., not a signed/unsigned char/short/int/longlong/__int128.
Such base types are often used to describe
how an actual parameter or variable is generated. For example,

0x000015cf:   DW_TAG_base_type
                DW_AT_name      ("DW_ATE_unsigned_1")
                DW_AT_encoding  (DW_ATE_unsigned)
                DW_AT_byte_size (0x00)

0x00010ed9:         DW_TAG_formal_parameter
                      DW_AT_location    (DW_OP_lit0,
                                         DW_OP_not,
                                         DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1",
                                         DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8",
                                         DW_OP_stack_value)
                      DW_AT_abstract_origin     (0x00013984 "branch")

What it does is with a literal "0", did a "not" operation, and the converted to
one-bit unsigned int and then 8-bit unsigned int.

Another example,

0x000e97e4:   DW_TAG_base_type
                DW_AT_name      ("DW_ATE_unsigned_24")
                DW_AT_encoding  (DW_ATE_unsigned)
                DW_AT_byte_size (0x03)

0x000f88f8:     DW_TAG_variable
                  DW_AT_location        (indexed (0x3c) loclist = 0x00008fb0:
                     [0xffffffff82808812, 0xffffffff82808817):
                         DW_OP_breg0 RAX+0,
                         DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
                         DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
                         DW_OP_stack_value,
                         DW_OP_piece 0x1,
                         DW_OP_breg0 RAX+0,
                         DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
                         DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
                         DW_OP_lit8,
                         DW_OP_shr,
                         DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
                         DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
                         DW_OP_stack_value,
                         DW_OP_piece 0x3
                     ......

At one point, a right shift by 8 happens and the result is converted to
32-bit unsigned int and then to 24-bit unsigned int.

BTF does not need any of these DW_OP_* information and such non-regular int
types will cause libbpf to emit errors.
Let us sanitize them to generate BTF acceptable to libbpf and kernel.

Cc: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/libbtf.c b/libbtf.c
index 9f76283..93fe185 100644
--- a/libbtf.c
+++ b/libbtf.c
@@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
 	struct btf *btf = btfe->btf;
 	const struct btf_type *t;
 	uint8_t encoding = 0;
+	uint16_t byte_sz;
 	int32_t id;
 
 	if (bt->is_signed) {
@@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
 		return -1;
 	}
 
-	id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding);
+	/* dwarf5 may emit DW_ATE_[un]signed_{num} base types where
+	 * {num} is not power of 2 and may exceed 128. Such attributes
+	 * are mostly used to record operation for an actual parameter
+	 * or variable.
+	 * For example,
+	 *     DW_AT_location        (indexed (0x3c) loclist = 0x00008fb0:
+	 *         [0xffffffff82808812, 0xffffffff82808817):
+	 *             DW_OP_breg0 RAX+0,
+	 *             DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
+	 *             DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
+	 *             DW_OP_stack_value,
+	 *             DW_OP_piece 0x1,
+	 *             DW_OP_breg0 RAX+0,
+	 *             DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
+	 *             DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
+	 *             DW_OP_lit8,
+	 *             DW_OP_shr,
+	 *             DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
+	 *             DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
+	 *             DW_OP_stack_value, DW_OP_piece 0x3
+	 *     DW_AT_name    ("ebx")
+	 *     DW_AT_decl_file       ("/linux/arch/x86/events/intel/core.c")
+	 *
+	 * In the above example, at some point, one unsigned_32 value
+	 * is right shifted by 8 and the result is converted to unsigned_32
+	 * and then unsigned_24.
+	 *
+	 * BTF does not need such DW_OP_* information so let us sanitize
+	 * these non-regular int types to avoid libbpf/kernel complaints.
+	 */
+	byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size);
+	if (!byte_sz || (byte_sz & (byte_sz - 1))) {
+		name = "sanitized_int";
+		byte_sz = 4;
+	}
+
+	id = btf__add_int(btf, name, byte_sz, encoding);
 	if (id < 0) {
 		btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type");
 	} else {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH dwarves] btf_encoder: sanitize non-regular int base type
  2021-02-06 19:13 [PATCH dwarves] btf_encoder: sanitize non-regular int base type Yonghong Song
@ 2021-02-07  6:36 ` Andrii Nakryiko
  2021-02-07  7:10   ` Yonghong Song
  0 siblings, 1 reply; 3+ messages in thread
From: Andrii Nakryiko @ 2021-02-07  6:36 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Andrii Nakryiko,
	Mark Wielaard, Nick Desaulniers, Sedat Dilek

On Sat, Feb 6, 2021 at 11:21 AM Yonghong Song <yhs@fb.com> wrote:
>
> clang with dwarf5 may generate non-regular int base type,
> i.e., not a signed/unsigned char/short/int/longlong/__int128.
> Such base types are often used to describe
> how an actual parameter or variable is generated. For example,
>
> 0x000015cf:   DW_TAG_base_type
>                 DW_AT_name      ("DW_ATE_unsigned_1")
>                 DW_AT_encoding  (DW_ATE_unsigned)
>                 DW_AT_byte_size (0x00)
>
> 0x00010ed9:         DW_TAG_formal_parameter
>                       DW_AT_location    (DW_OP_lit0,
>                                          DW_OP_not,
>                                          DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1",
>                                          DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8",
>                                          DW_OP_stack_value)
>                       DW_AT_abstract_origin     (0x00013984 "branch")
>
> What it does is with a literal "0", did a "not" operation, and the converted to
> one-bit unsigned int and then 8-bit unsigned int.
>
> Another example,
>
> 0x000e97e4:   DW_TAG_base_type
>                 DW_AT_name      ("DW_ATE_unsigned_24")
>                 DW_AT_encoding  (DW_ATE_unsigned)
>                 DW_AT_byte_size (0x03)
>
> 0x000f88f8:     DW_TAG_variable
>                   DW_AT_location        (indexed (0x3c) loclist = 0x00008fb0:
>                      [0xffffffff82808812, 0xffffffff82808817):
>                          DW_OP_breg0 RAX+0,
>                          DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>                          DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
>                          DW_OP_stack_value,
>                          DW_OP_piece 0x1,
>                          DW_OP_breg0 RAX+0,
>                          DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>                          DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>                          DW_OP_lit8,
>                          DW_OP_shr,
>                          DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>                          DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
>                          DW_OP_stack_value,
>                          DW_OP_piece 0x3
>                      ......
>
> At one point, a right shift by 8 happens and the result is converted to
> 32-bit unsigned int and then to 24-bit unsigned int.
>
> BTF does not need any of these DW_OP_* information and such non-regular int
> types will cause libbpf to emit errors.
> Let us sanitize them to generate BTF acceptable to libbpf and kernel.
>
> Cc: Sedat Dilek <sedat.dilek@gmail.com>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/libbtf.c b/libbtf.c
> index 9f76283..93fe185 100644
> --- a/libbtf.c
> +++ b/libbtf.c
> @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
>         struct btf *btf = btfe->btf;
>         const struct btf_type *t;
>         uint8_t encoding = 0;
> +       uint16_t byte_sz;
>         int32_t id;
>
>         if (bt->is_signed) {
> @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
>                 return -1;
>         }
>
> -       id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding);
> +       /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where
> +        * {num} is not power of 2 and may exceed 128. Such attributes
> +        * are mostly used to record operation for an actual parameter
> +        * or variable.
> +        * For example,
> +        *     DW_AT_location        (indexed (0x3c) loclist = 0x00008fb0:
> +        *         [0xffffffff82808812, 0xffffffff82808817):
> +        *             DW_OP_breg0 RAX+0,
> +        *             DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
> +        *             DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
> +        *             DW_OP_stack_value,
> +        *             DW_OP_piece 0x1,
> +        *             DW_OP_breg0 RAX+0,
> +        *             DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
> +        *             DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
> +        *             DW_OP_lit8,
> +        *             DW_OP_shr,
> +        *             DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
> +        *             DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
> +        *             DW_OP_stack_value, DW_OP_piece 0x3
> +        *     DW_AT_name    ("ebx")
> +        *     DW_AT_decl_file       ("/linux/arch/x86/events/intel/core.c")
> +        *
> +        * In the above example, at some point, one unsigned_32 value
> +        * is right shifted by 8 and the result is converted to unsigned_32
> +        * and then unsigned_24.
> +        *
> +        * BTF does not need such DW_OP_* information so let us sanitize
> +        * these non-regular int types to avoid libbpf/kernel complaints.
> +        */
> +       byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size);
> +       if (!byte_sz || (byte_sz & (byte_sz - 1))) {
> +               name = "sanitized_int";

DWARF never stops causing issues :( How about making this name stand
out a bit more: __SANITIZED_FAKE_INT__ ? Similar in style to
__ARRAY_INDEX_TYPE__?

Otherwise looks good to me, even though it's a bit sketchy to just
"fix up" any integer that doesn't conform to our idea of "normal
integer". But as I said, DWARF is DWARF...

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> +               byte_sz = 4;
> +       }
> +
> +       id = btf__add_int(btf, name, byte_sz, encoding);
>         if (id < 0) {
>                 btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type");
>         } else {
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH dwarves] btf_encoder: sanitize non-regular int base type
  2021-02-07  6:36 ` Andrii Nakryiko
@ 2021-02-07  7:10   ` Yonghong Song
  0 siblings, 0 replies; 3+ messages in thread
From: Yonghong Song @ 2021-02-07  7:10 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, dwarves, bpf, Andrii Nakryiko,
	Mark Wielaard, Nick Desaulniers, Sedat Dilek



On 2/6/21 10:36 PM, Andrii Nakryiko wrote:
> On Sat, Feb 6, 2021 at 11:21 AM Yonghong Song <yhs@fb.com> wrote:
>>
>> clang with dwarf5 may generate non-regular int base type,
>> i.e., not a signed/unsigned char/short/int/longlong/__int128.
>> Such base types are often used to describe
>> how an actual parameter or variable is generated. For example,
>>
>> 0x000015cf:   DW_TAG_base_type
>>                  DW_AT_name      ("DW_ATE_unsigned_1")
>>                  DW_AT_encoding  (DW_ATE_unsigned)
>>                  DW_AT_byte_size (0x00)
>>
>> 0x00010ed9:         DW_TAG_formal_parameter
>>                        DW_AT_location    (DW_OP_lit0,
>>                                           DW_OP_not,
>>                                           DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1",
>>                                           DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8",
>>                                           DW_OP_stack_value)
>>                        DW_AT_abstract_origin     (0x00013984 "branch")
>>
>> What it does is with a literal "0", did a "not" operation, and the converted to
>> one-bit unsigned int and then 8-bit unsigned int.
>>
>> Another example,
>>
>> 0x000e97e4:   DW_TAG_base_type
>>                  DW_AT_name      ("DW_ATE_unsigned_24")
>>                  DW_AT_encoding  (DW_ATE_unsigned)
>>                  DW_AT_byte_size (0x03)
>>
>> 0x000f88f8:     DW_TAG_variable
>>                    DW_AT_location        (indexed (0x3c) loclist = 0x00008fb0:
>>                       [0xffffffff82808812, 0xffffffff82808817):
>>                           DW_OP_breg0 RAX+0,
>>                           DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>>                           DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
>>                           DW_OP_stack_value,
>>                           DW_OP_piece 0x1,
>>                           DW_OP_breg0 RAX+0,
>>                           DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>>                           DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>>                           DW_OP_lit8,
>>                           DW_OP_shr,
>>                           DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>>                           DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
>>                           DW_OP_stack_value,
>>                           DW_OP_piece 0x3
>>                       ......
>>
>> At one point, a right shift by 8 happens and the result is converted to
>> 32-bit unsigned int and then to 24-bit unsigned int.
>>
>> BTF does not need any of these DW_OP_* information and such non-regular int
>> types will cause libbpf to emit errors.
>> Let us sanitize them to generate BTF acceptable to libbpf and kernel.
>>
>> Cc: Sedat Dilek <sedat.dilek@gmail.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/libbtf.c b/libbtf.c
>> index 9f76283..93fe185 100644
>> --- a/libbtf.c
>> +++ b/libbtf.c
>> @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
>>          struct btf *btf = btfe->btf;
>>          const struct btf_type *t;
>>          uint8_t encoding = 0;
>> +       uint16_t byte_sz;
>>          int32_t id;
>>
>>          if (bt->is_signed) {
>> @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
>>                  return -1;
>>          }
>>
>> -       id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding);
>> +       /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where
>> +        * {num} is not power of 2 and may exceed 128. Such attributes
>> +        * are mostly used to record operation for an actual parameter
>> +        * or variable.
>> +        * For example,
>> +        *     DW_AT_location        (indexed (0x3c) loclist = 0x00008fb0:
>> +        *         [0xffffffff82808812, 0xffffffff82808817):
>> +        *             DW_OP_breg0 RAX+0,
>> +        *             DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>> +        *             DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
>> +        *             DW_OP_stack_value,
>> +        *             DW_OP_piece 0x1,
>> +        *             DW_OP_breg0 RAX+0,
>> +        *             DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
>> +        *             DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>> +        *             DW_OP_lit8,
>> +        *             DW_OP_shr,
>> +        *             DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
>> +        *             DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
>> +        *             DW_OP_stack_value, DW_OP_piece 0x3
>> +        *     DW_AT_name    ("ebx")
>> +        *     DW_AT_decl_file       ("/linux/arch/x86/events/intel/core.c")
>> +        *
>> +        * In the above example, at some point, one unsigned_32 value
>> +        * is right shifted by 8 and the result is converted to unsigned_32
>> +        * and then unsigned_24.
>> +        *
>> +        * BTF does not need such DW_OP_* information so let us sanitize
>> +        * these non-regular int types to avoid libbpf/kernel complaints.
>> +        */
>> +       byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size);
>> +       if (!byte_sz || (byte_sz & (byte_sz - 1))) {
>> +               name = "sanitized_int";
> 
> DWARF never stops causing issues :( How about making this name stand
> out a bit more: __SANITIZED_FAKE_INT__ ? Similar in style to
> __ARRAY_INDEX_TYPE__?

Good idea. __SANITIZED_FAKE_INT__ can make it easy to understand
this is some kind of workaround.

Will send v2 soon.

> 
> Otherwise looks good to me, even though it's a bit sketchy to just
> "fix up" any integer that doesn't conform to our idea of "normal
> integer". But as I said, DWARF is DWARF...
> 
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> 
>> +               byte_sz = 4;
>> +       }
>> +
>> +       id = btf__add_int(btf, name, byte_sz, encoding);
>>          if (id < 0) {
>>                  btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type");
>>          } else {
>> --
>> 2.24.1
>>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-07  7:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-06 19:13 [PATCH dwarves] btf_encoder: sanitize non-regular int base type Yonghong Song
2021-02-07  6:36 ` Andrii Nakryiko
2021-02-07  7:10   ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).