All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Infer BTF alignment
@ 2021-10-18 13:16 Douglas RAILLARD
  2021-10-18 13:16 ` [PATCH v2 1/3] fprintf: Fix nested struct printing Douglas RAILLARD
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Douglas RAILLARD @ 2021-10-18 13:16 UTC (permalink / raw)
  To: acme; +Cc: dwarves, douglas.raillard

From: Douglas Raillard <douglas.raillard@arm.com>

Infer alignment of struct members when loading BTF, so that using the
header dumped by pahole results in the same offsets. This is useful for
debugging tools to be able to access pointers to structs of possibly
private types. One such example are kernel modules bolting traceevents
on existing upstream tracepoints, which typically expose "top-level"
struct pointers (e.g. struct cfs_rq) to allow maximum flexibility.

Changes since v1:
* Split alignment patch into refactor and new code.
* Test with fullcircle on vmlinux and .o of kernel tree:
    
  export PATH=pahole/build:$PATH
  for f in {kernel,drivers,arch,block,crypto,certs,fs,init,ipc,lib,mm,net,sound,virt}/**.o; do
     echo $f
     pahole/fullcircle $f
  done

  This did not produce any error but neither did it on pahole master
  branch, since I assume it reads DWARF by default.

  Trying with "pfunct -F btf" on object files seemed to yield empty
  sources:

    pfunct: file 'kernel/sched/fair.o' has no btf type information.

  Running pfunct on vmlinux worked, but fullcircle seems to have bailed
  out before getting to pfunct. When commenting out the bailouts, I
  managed to get pfunct to work, but the header generated with --compile
  contains one duplicate type, and lots of broken inlined functions
  (e.g. "return 0;" in a func returning a struct).
  

Douglas Raillard (3):
  fprintf: Fix nested struct printing
  btf_loader.c: Refactor class__fixup_btf_bitfields
  btf_loader.c: Infer alignment info

 btf_loader.c      | 65 ++++++++++++++++++++++++++++++++++++-----------
 dwarves.c         |  2 +-
 dwarves.h         |  2 ++
 dwarves_fprintf.c | 22 +++++++++++++---
 4 files changed, 71 insertions(+), 20 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/3] fprintf: Fix nested struct printing
  2021-10-18 13:16 [PATCH v2 0/3] Infer BTF alignment Douglas RAILLARD
@ 2021-10-18 13:16 ` Douglas RAILLARD
  2021-10-18 13:16 ` [PATCH v2 2/3] btf_loader.c: Refactor class__fixup_btf_bitfields Douglas RAILLARD
  2021-10-18 13:16 ` [PATCH v2 3/3] btf_loader.c: Infer alignment info Douglas RAILLARD
  2 siblings, 0 replies; 11+ messages in thread
From: Douglas RAILLARD @ 2021-10-18 13:16 UTC (permalink / raw)
  To: acme; +Cc: dwarves, douglas.raillard

From: Douglas Raillard <douglas.raillard@arm.com>

This code:

    struct X {
       struct {
       } __attribute__((foo)) x __attribute__((bar));
    }

Was wrongly printed as:

    struct X {
       struct {
       } x __attribute__((foo)) __attribute__((bar));
    }

This unfortunately matters a lot, since "bar" is suppose to apply to
"x", but "foo" to typeof(x). In the wrong form, both apply to "x",
leading to e.g. incorrect layout for __aligned__ attribute.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
---
 dwarves_fprintf.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/dwarves_fprintf.c b/dwarves_fprintf.c
index c35868a..1c1d949 100644
--- a/dwarves_fprintf.c
+++ b/dwarves_fprintf.c
@@ -787,7 +787,7 @@ print_default:
 					   (type->tag == DW_TAG_class_type &&
 					    !tconf.classes_as_structs) ? "class" : "struct",
 					   tconf.type_spacing - 7,
-					   type__name(ctype), name);
+					   type__name(ctype), name ?: "");
 		} else {
 			struct class *cclass = tag__class(type);
 
@@ -802,7 +802,7 @@ print_default:
 		ctype = tag__type(type);
 
 		if (type__name(ctype) != NULL && !expand_types) {
-			printed += fprintf(fp, "union %-*s %s", tconf.type_spacing - 6, type__name(ctype), name);
+			printed += fprintf(fp, "union %-*s %s", tconf.type_spacing - 6, type__name(ctype), name ?: "");
 		} else {
 			tconf.type_spacing -= 8;
 			printed += union__fprintf(ctype, cu, &tconf, fp);
@@ -812,7 +812,7 @@ print_default:
 		ctype = tag__type(type);
 
 		if (type__name(ctype) != NULL)
-			printed += fprintf(fp, "enum %-*s %s", tconf.type_spacing - 5, type__name(ctype), name);
+			printed += fprintf(fp, "enum %-*s %s", tconf.type_spacing - 5, type__name(ctype), name ?: "");
 		else
 			printed += enumeration__fprintf(type, &tconf, fp);
 		break;
@@ -863,7 +863,21 @@ static size_t class_member__fprintf(struct class_member *member, bool union_memb
 	if (member->is_static)
 		printed += fprintf(fp, "static ");
 
-	printed += type__fprintf(type, cu, name, &sconf, fp);
+	/* For struct-like constructs, the name of the member cannot be
+	 * conflated with the name of its type, otherwise __attribute__ are
+	 * printed in the wrong order.
+	 */
+	if (tag__is_union(type) || tag__is_struct(type) ||
+	    tag__is_enumeration(type)) {
+		printed += type__fprintf(type, cu, NULL, &sconf, fp);
+		if (name) {
+			if (!type__name(tag__type(type)))
+				printed += fprintf(fp, " ");
+			printed += fprintf(fp, "%s", name);
+		}
+	} else {
+		printed += type__fprintf(type, cu, name, &sconf, fp);
+	}
 
 	if (member->is_static) {
 		if (member->const_value != 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/3] btf_loader.c: Refactor class__fixup_btf_bitfields
  2021-10-18 13:16 [PATCH v2 0/3] Infer BTF alignment Douglas RAILLARD
  2021-10-18 13:16 ` [PATCH v2 1/3] fprintf: Fix nested struct printing Douglas RAILLARD
@ 2021-10-18 13:16 ` Douglas RAILLARD
  2021-10-21 20:49   ` Arnaldo Carvalho de Melo
  2021-10-18 13:16 ` [PATCH v2 3/3] btf_loader.c: Infer alignment info Douglas RAILLARD
  2 siblings, 1 reply; 11+ messages in thread
From: Douglas RAILLARD @ 2021-10-18 13:16 UTC (permalink / raw)
  To: acme; +Cc: dwarves, douglas.raillard

From: Douglas Raillard <douglas.raillard@arm.com>

Refactor class__fixup_btf_bitfields to remove a "continue" statement, to
prepare the ground for alignment fixup that is relevant for some types
matching:

    type->tag != DW_TAG_base_type && type->tag != DW_TAG_enumeration_type

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
---
 btf_loader.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/btf_loader.c b/btf_loader.c
index 3e5a945..9c2daee 100644
--- a/btf_loader.c
+++ b/btf_loader.c
@@ -486,28 +486,27 @@ static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu)
 		pos->byte_size = tag__size(type, cu);
 		pos->bit_size = pos->byte_size * 8;
 
-		/* bitfield fixup is needed for enums and base types only */
-		if (type->tag != DW_TAG_base_type && type->tag != DW_TAG_enumeration_type)
-			continue;
-
 		/* if BTF data is incorrect and has size == 0, skip field,
 		 * instead of crashing */
 		if (pos->byte_size == 0) {
 			continue;
 		}
 
-		if (pos->bitfield_size) {
-			/* bitfields seem to be always aligned, no matter the packing */
-			pos->byte_offset = pos->bit_offset / pos->bit_size * pos->bit_size / 8;
-			pos->bitfield_offset = pos->bit_offset - pos->byte_offset * 8;
-			/* re-adjust bitfield offset if it is negative */
-			if (pos->bitfield_offset < 0) {
-				pos->bitfield_offset += pos->bit_size;
-				pos->byte_offset -= pos->byte_size;
-				pos->bit_offset = pos->byte_offset * 8 + pos->bitfield_offset;
+		/* bitfield fixup is needed for enums and base types only */
+		if (type->tag == DW_TAG_base_type || type->tag == DW_TAG_enumeration_type) {
+			if (pos->bitfield_size) {
+				/* bitfields seem to be always aligned, no matter the packing */
+				pos->byte_offset = pos->bit_offset / pos->bit_size * pos->bit_size / 8;
+				pos->bitfield_offset = pos->bit_offset - pos->byte_offset * 8;
+				/* re-adjust bitfield offset if it is negative */
+				if (pos->bitfield_offset < 0) {
+					pos->bitfield_offset += pos->bit_size;
+					pos->byte_offset -= pos->byte_size;
+					pos->bit_offset = pos->byte_offset * 8 + pos->bitfield_offset;
+				}
+			} else {
+				pos->byte_offset = pos->bit_offset / 8;
 			}
-		} else {
-			pos->byte_offset = pos->bit_offset / 8;
 		}
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-18 13:16 [PATCH v2 0/3] Infer BTF alignment Douglas RAILLARD
  2021-10-18 13:16 ` [PATCH v2 1/3] fprintf: Fix nested struct printing Douglas RAILLARD
  2021-10-18 13:16 ` [PATCH v2 2/3] btf_loader.c: Refactor class__fixup_btf_bitfields Douglas RAILLARD
@ 2021-10-18 13:16 ` Douglas RAILLARD
  2021-10-22  0:31   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 11+ messages in thread
From: Douglas RAILLARD @ 2021-10-18 13:16 UTC (permalink / raw)
  To: acme; +Cc: dwarves, douglas.raillard

From: Douglas Raillard <douglas.raillard@arm.com>

BTF does not carry alignment information, but it carries the offset in
structs. This allows inferring the original alignment, yielding a C
header dump that is not identical to the original C code, but is
guaranteed to lead to the same memory layout.

This allows using the output of pahole in another program to poke at
memory, with the assurance that we will not read garbage.

Note: Since the alignment is inferred from the offset, it sometimes
happens that the offset was already correctly aligned, which means the
inferred alignment will be smaller than in the original source. This
does not impact the ability to read existing structs, but it could
impact creating such struct if other client code expects higher
alignment than the one exposed in the generated header.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
---
 btf_loader.c | 36 ++++++++++++++++++++++++++++++++++++
 dwarves.c    |  2 +-
 dwarves.h    |  2 ++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/btf_loader.c b/btf_loader.c
index 9c2daee..2885252 100644
--- a/btf_loader.c
+++ b/btf_loader.c
@@ -471,10 +471,37 @@ static int btf__load_sections(struct btf *btf, struct cu *cu)
 	return btf__load_types(btf, cu);
 }
 
+static uint32_t class__infer_alignment(uint32_t byte_offset,
+				       uint32_t natural_alignment,
+				       uint32_t smallest_offset)
+{
+	uint32_t alignment = 0;
+	uint32_t offset_delta = byte_offset - smallest_offset;
+
+	if (offset_delta) {
+		if (byte_offset % 2 == 0) {
+			/* Find the power of 2 immediately higher than
+			 * offset_delta
+			 */
+			alignment = 1 << (8 * sizeof(offset_delta) -
+					      __builtin_clz(offset_delta));
+		} else {
+			alignment = 0;
+		}
+	}
+
+	/* Natural alignment, nothing to do */
+	if (alignment <= natural_alignment || alignment == 1)
+		alignment = 0;
+
+	return alignment;
+}
+
 static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu)
 {
 	struct class_member *pos;
 	struct type *tag_type = tag__type(tag);
+	uint32_t smallest_offset = 0;
 
 	type__for_each_data_member(tag_type, pos) {
 		struct tag *type = tag__strip_typedefs_and_modifiers(&pos->tag, cu);
@@ -508,8 +535,17 @@ static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu)
 				pos->byte_offset = pos->bit_offset / 8;
 			}
 		}
+
+		pos->alignment = class__infer_alignment(pos->byte_offset,
+							tag__natural_alignment(type, cu),
+							smallest_offset);
+		smallest_offset = pos->byte_offset + pos->byte_size;
 	}
 
+	tag_type->alignment = class__infer_alignment(tag_type->size,
+						     tag__natural_alignment(tag, cu),
+						     smallest_offset);
+
 	return 0;
 }
 
diff --git a/dwarves.c b/dwarves.c
index b6f2489..bb8af5b 100644
--- a/dwarves.c
+++ b/dwarves.c
@@ -1515,7 +1515,7 @@ void class__find_holes(struct class *class)
 
 static size_t type__natural_alignment(struct type *type, const struct cu *cu);
 
-static size_t tag__natural_alignment(struct tag *tag, const struct cu *cu)
+size_t tag__natural_alignment(struct tag *tag, const struct cu *cu)
 {
 	size_t natural_alignment = 1;
 
diff --git a/dwarves.h b/dwarves.h
index 30d33fa..c2fea0a 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -1002,6 +1002,8 @@ struct type {
 
 void __type__init(struct type *type);
 
+size_t tag__natural_alignment(struct tag *tag, const struct cu *cu);
+
 static inline struct class *type__class(const struct type *type)
 {
 	return (struct class *)type;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] btf_loader.c: Refactor class__fixup_btf_bitfields
  2021-10-18 13:16 ` [PATCH v2 2/3] btf_loader.c: Refactor class__fixup_btf_bitfields Douglas RAILLARD
@ 2021-10-21 20:49   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-10-21 20:49 UTC (permalink / raw)
  To: Douglas RAILLARD; +Cc: acme, dwarves

Em Mon, Oct 18, 2021 at 02:16:20PM +0100, Douglas RAILLARD escreveu:
> From: Douglas Raillard <douglas.raillard@arm.com>
> 
> Refactor class__fixup_btf_bitfields to remove a "continue" statement, to
> prepare the ground for alignment fixup that is relevant for some types
> matching:
> 
>     type->tag != DW_TAG_base_type && type->tag != DW_TAG_enumeration_type

Thanks, applied.

    Committer testing:

    btfdiff passes for a x86_64 kernel built with gcc and for a clang
    thin-LTO vmlinux build.


- Arnaldo

 
> Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
> ---
>  btf_loader.c | 29 ++++++++++++++---------------
>  1 file changed, 14 insertions(+), 15 deletions(-)
> 
> diff --git a/btf_loader.c b/btf_loader.c
> index 3e5a945..9c2daee 100644
> --- a/btf_loader.c
> +++ b/btf_loader.c
> @@ -486,28 +486,27 @@ static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu)
>  		pos->byte_size = tag__size(type, cu);
>  		pos->bit_size = pos->byte_size * 8;
>  
> -		/* bitfield fixup is needed for enums and base types only */
> -		if (type->tag != DW_TAG_base_type && type->tag != DW_TAG_enumeration_type)
> -			continue;
> -
>  		/* if BTF data is incorrect and has size == 0, skip field,
>  		 * instead of crashing */
>  		if (pos->byte_size == 0) {
>  			continue;
>  		}
>  
> -		if (pos->bitfield_size) {
> -			/* bitfields seem to be always aligned, no matter the packing */
> -			pos->byte_offset = pos->bit_offset / pos->bit_size * pos->bit_size / 8;
> -			pos->bitfield_offset = pos->bit_offset - pos->byte_offset * 8;
> -			/* re-adjust bitfield offset if it is negative */
> -			if (pos->bitfield_offset < 0) {
> -				pos->bitfield_offset += pos->bit_size;
> -				pos->byte_offset -= pos->byte_size;
> -				pos->bit_offset = pos->byte_offset * 8 + pos->bitfield_offset;
> +		/* bitfield fixup is needed for enums and base types only */
> +		if (type->tag == DW_TAG_base_type || type->tag == DW_TAG_enumeration_type) {
> +			if (pos->bitfield_size) {
> +				/* bitfields seem to be always aligned, no matter the packing */
> +				pos->byte_offset = pos->bit_offset / pos->bit_size * pos->bit_size / 8;
> +				pos->bitfield_offset = pos->bit_offset - pos->byte_offset * 8;
> +				/* re-adjust bitfield offset if it is negative */
> +				if (pos->bitfield_offset < 0) {
> +					pos->bitfield_offset += pos->bit_size;
> +					pos->byte_offset -= pos->byte_size;
> +					pos->bit_offset = pos->byte_offset * 8 + pos->bitfield_offset;
> +				}
> +			} else {
> +				pos->byte_offset = pos->bit_offset / 8;
>  			}
> -		} else {
> -			pos->byte_offset = pos->bit_offset / 8;
>  		}
>  	}
>  
> -- 
> 2.25.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-18 13:16 ` [PATCH v2 3/3] btf_loader.c: Infer alignment info Douglas RAILLARD
@ 2021-10-22  0:31   ` Arnaldo Carvalho de Melo
  2021-10-25 17:06     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-10-22  0:31 UTC (permalink / raw)
  To: Douglas RAILLARD; +Cc: acme, dwarves

Em Mon, Oct 18, 2021 at 02:16:21PM +0100, Douglas RAILLARD escreveu:
> From: Douglas Raillard <douglas.raillard@arm.com>
> 
> BTF does not carry alignment information, but it carries the offset in
> structs. This allows inferring the original alignment, yielding a C
> header dump that is not identical to the original C code, but is
> guaranteed to lead to the same memory layout.
> 
> This allows using the output of pahole in another program to poke at
> memory, with the assurance that we will not read garbage.
> 
> Note: Since the alignment is inferred from the offset, it sometimes
> happens that the offset was already correctly aligned, which means the
> inferred alignment will be smaller than in the original source. This
> does not impact the ability to read existing structs, but it could
> impact creating such struct if other client code expects higher
> alignment than the one exposed in the generated header.

this one makes btfdiff fail, example:


@@ -125578,7 +125640,7 @@ struct xt_entry_match {
 			struct xt_match * match;         /*     8     8 */
 		} kernel;                                /*     0    16 */
 		__u16              match_size;           /*     0     2 */
-	} u;                                             /*     0    32 */
+	} u;            /*     0    32 */
 	unsigned char              data[];               /*    32     0 */

 	/* size: 32, cachelines: 1, members: 2 */

Why the change in the generated source code comment alignment?

There are some other differences, I'll check tomorrow.

- Arnaldo
 
> Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
> ---
>  btf_loader.c | 36 ++++++++++++++++++++++++++++++++++++
>  dwarves.c    |  2 +-
>  dwarves.h    |  2 ++
>  3 files changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/btf_loader.c b/btf_loader.c
> index 9c2daee..2885252 100644
> --- a/btf_loader.c
> +++ b/btf_loader.c
> @@ -471,10 +471,37 @@ static int btf__load_sections(struct btf *btf, struct cu *cu)
>  	return btf__load_types(btf, cu);
>  }
>  
> +static uint32_t class__infer_alignment(uint32_t byte_offset,
> +				       uint32_t natural_alignment,
> +				       uint32_t smallest_offset)
> +{
> +	uint32_t alignment = 0;
> +	uint32_t offset_delta = byte_offset - smallest_offset;
> +
> +	if (offset_delta) {
> +		if (byte_offset % 2 == 0) {
> +			/* Find the power of 2 immediately higher than
> +			 * offset_delta
> +			 */
> +			alignment = 1 << (8 * sizeof(offset_delta) -
> +					      __builtin_clz(offset_delta));
> +		} else {
> +			alignment = 0;
> +		}
> +	}
> +
> +	/* Natural alignment, nothing to do */
> +	if (alignment <= natural_alignment || alignment == 1)
> +		alignment = 0;
> +
> +	return alignment;
> +}
> +
>  static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu)
>  {
>  	struct class_member *pos;
>  	struct type *tag_type = tag__type(tag);
> +	uint32_t smallest_offset = 0;
>  
>  	type__for_each_data_member(tag_type, pos) {
>  		struct tag *type = tag__strip_typedefs_and_modifiers(&pos->tag, cu);
> @@ -508,8 +535,17 @@ static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu)
>  				pos->byte_offset = pos->bit_offset / 8;
>  			}
>  		}
> +
> +		pos->alignment = class__infer_alignment(pos->byte_offset,
> +							tag__natural_alignment(type, cu),
> +							smallest_offset);
> +		smallest_offset = pos->byte_offset + pos->byte_size;
>  	}
>  
> +	tag_type->alignment = class__infer_alignment(tag_type->size,
> +						     tag__natural_alignment(tag, cu),
> +						     smallest_offset);
> +
>  	return 0;
>  }
>  
> diff --git a/dwarves.c b/dwarves.c
> index b6f2489..bb8af5b 100644
> --- a/dwarves.c
> +++ b/dwarves.c
> @@ -1515,7 +1515,7 @@ void class__find_holes(struct class *class)
>  
>  static size_t type__natural_alignment(struct type *type, const struct cu *cu);
>  
> -static size_t tag__natural_alignment(struct tag *tag, const struct cu *cu)
> +size_t tag__natural_alignment(struct tag *tag, const struct cu *cu)
>  {
>  	size_t natural_alignment = 1;
>  
> diff --git a/dwarves.h b/dwarves.h
> index 30d33fa..c2fea0a 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -1002,6 +1002,8 @@ struct type {
>  
>  void __type__init(struct type *type);
>  
> +size_t tag__natural_alignment(struct tag *tag, const struct cu *cu);
> +
>  static inline struct class *type__class(const struct type *type)
>  {
>  	return (struct class *)type;
> -- 
> 2.25.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-22  0:31   ` Arnaldo Carvalho de Melo
@ 2021-10-25 17:06     ` Arnaldo Carvalho de Melo
  2021-10-26 15:03       ` Douglas Raillard
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-10-25 17:06 UTC (permalink / raw)
  To: Douglas RAILLARD; +Cc: acme, dwarves

Em Thu, Oct 21, 2021 at 09:31:36PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Oct 18, 2021 at 02:16:21PM +0100, Douglas RAILLARD escreveu:
> > From: Douglas Raillard <douglas.raillard@arm.com>
> > 
> > BTF does not carry alignment information, but it carries the offset in
> > structs. This allows inferring the original alignment, yielding a C
> > header dump that is not identical to the original C code, but is
> > guaranteed to lead to the same memory layout.
> > 
> > This allows using the output of pahole in another program to poke at
> > memory, with the assurance that we will not read garbage.
> > 
> > Note: Since the alignment is inferred from the offset, it sometimes
> > happens that the offset was already correctly aligned, which means the
> > inferred alignment will be smaller than in the original source. This
> > does not impact the ability to read existing structs, but it could
> > impact creating such struct if other client code expects higher
> > alignment than the one exposed in the generated header.
> 
> this one makes btfdiff fail, example:
> 
> 
> @@ -125578,7 +125640,7 @@ struct xt_entry_match {
>  			struct xt_match * match;         /*     8     8 */
>  		} kernel;                                /*     0    16 */
>  		__u16              match_size;           /*     0     2 */
> -	} u;                                             /*     0    32 */
> +	} u;            /*     0    32 */
>  	unsigned char              data[];               /*    32     0 */
> 
>  	/* size: 32, cachelines: 1, members: 2 */
> 
> Why the change in the generated source code comment alignment?

Since this is inferred and DWARF has the alignment info explicitely, we
can't really compare, so I'll stick this to btfdiff.

diff --git a/btfdiff b/btfdiff
index 77543630d1965b5e..cbdf65285cd90f62 100755
--- a/btfdiff
+++ b/btfdiff
@@ -33,6 +33,7 @@ ${pahole_bin} -F dwarf \
 	      --show_private_classes $dwarf_input > $dwarf_output
 ${pahole_bin} -F btf \
 	      --sort \
+	      --suppress_aligned_attribute \
 	      --suppress_packed \
 	      $btf_input > $btf_output
 
 
> There are some other differences, I'll check tomorrow.

I got sidetracked, will try to reduce the differences somehow, but
probably as patches on top of yours, to test this the best is to use
fullcircle from BTF info, which I'll also look into as I think you
reported issues with pfunct's --compile option.

- Arnaldo

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-25 17:06     ` Arnaldo Carvalho de Melo
@ 2021-10-26 15:03       ` Douglas Raillard
  2021-10-27 20:47         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Douglas Raillard @ 2021-10-26 15:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: acme, dwarves

On 10/25/21 6:06 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Oct 21, 2021 at 09:31:36PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Oct 18, 2021 at 02:16:21PM +0100, Douglas RAILLARD escreveu:
>>> From: Douglas Raillard <douglas.raillard@arm.com>
>>>
>>> BTF does not carry alignment information, but it carries the offset in
>>> structs. This allows inferring the original alignment, yielding a C
>>> header dump that is not identical to the original C code, but is
>>> guaranteed to lead to the same memory layout.
>>>
>>> This allows using the output of pahole in another program to poke at
>>> memory, with the assurance that we will not read garbage.
>>>
>>> Note: Since the alignment is inferred from the offset, it sometimes
>>> happens that the offset was already correctly aligned, which means the
>>> inferred alignment will be smaller than in the original source. This
>>> does not impact the ability to read existing structs, but it could
>>> impact creating such struct if other client code expects higher
>>> alignment than the one exposed in the generated header.
>>
>> this one makes btfdiff fail, example:
>>
>>
>> @@ -125578,7 +125640,7 @@ struct xt_entry_match {
>>   			struct xt_match * match;         /*     8     8 */
>>   		} kernel;                                /*     0    16 */
>>   		__u16              match_size;           /*     0     2 */
>> -	} u;                                             /*     0    32 */
>> +	} u;            /*     0    32 */
>>   	unsigned char              data[];               /*    32     0 */
>>
>>   	/* size: 32, cachelines: 1, members: 2 */
>>
>> Why the change in the generated source code comment alignment?
> 
> Since this is inferred and DWARF has the alignment info explicitely, we
> can't really compare, so I'll stick this to btfdiff.
> 
> diff --git a/btfdiff b/btfdiff
> index 77543630d1965b5e..cbdf65285cd90f62 100755
> --- a/btfdiff
> +++ b/btfdiff
> @@ -33,6 +33,7 @@ ${pahole_bin} -F dwarf \
>   	      --show_private_classes $dwarf_input > $dwarf_output
>   ${pahole_bin} -F btf \
>   	      --sort \
> +	      --suppress_aligned_attribute \
>   	      --suppress_packed \
>   	      $btf_input > $btf_output
>   
>   
>> There are some other differences, I'll check tomorrow.
> 
> I got sidetracked, will try to reduce the differences somehow, but
> probably as patches on top of yours, to test this the best is to use
> fullcircle from BTF info, which I'll also look into as I think you
> reported issues with pfunct's --compile option.

Thanks, let me know when you get something working on this front.
I also made the following Python script to validate the algorithm, assuming
the compiler will emit a layout according to natural alignment:

#! /usr/bin/env python3

import itertools
import math

class Member:
     def __init__(self, size, alignment=None):
         self.size = size
         self.alignment = size if alignment is None else alignment

         assert self.alignment >= self.size

     def __repr__(self):
         return f'Member(size={self.size}, alignment={self.alignment})'


def align(x, n):
     r = x % n
     padding = n - r if r else 0
     return x + padding


def next_power_of_2(x):
     return 2 ** math.ceil(math.log2(x)) if x else 1


def powers_of_2(from_, up_to):
     for i in itertools.count():
         x = 1 << i
         if x < from_:
             continue
         elif x <= up_to:
             yield x
         else:
             return

class Struct:
     def __init__(self, members, alignment=None):
         members = list(members)

         self.members = members
         self.alignment = alignment
         self.offsets = self._compute_offsets(members)
         self.size = self._compute_size(members, self.offsets, alignment)


     def __str__(self):
         members = ',\n    '.join(
             f'offset={offset}: {member}'
             for member, offset in self.members_offset
         )
         return f'Struct(\n    {members}\n)'

     @classmethod
     def from_offsets(cls, members, offsets):
         def compute(acc, member_spec):
             _, smallest_offset = acc
             member, offset = member_spec

             delta = offset - smallest_offset
             size = member.size

             # Get power of 2 that is strictly higher than delta
             alignment = next_power_of_2(delta + 1)

             # Minimum is natural alignment
             if alignment < size:
                 alignment = size

             member = Member(
                 size=size,
                 # Rather than using member.alignment to get the original
                 # alignment, infer it only from the offset
                 alignment=alignment,
             )
             return (member, offset + size)

         accs = itertools.accumulate(
             zip(members, offsets),
             compute,
             initial=(Member(0), 0),
         )
         members, _ = zip(*itertools.islice(accs, 1, None))

         return cls(
             members=members,
         )

     @staticmethod
     def _compute_offsets(members):
         def compute(acc, member):
             _, smallest_offset = acc

             offset = align(smallest_offset, member.alignment)
             next_offset = offset + member.size
             return (offset, next_offset)

         offsets = itertools.accumulate(
             members,
             compute,
             initial=(0, 0),
         )
         offsets, _ = zip(*itertools.islice(offsets, 1, None))
         return offsets

     @property
     def members_offset(self):
         return list(zip(self.members, self.offsets))

     @staticmethod
     def _compute_size(members, offsets, alignment):
         alignment = alignment or 1
         last = offsets[-1] + members[-1].size
         return align(last, alignment)


def test_align_infer(struct):

     # Reconstruct a struct by inferring alignment from offsets
     s2 = struct.from_offsets(struct.members, struct.offsets)

     errors = []
     for (m1, o1), (m2, o2) in zip(
         struct.members_offset,
         s2.members_offset,
     ):
         if o1 != o2:
             errors.append(
                 f'Wrong offset for {m1} (offset={o1}): {m2} (offset={o2})'
             )

     if errors:
         print()
         print(struct)
         print(s2)
         for error in errors:
             print(error)

# s1 = Struct(
#     [
#         Member(4),
#         Member(2),
#         Member(8),
#     ]
# )
# print(s1.members_offset)

members = [
     Member(size, alignment)
     for size in powers_of_2(1, 128)
     for alignment in powers_of_2(size, 256)
]

for nr_members in range(1, 4):
     for _members in itertools.permutations(members, nr_members):
         struct = Struct(list(_members))
         print(struct)
         res = test_align_infer(struct)


> - Arnaldo
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-26 15:03       ` Douglas Raillard
@ 2021-10-27 20:47         ` Arnaldo Carvalho de Melo
  2021-10-28  9:31           ` Douglas Raillard
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-10-27 20:47 UTC (permalink / raw)
  To: Douglas Raillard; +Cc: acme, dwarves

Em Tue, Oct 26, 2021 at 04:03:57PM +0100, Douglas Raillard escreveu:
> On 10/25/21 6:06 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Oct 21, 2021 at 09:31:36PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Mon, Oct 18, 2021 at 02:16:21PM +0100, Douglas RAILLARD escreveu:
> > > > From: Douglas Raillard <douglas.raillard@arm.com>

> > > > BTF does not carry alignment information, but it carries the offset in
> > > > structs. This allows inferring the original alignment, yielding a C
> > > > header dump that is not identical to the original C code, but is
> > > > guaranteed to lead to the same memory layout.

> > > > This allows using the output of pahole in another program to poke at
> > > > memory, with the assurance that we will not read garbage.

> > > > Note: Since the alignment is inferred from the offset, it sometimes
> > > > happens that the offset was already correctly aligned, which means the
> > > > inferred alignment will be smaller than in the original source. This
> > > > does not impact the ability to read existing structs, but it could
> > > > impact creating such struct if other client code expects higher
> > > > alignment than the one exposed in the generated header.

> > > this one makes btfdiff fail, example:

> > > @@ -125578,7 +125640,7 @@ struct xt_entry_match {
> > >   			struct xt_match * match;         /*     8     8 */
> > >   		} kernel;                                /*     0    16 */
> > >   		__u16              match_size;           /*     0     2 */
> > > -	} u;                                             /*     0    32 */
> > > +	} u;            /*     0    32 */
> > >   	unsigned char              data[];               /*    32     0 */
> > > 
> > >   	/* size: 32, cachelines: 1, members: 2 */
> > > 
> > > Why the change in the generated source code comment alignment?

The one above I fixed today, an old bug that gets exercised when you inferred
alignment for types.

> > Since this is inferred and DWARF has the alignment info explicitely, we
> > can't really compare, so I'll stick this to btfdiff.
<SNIP>
> > I got sidetracked, will try to reduce the differences somehow, but
> > probably as patches on top of yours, to test this the best is to use
> > fullcircle from BTF info, which I'll also look into as I think you
> > reported issues with pfunct's --compile option.
 
> Thanks, let me know when you get something working on this front.

So, what is now in the 'next' branch (an alias for the tmp.master branch
that is tested in the libbpf CI) produces:

  --- /tmp/btfdiff.dwarf.8098nf   2021-10-27 17:29:02.788601053 -0300
  +++ /tmp/btfdiff.btf.eTagYI     2021-10-27 17:29:02.994606515 -0300
  @@ -107,7 +107,7 @@ struct Qdisc {
          /* XXX 24 bytes hole, try to pack */
   
          /* --- cacheline 2 boundary (128 bytes) --- */
  -       struct sk_buff_head        gso_skb __attribute__((__aligned__(64))); /*   128    24 */
  +       struct sk_buff_head        gso_skb __attribute__((__aligned__(32))); /*   128    24 */
          struct qdisc_skb_head      q;                    /*   152    24 */
          struct gnet_stats_basic_packed bstats;           /*   176    16 */
          /* --- cacheline 3 boundary (192 bytes) --- */
  
The one above is a bit difficult to solve, perhaps we can use an
heuristic for kernel files and assume this is dependend on the
typical cacheline sizes? As it in the kernel sources is:

          struct sk_buff_head     gso_skb ____cacheline_aligned_in_smp;

I.e. if it is a multiple of both 64, then we use it instead of 32?
  
  @@ -117,18 +117,18 @@ struct Qdisc {
          struct Qdisc *             next_sched;           /*   224     8 */
          struct sk_buff_head        skb_bad_txq;          /*   232    24 */
          /* --- cacheline 4 boundary (256 bytes) --- */
  -       spinlock_t                 busylock __attribute__((__aligned__(64))); /*   256     4 */
  +       spinlock_t                 busylock;             /*   256     4 */

Originally:

        spinlock_t              busylock ____cacheline_aligned_in_smp;

But since it is already naturally aligned (232 + 24 = 256), we can't
infer it from what BTF carries.

          spinlock_t                 seqlock;              /*   260     4 */
  -       struct callback_head       rcu __attribute__((__aligned__(8))); /*   264    16 */
  +       struct callback_head       rcu;                  /*   264    16 */

Ditto
   
          /* XXX 40 bytes hole, try to pack */
   
          /* --- cacheline 5 boundary (320 bytes) --- */
  -       long int                   privdata[] __attribute__((__aligned__(64))); /*   320     0 */
  +       long int                   privdata[];           /*   320     0 */

But this one should've been inferred, right?
   
          /* size: 320, cachelines: 5, members: 27 */
          /* sum members: 256, holes: 2, sum holes: 64 */
  -       /* forced alignments: 4, forced holes: 2, sum forced holes: 64 */
  +       /* forced alignments: 1, forced holes: 1, sum forced holes: 24 */
   } __attribute__((__aligned__(64)));


Here both match, what was inferred for BTF is what DWARF produces, even
with the source code having no explicit annotation for the type.

> I also made the following Python script to validate the algorithm, assuming
> the compiler will emit a layout according to natural alignment:

Cool, I'll try and read it carefully.

- Arnaldo
 
> #! /usr/bin/env python3
> 
> import itertools
> import math
> 
> class Member:
>     def __init__(self, size, alignment=None):
>         self.size = size
>         self.alignment = size if alignment is None else alignment
> 
>         assert self.alignment >= self.size
> 
>     def __repr__(self):
>         return f'Member(size={self.size}, alignment={self.alignment})'
> 
> 
> def align(x, n):
>     r = x % n
>     padding = n - r if r else 0
>     return x + padding
> 
> 
> def next_power_of_2(x):
>     return 2 ** math.ceil(math.log2(x)) if x else 1
> 
> 
> def powers_of_2(from_, up_to):
>     for i in itertools.count():
>         x = 1 << i
>         if x < from_:
>             continue
>         elif x <= up_to:
>             yield x
>         else:
>             return
> 
> class Struct:
>     def __init__(self, members, alignment=None):
>         members = list(members)
> 
>         self.members = members
>         self.alignment = alignment
>         self.offsets = self._compute_offsets(members)
>         self.size = self._compute_size(members, self.offsets, alignment)
> 
> 
>     def __str__(self):
>         members = ',\n    '.join(
>             f'offset={offset}: {member}'
>             for member, offset in self.members_offset
>         )
>         return f'Struct(\n    {members}\n)'
> 
>     @classmethod
>     def from_offsets(cls, members, offsets):
>         def compute(acc, member_spec):
>             _, smallest_offset = acc
>             member, offset = member_spec
> 
>             delta = offset - smallest_offset
>             size = member.size
> 
>             # Get power of 2 that is strictly higher than delta
>             alignment = next_power_of_2(delta + 1)
> 
>             # Minimum is natural alignment
>             if alignment < size:
>                 alignment = size
> 
>             member = Member(
>                 size=size,
>                 # Rather than using member.alignment to get the original
>                 # alignment, infer it only from the offset
>                 alignment=alignment,
>             )
>             return (member, offset + size)
> 
>         accs = itertools.accumulate(
>             zip(members, offsets),
>             compute,
>             initial=(Member(0), 0),
>         )
>         members, _ = zip(*itertools.islice(accs, 1, None))
> 
>         return cls(
>             members=members,
>         )
> 
>     @staticmethod
>     def _compute_offsets(members):
>         def compute(acc, member):
>             _, smallest_offset = acc
> 
>             offset = align(smallest_offset, member.alignment)
>             next_offset = offset + member.size
>             return (offset, next_offset)
> 
>         offsets = itertools.accumulate(
>             members,
>             compute,
>             initial=(0, 0),
>         )
>         offsets, _ = zip(*itertools.islice(offsets, 1, None))
>         return offsets
> 
>     @property
>     def members_offset(self):
>         return list(zip(self.members, self.offsets))
> 
>     @staticmethod
>     def _compute_size(members, offsets, alignment):
>         alignment = alignment or 1
>         last = offsets[-1] + members[-1].size
>         return align(last, alignment)
> 
> 
> def test_align_infer(struct):
> 
>     # Reconstruct a struct by inferring alignment from offsets
>     s2 = struct.from_offsets(struct.members, struct.offsets)
> 
>     errors = []
>     for (m1, o1), (m2, o2) in zip(
>         struct.members_offset,
>         s2.members_offset,
>     ):
>         if o1 != o2:
>             errors.append(
>                 f'Wrong offset for {m1} (offset={o1}): {m2} (offset={o2})'
>             )
> 
>     if errors:
>         print()
>         print(struct)
>         print(s2)
>         for error in errors:
>             print(error)
> 
> # s1 = Struct(
> #     [
> #         Member(4),
> #         Member(2),
> #         Member(8),
> #     ]
> # )
> # print(s1.members_offset)
> 
> members = [
>     Member(size, alignment)
>     for size in powers_of_2(1, 128)
>     for alignment in powers_of_2(size, 256)
> ]
> 
> for nr_members in range(1, 4):
>     for _members in itertools.permutations(members, nr_members):
>         struct = Struct(list(_members))
>         print(struct)
>         res = test_align_infer(struct)
> 
> 
> > - Arnaldo
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-27 20:47         ` Arnaldo Carvalho de Melo
@ 2021-10-28  9:31           ` Douglas Raillard
  2021-10-28 11:38             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Douglas Raillard @ 2021-10-28  9:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: acme, dwarves

On 10/27/21 9:47 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Oct 26, 2021 at 04:03:57PM +0100, Douglas Raillard escreveu:
>> On 10/25/21 6:06 PM, Arnaldo Carvalho de Melo wrote:
>>> Em Thu, Oct 21, 2021 at 09:31:36PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>> Em Mon, Oct 18, 2021 at 02:16:21PM +0100, Douglas RAILLARD escreveu:
>>>>> From: Douglas Raillard <douglas.raillard@arm.com>
> 
>>>>> BTF does not carry alignment information, but it carries the offset in
>>>>> structs. This allows inferring the original alignment, yielding a C
>>>>> header dump that is not identical to the original C code, but is
>>>>> guaranteed to lead to the same memory layout.
> 
>>>>> This allows using the output of pahole in another program to poke at
>>>>> memory, with the assurance that we will not read garbage.
> 
>>>>> Note: Since the alignment is inferred from the offset, it sometimes
>>>>> happens that the offset was already correctly aligned, which means the
>>>>> inferred alignment will be smaller than in the original source. This
>>>>> does not impact the ability to read existing structs, but it could
>>>>> impact creating such struct if other client code expects higher
>>>>> alignment than the one exposed in the generated header.
> 
>>>> this one makes btfdiff fail, example:
> 
>>>> @@ -125578,7 +125640,7 @@ struct xt_entry_match {
>>>>    			struct xt_match * match;         /*     8     8 */
>>>>    		} kernel;                                /*     0    16 */
>>>>    		__u16              match_size;           /*     0     2 */
>>>> -	} u;                                             /*     0    32 */
>>>> +	} u;            /*     0    32 */
>>>>    	unsigned char              data[];               /*    32     0 */
>>>>
>>>>    	/* size: 32, cachelines: 1, members: 2 */
>>>>
>>>> Why the change in the generated source code comment alignment?
> 
> The one above I fixed today, an old bug that gets exercised when you inferred
> alignment for types.
> 
>>> Since this is inferred and DWARF has the alignment info explicitely, we
>>> can't really compare, so I'll stick this to btfdiff.
> <SNIP>
>>> I got sidetracked, will try to reduce the differences somehow, but
>>> probably as patches on top of yours, to test this the best is to use
>>> fullcircle from BTF info, which I'll also look into as I think you
>>> reported issues with pfunct's --compile option.
>   
>> Thanks, let me know when you get something working on this front.
> 
> So, what is now in the 'next' branch (an alias for the tmp.master branch
> that is tested in the libbpf CI) produces:
> 
>    --- /tmp/btfdiff.dwarf.8098nf   2021-10-27 17:29:02.788601053 -0300
>    +++ /tmp/btfdiff.btf.eTagYI     2021-10-27 17:29:02.994606515 -0300
>    @@ -107,7 +107,7 @@ struct Qdisc {
>            /* XXX 24 bytes hole, try to pack */
>     
>            /* --- cacheline 2 boundary (128 bytes) --- */
>    -       struct sk_buff_head        gso_skb __attribute__((__aligned__(64))); /*   128    24 */
>    +       struct sk_buff_head        gso_skb __attribute__((__aligned__(32))); /*   128    24 */
>            struct qdisc_skb_head      q;                    /*   152    24 */
>            struct gnet_stats_basic_packed bstats;           /*   176    16 */
>            /* --- cacheline 3 boundary (192 bytes) --- */
>    
> The one above is a bit difficult to solve, perhaps we can use an
> heuristic for kernel files and assume this is dependend on the
> typical cacheline sizes? As it in the kernel sources is:
> 
>            struct sk_buff_head     gso_skb ____cacheline_aligned_in_smp;
> 
> I.e. if it is a multiple of both 64, then we use it instead of 32?
>    
>    @@ -117,18 +117,18 @@ struct Qdisc {
>            struct Qdisc *             next_sched;           /*   224     8 */
>            struct sk_buff_head        skb_bad_txq;          /*   232    24 */
>            /* --- cacheline 4 boundary (256 bytes) --- */
>    -       spinlock_t                 busylock __attribute__((__aligned__(64))); /*   256     4 */
>    +       spinlock_t                 busylock;             /*   256     4 */
> 
> Originally:
> 
>          spinlock_t              busylock ____cacheline_aligned_in_smp;

Sounds like a good idea, I can't think of another reason for large alignments anyway.
Larger power of 2 alignments are harmless anyway so we might as well try to guess.
I'll prepare a v3 with that update.

> 
> But since it is already naturally aligned (232 + 24 = 256), we can't
> infer it from what BTF carries.
> 
>            spinlock_t                 seqlock;              /*   260     4 */
>    -       struct callback_head       rcu __attribute__((__aligned__(8))); /*   264    16 */
>    +       struct callback_head       rcu;                  /*   264    16 */
> 
> Ditto
>     
>            /* XXX 40 bytes hole, try to pack */
>     
>            /* --- cacheline 5 boundary (320 bytes) --- */
>    -       long int                   privdata[] __attribute__((__aligned__(64))); /*   320     0 */
>    +       long int                   privdata[];           /*   320     0 */
> 
> But this one should've been inferred, right?

Assuming you talk about offset 320 aligned on 64, we cannot infer a specific alignment since
5 * 64 = 320.

If you mean the number of forced alignments, I'm not surprised that BTF finds less forced alignments,
as each some of the technically forced alignments (in the original source) also happens to be already
satisfied by the normal layout, so we don't infer anything special.

- Douglas

>            /* size: 320, cachelines: 5, members: 27 */
>            /* sum members: 256, holes: 2, sum holes: 64 */
>    -       /* forced alignments: 4, forced holes: 2, sum forced holes: 64 */
>    +       /* forced alignments: 1, forced holes: 1, sum forced holes: 24 */
>     } __attribute__((__aligned__(64)));
> 
> 
> Here both match, what was inferred for BTF is what DWARF produces, even
> with the source code having no explicit annotation for the type.
> 
>> I also made the following Python script to validate the algorithm, assuming
>> the compiler will emit a layout according to natural alignment:
> 
> Cool, I'll try and read it carefully.
> 
> - Arnaldo
>   
>> #! /usr/bin/env python3
>>
>> import itertools
>> import math
>>
>> class Member:
>>      def __init__(self, size, alignment=None):
>>          self.size = size
>>          self.alignment = size if alignment is None else alignment
>>
>>          assert self.alignment >= self.size
>>
>>      def __repr__(self):
>>          return f'Member(size={self.size}, alignment={self.alignment})'
>>
>>
>> def align(x, n):
>>      r = x % n
>>      padding = n - r if r else 0
>>      return x + padding
>>
>>
>> def next_power_of_2(x):
>>      return 2 ** math.ceil(math.log2(x)) if x else 1
>>
>>
>> def powers_of_2(from_, up_to):
>>      for i in itertools.count():
>>          x = 1 << i
>>          if x < from_:
>>              continue
>>          elif x <= up_to:
>>              yield x
>>          else:
>>              return
>>
>> class Struct:
>>      def __init__(self, members, alignment=None):
>>          members = list(members)
>>
>>          self.members = members
>>          self.alignment = alignment
>>          self.offsets = self._compute_offsets(members)
>>          self.size = self._compute_size(members, self.offsets, alignment)
>>
>>
>>      def __str__(self):
>>          members = ',\n    '.join(
>>              f'offset={offset}: {member}'
>>              for member, offset in self.members_offset
>>          )
>>          return f'Struct(\n    {members}\n)'
>>
>>      @classmethod
>>      def from_offsets(cls, members, offsets):
>>          def compute(acc, member_spec):
>>              _, smallest_offset = acc
>>              member, offset = member_spec
>>
>>              delta = offset - smallest_offset
>>              size = member.size
>>
>>              # Get power of 2 that is strictly higher than delta
>>              alignment = next_power_of_2(delta + 1)
>>
>>              # Minimum is natural alignment
>>              if alignment < size:
>>                  alignment = size
>>
>>              member = Member(
>>                  size=size,
>>                  # Rather than using member.alignment to get the original
>>                  # alignment, infer it only from the offset
>>                  alignment=alignment,
>>              )
>>              return (member, offset + size)
>>
>>          accs = itertools.accumulate(
>>              zip(members, offsets),
>>              compute,
>>              initial=(Member(0), 0),
>>          )
>>          members, _ = zip(*itertools.islice(accs, 1, None))
>>
>>          return cls(
>>              members=members,
>>          )
>>
>>      @staticmethod
>>      def _compute_offsets(members):
>>          def compute(acc, member):
>>              _, smallest_offset = acc
>>
>>              offset = align(smallest_offset, member.alignment)
>>              next_offset = offset + member.size
>>              return (offset, next_offset)
>>
>>          offsets = itertools.accumulate(
>>              members,
>>              compute,
>>              initial=(0, 0),
>>          )
>>          offsets, _ = zip(*itertools.islice(offsets, 1, None))
>>          return offsets
>>
>>      @property
>>      def members_offset(self):
>>          return list(zip(self.members, self.offsets))
>>
>>      @staticmethod
>>      def _compute_size(members, offsets, alignment):
>>          alignment = alignment or 1
>>          last = offsets[-1] + members[-1].size
>>          return align(last, alignment)
>>
>>
>> def test_align_infer(struct):
>>
>>      # Reconstruct a struct by inferring alignment from offsets
>>      s2 = struct.from_offsets(struct.members, struct.offsets)
>>
>>      errors = []
>>      for (m1, o1), (m2, o2) in zip(
>>          struct.members_offset,
>>          s2.members_offset,
>>      ):
>>          if o1 != o2:
>>              errors.append(
>>                  f'Wrong offset for {m1} (offset={o1}): {m2} (offset={o2})'
>>              )
>>
>>      if errors:
>>          print()
>>          print(struct)
>>          print(s2)
>>          for error in errors:
>>              print(error)
>>
>> # s1 = Struct(
>> #     [
>> #         Member(4),
>> #         Member(2),
>> #         Member(8),
>> #     ]
>> # )
>> # print(s1.members_offset)
>>
>> members = [
>>      Member(size, alignment)
>>      for size in powers_of_2(1, 128)
>>      for alignment in powers_of_2(size, 256)
>> ]
>>
>> for nr_members in range(1, 4):
>>      for _members in itertools.permutations(members, nr_members):
>>          struct = Struct(list(_members))
>>          print(struct)
>>          res = test_align_infer(struct)
>>
>>
>>> - Arnaldo
>>>
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] btf_loader.c: Infer alignment info
  2021-10-28  9:31           ` Douglas Raillard
@ 2021-10-28 11:38             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-10-28 11:38 UTC (permalink / raw)
  To: Douglas Raillard; +Cc: acme, dwarves

Em Thu, Oct 28, 2021 at 10:31:33AM +0100, Douglas Raillard escreveu:
> On 10/27/21 9:47 PM, Arnaldo Carvalho de Melo wrote:
> > So, what is now in the 'next' branch (an alias for the tmp.master branch
> > that is tested in the libbpf CI) produces:

> >    --- /tmp/btfdiff.dwarf.8098nf   2021-10-27 17:29:02.788601053 -0300
> >    +++ /tmp/btfdiff.btf.eTagYI     2021-10-27 17:29:02.994606515 -0300
> >    @@ -107,7 +107,7 @@ struct Qdisc {
> >            /* XXX 24 bytes hole, try to pack */
> >            /* --- cacheline 2 boundary (128 bytes) --- */
> >    -       struct sk_buff_head        gso_skb __attribute__((__aligned__(64))); /*   128    24 */
> >    +       struct sk_buff_head        gso_skb __attribute__((__aligned__(32))); /*   128    24 */
> >            struct qdisc_skb_head      q;                    /*   152    24 */
> >            struct gnet_stats_basic_packed bstats;           /*   176    16 */
> >            /* --- cacheline 3 boundary (192 bytes) --- */
> > The one above is a bit difficult to solve, perhaps we can use an
> > heuristic for kernel files and assume this is dependend on the
> > typical cacheline sizes? As it in the kernel sources is:

> >            struct sk_buff_head     gso_skb ____cacheline_aligned_in_smp;

> > I.e. if it is a multiple of both 64, then we use it instead of 32?
> >    @@ -117,18 +117,18 @@ struct Qdisc {
> >            struct Qdisc *             next_sched;           /*   224     8 */
> >            struct sk_buff_head        skb_bad_txq;          /*   232    24 */
> >            /* --- cacheline 4 boundary (256 bytes) --- */
> >    -       spinlock_t                 busylock __attribute__((__aligned__(64))); /*   256     4 */
> >    +       spinlock_t                 busylock;             /*   256     4 */

> > Originally:

> >          spinlock_t              busylock ____cacheline_aligned_in_smp;

> Sounds like a good idea, I can't think of another reason for large alignments anyway.
> Larger power of 2 alignments are harmless anyway so we might as well try to guess.
> I'll prepare a v3 with that update.

Cool
 
> > But since it is already naturally aligned (232 + 24 = 256), we can't
> > infer it from what BTF carries.

> >            spinlock_t                 seqlock;              /*   260     4 */
> >    -       struct callback_head       rcu __attribute__((__aligned__(8))); /*   264    16 */
> >    +       struct callback_head       rcu;                  /*   264    16 */

> > Ditto
> >            /* XXX 40 bytes hole, try to pack */
> >            /* --- cacheline 5 boundary (320 bytes) --- */
> >    -       long int                   privdata[] __attribute__((__aligned__(64))); /*   320     0 */
> >    +       long int                   privdata[];           /*   320     0 */

> > But this one should've been inferred, right?

> Assuming you talk about offset 320 aligned on 64, we cannot infer a specific alignment since
> 5 * 64 = 320.

I mean this line:

            /* XXX 40 bytes hole, try to pack */

I.e. without an explicit __attribute__((__aligned__(64))) it wouldn't be
at 320, but at 280:

$ cat no_forced_alignment.c
struct foo {
	char	 rcu[264 + 16];
	long int privdata[];
} bar;
$ gcc -g -c no_forced_alignment.c
$ pahole no_forced_alignment.o
struct foo {
	char                       rcu[280];             /*     0   280 */
	/* --- cacheline 4 boundary (256 bytes) was 24 bytes ago --- */
	long int                   privdata[];           /*   280     0 */

	/* size: 280, cachelines: 5, members: 2 */
	/* last cacheline: 24 bytes */
};
$

Its only when we explicitely add the __attribute__((__aligned__(64)))
that we get that hole:

$ cat forced_alignment.c
struct foo {
	char	 rcu[264 + 16];
	long int privdata[] __attribute__((__aligned__(64)));
} bar;
$ gcc -g -c forced_alignment.c
$ pahole forced_alignment.o
struct foo {
	char                       rcu[280];             /*     0   280 */

	/* XXX 40 bytes hole, try to pack */

	/* --- cacheline 5 boundary (320 bytes) --- */
	long int                   privdata[] __attribute__((__aligned__(64))); /*   320     0 */

	/* size: 320, cachelines: 5, members: 2 */
	/* sum members: 280, holes: 1, sum holes: 40 */
	/* forced alignments: 1, forced holes: 1, sum forced holes: 40 */
} __attribute__((__aligned__(64)));
$

I.e. you have to look at the hole right after the previous class member
to notice that yeah, there is a forced alignment.

> If you mean the number of forced alignments, I'm not surprised that BTF finds less forced alignments,
> as each some of the technically forced alignments (in the original source) also happens to be already
> satisfied by the normal layout, so we don't infer anything special.

I meant the hole before privdata :-)

Yeah, there are some alignments that won't be detectable purely from
looking at BTF.

What I'm wanting is to reduce btfdiff output, and in this case, fix a
bug, as the BTF generated from this struct would be incorrectly aligned,
when someone would use that privdata it would get it at offset 280, not
at 320, as intended.

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-28 11:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 13:16 [PATCH v2 0/3] Infer BTF alignment Douglas RAILLARD
2021-10-18 13:16 ` [PATCH v2 1/3] fprintf: Fix nested struct printing Douglas RAILLARD
2021-10-18 13:16 ` [PATCH v2 2/3] btf_loader.c: Refactor class__fixup_btf_bitfields Douglas RAILLARD
2021-10-21 20:49   ` Arnaldo Carvalho de Melo
2021-10-18 13:16 ` [PATCH v2 3/3] btf_loader.c: Infer alignment info Douglas RAILLARD
2021-10-22  0:31   ` Arnaldo Carvalho de Melo
2021-10-25 17:06     ` Arnaldo Carvalho de Melo
2021-10-26 15:03       ` Douglas Raillard
2021-10-27 20:47         ` Arnaldo Carvalho de Melo
2021-10-28  9:31           ` Douglas Raillard
2021-10-28 11:38             ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.