All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: linux-kernel@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>,
	Keith Packard <keithp@keithp.com>,
	"Gustavo A . R . Silva" <gustavoars@kernel.org>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Dan Williams <dan.j.williams@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Daniel Micay <danielmicay@gmail.com>,
	Francis Laniel <laniel_francis@privacyrequired.com>,
	Bart Van Assche <bvanassche@acm.org>,
	David Gow <davidgow@google.com>,
	linux-mm@kvack.org, clang-built-linux@googlegroups.com,
	linux-hardening@vger.kernel.org
Subject: [PATCH for-next 04/25] stddef: Introduce struct_group() helper macro
Date: Sun, 22 Aug 2021 00:51:01 -0700	[thread overview]
Message-ID: <20210822075122.864511-5-keescook@chromium.org> (raw)
In-Reply-To: <20210822075122.864511-1-keescook@chromium.org>

Kernel code has a regular need to describe groups of members within a
structure usually when they need to be copied or initialized separately
from the rest of the surrounding structure. The generally accepted design
pattern in C is to use a named sub-struct:

	struct foo {
		int one;
		struct {
			int two;
			int three, four;
		} thing;
		int five;
	};

This would allow for traditional references and sizing:

	memcpy(&dst.thing, &src.thing, sizeof(dst.thing));

However, doing this would mean that referencing struct members enclosed
by such named structs would always require including the sub-struct name
in identifiers:

	do_something(dst.thing.three);

This has tended to be quite inflexible, especially when such groupings
need to be added to established code which causes huge naming churn.
Three workarounds exist in the kernel for this problem, and each have
other negative properties.

To avoid the naming churn, there is a design pattern of adding macro
aliases for the named struct:

	#define f_three thing.three

This ends up polluting the global namespace, and makes it difficult to
search for identifiers.

Another common work-around in kernel code avoids the pollution by avoiding
the named struct entirely, instead identifying the group's boundaries using
either a pair of empty anonymous structs of a pair of zero-element arrays:

	struct foo {
		int one;
		struct { } start;
		int two;
		int three, four;
		struct { } finish;
		int five;
	};

	struct foo {
		int one;
		int start[0];
		int two;
		int three, four;
		int finish[0];
		int five;
	};

This allows code to avoid needing to use a sub-struct named for member
references within the surrounding structure, but loses the benefits of
being able to actually use such a struct, making it rather fragile. Using
these requires open-coded calculation of sizes and offsets. The efforts
made to avoid common mistakes include lots of comments, or adding various
BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
about the boundaries (e.g. the "start" object looks like it's 0 bytes
in length), making bounds checking depend on open-coded calculations:

	if (length > offsetof(struct foo, finish) -
		     offsetof(struct foo, start))
		return -EINVAL;
	memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
				       offsetof(struct foo, start));

However, the vast majority of places in the kernel that operate on
groups of members do so without any identification of the grouping,
relying either on comments or implicit knowledge of the struct contents,
which is even harder for the compiler to reason about, and results in
even more fragile manual sizing, usually depending on member locations
outside of the region (e.g. to copy "two" and "three", use the start of
"four" to find the size):

	BUILD_BUG_ON((offsetof(struct foo, four) <
		      offsetof(struct foo, two)) ||
		     (offsetof(struct foo, four) <
		      offsetof(struct foo, three));
	if (length > offsetof(struct foo, four) -
		     offsetof(struct foo, two))
		return -EINVAL;
	memcpy(&dst.two, &src.two, length);

In order to have a regular programmatic way to describe a struct
region that can be used for references and sizing, can be examined for
bounds checking, avoids forcing the use of intermediate identifiers,
and avoids polluting the global namespace, introduce the struct_group()
macro. This macro wraps the member declarations to create an anonymous
union of an anonymous struct (no intermediate name) and a named struct
(for references and sizing):

	struct foo {
		int one;
		struct_group(thing,
			int two;
			int three, four;
		);
		int five;
	};

	if (length > sizeof(src.thing))
		return -EINVAL;
	memcpy(&dst.thing, &src.thing, length);
	do_something(dst.three);

There are some rare cases where the resulting struct_group() needs
attributes added, so struct_group_attr() is also introduced to allow
for specifying struct attributes (e.g. __align(x) or __packed).
Additionally, there are places where such declarations would like to
have the struct be tagged, so struct_group_tagged() is added.

Given there is a need for a handful of UAPI uses too, the underlying
__struct_group() macro has been defined in UAPI so it can be used there
too.

To avoid confusing scripts/kernel-doc, hide the macro from its struct
parsing.

Co-developed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor
Enhanced-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk
Enhanced-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com
Enhanced-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/stddef.h      | 48 +++++++++++++++++++++++++++++++++++++
 include/uapi/linux/stddef.h | 21 ++++++++++++++++
 scripts/kernel-doc          |  7 ++++++
 3 files changed, 76 insertions(+)

diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 8553b33143d1..8b103a53b000 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -36,4 +36,52 @@ enum {
 #define offsetofend(TYPE, MEMBER) \
 	(offsetof(TYPE, MEMBER)	+ sizeof_field(TYPE, MEMBER))
 
+/**
+ * struct_group() - Wrap a set of declarations in a mirrored struct
+ *
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members.
+ */
+#define struct_group(NAME, MEMBERS...)	\
+	__struct_group(/* no tag */, NAME, /* no attrs */, MEMBERS)
+
+/**
+ * struct_group_attr() - Create a struct_group() with trailing attributes
+ *
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes to apply
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes structure attributes argument.
+ */
+#define struct_group_attr(NAME, ATTRS, MEMBERS...) \
+	__struct_group(/* no tag */, NAME, ATTRS, MEMBERS)
+
+/**
+ * struct_group_tagged() - Create a struct_group with a reusable tag
+ *
+ * @TAG: The tag name for the named sub-struct
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes struct tag argument for the named copy,
+ * so the specified layout can be reused later.
+ */
+#define struct_group_tagged(TAG, NAME, MEMBERS...) \
+	__struct_group(TAG, NAME, /* no attrs */, MEMBERS)
+
 #endif
diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h
index ee8220f8dcf5..610204f7c275 100644
--- a/include/uapi/linux/stddef.h
+++ b/include/uapi/linux/stddef.h
@@ -4,3 +4,24 @@
 #ifndef __always_inline
 #define __always_inline inline
 #endif
+
+/**
+ * __struct_group() - Create a mirrored named and anonyomous struct
+ *
+ * @TAG: The tag name for the named sub-struct (usually empty)
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes (usually empty)
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical layout
+ * and size: one anonymous and one named. The former's members can be used
+ * normally without sub-struct naming, and the latter can be used to
+ * reason about the start, end, and size of the group of struct members.
+ * The named struct can also be explicitly tagged for layer reuse, as well
+ * as both having struct attributes appended.
+ */
+#define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \
+	union { \
+		struct { MEMBERS } ATTRS; \
+		struct TAG { MEMBERS } ATTRS NAME; \
+	}
diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 7c4a6a507ac4..d9715efbe0b7 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1245,6 +1245,13 @@ sub dump_struct($$) {
 	$members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos;
 	$members =~ s/\s*____cacheline_aligned_in_smp/ /gos;
 	$members =~ s/\s*____cacheline_aligned/ /gos;
+	# unwrap struct_group():
+	# - first eat non-declaration parameters and rewrite for final match
+	# - then remove macro, outer parens, and trailing semicolon
+	$members =~ s/\bstruct_group\s*\(([^,]*,)/STRUCT_GROUP(/gos;
+	$members =~ s/\bstruct_group_(attr|tagged)\s*\(([^,]*,){2}/STRUCT_GROUP(/gos;
+	$members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos;
+	$members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
 
 	my $args = qr{([^,)]+)};
 	# replace DECLARE_BITMAP
-- 
2.30.2


  parent reply	other threads:[~2021-08-22  7:51 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-22  7:50 [PATCH for-next 00/25] Prepare for better FORTIFY_SOURCE Kees Cook
2021-08-22  7:50 ` [PATCH for-next 01/25] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp Kees Cook
2021-08-22  7:50   ` Kees Cook
2021-08-22  7:50 ` [PATCH for-next 02/25] powerpc: Split memset() to avoid multi-field overflow Kees Cook
2021-08-22  7:50   ` Kees Cook
2021-08-22  7:51 ` [PATCH for-next 03/25] stddef: Fix kerndoc for sizeof_field() and offsetofend() Kees Cook
2021-08-22  7:51 ` Kees Cook [this message]
2021-08-22  7:51 ` [PATCH for-next 05/25] cxl/core: Replace unions with struct_group() Kees Cook
2021-08-22  7:51 ` [PATCH for-next 06/25] bnxt_en: Use struct_group_attr() for memcpy() region Kees Cook
2021-08-22  7:51 ` [PATCH for-next 07/25] iommu/amd: Use struct_group() " Kees Cook
2021-08-22  7:51 ` [PATCH for-next 08/25] drm/mga/mga_ioc32: " Kees Cook
2021-08-22  7:51 ` [PATCH for-next 09/25] HID: cp2112: " Kees Cook
2021-08-22  7:51 ` [PATCH for-next 10/25] HID: roccat: Use struct_group() to zero kone_mouse_event Kees Cook
2021-08-22  7:51 ` [PATCH for-next 11/25] can: flexcan: Use struct_group() to zero struct flexcan_regs regions Kees Cook
2021-08-22  7:51 ` [PATCH for-next 12/25] cm4000_cs: Use struct_group() to zero struct cm4000_dev region Kees Cook
2021-08-22  7:51 ` [PATCH for-next 13/25] compiler_types.h: Remove __compiletime_object_size() Kees Cook
2021-08-23  6:43   ` Rasmus Villemoes
2021-08-25 19:43     ` Nick Desaulniers
2021-08-25 19:43       ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 14/25] lib/string: Move helper functions out of string.c Kees Cook
2021-08-25 21:48   ` Nick Desaulniers
2021-08-25 21:48     ` Nick Desaulniers
2021-08-26  2:47     ` Kees Cook
2021-08-26 18:08       ` Nick Desaulniers
2021-08-26 18:08         ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 15/25] fortify: Move remaining fortify helpers into fortify-string.h Kees Cook
2021-08-25 21:59   ` Nick Desaulniers
2021-08-25 21:59     ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 16/25] fortify: Explicitly disable Clang support Kees Cook
2021-08-25 19:41   ` Nick Desaulniers
2021-08-25 19:41     ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 17/25] fortify: Fix dropped strcpy() compile-time write overflow check Kees Cook
2021-08-25 21:55   ` Nick Desaulniers
2021-08-25 21:55     ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 18/25] fortify: Prepare to improve strnlen() and strlen() warnings Kees Cook
2021-08-25 22:01   ` Nick Desaulniers
2021-08-25 22:01     ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 19/25] fortify: Allow strlen() and strnlen() to pass compile-time known lengths Kees Cook
2021-08-25 22:05   ` Nick Desaulniers
2021-08-25 22:05     ` Nick Desaulniers
2021-08-26  2:56     ` Kees Cook
2021-08-26 18:02       ` Nick Desaulniers
2021-08-26 18:02         ` Nick Desaulniers
2021-08-22  7:51 ` [PATCH for-next 20/25] fortify: Add compile-time FORTIFY_SOURCE tests Kees Cook
2021-08-22  7:51 ` [PATCH for-next 21/25] lib: Introduce CONFIG_TEST_MEMCPY Kees Cook
2021-08-24  7:00   ` David Gow
2021-08-24  7:00     ` David Gow
2021-08-25  2:32     ` Kees Cook
2021-10-18 15:46   ` Arnd Bergmann
2021-10-18 19:28     ` Kees Cook
2021-08-22  7:51 ` [PATCH for-next 22/25] string.h: Introduce memset_after() for wiping trailing members/padding Kees Cook
2021-08-22  7:51 ` [PATCH for-next 23/25] xfrm: Use memset_after() to clear padding Kees Cook
2021-08-22  7:51 ` [PATCH for-next 24/25] string.h: Introduce memset_startat() for wiping trailing members and padding Kees Cook
2021-08-22  7:51 ` [PATCH for-next 25/25] btrfs: Use memset_startat() to clear end of struct Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210822075122.864511-5-keescook@chromium.org \
    --to=keescook@chromium.org \
    --cc=bvanassche@acm.org \
    --cc=clang-built-linux@googlegroups.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=danielmicay@gmail.com \
    --cc=davidgow@google.com \
    --cc=gustavoars@kernel.org \
    --cc=keithp@keithp.com \
    --cc=laniel_francis@privacyrequired.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@rasmusvillemoes.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.