From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D234DC19F37 for ; Sun, 22 Aug 2021 07:51:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BA32161242 for ; Sun, 22 Aug 2021 07:51:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232512AbhHVHwU (ORCPT ); Sun, 22 Aug 2021 03:52:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231790AbhHVHwK (ORCPT ); Sun, 22 Aug 2021 03:52:10 -0400 Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21ED2C061760 for ; Sun, 22 Aug 2021 00:51:30 -0700 (PDT) Received: by mail-pj1-x102e.google.com with SMTP id mq3so9965579pjb.5 for ; Sun, 22 Aug 2021 00:51:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eA1/RCZvA4eh8kSUwIkQsy83t52sxem7VcmVr/L+Sis=; b=OybusePZZZlI1BC9BNqbzYpfQ7d/uK2tBU5R2GteBcuKpkUjhmYHqKRLMD4r9xUQ0H UNDwlfNxmKhsyB/Pa3TWBIx4sEF1ESvBwBlrZ5Wwsr5wAtEggxmZLTKfyz9XtN21Fm0c 3DUMdF1ODQj9qhRHyqJc1flLkHB3zoZiYFPHA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=eA1/RCZvA4eh8kSUwIkQsy83t52sxem7VcmVr/L+Sis=; b=OUM1MfYlDepFXqjNFtBk9BhApwyR137Uh26dBRra1tXWRtenzmWJjnRvwgyMHRcF/r hbFWn/WgV4OKTV1FRyPy3P26/j8JLg8WI+BIG9MWkuyRhSxqYLmFYcrwim60m0nb7uRW kweNeNmcKNMloG0fRAmkyKWuV6YDHDxyRJtXPl9Jk/SVjpzJnkn+1KnVfXjiw2TisbMS sex7L0veOLA7lIReafJSgkDmg9s9U/dt58nuODrYqPxuJxw0uAVlNSA3BkhnmubaPGP2 KlytBgtPcAh0bJeoIy13MG3mn+3ZgRbmM/UK1vGsIZkAkwbU1NKukhVvF3Jm3LXnknh6 32Fw== X-Gm-Message-State: AOAM5331VfG6FrkjF7BS4bh4R3f93mJww8K8Yq3oK6rxw/o+/jk+EeI9 IcoTrkMMxhql0nKMgfBixHICkg== X-Google-Smtp-Source: ABdhPJzE9jMd+PS6f52ZpztVGK5B+iBkJ+6r6yiipWSzjTBftlqmy2oeRM8qy359aegah4DvQecJaQ== X-Received: by 2002:a17:902:8bc4:b029:12b:8470:e29e with SMTP id r4-20020a1709028bc4b029012b8470e29emr23726292plo.2.1629618689604; Sun, 22 Aug 2021 00:51:29 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id a190sm10053648pfa.49.2021.08.22.00.51.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Aug 2021 00:51:27 -0700 (PDT) From: Kees Cook To: linux-kernel@vger.kernel.org Cc: Kees Cook , Keith Packard , "Gustavo A . R . Silva" , Rasmus Villemoes , Dan Williams , Daniel Vetter , Daniel Micay , Francis Laniel , Bart Van Assche , David Gow , linux-mm@kvack.org, clang-built-linux@googlegroups.com, linux-hardening@vger.kernel.org Subject: [PATCH for-next 04/25] stddef: Introduce struct_group() helper macro Date: Sun, 22 Aug 2021 00:51:01 -0700 Message-Id: <20210822075122.864511-5-keescook@chromium.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210822075122.864511-1-keescook@chromium.org> References: <20210822075122.864511-1-keescook@chromium.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=9843; h=from:subject; bh=u5fX+UqxhdcioJkWviuiAc++ub9cxwZ1QofYrPUghAU=; b=owEBbQKS/ZANAwAKAYly9N/cbcAmAcsmYgBhIgH1sowy/yBp3PMXTkKsKq1zkOkKRGQ55eXH3/HT KSxoXF+JAjMEAAEKAB0WIQSlw/aPIp3WD3I+bhOJcvTf3G3AJgUCYSIB9QAKCRCJcvTf3G3AJshxEA CuVHRbQoEYAR/xN1/Thv8kPAvZyr5+zcAeXlzgJU52GJWkOw9CZd5y2Mna9TAZuYWbMHmOYwdty5gW Ne6tJU4xflqK4syBvoAWw7A7e4elXNXKI4xVYSOha5hUV22qUO8oo/jyFtWjOSv4ESvsMOjPX54z8E MxiJ6jlBr8xV+T4sRp0m+kIGjT/1My/B5TkOXxgunLwN0Ygwz3+pfsuUY3y2b+YmXw8FtDRzxi7pke dS/3wcN4C1iBJssBhqExyrNbURFRqJAG8bewokSkRiIRwalR/4VAapEH1lvqpenLk0iivEfw99oDE6 Kp+x24HKjPGzhpKeaG5eeHTogQVxMzzrjOqkdtK/Bc4AaHYuWci4KJSzpyZ/IC7h/kAxS6F3jpX6Ot Ul7ZSEM+xaRbisMzCO63UXZVvWJ2dKaCOB2qo23D2pyMAa4/QrlXWw9mSVTxje05EmuT/7avuMvT0d Ydi10515xpc4x9+pW6PHDN4JHN0V5Cms7vPXJjdvr+GbxZ6PEOR10vH8o65HqtZ7ffCCPIH6kMI5a8 4ECSsf3mR7bXhbWlCYiYrsJlvLS/0RORQ90wOF+AyxNxRX/bZV4kjlYRIOHwmV2Nkwu9GVk8JJRvDZ UtTbi4V00YbxSf6v3G8e49vym8169FoHnBELFjnIXCITJM1EOO1FwJlZ8p3g== X-Developer-Key: i=keescook@chromium.org; a=openpgp; fpr=A5C3F68F229DD60F723E6E138972F4DFDC6DC026 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Kernel code has a regular need to describe groups of members within a structure usually when they need to be copied or initialized separately from the rest of the surrounding structure. The generally accepted design pattern in C is to use a named sub-struct: struct foo { int one; struct { int two; int three, four; } thing; int five; }; This would allow for traditional references and sizing: memcpy(&dst.thing, &src.thing, sizeof(dst.thing)); However, doing this would mean that referencing struct members enclosed by such named structs would always require including the sub-struct name in identifiers: do_something(dst.thing.three); This has tended to be quite inflexible, especially when such groupings need to be added to established code which causes huge naming churn. Three workarounds exist in the kernel for this problem, and each have other negative properties. To avoid the naming churn, there is a design pattern of adding macro aliases for the named struct: #define f_three thing.three This ends up polluting the global namespace, and makes it difficult to search for identifiers. Another common work-around in kernel code avoids the pollution by avoiding the named struct entirely, instead identifying the group's boundaries using either a pair of empty anonymous structs of a pair of zero-element arrays: struct foo { int one; struct { } start; int two; int three, four; struct { } finish; int five; }; struct foo { int one; int start[0]; int two; int three, four; int finish[0]; int five; }; This allows code to avoid needing to use a sub-struct named for member references within the surrounding structure, but loses the benefits of being able to actually use such a struct, making it rather fragile. Using these requires open-coded calculation of sizes and offsets. The efforts made to avoid common mistakes include lots of comments, or adding various BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason about the boundaries (e.g. the "start" object looks like it's 0 bytes in length), making bounds checking depend on open-coded calculations: if (length > offsetof(struct foo, finish) - offsetof(struct foo, start)) return -EINVAL; memcpy(&dst.start, &src.start, offsetof(struct foo, finish) - offsetof(struct foo, start)); However, the vast majority of places in the kernel that operate on groups of members do so without any identification of the grouping, relying either on comments or implicit knowledge of the struct contents, which is even harder for the compiler to reason about, and results in even more fragile manual sizing, usually depending on member locations outside of the region (e.g. to copy "two" and "three", use the start of "four" to find the size): BUILD_BUG_ON((offsetof(struct foo, four) < offsetof(struct foo, two)) || (offsetof(struct foo, four) < offsetof(struct foo, three)); if (length > offsetof(struct foo, four) - offsetof(struct foo, two)) return -EINVAL; memcpy(&dst.two, &src.two, length); In order to have a regular programmatic way to describe a struct region that can be used for references and sizing, can be examined for bounds checking, avoids forcing the use of intermediate identifiers, and avoids polluting the global namespace, introduce the struct_group() macro. This macro wraps the member declarations to create an anonymous union of an anonymous struct (no intermediate name) and a named struct (for references and sizing): struct foo { int one; struct_group(thing, int two; int three, four; ); int five; }; if (length > sizeof(src.thing)) return -EINVAL; memcpy(&dst.thing, &src.thing, length); do_something(dst.three); There are some rare cases where the resulting struct_group() needs attributes added, so struct_group_attr() is also introduced to allow for specifying struct attributes (e.g. __align(x) or __packed). Additionally, there are places where such declarations would like to have the struct be tagged, so struct_group_tagged() is added. Given there is a need for a handful of UAPI uses too, the underlying __struct_group() macro has been defined in UAPI so it can be used there too. To avoid confusing scripts/kernel-doc, hide the macro from its struct parsing. Co-developed-by: Keith Packard Signed-off-by: Keith Packard Acked-by: Gustavo A. R. Silva Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor Enhanced-by: Rasmus Villemoes Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk Enhanced-by: Dan Williams Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com Enhanced-by: Daniel Vetter Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local Acked-by: Dan Williams Signed-off-by: Kees Cook --- include/linux/stddef.h | 48 +++++++++++++++++++++++++++++++++++++ include/uapi/linux/stddef.h | 21 ++++++++++++++++ scripts/kernel-doc | 7 ++++++ 3 files changed, 76 insertions(+) diff --git a/include/linux/stddef.h b/include/linux/stddef.h index 8553b33143d1..8b103a53b000 100644 --- a/include/linux/stddef.h +++ b/include/linux/stddef.h @@ -36,4 +36,52 @@ enum { #define offsetofend(TYPE, MEMBER) \ (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER)) +/** + * struct_group() - Wrap a set of declarations in a mirrored struct + * + * @NAME: The identifier name of the mirrored sub-struct + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. + */ +#define struct_group(NAME, MEMBERS...) \ + __struct_group(/* no tag */, NAME, /* no attrs */, MEMBERS) + +/** + * struct_group_attr() - Create a struct_group() with trailing attributes + * + * @NAME: The identifier name of the mirrored sub-struct + * @ATTRS: Any struct attributes to apply + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. Includes structure attributes argument. + */ +#define struct_group_attr(NAME, ATTRS, MEMBERS...) \ + __struct_group(/* no tag */, NAME, ATTRS, MEMBERS) + +/** + * struct_group_tagged() - Create a struct_group with a reusable tag + * + * @TAG: The tag name for the named sub-struct + * @NAME: The identifier name of the mirrored sub-struct + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. Includes struct tag argument for the named copy, + * so the specified layout can be reused later. + */ +#define struct_group_tagged(TAG, NAME, MEMBERS...) \ + __struct_group(TAG, NAME, /* no attrs */, MEMBERS) + #endif diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h index ee8220f8dcf5..610204f7c275 100644 --- a/include/uapi/linux/stddef.h +++ b/include/uapi/linux/stddef.h @@ -4,3 +4,24 @@ #ifndef __always_inline #define __always_inline inline #endif + +/** + * __struct_group() - Create a mirrored named and anonyomous struct + * + * @TAG: The tag name for the named sub-struct (usually empty) + * @NAME: The identifier name of the mirrored sub-struct + * @ATTRS: Any struct attributes (usually empty) + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical layout + * and size: one anonymous and one named. The former's members can be used + * normally without sub-struct naming, and the latter can be used to + * reason about the start, end, and size of the group of struct members. + * The named struct can also be explicitly tagged for layer reuse, as well + * as both having struct attributes appended. + */ +#define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \ + union { \ + struct { MEMBERS } ATTRS; \ + struct TAG { MEMBERS } ATTRS NAME; \ + } diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 7c4a6a507ac4..d9715efbe0b7 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1245,6 +1245,13 @@ sub dump_struct($$) { $members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos; $members =~ s/\s*____cacheline_aligned_in_smp/ /gos; $members =~ s/\s*____cacheline_aligned/ /gos; + # unwrap struct_group(): + # - first eat non-declaration parameters and rewrite for final match + # - then remove macro, outer parens, and trailing semicolon + $members =~ s/\bstruct_group\s*\(([^,]*,)/STRUCT_GROUP(/gos; + $members =~ s/\bstruct_group_(attr|tagged)\s*\(([^,]*,){2}/STRUCT_GROUP(/gos; + $members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos; + $members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos; my $args = qr{([^,)]+)}; # replace DECLARE_BITMAP -- 2.30.2