From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB236C4167D for ; Fri, 22 Apr 2022 18:06:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232997AbiDVSIO (ORCPT ); Fri, 22 Apr 2022 14:08:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236239AbiDVSGS (ORCPT ); Fri, 22 Apr 2022 14:06:18 -0400 Received: from mail-vs1-xe29.google.com (mail-vs1-xe29.google.com [IPv6:2607:f8b0:4864:20::e29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C72ECC50A for ; Fri, 22 Apr 2022 11:03:20 -0700 (PDT) Received: by mail-vs1-xe29.google.com with SMTP id v133so8145724vsv.7 for ; Fri, 22 Apr 2022 11:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GZVGIj4tUMtOzUQeEJsn9To8y+Iu7srnhe9+RZIuWOI=; b=pkOd/P6aQrs9xk9SK5o3HiZlz8oNjOlMYHEYiSWjfjZ3nxHxpnaXr5ig1LKYGbEyEJ 4bVbKyVfYMarfZvKkLGT4O7GrKpmzgiWSMGuU9G/V7ITcxN4E8eGMX1wtMDmV/V7tAI8 4TB1oVvywb27j5Xtks52r/oplLfLwX8A450JZ9pzZLRLg8mBeLcDUYnHZa7mWHbdHHUJ hntntY+KTHUi3QFYLSOM4WLcOtJ5m2RbbXe4ZDcnsXsMF7umFed4F0ehdk26zplFPY3c uaXeR9R/zp2Y3cawJCSUnBuVLTGhmsrMUOk99qx8nZ2RGDhXYd8TGLtm86ZUXS/bFdc2 lvIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GZVGIj4tUMtOzUQeEJsn9To8y+Iu7srnhe9+RZIuWOI=; b=B3Cw7ZLngjwnRguag6mtTo4czS11Bm/pFAGaFM7qz/TYw40b43NBl7PaVxHCrXLJKJ MhWAelACB/VCyDkcXq8XAsi1w9/aSkrIrK5XKeqO9i4LWYWA3syien+1MoX1qoBWP/Cu pIAOCYz6O+NVF3swzsG6yYXUca2OdBetVYmUWnDovB5OhKnu9rhtpmf8BnZxPIIpstlX ij7jGxbu9WGM4+Q7G2B5gmeFhKleRjPjB3mMztiVnKDpbcOTVTV80pMXqNEQ0vOV9czM XcI82R+9JeQTzzMo3vaNcEw13TCS9UN719cRQI09D6IG6HXJ7x1BZi3EXYkZ/Zk3hZW1 u3QA== X-Gm-Message-State: AOAM5309ZR+VZ6T15h0VYqt1e/PT8Mq6cpnN2Uv8VLfKWcpgdzZU969U ON49kVGUkjzZ8nvWfchH1ne0DYT5az3aY5rr4mxTmw== X-Google-Smtp-Source: ABdhPJzDmgIwFY4f0yUV+byn0j+Nu5+VzLL9jIQ7SXS0tv/19kR2sQlytJ5wPI13Ob31AlpEwtvLG87Osgl0StKy4ds= X-Received: by 2002:a67:2f44:0:b0:32a:27a3:7319 with SMTP id v65-20020a672f44000000b0032a27a37319mr1833795vsv.49.1650650598274; Fri, 22 Apr 2022 11:03:18 -0700 (PDT) MIME-Version: 1.0 References: <20220421031738.3168157-1-pcc@google.com> In-Reply-To: From: Peter Collingbourne Date: Fri, 22 Apr 2022 11:03:07 -0700 Message-ID: Subject: Re: [PATCH] mm: make minimum slab alignment a runtime property To: Vlastimil Babka Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Andrey Konovalov , Andrew Morton , Linux ARM , Linux Memory Management List , Linux Kernel Mailing List , Pekka Enberg , cl@linux.org, roman.gushchin@linux.dev, Joonsoo Kim , David Rientjes , Catalin Marinas , Herbert Xu , Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , kasan-dev , Eric Biederman , Kees Cook , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 22, 2022 at 9:19 AM Vlastimil Babka wrote: > > On 4/22/22 14:39, Hyeonggon Yoo wrote: > > On Thu, Apr 21, 2022 at 10:16:25AM -0700, Peter Collingbourne wrote: > >> On Thu, Apr 21, 2022 at 5:30 AM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > >> > > >> > On Wed, Apr 20, 2022 at 08:17:38PM -0700, Peter Collingbourne wrote: > >> > > When CONFIG_KASAN_HW_TAGS is enabled we currently increase the minimum > >> > > slab alignment to 16. This happens even if MTE is not supported in > >> > > hardware or disabled via kasan=off, which creates an unnecessary > >> > > memory overhead in those cases. Eliminate this overhead by making > >> > > the minimum slab alignment a runtime property and only aligning to > >> > > 16 if KASAN is enabled at runtime. > >> > > > >> > > On a DragonBoard 845c (non-MTE hardware) with a kernel built with > >> > > CONFIG_KASAN_HW_TAGS, waiting for quiescence after a full Android > >> > > boot I see the following Slab measurements in /proc/meminfo (median > >> > > of 3 reboots): > >> > > > >> > > Before: 169020 kB > >> > > After: 167304 kB > >> > > > >> > > Link: https://linux-review.googlesource.com/id/I752e725179b43b144153f4b6f584ceb646473ead > >> > > Signed-off-by: Peter Collingbourne > >> > > --- > >> > > arch/arc/include/asm/cache.h | 4 ++-- > >> > > arch/arm/include/asm/cache.h | 2 +- > >> > > arch/arm64/include/asm/cache.h | 19 +++++++++++++------ > >> > > arch/microblaze/include/asm/page.h | 2 +- > >> > > arch/riscv/include/asm/cache.h | 2 +- > >> > > arch/sparc/include/asm/cache.h | 2 +- > >> > > arch/xtensa/include/asm/processor.h | 2 +- > >> > > fs/binfmt_flat.c | 9 ++++++--- > >> > > include/crypto/hash.h | 2 +- > >> > > include/linux/slab.h | 22 +++++++++++++++++----- > >> > > mm/slab.c | 7 +++---- > >> > > mm/slab_common.c | 3 +-- > >> > > mm/slob.c | 6 +++--- > >> > > 13 files changed, 51 insertions(+), 31 deletions(-) > >> > > >> > [+Cc slab people, Catalin and affected subsystems' folks] > >> > > >> > just FYI, There is similar discussion about kmalloc caches' alignment. > >> > https://lore.kernel.org/linux-mm/20220405135758.774016-1-catalin.marinas@arm.com/ > >> > > >> > It seems this is another demand for runtime resolution of slab > >> > alignment, But slightly different from kmalloc as there is no requirement > >> > for DMA alignment. > >> > > >> > > > >> > > diff --git a/arch/arc/include/asm/cache.h b/arch/arc/include/asm/cache.h > >> > > index f0f1fc5d62b6..b6a7763fd5d6 100644 > >> > > --- a/arch/arc/include/asm/cache.h > >> > > +++ b/arch/arc/include/asm/cache.h > >> > > @@ -55,11 +55,11 @@ > >> > > * Make sure slab-allocated buffers are 64-bit aligned when atomic64_t uses > >> > > * ARCv2 64-bit atomics (LLOCKD/SCONDD). This guarantess runtime 64-bit > >> > > * alignment for any atomic64_t embedded in buffer. > >> > > - * Default ARCH_SLAB_MINALIGN is __alignof__(long long) which has a relaxed > >> > > + * Default ARCH_SLAB_MIN_MINALIGN is __alignof__(long long) which has a relaxed > >> > > * value of 4 (and not 8) in ARC ABI. > >> > > */ > >> > > #if defined(CONFIG_ARC_HAS_LL64) && defined(CONFIG_ARC_HAS_LLSC) > >> > > -#define ARCH_SLAB_MINALIGN 8 > >> > > +#define ARCH_SLAB_MIN_MINALIGN 8 > >> > > #endifh > >> > > > >> > > >> > Why isn't it just ARCH_SLAB_MINALIGN? > >> > >> Because this is the minimum possible value of the minimum alignment > >> decided at runtime. I chose to give it a different name to > >> arch_slab_minalign() because the two have different meanings. > >> > >> Granted this isn't a great name because of the stuttering but > >> hopefully it will prompt folks to investigate the meaning of this > >> constant if necessary. > > > > To be honest I don't care much about the name but just thought it's just better > > to be consistent with Catalin's series: ARCH_KMALLOC_MINALIGN for static > > alignment and arch_kmalloc_minalign() for (possibly bigger) alignment decided > > at runtime. > > Agree it should be consistent, one way or another. I would (not overly > strongly) prefer Catalin's approach as it's less churn. The name > ARCH_SLAB_MINALIGN is not wrong as the actual alignment can be only bigger > than that (or equal). > Realistically it seems only slab internals are going to use > arch_kmalloc_minalign(), so there shouldn't be too much need of "prompt > folks to investigate". No strong opinion, so I'll change it back to ARCH_SLAB_MINALIGN then. > >> > > extern int ioc_enable; > >> > > diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h > >> > > index e3ea34558ada..3e1018bb9805 100644 > >> > > --- a/arch/arm/include/asm/cache.h > >> > > +++ b/arch/arm/include/asm/cache.h > >> > > @@ -21,7 +21,7 @@ > >> > > * With EABI on ARMv5 and above we must have 64-bit aligned slab pointers. > >> > > */ > >> > > #if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5) > >> > > -#define ARCH_SLAB_MINALIGN 8 > >> > > +#define ARCH_SLAB_MIN_MINALIGN 8 > >> > > #endif > >> > > > >> > > #define __read_mostly __section(".data..read_mostly") > >> > > diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h > >> > > index a074459f8f2f..38f171591c3f 100644 > >> > > --- a/arch/arm64/include/asm/cache.h > >> > > +++ b/arch/arm64/include/asm/cache.h > >> > > @@ -6,6 +6,7 @@ > >> > > #define __ASM_CACHE_H > >> > > > >> > > #include > >> > > +#include > >> > > > >> > > #define CTR_L1IP_SHIFT 14 > >> > > #define CTR_L1IP_MASK 3 > >> > > @@ -49,15 +50,21 @@ > >> > > */ > >> > > #define ARCH_DMA_MINALIGN (128) > >> > > > >> > > -#ifdef CONFIG_KASAN_SW_TAGS > >> > > -#define ARCH_SLAB_MINALIGN (1ULL << KASAN_SHADOW_SCALE_SHIFT) > >> > > -#elif defined(CONFIG_KASAN_HW_TAGS) > >> > > -#define ARCH_SLAB_MINALIGN MTE_GRANULE_SIZE > >> > > -#endif > >> > > - > >> > > #ifndef __ASSEMBLY__ > >> > > > >> > > #include > >> > > +#include > >> > > + > >> > > +#ifdef CONFIG_KASAN_SW_TAGS > >> > > +#define ARCH_SLAB_MIN_MINALIGN (1ULL << KASAN_SHADOW_SCALE_SHIFT) > >> > > +#elif defined(CONFIG_KASAN_HW_TAGS) > >> > > +static inline size_t arch_slab_minalign(void) > >> > > +{ > >> > > + return kasan_hw_tags_enabled() ? MTE_GRANULE_SIZE : > >> > > + __alignof__(unsigned long long); > >> > > +} > >> > > +#define arch_slab_minalign() arch_slab_minalign() > >> > > +#endif > >> > > > >> > > >> > kasan_hw_tags_enabled() is also false when kasan is just not initialized yet. > >> > What about writing a new helper something like kasan_is_disabled() > >> > instead? > >> > >> The decision of whether to enable KASAN is made early, before the slab > >> allocator is initialized (start_kernel -> smp_prepare_boot_cpu -> > >> kasan_init_hw_tags vs start_kernel -> mm_init -> kmem_cache_init). If > >> you think about it, this needs to be the case for KASAN to operate > >> correctly because it influences the behavior of the slab allocator via > >> the kasan_*poison* hooks. So I don't think we can end up calling this > >> function before then. > > > > Sounds not bad. I wanted to make sure the value of arch_slab_minaligned() > > is not changed during its execution. > > > > Just some part of me thought something like this would be more > > intuitive/robust. > > > > if (systems_supports_mte() && kasan_arg != KASAN_ARG_OFF) > > return MTE_GRANULE_SIZE; > > else > > return __alignof__(unsigned long long); > > Let's see if kasan or arm folks have an opinion here. > > > > >> > > #define ICACHEF_ALIASING 0 > >> > > #define ICACHEF_VPIPT 1 > >> > > diff --git a/arch/microblaze/include/asm/page.h b/arch/microblaze/include/asm/page.h > >> > > index 4b8b2fa78fc5..ccdbc1da3c3e 100644 > >> > > --- a/arch/microblaze/include/asm/page.h > >> > > +++ b/arch/microblaze/include/asm/page.h > >> > > @@ -33,7 +33,7 @@ > >> > > /* MS be sure that SLAB allocates aligned objects */ > >> > > #define ARCH_DMA_MINALIGN L1_CACHE_BYTES > >> > > > >> > > -#define ARCH_SLAB_MINALIGN L1_CACHE_BYTES > >> > > +#define ARCH_SLAB_MIN_MINALIGN L1_CACHE_BYTES > >> > > > >> > > /* > >> > > * PAGE_OFFSET -- the first address of the first page of memory. With MMU > >> > > diff --git a/arch/riscv/include/asm/cache.h b/arch/riscv/include/asm/cache.h > >> > > index 9b58b104559e..7beb3b5d27c7 100644 > >> > > --- a/arch/riscv/include/asm/cache.h > >> > > +++ b/arch/riscv/include/asm/cache.h > >> > > @@ -16,7 +16,7 @@ > >> > > * the flat loader aligns it accordingly. > >> > > */ > >> > > #ifndef CONFIG_MMU > >> > > -#define ARCH_SLAB_MINALIGN 16 > >> > > +#define ARCH_SLAB_MIN_MINALIGN 16 > >> > > #endif > >> > > > >> > > #endif /* _ASM_RISCV_CACHE_H */ > >> > > diff --git a/arch/sparc/include/asm/cache.h b/arch/sparc/include/asm/cache.h > >> > > index e62fd0e72606..9d8cb4687b7e 100644 > >> > > --- a/arch/sparc/include/asm/cache.h > >> > > +++ b/arch/sparc/include/asm/cache.h > >> > > @@ -8,7 +8,7 @@ > >> > > #ifndef _SPARC_CACHE_H > >> > > #define _SPARC_CACHE_H > >> > > > >> > > -#define ARCH_SLAB_MINALIGN __alignof__(unsigned long long) > >> > > +#define ARCH_SLAB_MIN_MINALIGN __alignof__(unsigned long long) > >> > > > >> > > #define L1_CACHE_SHIFT 5 > >> > > #define L1_CACHE_BYTES 32 > >> > > diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h > >> > > index 4489a27d527a..e3ea278e3fcf 100644 > >> > > --- a/arch/xtensa/include/asm/processor.h > >> > > +++ b/arch/xtensa/include/asm/processor.h > >> > > @@ -18,7 +18,7 @@ > >> > > #include > >> > > #include > >> > > > >> > > -#define ARCH_SLAB_MINALIGN XTENSA_STACK_ALIGNMENT > >> > > +#define ARCH_SLAB_MIN_MINALIGN XTENSA_STACK_ALIGNMENT > >> > > > >> > > /* > >> > > * User space process size: 1 GB. > >> > > diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c > >> > > index 626898150011..8ff1bf7d1e87 100644 > >> > > --- a/fs/binfmt_flat.c > >> > > +++ b/fs/binfmt_flat.c > >> > > @@ -64,7 +64,10 @@ > >> > > * Here we can be a bit looser than the data sections since this > >> > > * needs to only meet arch ABI requirements. > >> > > */ > >> > > -#define FLAT_STACK_ALIGN max_t(unsigned long, sizeof(void *), ARCH_SLAB_MINALIGN) > >> > > +static size_t flat_stack_align(void) > >> > > +{ > >> > > + return max_t(unsigned long, sizeof(void *), arch_slab_minalign()); > >> > > +} > > I think this might not be necessary at all. There doesn't seem to be actual > connection to the slab+kasan constraints here. My brief digging into git > blame suggest they just used the ARCH_SLAB_MINALIGN constant because it > existed, e.g. commit 2952095c6b2ee includes in changelog "Arguably, this is > kind of hokey that the FLAT is semi-abusing defines it shouldn't." > So, there shouldn't be a reason to increase this due to KASAN/MTE granule > size, it was done unnecessarily as a side-effect before (AFAIU it shouldn't > have caused existing userspace binaries to break, but maybe in some corner > case it could?), and if this patch leaves out the binfmt_flat changes, the > alignment will be (IMHO correctly) decreased again. Okay, I'll revert this part. > >> > > > >> > > #define RELOC_FAILED 0xff00ff01 /* Relocation incorrect somewhere */ > >> > > #define UNLOADED_LIB 0x7ff000ff /* Placeholder for unused library */ > >> > > @@ -148,7 +151,7 @@ static int create_flat_tables(struct linux_binprm *bprm, unsigned long arg_start > >> > > sp -= 2; /* argvp + envp */ > >> > > sp -= 1; /* &argc */ > >> > > > >> > > - current->mm->start_stack = (unsigned long)sp & -FLAT_STACK_ALIGN; > >> > > + current->mm->start_stack = (unsigned long)sp & -flat_stack_align(); > >> > > sp = (unsigned long __user *)current->mm->start_stack; > >> > > > >> > > if (put_user(bprm->argc, sp++)) > >> > > @@ -966,7 +969,7 @@ static int load_flat_binary(struct linux_binprm *bprm) > >> > > #endif > >> > > stack_len += (bprm->argc + 1) * sizeof(char *); /* the argv array */ > >> > > stack_len += (bprm->envc + 1) * sizeof(char *); /* the envp array */ > >> > > - stack_len = ALIGN(stack_len, FLAT_STACK_ALIGN); > >> > > + stack_len = ALIGN(stack_len, flat_stack_align()); > >> > > > >> > > res = load_flat_file(bprm, &libinfo, 0, &stack_len); > >> > > if (res < 0) > >> > > diff --git a/include/crypto/hash.h b/include/crypto/hash.h > >> > > index f140e4643949..442c290f458c 100644 > >> > > --- a/include/crypto/hash.h > >> > > +++ b/include/crypto/hash.h > >> > > @@ -149,7 +149,7 @@ struct ahash_alg { > >> > > > >> > > struct shash_desc { > >> > > struct crypto_shash *tfm; > >> > > - void *__ctx[] __aligned(ARCH_SLAB_MINALIGN); > >> > > + void *__ctx[] __aligned(ARCH_SLAB_MIN_MINALIGN); > >> > > }; > >> > > > >> > > #define HASH_MAX_DIGESTSIZE 64 > >> > > diff --git a/include/linux/slab.h b/include/linux/slab.h > >> > > index 373b3ef99f4e..80e517593372 100644 > >> > > --- a/include/linux/slab.h > >> > > +++ b/include/linux/slab.h > >> > > @@ -201,21 +201,33 @@ void kmem_dump_obj(void *object); > >> > > #endif > >> > > > >> > > /* > >> > > - * Setting ARCH_SLAB_MINALIGN in arch headers allows a different alignment. > >> > > + * Setting ARCH_SLAB_MIN_MINALIGN in arch headers allows a different alignment. > >> > > * Intended for arches that get misalignment faults even for 64 bit integer > >> > > * aligned buffers. > >> > > */ > >> > > -#ifndef ARCH_SLAB_MINALIGN > >> > > -#define ARCH_SLAB_MINALIGN __alignof__(unsigned long long) > >> > > +#ifndef ARCH_SLAB_MIN_MINALIGN > >> > > +#define ARCH_SLAB_MIN_MINALIGN __alignof__(unsigned long long) > >> > > +#endif > >> > > + > >> > > +/* > >> > > + * Arches can define this function if they want to decide the minimum slab > >> > > + * alignment at runtime. The value returned by the function must be > >> > > + * >= ARCH_SLAB_MIN_MINALIGN. > >> > > + */ > >> > > >> > Not only the value should be bigger than or equal to ARCH_SLAB_MIN_MINALIGN, > >> > it should be compatible with ARCH_SLAB_MIN_MINALIGN. > >> > >> What's the difference? > >> > > > > 231 /* > > 232 * kmalloc and friends return ARCH_KMALLOC_MINALIGN aligned > > 233 * pointers. kmem_cache_alloc and friends return ARCH_SLAB_MIN_MINALIGN > > 234 * aligned pointers. > > 235 */ > > 236 #define __assume_kmalloc_alignment __assume_aligned(ARCH_KMALLOC_MINALIGN) > > 237 #define __assume_slab_alignment __assume_aligned(ARCH_SLAB_MIN_MINALIGN) > > 238 #define __assume_page_alignment __assume_aligned(PAGE_SIZE) > > > > I mean actual slab object size should be both ARCH_SLAB_MIN_MINALIGN-aligned and > > arch_slab_minalign()-aligned. Otherwise we are lying to the compiler. > > > > It's okay If we use just power-of-two alignment. > > But adding a comment wouldn't harm :) > > Agreed, technically it's not ">=ARCH_SLAB_MIN_MINALIGN", but "a least common > multiple of ARCH_SLAB_MIN_MINALIGN and whatever the other alignment > requirements arch_slab_minalign() wants to guarantee". But AFAIK in practice > these constraints are always power-of-two. I think it's pretty much assumed that alignments are a power of two, so from that viewpoint it's enough to say that it must be >=ARCH_SLAB_MIN_MINALIGN. I guess I'll change the comment to say that it must return a power of two since there's no reason not to. Peter From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 041A2C433EF for ; Fri, 22 Apr 2022 18:05:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=9kdEloZraYUzFI7eRBhxlPFf6B5JEYwqKsXE+CQv4i4=; b=Ei6fmI1bYTC7J4 OMnrYLWT0YKyz81YoC/OyoyJ6T2XWdD9jRgUdC2cn+UJ7vV7ZdTLZBDijRNqd/9JgxbnCRTYEB9lQ UI/aWelZRh4qwuYlfxleOQ+06Kl9q4wMbTzSQzUWMcp83/pyJMnDBxPvVuqWCeHiWtQLSppXQKGF8 hLZefhUROgOdlkRWP3CfAMezreYsFlkiNudHPE8o2VrGmMC4pq8thfMncR6sOet4yZun7Cg5iRqCO nq22UtQHOG4lZt5lmPal/be2fKbBjP6/RYUxIzjRoXTmIK4DkuBNcgi5Xxgc1YPe//mbLgSxFs+ap Vdx5rqsCz32cIsS/mh5A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nhxdv-001tDR-Hj; Fri, 22 Apr 2022 18:04:31 +0000 Received: from mail-vs1-xe36.google.com ([2607:f8b0:4864:20::e36]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nhxcn-001smc-8X for linux-arm-kernel@lists.infradead.org; Fri, 22 Apr 2022 18:03:26 +0000 Received: by mail-vs1-xe36.google.com with SMTP id v133so8145721vsv.7 for ; Fri, 22 Apr 2022 11:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GZVGIj4tUMtOzUQeEJsn9To8y+Iu7srnhe9+RZIuWOI=; b=pkOd/P6aQrs9xk9SK5o3HiZlz8oNjOlMYHEYiSWjfjZ3nxHxpnaXr5ig1LKYGbEyEJ 4bVbKyVfYMarfZvKkLGT4O7GrKpmzgiWSMGuU9G/V7ITcxN4E8eGMX1wtMDmV/V7tAI8 4TB1oVvywb27j5Xtks52r/oplLfLwX8A450JZ9pzZLRLg8mBeLcDUYnHZa7mWHbdHHUJ hntntY+KTHUi3QFYLSOM4WLcOtJ5m2RbbXe4ZDcnsXsMF7umFed4F0ehdk26zplFPY3c uaXeR9R/zp2Y3cawJCSUnBuVLTGhmsrMUOk99qx8nZ2RGDhXYd8TGLtm86ZUXS/bFdc2 lvIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GZVGIj4tUMtOzUQeEJsn9To8y+Iu7srnhe9+RZIuWOI=; b=jdXGmgQmfS8XaUz4iX04JqFKGLG7zJJBFr0p424qpdq81KFHh7J0yVUNPQKOzClxbX eAWf3mYtsISjmCurcK4HAfJfXhdvIzUU1cVCIH1NaWjEHyzVzI3ksFMDwBP9DAKTIYHA Pl6EhEi/lxkrLi6YOO5oO2en88y0CurVWDqRFciCSK8ThKL9M7nzD95kyf94ziNp/ebU j8jhoZjJSGa8EOZMstFfFQVWmT4uLH6jwKV+HCjU3wMQTShQ752z4Qjy+yZVXxzuItMg MjZw1ukq00My54wBHLRtPV2F6BKI+JQWic9bZgjVH+TwlWxPEp6KANgILOOA0pu6rjlo yh7Q== X-Gm-Message-State: AOAM533Iwa0/tg9ipV7iPBocKUGMCu5h4xSgAfU60BOP2OCAzoUv0wt3 l+dq/L2zDuc6cJ011nmGC7OWxqOq0avvzL908/yTfA== X-Google-Smtp-Source: ABdhPJzDmgIwFY4f0yUV+byn0j+Nu5+VzLL9jIQ7SXS0tv/19kR2sQlytJ5wPI13Ob31AlpEwtvLG87Osgl0StKy4ds= X-Received: by 2002:a67:2f44:0:b0:32a:27a3:7319 with SMTP id v65-20020a672f44000000b0032a27a37319mr1833795vsv.49.1650650598274; Fri, 22 Apr 2022 11:03:18 -0700 (PDT) MIME-Version: 1.0 References: <20220421031738.3168157-1-pcc@google.com> In-Reply-To: From: Peter Collingbourne Date: Fri, 22 Apr 2022 11:03:07 -0700 Message-ID: Subject: Re: [PATCH] mm: make minimum slab alignment a runtime property To: Vlastimil Babka Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Andrey Konovalov , Andrew Morton , Linux ARM , Linux Memory Management List , Linux Kernel Mailing List , Pekka Enberg , cl@linux.org, roman.gushchin@linux.dev, Joonsoo Kim , David Rientjes , Catalin Marinas , Herbert Xu , Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , kasan-dev , Eric Biederman , Kees Cook , Linus Torvalds X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220422_110321_348402_4CE145EC X-CRM114-Status: GOOD ( 61.17 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Apr 22, 2022 at 9:19 AM Vlastimil Babka wrote: > > On 4/22/22 14:39, Hyeonggon Yoo wrote: > > On Thu, Apr 21, 2022 at 10:16:25AM -0700, Peter Collingbourne wrote: > >> On Thu, Apr 21, 2022 at 5:30 AM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > >> > > >> > On Wed, Apr 20, 2022 at 08:17:38PM -0700, Peter Collingbourne wrote: > >> > > When CONFIG_KASAN_HW_TAGS is enabled we currently increase the minimum > >> > > slab alignment to 16. This happens even if MTE is not supported in > >> > > hardware or disabled via kasan=off, which creates an unnecessary > >> > > memory overhead in those cases. Eliminate this overhead by making > >> > > the minimum slab alignment a runtime property and only aligning to > >> > > 16 if KASAN is enabled at runtime. > >> > > > >> > > On a DragonBoard 845c (non-MTE hardware) with a kernel built with > >> > > CONFIG_KASAN_HW_TAGS, waiting for quiescence after a full Android > >> > > boot I see the following Slab measurements in /proc/meminfo (median > >> > > of 3 reboots): > >> > > > >> > > Before: 169020 kB > >> > > After: 167304 kB > >> > > > >> > > Link: https://linux-review.googlesource.com/id/I752e725179b43b144153f4b6f584ceb646473ead > >> > > Signed-off-by: Peter Collingbourne > >> > > --- > >> > > arch/arc/include/asm/cache.h | 4 ++-- > >> > > arch/arm/include/asm/cache.h | 2 +- > >> > > arch/arm64/include/asm/cache.h | 19 +++++++++++++------ > >> > > arch/microblaze/include/asm/page.h | 2 +- > >> > > arch/riscv/include/asm/cache.h | 2 +- > >> > > arch/sparc/include/asm/cache.h | 2 +- > >> > > arch/xtensa/include/asm/processor.h | 2 +- > >> > > fs/binfmt_flat.c | 9 ++++++--- > >> > > include/crypto/hash.h | 2 +- > >> > > include/linux/slab.h | 22 +++++++++++++++++----- > >> > > mm/slab.c | 7 +++---- > >> > > mm/slab_common.c | 3 +-- > >> > > mm/slob.c | 6 +++--- > >> > > 13 files changed, 51 insertions(+), 31 deletions(-) > >> > > >> > [+Cc slab people, Catalin and affected subsystems' folks] > >> > > >> > just FYI, There is similar discussion about kmalloc caches' alignment. > >> > https://lore.kernel.org/linux-mm/20220405135758.774016-1-catalin.marinas@arm.com/ > >> > > >> > It seems this is another demand for runtime resolution of slab > >> > alignment, But slightly different from kmalloc as there is no requirement > >> > for DMA alignment. > >> > > >> > > > >> > > diff --git a/arch/arc/include/asm/cache.h b/arch/arc/include/asm/cache.h > >> > > index f0f1fc5d62b6..b6a7763fd5d6 100644 > >> > > --- a/arch/arc/include/asm/cache.h > >> > > +++ b/arch/arc/include/asm/cache.h > >> > > @@ -55,11 +55,11 @@ > >> > > * Make sure slab-allocated buffers are 64-bit aligned when atomic64_t uses > >> > > * ARCv2 64-bit atomics (LLOCKD/SCONDD). This guarantess runtime 64-bit > >> > > * alignment for any atomic64_t embedded in buffer. > >> > > - * Default ARCH_SLAB_MINALIGN is __alignof__(long long) which has a relaxed > >> > > + * Default ARCH_SLAB_MIN_MINALIGN is __alignof__(long long) which has a relaxed > >> > > * value of 4 (and not 8) in ARC ABI. > >> > > */ > >> > > #if defined(CONFIG_ARC_HAS_LL64) && defined(CONFIG_ARC_HAS_LLSC) > >> > > -#define ARCH_SLAB_MINALIGN 8 > >> > > +#define ARCH_SLAB_MIN_MINALIGN 8 > >> > > #endifh > >> > > > >> > > >> > Why isn't it just ARCH_SLAB_MINALIGN? > >> > >> Because this is the minimum possible value of the minimum alignment > >> decided at runtime. I chose to give it a different name to > >> arch_slab_minalign() because the two have different meanings. > >> > >> Granted this isn't a great name because of the stuttering but > >> hopefully it will prompt folks to investigate the meaning of this > >> constant if necessary. > > > > To be honest I don't care much about the name but just thought it's just better > > to be consistent with Catalin's series: ARCH_KMALLOC_MINALIGN for static > > alignment and arch_kmalloc_minalign() for (possibly bigger) alignment decided > > at runtime. > > Agree it should be consistent, one way or another. I would (not overly > strongly) prefer Catalin's approach as it's less churn. The name > ARCH_SLAB_MINALIGN is not wrong as the actual alignment can be only bigger > than that (or equal). > Realistically it seems only slab internals are going to use > arch_kmalloc_minalign(), so there shouldn't be too much need of "prompt > folks to investigate". No strong opinion, so I'll change it back to ARCH_SLAB_MINALIGN then. > >> > > extern int ioc_enable; > >> > > diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h > >> > > index e3ea34558ada..3e1018bb9805 100644 > >> > > --- a/arch/arm/include/asm/cache.h > >> > > +++ b/arch/arm/include/asm/cache.h > >> > > @@ -21,7 +21,7 @@ > >> > > * With EABI on ARMv5 and above we must have 64-bit aligned slab pointers. > >> > > */ > >> > > #if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5) > >> > > -#define ARCH_SLAB_MINALIGN 8 > >> > > +#define ARCH_SLAB_MIN_MINALIGN 8 > >> > > #endif > >> > > > >> > > #define __read_mostly __section(".data..read_mostly") > >> > > diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h > >> > > index a074459f8f2f..38f171591c3f 100644 > >> > > --- a/arch/arm64/include/asm/cache.h > >> > > +++ b/arch/arm64/include/asm/cache.h > >> > > @@ -6,6 +6,7 @@ > >> > > #define __ASM_CACHE_H > >> > > > >> > > #include > >> > > +#include > >> > > > >> > > #define CTR_L1IP_SHIFT 14 > >> > > #define CTR_L1IP_MASK 3 > >> > > @@ -49,15 +50,21 @@ > >> > > */ > >> > > #define ARCH_DMA_MINALIGN (128) > >> > > > >> > > -#ifdef CONFIG_KASAN_SW_TAGS > >> > > -#define ARCH_SLAB_MINALIGN (1ULL << KASAN_SHADOW_SCALE_SHIFT) > >> > > -#elif defined(CONFIG_KASAN_HW_TAGS) > >> > > -#define ARCH_SLAB_MINALIGN MTE_GRANULE_SIZE > >> > > -#endif > >> > > - > >> > > #ifndef __ASSEMBLY__ > >> > > > >> > > #include > >> > > +#include > >> > > + > >> > > +#ifdef CONFIG_KASAN_SW_TAGS > >> > > +#define ARCH_SLAB_MIN_MINALIGN (1ULL << KASAN_SHADOW_SCALE_SHIFT) > >> > > +#elif defined(CONFIG_KASAN_HW_TAGS) > >> > > +static inline size_t arch_slab_minalign(void) > >> > > +{ > >> > > + return kasan_hw_tags_enabled() ? MTE_GRANULE_SIZE : > >> > > + __alignof__(unsigned long long); > >> > > +} > >> > > +#define arch_slab_minalign() arch_slab_minalign() > >> > > +#endif > >> > > > >> > > >> > kasan_hw_tags_enabled() is also false when kasan is just not initialized yet. > >> > What about writing a new helper something like kasan_is_disabled() > >> > instead? > >> > >> The decision of whether to enable KASAN is made early, before the slab > >> allocator is initialized (start_kernel -> smp_prepare_boot_cpu -> > >> kasan_init_hw_tags vs start_kernel -> mm_init -> kmem_cache_init). If > >> you think about it, this needs to be the case for KASAN to operate > >> correctly because it influences the behavior of the slab allocator via > >> the kasan_*poison* hooks. So I don't think we can end up calling this > >> function before then. > > > > Sounds not bad. I wanted to make sure the value of arch_slab_minaligned() > > is not changed during its execution. > > > > Just some part of me thought something like this would be more > > intuitive/robust. > > > > if (systems_supports_mte() && kasan_arg != KASAN_ARG_OFF) > > return MTE_GRANULE_SIZE; > > else > > return __alignof__(unsigned long long); > > Let's see if kasan or arm folks have an opinion here. > > > > >> > > #define ICACHEF_ALIASING 0 > >> > > #define ICACHEF_VPIPT 1 > >> > > diff --git a/arch/microblaze/include/asm/page.h b/arch/microblaze/include/asm/page.h > >> > > index 4b8b2fa78fc5..ccdbc1da3c3e 100644 > >> > > --- a/arch/microblaze/include/asm/page.h > >> > > +++ b/arch/microblaze/include/asm/page.h > >> > > @@ -33,7 +33,7 @@ > >> > > /* MS be sure that SLAB allocates aligned objects */ > >> > > #define ARCH_DMA_MINALIGN L1_CACHE_BYTES > >> > > > >> > > -#define ARCH_SLAB_MINALIGN L1_CACHE_BYTES > >> > > +#define ARCH_SLAB_MIN_MINALIGN L1_CACHE_BYTES > >> > > > >> > > /* > >> > > * PAGE_OFFSET -- the first address of the first page of memory. With MMU > >> > > diff --git a/arch/riscv/include/asm/cache.h b/arch/riscv/include/asm/cache.h > >> > > index 9b58b104559e..7beb3b5d27c7 100644 > >> > > --- a/arch/riscv/include/asm/cache.h > >> > > +++ b/arch/riscv/include/asm/cache.h > >> > > @@ -16,7 +16,7 @@ > >> > > * the flat loader aligns it accordingly. > >> > > */ > >> > > #ifndef CONFIG_MMU > >> > > -#define ARCH_SLAB_MINALIGN 16 > >> > > +#define ARCH_SLAB_MIN_MINALIGN 16 > >> > > #endif > >> > > > >> > > #endif /* _ASM_RISCV_CACHE_H */ > >> > > diff --git a/arch/sparc/include/asm/cache.h b/arch/sparc/include/asm/cache.h > >> > > index e62fd0e72606..9d8cb4687b7e 100644 > >> > > --- a/arch/sparc/include/asm/cache.h > >> > > +++ b/arch/sparc/include/asm/cache.h > >> > > @@ -8,7 +8,7 @@ > >> > > #ifndef _SPARC_CACHE_H > >> > > #define _SPARC_CACHE_H > >> > > > >> > > -#define ARCH_SLAB_MINALIGN __alignof__(unsigned long long) > >> > > +#define ARCH_SLAB_MIN_MINALIGN __alignof__(unsigned long long) > >> > > > >> > > #define L1_CACHE_SHIFT 5 > >> > > #define L1_CACHE_BYTES 32 > >> > > diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h > >> > > index 4489a27d527a..e3ea278e3fcf 100644 > >> > > --- a/arch/xtensa/include/asm/processor.h > >> > > +++ b/arch/xtensa/include/asm/processor.h > >> > > @@ -18,7 +18,7 @@ > >> > > #include > >> > > #include > >> > > > >> > > -#define ARCH_SLAB_MINALIGN XTENSA_STACK_ALIGNMENT > >> > > +#define ARCH_SLAB_MIN_MINALIGN XTENSA_STACK_ALIGNMENT > >> > > > >> > > /* > >> > > * User space process size: 1 GB. > >> > > diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c > >> > > index 626898150011..8ff1bf7d1e87 100644 > >> > > --- a/fs/binfmt_flat.c > >> > > +++ b/fs/binfmt_flat.c > >> > > @@ -64,7 +64,10 @@ > >> > > * Here we can be a bit looser than the data sections since this > >> > > * needs to only meet arch ABI requirements. > >> > > */ > >> > > -#define FLAT_STACK_ALIGN max_t(unsigned long, sizeof(void *), ARCH_SLAB_MINALIGN) > >> > > +static size_t flat_stack_align(void) > >> > > +{ > >> > > + return max_t(unsigned long, sizeof(void *), arch_slab_minalign()); > >> > > +} > > I think this might not be necessary at all. There doesn't seem to be actual > connection to the slab+kasan constraints here. My brief digging into git > blame suggest they just used the ARCH_SLAB_MINALIGN constant because it > existed, e.g. commit 2952095c6b2ee includes in changelog "Arguably, this is > kind of hokey that the FLAT is semi-abusing defines it shouldn't." > So, there shouldn't be a reason to increase this due to KASAN/MTE granule > size, it was done unnecessarily as a side-effect before (AFAIU it shouldn't > have caused existing userspace binaries to break, but maybe in some corner > case it could?), and if this patch leaves out the binfmt_flat changes, the > alignment will be (IMHO correctly) decreased again. Okay, I'll revert this part. > >> > > > >> > > #define RELOC_FAILED 0xff00ff01 /* Relocation incorrect somewhere */ > >> > > #define UNLOADED_LIB 0x7ff000ff /* Placeholder for unused library */ > >> > > @@ -148,7 +151,7 @@ static int create_flat_tables(struct linux_binprm *bprm, unsigned long arg_start > >> > > sp -= 2; /* argvp + envp */ > >> > > sp -= 1; /* &argc */ > >> > > > >> > > - current->mm->start_stack = (unsigned long)sp & -FLAT_STACK_ALIGN; > >> > > + current->mm->start_stack = (unsigned long)sp & -flat_stack_align(); > >> > > sp = (unsigned long __user *)current->mm->start_stack; > >> > > > >> > > if (put_user(bprm->argc, sp++)) > >> > > @@ -966,7 +969,7 @@ static int load_flat_binary(struct linux_binprm *bprm) > >> > > #endif > >> > > stack_len += (bprm->argc + 1) * sizeof(char *); /* the argv array */ > >> > > stack_len += (bprm->envc + 1) * sizeof(char *); /* the envp array */ > >> > > - stack_len = ALIGN(stack_len, FLAT_STACK_ALIGN); > >> > > + stack_len = ALIGN(stack_len, flat_stack_align()); > >> > > > >> > > res = load_flat_file(bprm, &libinfo, 0, &stack_len); > >> > > if (res < 0) > >> > > diff --git a/include/crypto/hash.h b/include/crypto/hash.h > >> > > index f140e4643949..442c290f458c 100644 > >> > > --- a/include/crypto/hash.h > >> > > +++ b/include/crypto/hash.h > >> > > @@ -149,7 +149,7 @@ struct ahash_alg { > >> > > > >> > > struct shash_desc { > >> > > struct crypto_shash *tfm; > >> > > - void *__ctx[] __aligned(ARCH_SLAB_MINALIGN); > >> > > + void *__ctx[] __aligned(ARCH_SLAB_MIN_MINALIGN); > >> > > }; > >> > > > >> > > #define HASH_MAX_DIGESTSIZE 64 > >> > > diff --git a/include/linux/slab.h b/include/linux/slab.h > >> > > index 373b3ef99f4e..80e517593372 100644 > >> > > --- a/include/linux/slab.h > >> > > +++ b/include/linux/slab.h > >> > > @@ -201,21 +201,33 @@ void kmem_dump_obj(void *object); > >> > > #endif > >> > > > >> > > /* > >> > > - * Setting ARCH_SLAB_MINALIGN in arch headers allows a different alignment. > >> > > + * Setting ARCH_SLAB_MIN_MINALIGN in arch headers allows a different alignment. > >> > > * Intended for arches that get misalignment faults even for 64 bit integer > >> > > * aligned buffers. > >> > > */ > >> > > -#ifndef ARCH_SLAB_MINALIGN > >> > > -#define ARCH_SLAB_MINALIGN __alignof__(unsigned long long) > >> > > +#ifndef ARCH_SLAB_MIN_MINALIGN > >> > > +#define ARCH_SLAB_MIN_MINALIGN __alignof__(unsigned long long) > >> > > +#endif > >> > > + > >> > > +/* > >> > > + * Arches can define this function if they want to decide the minimum slab > >> > > + * alignment at runtime. The value returned by the function must be > >> > > + * >= ARCH_SLAB_MIN_MINALIGN. > >> > > + */ > >> > > >> > Not only the value should be bigger than or equal to ARCH_SLAB_MIN_MINALIGN, > >> > it should be compatible with ARCH_SLAB_MIN_MINALIGN. > >> > >> What's the difference? > >> > > > > 231 /* > > 232 * kmalloc and friends return ARCH_KMALLOC_MINALIGN aligned > > 233 * pointers. kmem_cache_alloc and friends return ARCH_SLAB_MIN_MINALIGN > > 234 * aligned pointers. > > 235 */ > > 236 #define __assume_kmalloc_alignment __assume_aligned(ARCH_KMALLOC_MINALIGN) > > 237 #define __assume_slab_alignment __assume_aligned(ARCH_SLAB_MIN_MINALIGN) > > 238 #define __assume_page_alignment __assume_aligned(PAGE_SIZE) > > > > I mean actual slab object size should be both ARCH_SLAB_MIN_MINALIGN-aligned and > > arch_slab_minalign()-aligned. Otherwise we are lying to the compiler. > > > > It's okay If we use just power-of-two alignment. > > But adding a comment wouldn't harm :) > > Agreed, technically it's not ">=ARCH_SLAB_MIN_MINALIGN", but "a least common > multiple of ARCH_SLAB_MIN_MINALIGN and whatever the other alignment > requirements arch_slab_minalign() wants to guarantee". But AFAIK in practice > these constraints are always power-of-two. I think it's pretty much assumed that alignments are a power of two, so from that viewpoint it's enough to say that it must be >=ARCH_SLAB_MIN_MINALIGN. I guess I'll change the comment to say that it must return a power of two since there's no reason not to. Peter _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel