From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C378C2BA19 for ; Wed, 15 Apr 2020 16:47:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E926C20737 for ; Wed, 15 Apr 2020 16:47:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rdYbZp80" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E926C20737 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5022A8E002A; Wed, 15 Apr 2020 12:47:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B38F8E0001; Wed, 15 Apr 2020 12:47:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A0A68E002A; Wed, 15 Apr 2020 12:47:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0198.hostedemail.com [216.40.44.198]) by kanga.kvack.org (Postfix) with ESMTP id 1E1658E0001 for ; Wed, 15 Apr 2020 12:47:36 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D08A35DC8 for ; Wed, 15 Apr 2020 16:47:35 +0000 (UTC) X-FDA: 76710670470.03.scene65_252e41c176358 X-HE-Tag: scene65_252e41c176358 X-Filterd-Recvd-Size: 10635 Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Apr 2020 16:47:35 +0000 (UTC) Received: by mail-wm1-f65.google.com with SMTP id z6so375548wml.2 for ; Wed, 15 Apr 2020 09:47:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=qS0Pc8ZQd6k9Mos9tPfn70jj79PxKDx5svGBTd4t5DM=; b=rdYbZp801z3BznnIdHZ7Ef0182I1wfjyoGeB2GaXw3Pfjt6f0A96g22j68K4CzZGKl 2zq2fdpAHAqxBYAs1h6SG9HbIwPxQ6yFkf3dXDGNb8aeIBkrc0NeaoNk9BFOiOG44Roc Cn3UEKMkOzWiBVmxojzLJ434hy5XoJLALqVyfU7CQtLF/E0ktCRGpp95mSgvQSZl8nsA ucRKL9IngmYo4dZ5rhXR2fGsLOnNtF7ae7jZ/f4zPGo2buGqFLz3n78pZCckZMgRsPfJ kyVMe4livMsN3jAKVsXxNInUNbV1uqByYKC23ZfJuR2xqhkKVzh+1v5qqgBSVjr9nrMc TMCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=qS0Pc8ZQd6k9Mos9tPfn70jj79PxKDx5svGBTd4t5DM=; b=ovAnTfC3ov9q0eXZNmP9PIgZRpW0M+CtV3r4IVsbHsx4H2Yj1RT6R9+nZYNepQlMfi Kj4yEKDMUZcOrzYFhFFWYoTio4Ku53y6ws3RHUoickMO7dT5o84UWZ6hGwwTts8rzfa8 fD2Qa5KDn253M7eBiZLkXCOxEOb1gnuFzzy7dTLpKI/l8irgqf7aDOrGw/OqncZlHqta VYeSk81fLu1VpEl8PANKcS5gukjL2HyCYB/aN/9J8PhiWQRvsdXD3fqPYvKkTiUtAGFS CdHkpnUGKPBcJj1Z8fMeVuf2dMqGVv+QZFyjbwFT4konSGF5WEnFEsDtJ9yDc1kiBfgX JY7A== X-Gm-Message-State: AGi0Puazq8v+CmNLjgNq25EksGAdY4xUSoGWERFm0a5Bfok+AWOY+dHq HHCZUrnnh6oMrb+XYoD8fLIJkQ== X-Google-Smtp-Source: APiQypJgIS0UljPpg0Xme1SgZZ2EJfdP/S6E+UPVUFB0R6f1IfejMGLEJHXN3a03l/KV8RV1+klPJA== X-Received: by 2002:a1c:f302:: with SMTP id q2mr85779wmq.185.1586969253783; Wed, 15 Apr 2020 09:47:33 -0700 (PDT) Received: from google.com ([100.105.32.75]) by smtp.gmail.com with ESMTPSA id z16sm25693499wrl.0.2020.04.15.09.47.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2020 09:47:33 -0700 (PDT) Date: Wed, 15 Apr 2020 18:47:26 +0200 From: Marco Elver To: Andrew Morton Cc: cl@linux.com, iamjoonsoo.kim@lge.com, keescook@chromium.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, silvio.cesare@gmail.com, torvalds@linux-foundation.org, vnik@duasynt.com Subject: Re: [patch 025/155] slub: relocate freelist pointer to middle of object Message-ID: <20200415164726.GA234932@google.com> References: <20200401210155.09e3b9742e1c6e732f5a7250@linux-foundation.org> <20200402040427.WyxceElzI%akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200402040427.WyxceElzI%akpm@linux-foundation.org> User-Agent: Mutt/1.13.2 (2019-12-18) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Wed, 01 Apr 2020, Andrew Morton wrote: > From: Kees Cook > Subject: slub: relocate freelist pointer to middle of object > > In a recent discussion[1] with Vitaly Nikolenko and Silvio Cesare, it > became clear that moving the freelist pointer away from the edge of > allocations would likely improve the overall defensive posture of the > inline freelist pointer. My benchmarks show no meaningful change to > performance (they seem to show it being faster), so this looks like a > reasonable change to make. > > Instead of having the freelist pointer at the very beginning of an > allocation (offset 0) or at the very end of an allocation (effectively > offset -sizeof(void *) from the next allocation), move it away from the > edges of the allocation and into the middle. This provides some > protection against small-sized neighboring overflows (or underflows), for > which the freelist pointer is commonly the target. (Large or well > controlled overwrites are much more likely to attack live object contents, > instead of attempting freelist corruption.) > > The vaunted kernel build benchmark, across 5 runs. Before: > > Mean: 250.05 > Std Dev: 1.85 > > and after, which appears mysteriously faster: > > Mean: 247.13 > Std Dev: 0.76 > > Attempts at running "sysbench --test=memory" show the change to be well in > the noise (sysbench seems to be pretty unstable here -- it's not really > measuring allocation). > > Hackbench is more allocation-heavy, and while the std dev is above the > difference, it looks like may manifest as an improvement as well: > > 20 runs of "hackbench -g 20 -l 1000", before: > > Mean: 36.322 > Std Dev: 0.577 > > and after: > > Mean: 36.056 > Std Dev: 0.598 > > [1] https://twitter.com/vnik5287/status/1235113523098685440 > > Link: http://lkml.kernel.org/r/202003051624.AAAC9AECC@keescook > Signed-off-by: Kees Cook > Acked-by: Christoph Lameter > Cc: Vitaly Nikolenko > Cc: Silvio Cesare > Cc: Christoph Lameter Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Signed-off-by: Andrew Morton > --- > > mm/slub.c | 7 +++++++ > 1 file changed, 7 insertions(+) > With kernel v5.7-rc1 I am unable to boot when using the SLUB allocator and red zoning (slub_debug=Z), but otherwise a default config. Bisect points to this patch, and when reverting it, the kernel boots again. Splat: [...] [ 0.328713] rcu: Hierarchical RCU implementation. [ 0.329169] rcu: RCU event tracing is enabled. [ 0.329611] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=8. [ 0.330251] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies. [ 0.330984] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8 [ 0.332130] NR_IRQS: 4352, nr_irqs: 488, preallocated irqs: 16 [ 0.332713] general protection fault, probably for non-canonical address 0xccccccccccccccd4: 0000 [#1] SMP PTI [ 0.333680] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc1+ #3 [ 0.334280] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 0.335079] RIP: 0010:deactivate_slab.isra.0+0x5b/0x460 [ 0.335582] Code: 48 8b b4 c7 e0 00 00 00 49 8b 44 24 20 31 ff 48 85 c0 40 0f 95 c7 83 c7 0f 89 7c 24 18 48 85 d2 0f 84 a0 00 00 00 41 8b 4e 20 <48> 8b 3c 0b 48 85 ff 0f 84 8c 00 00 00 49 8b 54 24 28 48 89 04 0b [ 0.337385] RSP: 0000:ffffffffb7e03c80 EFLAGS: 00010086 [ 0.337907] RAX: 0000000000000000 RBX: cccccccccccccccc RCX: 0000000000000008 [ 0.338688] RDX: cccccccccccccccc RSI: ffff91241c800f40 RDI: 000000000000000f [ 0.339473] RBP: ffffffffb7e03d20 R08: ffff91241fc2d230 R09: 0000000000000000 [ 0.340256] R10: ffff91241c89c010 R11: 0000000000000000 R12: ffffcf2f20722700 [ 0.341041] R13: cccccccccccccccc R14: ffff91241c802180 R15: ffffcf2f20722700 [ 0.341833] FS: 0000000000000000(0000) GS:ffff91241fc00000(0000) knlGS:0000000000000000 [ 0.342727] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.343359] CR2: ffff911e74e01000 CR3: 000000027460a001 CR4: 00000000000606b0 [ 0.344146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.344929] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.345727] Call Trace: [ 0.345999] ? setup_object_debug.isra.0+0x1d/0x40 [ 0.346525] ? new_slab+0x195/0x340 [ 0.346909] ? init_object+0x2f/0x80 [ 0.347305] ___slab_alloc+0x526/0x570 [ 0.347717] ? kasprintf+0x4e/0x70 [ 0.348092] ? init_object+0x2f/0x80 [ 0.348488] ? string+0x42/0x50 [ 0.348834] ? kasprintf+0x4e/0x70 [ 0.349217] __kmalloc_track_caller+0x1d2/0x200 [ 0.349720] kvasprintf+0x64/0xc0 [ 0.350085] kasprintf+0x4e/0x70 [ 0.350442] ? kmem_cache_alloc_trace+0x188/0x1b0 [ 0.350962] __irq_domain_alloc_fwnode+0x8f/0xd0 [ 0.351474] arch_early_irq_init+0x16/0x90 [ 0.351923] start_kernel+0x2aa/0x4c2 [ 0.352325] secondary_startup_64+0xb6/0xc0 [ 0.352784] Modules linked in: [ 0.353124] random: get_random_bytes called from print_oops_end_marker+0x21/0x40 with crng_init=0 [ 0.353126] ---[ end trace 186486c23e10986d ]--- [ 0.354613] RIP: 0010:deactivate_slab.isra.0+0x5b/0x460 [ 0.355186] Code: 48 8b b4 c7 e0 00 00 00 49 8b 44 24 20 31 ff 48 85 c0 40 0f 95 c7 83 c7 0f 89 7c 24 18 48 85 d2 0f 84 a0 00 00 00 41 8b 4e 20 <48> 8b 3c 0b 48 85 ff 0f 84 8c 00 00 00 49 8b 54 24 28 48 89 04 0b [ 0.357255] RSP: 0000:ffffffffb7e03c80 EFLAGS: 00010086 [ 0.357829] RAX: 0000000000000000 RBX: cccccccccccccccc RCX: 0000000000000008 [ 0.358613] RDX: cccccccccccccccc RSI: ffff91241c800f40 RDI: 000000000000000f [ 0.359398] RBP: ffffffffb7e03d20 R08: ffff91241fc2d230 R09: 0000000000000000 [ 0.360181] R10: ffff91241c89c010 R11: 0000000000000000 R12: ffffcf2f20722700 [ 0.360965] R13: cccccccccccccccc R14: ffff91241c802180 R15: ffffcf2f20722700 [ 0.361755] FS: 0000000000000000(0000) GS:ffff91241fc00000(0000) knlGS:0000000000000000 [ 0.362645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.363275] CR2: ffff911e74e01000 CR3: 000000027460a001 CR4: 00000000000606b0 [ 0.364060] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.364844] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.365636] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.366393] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- Can you reproduce this? Let me know if you need more information. Thanks, -- Marco > --- a/mm/slub.c~slub-relocate-freelist-pointer-to-middle-of-object > +++ a/mm/slub.c > @@ -3581,6 +3581,13 @@ static int calculate_sizes(struct kmem_c > */ > s->offset = size; > size += sizeof(void *); > + } else if (size > sizeof(void *)) { > + /* > + * Store freelist pointer near middle of object to keep > + * it away from the edges of the object to avoid small > + * sized over/underflows from neighboring allocations. > + */ > + s->offset = ALIGN(size / 2, sizeof(void *)); > } > > #ifdef CONFIG_SLUB_DEBUG > _