From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A5B6C433E2 for ; Fri, 11 Sep 2020 07:15:47 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F0C6E20731 for ; Fri, 11 Sep 2020 07:15:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="uLtl6PLZ"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="cGD6EZ5G" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F0C6E20731 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=zQOxlH6xSFE0/lWwNZxq2fubOg9HPl02C0U6DVIOBMY=; b=uLtl6PLZ+ZG1GEbeUuSlXs884 joRa0VIdAwOld5eTYsEtzXkth8JmUy8/YRxfjOFziNQ/Wjx5IXpeTw8cDKzv25h4cJ59VTLpWxRr7 Q85/gNNbJFzj1qghymctRTGt3oKmk6JyPdbF86P7s441SHAKqjETKqK7idqtB7FJxyVDXTBhJ0tzC XZJ4YiXLI/H5BERswGlFNaKFFQU/TW5u0taMBIyNDD1O4TwuO0qiXsbKEkXtsQ8d21FhojnG4UM0Y YH6lgjErvewmLeLyHdV48Bj3Ot9WhpRlBKBESKQ6C+kEpMZPZuCYvhDJwN5/FZmn4BfYo4HtbNG84 HEs7JZQmw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kGdGO-0005Di-W7; Fri, 11 Sep 2020 07:14:29 +0000 Received: from mail-qk1-x743.google.com ([2607:f8b0:4864:20::743]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kGdGK-00050W-I2 for linux-arm-kernel@lists.infradead.org; Fri, 11 Sep 2020 07:14:27 +0000 Received: by mail-qk1-x743.google.com with SMTP id v123so8963630qkd.9 for ; Fri, 11 Sep 2020 00:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Iq7jn4E6GwevsaQr+w+3QyCKnTOOBmHy/jKNim9F3x0=; b=cGD6EZ5G/BSu+LFVArof6jyhsoOPHs28C8B3hBvz9n5xzIhF08DGUzlgwKrbOVBEvk Oux0WdbENZwk8WtfafDBmz5dpXKE7SPsA/Bsk9LhWKW8JHPghGOBevUXAh2wAJUySnTm agIshavKUVlCJM8HXMY5lzhT9IJ312w9MAVOBlaac3p2rtoa5fglrBsjCXlVcKXMdYEw qnglm/kgWqh0xmEVhhCsAkZwju/Dij18+s+jnxymVCQYd5n6iPaIsRSrP0K/yR2GHNXd cGTzfe3lHjFN3kvKu9Fl6yhsLT10vmQs9UXpkkVWvXxbyZs2lDyrti5Eg+u7FOvHn6KZ k02Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Iq7jn4E6GwevsaQr+w+3QyCKnTOOBmHy/jKNim9F3x0=; b=rx0FRdC9V9k0qVJIrqp/tKruahtOAm6W4dTKtt88Q+Nyc8AxfHyLoiA5osso1Sd8F0 RWxufNNTHqlUVX7eehmr1lThpYLnmlnauEYI/5PPs4sUUJMEaNm4QKx7T9JmhFbeOaJd MzB3+OEpMDXFINVqCwAGlpmH5/GLEUud91ZjFmxn12Yr60C7amUUknWzut+EKG14JUum ra+Z3d2IG9o8QkA0mqdfIU1TCMfZjXYhhBSjdbzwt9u0TTmbId0tylrLsqxROMUmt1Ll I2ykP58TIwapLnOeYOI4NxzHdzKFfpbZfFzpC9cyqum3aQOH0GKxuKG9+Nx+MnxNwzDE T+8Q== X-Gm-Message-State: AOAM533gNYgRUb0GYWsD/xCn7gkAA39v+bZpCcnAYdJSaEx00jxPjktO ASEfmSt7uQlmOCOXBK1xxUptXGJaCkS4GMT/lZBSCA== X-Google-Smtp-Source: ABdhPJyQCCzXr9lK08u9HMczjai8KCmayWw/1hTBXvi9Gzzhqg3pDpRMyl4dgSS9RUvREmRUc8DbmcZh1l1stipdk4c= X-Received: by 2002:a37:5684:: with SMTP id k126mr355198qkb.43.1599808462092; Fri, 11 Sep 2020 00:14:22 -0700 (PDT) MIME-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> <20200907134055.2878499-10-elver@google.com> In-Reply-To: <20200907134055.2878499-10-elver@google.com> From: Dmitry Vyukov Date: Fri, 11 Sep 2020 09:14:10 +0200 Message-ID: Subject: Re: [PATCH RFC 09/10] kfence, Documentation: add KFENCE documentation To: Marco Elver X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200911_031424_687311_A34A0703 X-CRM114-Status: GOOD ( 38.62 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , "open list:DOCUMENTATION" , Peter Zijlstra , Catalin Marinas , Dave Hansen , Linux-MM , Eric Dumazet , Alexander Potapenko , "H. Peter Anvin" , Christoph Lameter , Will Deacon , Jonathan Corbet , the arch/x86 maintainers , kasan-dev , Ingo Molnar , David Rientjes , Andrey Ryabinin , Kees Cook , "Paul E. McKenney" , Jann Horn , Andrey Konovalov , Borislav Petkov , Andy Lutomirski , Thomas Gleixner , Andrew Morton , Linux ARM , Greg Kroah-Hartman , LKML , Pekka Enberg , Qian Cai , Joonsoo Kim Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Sep 7, 2020 at 3:41 PM Marco Elver wrote: > > Add KFENCE documentation in dev-tools/kfence.rst, and add to index. > > Co-developed-by: Alexander Potapenko > Signed-off-by: Alexander Potapenko > Signed-off-by: Marco Elver > --- > Documentation/dev-tools/index.rst | 1 + > Documentation/dev-tools/kfence.rst | 285 +++++++++++++++++++++++++++++ > 2 files changed, 286 insertions(+) > create mode 100644 Documentation/dev-tools/kfence.rst > > diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst > index f7809c7b1ba9..1b1cf4f5c9d9 100644 > --- a/Documentation/dev-tools/index.rst > +++ b/Documentation/dev-tools/index.rst > @@ -22,6 +22,7 @@ whole; patches welcome! > ubsan > kmemleak > kcsan > + kfence > gdb-kernel-debugging > kgdb > kselftest > diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst > new file mode 100644 > index 000000000000..254f4f089104 > --- /dev/null > +++ b/Documentation/dev-tools/kfence.rst > @@ -0,0 +1,285 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +Kernel Electric-Fence (KFENCE) > +============================== > + > +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety > +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and > +invalid-free errors. > + > +KFENCE is designed to be enabled in production kernels, and has near zero > +performance overhead. Compared to KASAN, KFENCE trades performance for > +precision. The main motivation behind KFENCE's design, is that with enough > +total uptime KFENCE will detect bugs in code paths not typically exercised by > +non-production test workloads. One way to quickly achieve a large enough total > +uptime is when the tool is deployed across a large fleet of machines. > + > +Usage > +----- > + > +To enable KFENCE, configure the kernel with:: > + > + CONFIG_KFENCE=y > + > +KFENCE provides several other configuration options to customize behaviour (see > +the respective help text in ``lib/Kconfig.kfence`` for more info). > + > +Tuning performance > +~~~~~~~~~~~~~~~~~~ > + > +The most important parameter is KFENCE's sample interval, which can be set via > +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The > +sample interval determines the frequency with which heap allocations will be > +guarded by KFENCE. The default is configurable via the Kconfig option > +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` > +disables KFENCE. > + > +With the Kconfig option ``CONFIG_KFENCE_NUM_OBJECTS`` (default 255), the number > +of available guarded objects can be controlled. Each object requires 2 pages, > +one for the object itself and the other one used as a guard page; object pages > +are interleaved with guard pages, and every object page is therefore surrounded > +by two guard pages. > + > +The total memory dedicated to the KFENCE memory pool can be computed as:: > + > + ( #objects + 1 ) * 2 * PAGE_SIZE > + > +Using the default config, and assuming a page size of 4 KiB, results in > +dedicating 2 MiB to the KFENCE memory pool. > + > +Error reports > +~~~~~~~~~~~~~ > + > +A typical out-of-bounds access looks like this:: > + > + ================================================================== > + BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b > + > + Out-of-bounds access at 0xffffffffb672efff (left of kfence-#17): > + test_out_of_bounds_read+0xa3/0x22b > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated in: > + __kfence_alloc+0x42d/0x4c0 > + __kmalloc+0x133/0x200 > + test_alloc+0xf3/0x25b > + test_out_of_bounds_read+0x98/0x22b > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7 > + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 > + ================================================================== > + > +The header of the report provides a short summary of the function involved in > +the access. It is followed by more detailed information about the access and > +its origin. > + > +Use-after-free accesses are reported as:: > + > + ================================================================== > + BUG: KFENCE: use-after-free in test_use_after_free_read+0xb3/0x143 > + > + Use-after-free access at 0xffffffffb673dfe0: > + test_use_after_free_read+0xb3/0x143 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated in: > + __kfence_alloc+0x277/0x4c0 > + __kmalloc+0x133/0x200 > + test_alloc+0xf3/0x25b > + test_use_after_free_read+0x76/0x143 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 Empty line between stacks for consistency and readability. > + freed in: > + kfence_guarded_free+0x158/0x380 > + __kfence_free+0x38/0xc0 > + test_use_after_free_read+0xa8/0x143 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 > + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 > + ================================================================== > + > +KFENCE also reports on invalid frees, such as double-frees:: > + > + ================================================================== > + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 > + > + Invalid free of 0xffffffffb6741000: > + test_double_free+0xdc/0x171 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated in: > + __kfence_alloc+0x42d/0x4c0 > + __kmalloc+0x133/0x200 > + test_alloc+0xf3/0x25b > + test_double_free+0x76/0x171 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + freed in: > + kfence_guarded_free+0x158/0x380 > + __kfence_free+0x38/0xc0 > + test_double_free+0xa8/0x171 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 > + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 > + ================================================================== > + > +KFENCE also uses pattern-based redzones on the other side of an object's guard > +page, to detect out-of-bounds writes on the unprotected side of the object. > +These are reported on frees:: > + > + ================================================================== > + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 > + > + Detected corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ]: > + test_kmalloc_aligned_oob_write+0xef/0x184 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated in: > + __kfence_alloc+0x277/0x4c0 > + __kmalloc+0x133/0x200 > + test_alloc+0xf3/0x25b > + test_kmalloc_aligned_oob_write+0x57/0x184 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 > + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 > + ================================================================== > + > +For such errors, the address where the corruption as well as the corrupt bytes > +are shown. > + > +And finally, KFENCE may also report on invalid accesses to any protected page > +where it was not possible to determine an associated object, e.g. if adjacent > +object pages had not yet been allocated:: > + > + ================================================================== > + BUG: KFENCE: invalid access in test_invalid_access+0x26/0xe0 > + > + Invalid access at 0xffffffffb670b00a: > + test_invalid_access+0x26/0xe0 > + kunit_try_run_case+0x51/0x85 > + kunit_generic_run_threadfn_adapter+0x16/0x30 > + kthread+0x137/0x160 > + ret_from_fork+0x22/0x30 > + > + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 > + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 > + ================================================================== > + > +DebugFS interface > +~~~~~~~~~~~~~~~~~ > + > +Some debugging information is exposed via debugfs: > + > +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. > + > +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects > + allocated via KFENCE, including those already freed but protected. > + > +Implementation Details > +---------------------- > + > +Guarded allocations are set up based on the sample interval. After expiration > +of the sample interval, a guarded allocation from the KFENCE object pool is > +returned to the main allocator (SLAB or SLUB). At this point, the timer is > +reset, and the next allocation is set up after the expiration of the interval. > +To "gate" a KFENCE allocation through the main allocator's fast-path without > +overhead, KFENCE relies on static branches via the static keys infrastructure. > +The static branch is toggled to redirect the allocation to KFENCE. > + > +KFENCE objects each reside on a dedicated page, at either the left or right Do we mention anywhere explicitly that KFENCE currently only supports allocations <=page_size? May be worth mentioning. It kinda follows from implementation but quite implicitly. One may also be confused assuming KFENCE handles larger allocations, but then not being able to figure out. > +page boundaries selected at random. The pages to the left and right of the > +object page are "guard pages", whose attributes are changed to a protected > +state, and cause page faults on any attempted access. Such page faults are then > +intercepted by KFENCE, which handles the fault gracefully by reporting an > +out-of-bounds access. The side opposite of an object's guard page is used as a > +pattern-based redzone, to detect out-of-bounds writes on the unprotected sed of > +the object on frees (for special alignment and size combinations, both sides of > +the object are redzoned). > + > +KFENCE also uses pattern-based redzones on the other side of an object's guard > +page, to detect out-of-bounds writes on the unprotected side of the object; > +these are reported on frees. > + > +The following figure illustrates the page layout:: > + > + ---+-----------+-----------+-----------+-----------+-----------+--- > + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | > + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | > + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | > + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | > + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | > + ---+-----------+-----------+-----------+-----------+-----------+--- > + > +Upon deallocation of a KFENCE object, the object's page is again protected and > +the object is marked as freed. Any further access to the object causes a fault > +and KFENCE reports a use-after-free access. Freed objects are inserted at the > +tail of KFENCE's freelist, so that the least recently freed objects are reused > +first, and the chances of detecting use-after-frees of recently freed objects > +is increased. > + > +Interface > +--------- > + > +The following describes the functions which are used by allocators as well page > +handling code to set up and deal with KFENCE allocations. > + > +.. kernel-doc:: include/linux/kfence.h > + :functions: is_kfence_address > + kfence_shutdown_cache > + kfence_alloc kfence_free > + kfence_ksize kfence_object_start > + kfence_handle_page_fault > + > +Related Tools > +------------- > + > +In userspace, a similar approach is taken by `GWP-ASan > +`_. GWP-ASan also relies on guard pages and > +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is > +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another > +similar but non-sampling approach, that also inspired the name "KFENCE", can be > +found in the userspace `Electric Fence Malloc Debugger > +`_. > + > +In the kernel, several tools exist to debug memory access errors, and in > +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN > +is more precise, relying on compiler instrumentation, this comes at a > +performance cost. We want to highlight that KASAN and KFENCE are complementary, > +with different target environments. For instance, KASAN is the better > +debugging-aid, where a simple reproducer exists: due to the lower chance to > +detect the error, it would require more effort using KFENCE to debug. > +Deployments at scale, however, would benefit from using KFENCE to discover bugs > +due to code paths not exercised by test cases or fuzzers. > -- > 2.28.0.526.ge36021eeef-goog > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel