From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72EE6C43461 for ; Mon, 7 Sep 2020 17:23:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 298A920658 for ; Mon, 7 Sep 2020 17:23:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="ALL9RpQB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731130AbgIGRXD (ORCPT ); Mon, 7 Sep 2020 13:23:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729655AbgIGNpP (ORCPT ); Mon, 7 Sep 2020 09:45:15 -0400 Received: from mail-wm1-x349.google.com (mail-wm1-x349.google.com [IPv6:2a00:1450:4864:20::349]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C5F4C0617A3 for ; Mon, 7 Sep 2020 06:41:33 -0700 (PDT) Received: by mail-wm1-x349.google.com with SMTP id d22so1779331wmd.6 for ; Mon, 07 Sep 2020 06:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=ALL9RpQBmEPbBRqb5ELx8D38l7uZ2fXlF779RdUFbsUuGJL6gcBUM7tJu/64ISq4My YiAVlSrw9ghLoaIar3j3DmPnyxXsWrtPv2hmyhS+Nh0lOTqZmq/TEHAAEnK92wnT7PmA xIX/7/VCEr5weGUe/HeJ0n13Bu0Cd4VgsRBoKRmnNlDstFMCKYwn/2mMCjQG2cDBLTZ8 xy2KY7WJHh1+7zCZES2tvtQxcSlt1v2EojD7M3Avm7GLnisEB5LbabRNiH2MZhyOlqsb wj1LdJ2GatvNgSj4H1oAR5GVoqiODfuXW7nxdrulY3Q9I8oF7WMayLQ4fMnmW0nGjlKc hlnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=LDyl13h3RnTPjAfEw7T3n8M1Gxxb2SB+oidvDhXeXaEqvxT5z4cMPcgtJ40UgDdplt UwPt2xEua6sIbiI+a+GPybQvJpOSRXAmkoSj6XVLiynlLVHpcKlrBIXV/1MGRuK0xc+4 GPs63Swds5KuRQxUWCsVuvTsLSVWPofX8kKc7+piCJPwNnEOR2noFSEcuGzGS78e+qjj b01S5egJsMWLPk9urSq94qUD+YG8QmPNDkexV1zwssVDv0nRV98j60G4PHiyA1+++wZv H9l39QTYTSpd4lyC7mWIuo3+u5wxmaQOHDdkePhsWDK0MT8uhQzvlL7SaiYSWeDnTvvN ac0Q== X-Gm-Message-State: AOAM532nJf1/sJ4ZXmb9gmxrXQvAHGSxgWjiIyCbDvbaHIrPCVskqZDK VzhwzyKoQWCY3OQFXkI84xqhYbO1Yw== X-Google-Smtp-Source: ABdhPJyYq0tbe8OJm8WhlgT1J8VR0ZbAfLGaI4xhC6D0qARX6o+xgg9QlBKjgP5zThx95lmrqWGuCVhVLg== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:f693:9fff:fef4:2449]) (user=elver job=sendgmr) by 2002:a1c:3886:: with SMTP id f128mr20829871wma.121.1599486092008; Mon, 07 Sep 2020 06:41:32 -0700 (PDT) Date: Mon, 7 Sep 2020 15:40:54 +0200 In-Reply-To: <20200907134055.2878499-1-elver@google.com> Message-Id: <20200907134055.2878499-10-elver@google.com> Mime-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH RFC 09/10] kfence, Documentation: add KFENCE documentation From: Marco Elver To: elver@google.com, glider@google.com, akpm@linux-foundation.org, catalin.marinas@arm.com, cl@linux.com, rientjes@google.com, iamjoonsoo.kim@lge.com, mark.rutland@arm.com, penberg@kernel.org Cc: hpa@zytor.com, paulmck@kernel.org, andreyknvl@google.com, aryabinin@virtuozzo.com, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, dvyukov@google.com, edumazet@google.com, gregkh@linuxfoundation.org, mingo@redhat.com, jannh@google.com, corbet@lwn.net, keescook@chromium.org, peterz@infradead.org, cai@lca.pw, tglx@linutronix.de, will@kernel.org, x86@kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add KFENCE documentation in dev-tools/kfence.rst, and add to index. Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Signed-off-by: Marco Elver --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfence.rst | 285 +++++++++++++++++++++++++++++ 2 files changed, 286 insertions(+) create mode 100644 Documentation/dev-tools/kfence.rst diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..1b1cf4f5c9d9 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -22,6 +22,7 @@ whole; patches welcome! ubsan kmemleak kcsan + kfence gdb-kernel-debugging kgdb kselftest diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst new file mode 100644 index 000000000000..254f4f089104 --- /dev/null +++ b/Documentation/dev-tools/kfence.rst @@ -0,0 +1,285 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Kernel Electric-Fence (KFENCE) +============================== + +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and +invalid-free errors. + +KFENCE is designed to be enabled in production kernels, and has near zero +performance overhead. Compared to KASAN, KFENCE trades performance for +precision. The main motivation behind KFENCE's design, is that with enough +total uptime KFENCE will detect bugs in code paths not typically exercised by +non-production test workloads. One way to quickly achieve a large enough total +uptime is when the tool is deployed across a large fleet of machines. + +Usage +----- + +To enable KFENCE, configure the kernel with:: + + CONFIG_KFENCE=y + +KFENCE provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kfence`` for more info). + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +The most important parameter is KFENCE's sample interval, which can be set via +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The +sample interval determines the frequency with which heap allocations will be +guarded by KFENCE. The default is configurable via the Kconfig option +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` +disables KFENCE. + +With the Kconfig option ``CONFIG_KFENCE_NUM_OBJECTS`` (default 255), the number +of available guarded objects can be controlled. Each object requires 2 pages, +one for the object itself and the other one used as a guard page; object pages +are interleaved with guard pages, and every object page is therefore surrounded +by two guard pages. + +The total memory dedicated to the KFENCE memory pool can be computed as:: + + ( #objects + 1 ) * 2 * PAGE_SIZE + +Using the default config, and assuming a page size of 4 KiB, results in +dedicating 2 MiB to the KFENCE memory pool. + +Error reports +~~~~~~~~~~~~~ + +A typical out-of-bounds access looks like this:: + + ================================================================== + BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b + + Out-of-bounds access at 0xffffffffb672efff (left of kfence-#17): + test_out_of_bounds_read+0xa3/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_out_of_bounds_read+0x98/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the function involved in +the access. It is followed by more detailed information about the access and +its origin. + +Use-after-free accesses are reported as:: + + ================================================================== + BUG: KFENCE: use-after-free in test_use_after_free_read+0xb3/0x143 + + Use-after-free access at 0xffffffffb673dfe0: + test_use_after_free_read+0xb3/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_use_after_free_read+0x76/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_use_after_free_read+0xa8/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also reports on invalid frees, such as double-frees:: + + ================================================================== + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 + + Invalid free of 0xffffffffb6741000: + test_double_free+0xdc/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_double_free+0x76/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_double_free+0xa8/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object. +These are reported on frees:: + + ================================================================== + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 + + Detected corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ]: + test_kmalloc_aligned_oob_write+0xef/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_kmalloc_aligned_oob_write+0x57/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +For such errors, the address where the corruption as well as the corrupt bytes +are shown. + +And finally, KFENCE may also report on invalid accesses to any protected page +where it was not possible to determine an associated object, e.g. if adjacent +object pages had not yet been allocated:: + + ================================================================== + BUG: KFENCE: invalid access in test_invalid_access+0x26/0xe0 + + Invalid access at 0xffffffffb670b00a: + test_invalid_access+0x26/0xe0 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +Some debugging information is exposed via debugfs: + +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. + +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects + allocated via KFENCE, including those already freed but protected. + +Implementation Details +---------------------- + +Guarded allocations are set up based on the sample interval. After expiration +of the sample interval, a guarded allocation from the KFENCE object pool is +returned to the main allocator (SLAB or SLUB). At this point, the timer is +reset, and the next allocation is set up after the expiration of the interval. +To "gate" a KFENCE allocation through the main allocator's fast-path without +overhead, KFENCE relies on static branches via the static keys infrastructure. +The static branch is toggled to redirect the allocation to KFENCE. + +KFENCE objects each reside on a dedicated page, at either the left or right +page boundaries selected at random. The pages to the left and right of the +object page are "guard pages", whose attributes are changed to a protected +state, and cause page faults on any attempted access. Such page faults are then +intercepted by KFENCE, which handles the fault gracefully by reporting an +out-of-bounds access. The side opposite of an object's guard page is used as a +pattern-based redzone, to detect out-of-bounds writes on the unprotected sed of +the object on frees (for special alignment and size combinations, both sides of +the object are redzoned). + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object; +these are reported on frees. + +The following figure illustrates the page layout:: + + ---+-----------+-----------+-----------+-----------+-----------+--- + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | + ---+-----------+-----------+-----------+-----------+-----------+--- + +Upon deallocation of a KFENCE object, the object's page is again protected and +the object is marked as freed. Any further access to the object causes a fault +and KFENCE reports a use-after-free access. Freed objects are inserted at the +tail of KFENCE's freelist, so that the least recently freed objects are reused +first, and the chances of detecting use-after-frees of recently freed objects +is increased. + +Interface +--------- + +The following describes the functions which are used by allocators as well page +handling code to set up and deal with KFENCE allocations. + +.. kernel-doc:: include/linux/kfence.h + :functions: is_kfence_address + kfence_shutdown_cache + kfence_alloc kfence_free + kfence_ksize kfence_object_start + kfence_handle_page_fault + +Related Tools +------------- + +In userspace, a similar approach is taken by `GWP-ASan +`_. GWP-ASan also relies on guard pages and +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another +similar but non-sampling approach, that also inspired the name "KFENCE", can be +found in the userspace `Electric Fence Malloc Debugger +`_. + +In the kernel, several tools exist to debug memory access errors, and in +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN +is more precise, relying on compiler instrumentation, this comes at a +performance cost. We want to highlight that KASAN and KFENCE are complementary, +with different target environments. For instance, KASAN is the better +debugging-aid, where a simple reproducer exists: due to the lower chance to +detect the error, it would require more effort using KFENCE to debug. +Deployments at scale, however, would benefit from using KFENCE to discover bugs +due to code paths not exercised by test cases or fuzzers. -- 2.28.0.526.ge36021eeef-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32B43C433E2 for ; Mon, 7 Sep 2020 13:41:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A8AE9217A0 for ; Mon, 7 Sep 2020 13:41:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="ALL9RpQB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A8AE9217A0 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 61EAB8E000A; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D1228E0001; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 498408E000A; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0041.hostedemail.com [216.40.44.41]) by kanga.kvack.org (Postfix) with ESMTP id 2D87D8E0001 for ; Mon, 7 Sep 2020 09:41:34 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E4B3B181AC9C6 for ; Mon, 7 Sep 2020 13:41:33 +0000 (UTC) X-FDA: 77236377666.29.boats67_43159e3270cc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id ACDC0180868DC for ; Mon, 7 Sep 2020 13:41:33 +0000 (UTC) X-HE-Tag: boats67_43159e3270cc X-Filterd-Recvd-Size: 17083 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Mon, 7 Sep 2020 13:41:33 +0000 (UTC) Received: by mail-wr1-f73.google.com with SMTP id b7so5717996wrn.6 for ; Mon, 07 Sep 2020 06:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=ALL9RpQBmEPbBRqb5ELx8D38l7uZ2fXlF779RdUFbsUuGJL6gcBUM7tJu/64ISq4My YiAVlSrw9ghLoaIar3j3DmPnyxXsWrtPv2hmyhS+Nh0lOTqZmq/TEHAAEnK92wnT7PmA xIX/7/VCEr5weGUe/HeJ0n13Bu0Cd4VgsRBoKRmnNlDstFMCKYwn/2mMCjQG2cDBLTZ8 xy2KY7WJHh1+7zCZES2tvtQxcSlt1v2EojD7M3Avm7GLnisEB5LbabRNiH2MZhyOlqsb wj1LdJ2GatvNgSj4H1oAR5GVoqiODfuXW7nxdrulY3Q9I8oF7WMayLQ4fMnmW0nGjlKc hlnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=AwWOHXQZWJzjrDdr/WoCoLu5jayyUj15TFoUo4mgpRoXMp606YXIRj8YDNAuM0TIGE NLMaCLa+DvuTs8raJmMwevx26bxmx/Vicf8QgQavNZGE3QXb/pTI5irweFW7NoBMJvuo Xqab8Fas4hLmmDHD519rh1E4H3jy0zVupee1Xa13ymOQ3og0g0kwmCPVBZmxVYrjwJev PqAhXxW+rv7Qicj30+HQ3SGFIe110F1Yr0uADYDnMPwhhZ76dzTa31DDRcT1QRUGCjlp oO9klUmt5QiIoErCdazllabVeeipCIU+FV/jIDriaK2B84jayWgptEOhbXZ8eUJiqmJe p5Sg== X-Gm-Message-State: AOAM530rvJYjpiCSSn6fPmrsHrA8X5z3jK4AlVGDcJ8Z1bJnFta9OqhK HiWE3f/UvYfTl3f3+qxKKGfdBik+4A== X-Google-Smtp-Source: ABdhPJyYq0tbe8OJm8WhlgT1J8VR0ZbAfLGaI4xhC6D0qARX6o+xgg9QlBKjgP5zThx95lmrqWGuCVhVLg== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:f693:9fff:fef4:2449]) (user=elver job=sendgmr) by 2002:a1c:3886:: with SMTP id f128mr20829871wma.121.1599486092008; Mon, 07 Sep 2020 06:41:32 -0700 (PDT) Date: Mon, 7 Sep 2020 15:40:54 +0200 In-Reply-To: <20200907134055.2878499-1-elver@google.com> Message-Id: <20200907134055.2878499-10-elver@google.com> Mime-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH RFC 09/10] kfence, Documentation: add KFENCE documentation From: Marco Elver To: elver@google.com, glider@google.com, akpm@linux-foundation.org, catalin.marinas@arm.com, cl@linux.com, rientjes@google.com, iamjoonsoo.kim@lge.com, mark.rutland@arm.com, penberg@kernel.org Cc: hpa@zytor.com, paulmck@kernel.org, andreyknvl@google.com, aryabinin@virtuozzo.com, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, dvyukov@google.com, edumazet@google.com, gregkh@linuxfoundation.org, mingo@redhat.com, jannh@google.com, corbet@lwn.net, keescook@chromium.org, peterz@infradead.org, cai@lca.pw, tglx@linutronix.de, will@kernel.org, x86@kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: ACDC0180868DC X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add KFENCE documentation in dev-tools/kfence.rst, and add to index. Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Signed-off-by: Marco Elver --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfence.rst | 285 +++++++++++++++++++++++++++++ 2 files changed, 286 insertions(+) create mode 100644 Documentation/dev-tools/kfence.rst diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..1b1cf4f5c9d9 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -22,6 +22,7 @@ whole; patches welcome! ubsan kmemleak kcsan + kfence gdb-kernel-debugging kgdb kselftest diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst new file mode 100644 index 000000000000..254f4f089104 --- /dev/null +++ b/Documentation/dev-tools/kfence.rst @@ -0,0 +1,285 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Kernel Electric-Fence (KFENCE) +============================== + +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and +invalid-free errors. + +KFENCE is designed to be enabled in production kernels, and has near zero +performance overhead. Compared to KASAN, KFENCE trades performance for +precision. The main motivation behind KFENCE's design, is that with enough +total uptime KFENCE will detect bugs in code paths not typically exercised by +non-production test workloads. One way to quickly achieve a large enough total +uptime is when the tool is deployed across a large fleet of machines. + +Usage +----- + +To enable KFENCE, configure the kernel with:: + + CONFIG_KFENCE=y + +KFENCE provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kfence`` for more info). + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +The most important parameter is KFENCE's sample interval, which can be set via +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The +sample interval determines the frequency with which heap allocations will be +guarded by KFENCE. The default is configurable via the Kconfig option +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` +disables KFENCE. + +With the Kconfig option ``CONFIG_KFENCE_NUM_OBJECTS`` (default 255), the number +of available guarded objects can be controlled. Each object requires 2 pages, +one for the object itself and the other one used as a guard page; object pages +are interleaved with guard pages, and every object page is therefore surrounded +by two guard pages. + +The total memory dedicated to the KFENCE memory pool can be computed as:: + + ( #objects + 1 ) * 2 * PAGE_SIZE + +Using the default config, and assuming a page size of 4 KiB, results in +dedicating 2 MiB to the KFENCE memory pool. + +Error reports +~~~~~~~~~~~~~ + +A typical out-of-bounds access looks like this:: + + ================================================================== + BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b + + Out-of-bounds access at 0xffffffffb672efff (left of kfence-#17): + test_out_of_bounds_read+0xa3/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_out_of_bounds_read+0x98/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the function involved in +the access. It is followed by more detailed information about the access and +its origin. + +Use-after-free accesses are reported as:: + + ================================================================== + BUG: KFENCE: use-after-free in test_use_after_free_read+0xb3/0x143 + + Use-after-free access at 0xffffffffb673dfe0: + test_use_after_free_read+0xb3/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_use_after_free_read+0x76/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_use_after_free_read+0xa8/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also reports on invalid frees, such as double-frees:: + + ================================================================== + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 + + Invalid free of 0xffffffffb6741000: + test_double_free+0xdc/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_double_free+0x76/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_double_free+0xa8/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object. +These are reported on frees:: + + ================================================================== + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 + + Detected corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ]: + test_kmalloc_aligned_oob_write+0xef/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_kmalloc_aligned_oob_write+0x57/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +For such errors, the address where the corruption as well as the corrupt bytes +are shown. + +And finally, KFENCE may also report on invalid accesses to any protected page +where it was not possible to determine an associated object, e.g. if adjacent +object pages had not yet been allocated:: + + ================================================================== + BUG: KFENCE: invalid access in test_invalid_access+0x26/0xe0 + + Invalid access at 0xffffffffb670b00a: + test_invalid_access+0x26/0xe0 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +Some debugging information is exposed via debugfs: + +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. + +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects + allocated via KFENCE, including those already freed but protected. + +Implementation Details +---------------------- + +Guarded allocations are set up based on the sample interval. After expiration +of the sample interval, a guarded allocation from the KFENCE object pool is +returned to the main allocator (SLAB or SLUB). At this point, the timer is +reset, and the next allocation is set up after the expiration of the interval. +To "gate" a KFENCE allocation through the main allocator's fast-path without +overhead, KFENCE relies on static branches via the static keys infrastructure. +The static branch is toggled to redirect the allocation to KFENCE. + +KFENCE objects each reside on a dedicated page, at either the left or right +page boundaries selected at random. The pages to the left and right of the +object page are "guard pages", whose attributes are changed to a protected +state, and cause page faults on any attempted access. Such page faults are then +intercepted by KFENCE, which handles the fault gracefully by reporting an +out-of-bounds access. The side opposite of an object's guard page is used as a +pattern-based redzone, to detect out-of-bounds writes on the unprotected sed of +the object on frees (for special alignment and size combinations, both sides of +the object are redzoned). + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object; +these are reported on frees. + +The following figure illustrates the page layout:: + + ---+-----------+-----------+-----------+-----------+-----------+--- + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | + ---+-----------+-----------+-----------+-----------+-----------+--- + +Upon deallocation of a KFENCE object, the object's page is again protected and +the object is marked as freed. Any further access to the object causes a fault +and KFENCE reports a use-after-free access. Freed objects are inserted at the +tail of KFENCE's freelist, so that the least recently freed objects are reused +first, and the chances of detecting use-after-frees of recently freed objects +is increased. + +Interface +--------- + +The following describes the functions which are used by allocators as well page +handling code to set up and deal with KFENCE allocations. + +.. kernel-doc:: include/linux/kfence.h + :functions: is_kfence_address + kfence_shutdown_cache + kfence_alloc kfence_free + kfence_ksize kfence_object_start + kfence_handle_page_fault + +Related Tools +------------- + +In userspace, a similar approach is taken by `GWP-ASan +`_. GWP-ASan also relies on guard pages and +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another +similar but non-sampling approach, that also inspired the name "KFENCE", can be +found in the userspace `Electric Fence Malloc Debugger +`_. + +In the kernel, several tools exist to debug memory access errors, and in +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN +is more precise, relying on compiler instrumentation, this comes at a +performance cost. We want to highlight that KASAN and KFENCE are complementary, +with different target environments. For instance, KASAN is the better +debugging-aid, where a simple reproducer exists: due to the lower chance to +detect the error, it would require more effort using KFENCE to debug. +Deployments at scale, however, would benefit from using KFENCE to discover bugs +due to code paths not exercised by test cases or fuzzers. -- 2.28.0.526.ge36021eeef-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB0B5C433E2 for ; Mon, 7 Sep 2020 13:44:11 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8CF5F21481 for ; Mon, 7 Sep 2020 13:44:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="dGMDGYlJ"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="ALL9RpQB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8CF5F21481 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:From:Subject:References:Mime-Version:Message-Id: In-Reply-To:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=+vUiH4E+xvuHnT5swWhFUNYceMcQwLYm09j5cvMyaFM=; b=dGMDGYlJS+VY5ZAzzk/SzjzBW a1XttxXMNmiAsykV0Y5PxFBMVsf0yipbMrd7QiIcCWSITQlKjXdMgH1nM53qHLdQYLUD75RhuZzLp kdJyCkPjzA6hgsRU5PtX4pHZhd9pshASTnH701R9NeQs760VRiDV/DbgTUMwWXodJPTipES+DuyBO MUpiX1PFhgYXNwtKl1Ob6gbwy2Gq5VfPzu/s3v2ua1413jCFQ+1A8zgv5TlfIg7JDzWLebE7qCmvf QSDaIb05lbizkbf2j/OaVukGo6OmGSJOnsbIf8T2DVOwN26yhzfHgZ/ubvhyBkBxckxxUbmPTw3H2 pssHcbC/Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kFHPp-00044a-Rm; Mon, 07 Sep 2020 13:42:37 +0000 Received: from mail-wr1-x449.google.com ([2a00:1450:4864:20::449]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kFHOp-0003eX-4Q for linux-arm-kernel@lists.infradead.org; Mon, 07 Sep 2020 13:41:51 +0000 Received: by mail-wr1-x449.google.com with SMTP id 3so5751430wrm.4 for ; Mon, 07 Sep 2020 06:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=ALL9RpQBmEPbBRqb5ELx8D38l7uZ2fXlF779RdUFbsUuGJL6gcBUM7tJu/64ISq4My YiAVlSrw9ghLoaIar3j3DmPnyxXsWrtPv2hmyhS+Nh0lOTqZmq/TEHAAEnK92wnT7PmA xIX/7/VCEr5weGUe/HeJ0n13Bu0Cd4VgsRBoKRmnNlDstFMCKYwn/2mMCjQG2cDBLTZ8 xy2KY7WJHh1+7zCZES2tvtQxcSlt1v2EojD7M3Avm7GLnisEB5LbabRNiH2MZhyOlqsb wj1LdJ2GatvNgSj4H1oAR5GVoqiODfuXW7nxdrulY3Q9I8oF7WMayLQ4fMnmW0nGjlKc hlnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=C9DzGO8omaYA4dQv8CYnrrSyTA9gR5pZaZjOfnCZc+u8SlWObDyAZ26jCglwAA6s8U 28sGTlKPTXQGTxEzfZcSPqfOT7oBFXL7Rq5wVl2oDMfgTr2mXYrxLGYfs7LVM8Oikeyz KkqwMx2IPhg7l6uDSavkG1FJT9GkWkaRKUQgocH86n2Y+fBbzxMgu9r/rxCrDmZmqaMI Vor75a7q/XrKuCBu4bZDPiQxSXXz0iNhI2GFEKtkb3H/WZ13meg6LXwjkL6Bpg/us97u x5G9TA4CaLMGuxH7sb9IUdU6YHr0yYtO4zLV1A4hD0FfG4sE9cP1WqcbMHucBewaAk9X hwVA== X-Gm-Message-State: AOAM530D7TbyZCfOUh2oQarp76wvnkWjyuJt30DcFmnzJ97mWcZ7pHex /Umk47CEt2vKTeQ8m484MePAG2olFg== X-Google-Smtp-Source: ABdhPJyYq0tbe8OJm8WhlgT1J8VR0ZbAfLGaI4xhC6D0qARX6o+xgg9QlBKjgP5zThx95lmrqWGuCVhVLg== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:f693:9fff:fef4:2449]) (user=elver job=sendgmr) by 2002:a1c:3886:: with SMTP id f128mr20829871wma.121.1599486092008; Mon, 07 Sep 2020 06:41:32 -0700 (PDT) Date: Mon, 7 Sep 2020 15:40:54 +0200 In-Reply-To: <20200907134055.2878499-1-elver@google.com> Message-Id: <20200907134055.2878499-10-elver@google.com> Mime-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH RFC 09/10] kfence, Documentation: add KFENCE documentation From: Marco Elver To: elver@google.com, glider@google.com, akpm@linux-foundation.org, catalin.marinas@arm.com, cl@linux.com, rientjes@google.com, iamjoonsoo.kim@lge.com, mark.rutland@arm.com, penberg@kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200907_094135_248766_B016D0D3 X-CRM114-Status: GOOD ( 23.78 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-doc@vger.kernel.org, peterz@infradead.org, dave.hansen@linux.intel.com, linux-mm@kvack.org, edumazet@google.com, hpa@zytor.com, will@kernel.org, corbet@lwn.net, x86@kernel.org, kasan-dev@googlegroups.com, mingo@redhat.com, linux-arm-kernel@lists.infradead.org, aryabinin@virtuozzo.com, keescook@chromium.org, paulmck@kernel.org, jannh@google.com, andreyknvl@google.com, cai@lca.pw, luto@kernel.org, tglx@linutronix.de, dvyukov@google.com, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, bp@alien8.de Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add KFENCE documentation in dev-tools/kfence.rst, and add to index. Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Signed-off-by: Marco Elver --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfence.rst | 285 +++++++++++++++++++++++++++++ 2 files changed, 286 insertions(+) create mode 100644 Documentation/dev-tools/kfence.rst diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..1b1cf4f5c9d9 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -22,6 +22,7 @@ whole; patches welcome! ubsan kmemleak kcsan + kfence gdb-kernel-debugging kgdb kselftest diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst new file mode 100644 index 000000000000..254f4f089104 --- /dev/null +++ b/Documentation/dev-tools/kfence.rst @@ -0,0 +1,285 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Kernel Electric-Fence (KFENCE) +============================== + +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and +invalid-free errors. + +KFENCE is designed to be enabled in production kernels, and has near zero +performance overhead. Compared to KASAN, KFENCE trades performance for +precision. The main motivation behind KFENCE's design, is that with enough +total uptime KFENCE will detect bugs in code paths not typically exercised by +non-production test workloads. One way to quickly achieve a large enough total +uptime is when the tool is deployed across a large fleet of machines. + +Usage +----- + +To enable KFENCE, configure the kernel with:: + + CONFIG_KFENCE=y + +KFENCE provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kfence`` for more info). + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +The most important parameter is KFENCE's sample interval, which can be set via +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The +sample interval determines the frequency with which heap allocations will be +guarded by KFENCE. The default is configurable via the Kconfig option +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` +disables KFENCE. + +With the Kconfig option ``CONFIG_KFENCE_NUM_OBJECTS`` (default 255), the number +of available guarded objects can be controlled. Each object requires 2 pages, +one for the object itself and the other one used as a guard page; object pages +are interleaved with guard pages, and every object page is therefore surrounded +by two guard pages. + +The total memory dedicated to the KFENCE memory pool can be computed as:: + + ( #objects + 1 ) * 2 * PAGE_SIZE + +Using the default config, and assuming a page size of 4 KiB, results in +dedicating 2 MiB to the KFENCE memory pool. + +Error reports +~~~~~~~~~~~~~ + +A typical out-of-bounds access looks like this:: + + ================================================================== + BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b + + Out-of-bounds access at 0xffffffffb672efff (left of kfence-#17): + test_out_of_bounds_read+0xa3/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_out_of_bounds_read+0x98/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the function involved in +the access. It is followed by more detailed information about the access and +its origin. + +Use-after-free accesses are reported as:: + + ================================================================== + BUG: KFENCE: use-after-free in test_use_after_free_read+0xb3/0x143 + + Use-after-free access at 0xffffffffb673dfe0: + test_use_after_free_read+0xb3/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_use_after_free_read+0x76/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_use_after_free_read+0xa8/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also reports on invalid frees, such as double-frees:: + + ================================================================== + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 + + Invalid free of 0xffffffffb6741000: + test_double_free+0xdc/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_double_free+0x76/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_double_free+0xa8/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object. +These are reported on frees:: + + ================================================================== + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 + + Detected corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ]: + test_kmalloc_aligned_oob_write+0xef/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_kmalloc_aligned_oob_write+0x57/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +For such errors, the address where the corruption as well as the corrupt bytes +are shown. + +And finally, KFENCE may also report on invalid accesses to any protected page +where it was not possible to determine an associated object, e.g. if adjacent +object pages had not yet been allocated:: + + ================================================================== + BUG: KFENCE: invalid access in test_invalid_access+0x26/0xe0 + + Invalid access at 0xffffffffb670b00a: + test_invalid_access+0x26/0xe0 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +Some debugging information is exposed via debugfs: + +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. + +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects + allocated via KFENCE, including those already freed but protected. + +Implementation Details +---------------------- + +Guarded allocations are set up based on the sample interval. After expiration +of the sample interval, a guarded allocation from the KFENCE object pool is +returned to the main allocator (SLAB or SLUB). At this point, the timer is +reset, and the next allocation is set up after the expiration of the interval. +To "gate" a KFENCE allocation through the main allocator's fast-path without +overhead, KFENCE relies on static branches via the static keys infrastructure. +The static branch is toggled to redirect the allocation to KFENCE. + +KFENCE objects each reside on a dedicated page, at either the left or right +page boundaries selected at random. The pages to the left and right of the +object page are "guard pages", whose attributes are changed to a protected +state, and cause page faults on any attempted access. Such page faults are then +intercepted by KFENCE, which handles the fault gracefully by reporting an +out-of-bounds access. The side opposite of an object's guard page is used as a +pattern-based redzone, to detect out-of-bounds writes on the unprotected sed of +the object on frees (for special alignment and size combinations, both sides of +the object are redzoned). + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object; +these are reported on frees. + +The following figure illustrates the page layout:: + + ---+-----------+-----------+-----------+-----------+-----------+--- + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | + ---+-----------+-----------+-----------+-----------+-----------+--- + +Upon deallocation of a KFENCE object, the object's page is again protected and +the object is marked as freed. Any further access to the object causes a fault +and KFENCE reports a use-after-free access. Freed objects are inserted at the +tail of KFENCE's freelist, so that the least recently freed objects are reused +first, and the chances of detecting use-after-frees of recently freed objects +is increased. + +Interface +--------- + +The following describes the functions which are used by allocators as well page +handling code to set up and deal with KFENCE allocations. + +.. kernel-doc:: include/linux/kfence.h + :functions: is_kfence_address + kfence_shutdown_cache + kfence_alloc kfence_free + kfence_ksize kfence_object_start + kfence_handle_page_fault + +Related Tools +------------- + +In userspace, a similar approach is taken by `GWP-ASan +`_. GWP-ASan also relies on guard pages and +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another +similar but non-sampling approach, that also inspired the name "KFENCE", can be +found in the userspace `Electric Fence Malloc Debugger +`_. + +In the kernel, several tools exist to debug memory access errors, and in +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN +is more precise, relying on compiler instrumentation, this comes at a +performance cost. We want to highlight that KASAN and KFENCE are complementary, +with different target environments. For instance, KASAN is the better +debugging-aid, where a simple reproducer exists: due to the lower chance to +detect the error, it would require more effort using KFENCE to debug. +Deployments at scale, however, would benefit from using KFENCE to discover bugs +due to code paths not exercised by test cases or fuzzers. -- 2.28.0.526.ge36021eeef-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel