From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B898C433EF for ; Sat, 18 Sep 2021 09:45:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1C1C260F51 for ; Sat, 18 Sep 2021 09:45:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1C1C260F51 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8E30C6B0071; Sat, 18 Sep 2021 05:45:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 892A86B0072; Sat, 18 Sep 2021 05:45:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75A33900002; Sat, 18 Sep 2021 05:45:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6655A6B0071 for ; Sat, 18 Sep 2021 05:45:28 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1DBE98248076 for ; Sat, 18 Sep 2021 09:45:28 +0000 (UTC) X-FDA: 78600211536.17.678739D Received: from mail-ot1-f50.google.com (mail-ot1-f50.google.com [209.85.210.50]) by imf11.hostedemail.com (Postfix) with ESMTP id D0F3BF0000BF for ; Sat, 18 Sep 2021 09:45:27 +0000 (UTC) Received: by mail-ot1-f50.google.com with SMTP id l16-20020a9d6a90000000b0053b71f7dc83so16313807otq.7 for ; Sat, 18 Sep 2021 02:45:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vjDTvBteAJt6rqaanENmzaxCVEDC1Q1i6Nv/9nSbEWE=; b=R0gWqYQHIYpji/L/rNhbW713jHzhe5H7XxZVXBR5re/RmbR/BUGWB72/Xc9DfXDg8d arnExngwsKixdJb12WJAzD/paHbR1qYC0RXJwymSpIIbqPzXWzQ5dAkHiZoX7MWLGs8A LriPkoWanQIdHQiXl+yufYz74CvuLsNX3PkDqDAom8eLWPy0whZTX1xH1md9qWQ4ocOA VhtSe4jkfruOy8nbPohcaPAPgP3i0ydBEBRc9D3gkH5bTvkkSrY4d+mRhQkL3Jbhbmji uUnniBn6ALsGuC84j+uiJ71IZXyCTazqiajBQJrorIXUnb5mmFAQlCBBrrH1hxTTUDoD ifwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vjDTvBteAJt6rqaanENmzaxCVEDC1Q1i6Nv/9nSbEWE=; b=fzJM+D5SJsOggmKUCJvaqigs0bmOw8S0M3134pe+jld6BZaHV88idLhuSUlJ/ldfwi Tdqsxzb6dFfBgnqj2IWd+pNiKYPeyDXJZNMCOPe0vN517NnhM/C5kP2oXtZhiCpa5qyT 8jubfNE8/exzu9bqELr+9g02RBzE3IiMZjr8hR9fA25Kn3S3lnHiVObLclBqyNBYqa94 Kuyf1xvlnRbOCj/dB6V1PgPW9G3sk/NtFEOncnEvHlF/qz/nhQzKO1lSLAETZzEgbfuu oyAehny4+yB85n9euDXV/cwkGNfvwIKLHxMYY/3053FnNP//GXPniarAD+NvTjKjNVyq rVxw== X-Gm-Message-State: AOAM530knAoa/9AGnSACM39a3Ouge0f3XCy2Wbwy9t8d3LnBkk7dnQ1X py8LWvUWKn+Kohzzeic/k3SIrxbYlHNt0mEE20VxPQ== X-Google-Smtp-Source: ABdhPJx4RfdEpXr7dCTUwuu72uM0DE2Wtycx2Q7nzWsgK/MJHZqUtVw/zJqQSELX1LbAfQtn4/FgtHXoLC5q7BBRvSM= X-Received: by 2002:a9d:71db:: with SMTP id z27mr13094532otj.292.1631958326940; Sat, 18 Sep 2021 02:45:26 -0700 (PDT) MIME-Version: 1.0 References: <20210421105132.3965998-1-elver@google.com> <20210421105132.3965998-3-elver@google.com> <6c0d5f40-5067-3a59-65fa-6977b6f70219@huawei.com> In-Reply-To: From: Marco Elver Date: Sat, 18 Sep 2021 11:45:15 +0200 Message-ID: Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration To: Liu Shixin Cc: Kefeng Wang , akpm@linux-foundation.org, glider@google.com, dvyukov@google.com, jannh@google.com, mark.rutland@arm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com, hdanton@sina.com Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=R0gWqYQH; spf=pass (imf11.hostedemail.com: domain of elver@google.com designates 209.85.210.50 as permitted sender) smtp.mailfrom=elver@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D0F3BF0000BF X-Stat-Signature: kgpefwbxhmcjmx4yhgda1ywnb4apegjc X-HE-Tag: 1631958327-312472 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, 18 Sept 2021 at 11:37, Marco Elver wrote: > > On Sat, 18 Sept 2021 at 10:07, Liu Shixin wrote: > > > > On 2021/9/16 16:49, Marco Elver wrote: > > > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang wrote: > > >> Hi Marco, > > >> > > >> We found kfence_test will fails on ARM64 with this patch with/without > > >> CONFIG_DETECT_HUNG_TASK, > > >> > > >> Any thought ? > > > Please share log and instructions to reproduce if possible. Also, if > > > possible, please share bisection log that led you to this patch. > > > > > > I currently do not see how this patch would cause that, it only > > > increases the timeout duration. > > > > > > I know that under QEMU TCG mode, there are occasionally timeouts in > > > the test simply due to QEMU being extremely slow or other weirdness. > > > > > > > > Hi Marco, > > > > There are some of the results of the current test: > > 1. Using qemu-kvm on arm64 machine, all testcase can pass. > > 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail. > > 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass. > > > > I add some printing to the kernel and get very strange results. > > I add a new variable kfence_allocation_key_gate to track the > > state of kfence_allocation_key. As shown in the following code, theoretically, > > if kfence_allocation_key_gate is zero, then kfence_allocation_key must be > > enabled, so the value of variable error in kfence_alloc() should always be > > zero. In fact, all the passed testcases fit this point. But as shown in the > > following failed log, although kfence_allocation_key has been enabled, it's > > still check failed here. > > > > So I think static_key might be problematic in my qemu environment. > > The change of timeout is not a problem but caused us to observe this problem. > > I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled > > in each loop, then the failed testcase disappears. > > Nice analysis, thanks! What I gather is that static_keys/jump_labels > are somehow broken in QEMU. > > This does remind me that I found a bug in QEMU that might be relevant: > https://bugs.launchpad.net/qemu/+bug/1920934 > Looks like it was never fixed. :-/ > > The failures I encountered caused the kernel to crash, but never saw > the kfence test to fail due to that (never managed to get that far). > Though the bug I saw was on x86 TCG mode, and I never tried arm64. If [ ... that is, I didn't try running QEMU-ASan in arm64 TCG mode ... of course I use QEMU arm64 to test. ;-) ] > you can, try to build a QEMU with ASan and see if you also get the > same use-after-free bug. > > Unless we observe the problem on a real machine, I think for now we > can conclude with fairly high confidence that QEMU TCG still has > issues and cannot be fully trusted here (see bug above). > > Thanks, > -- Marco