All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Liu Shixin <liushixin2@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>,
	akpm@linux-foundation.org, glider@google.com, dvyukov@google.com,
	jannh@google.com, mark.rutland@arm.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kasan-dev@googlegroups.com, hdanton@sina.com
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration
Date: Sat, 18 Sep 2021 11:45:15 +0200	[thread overview]
Message-ID: <CANpmjNOUt5is7iHCAz9aOdD2nBb_7tqAKXmuWtitY_VNOkmv5w@mail.gmail.com> (raw)
In-Reply-To: <CANpmjNPj5aMPu_7D=cwrDyAwz9i-rVcXYgGapYdB+vdHcR3RZg@mail.gmail.com>

On Sat, 18 Sept 2021 at 11:37, Marco Elver <elver@google.com> wrote:
>
> On Sat, 18 Sept 2021 at 10:07, Liu Shixin <liushixin2@huawei.com> wrote:
> >
> > On 2021/9/16 16:49, Marco Elver wrote:
> > > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> > >> Hi Marco,
> > >>
> > >> We found kfence_test will fails  on ARM64 with this patch with/without
> > >> CONFIG_DETECT_HUNG_TASK,
> > >>
> > >> Any thought ?
> > > Please share log and instructions to reproduce if possible. Also, if
> > > possible, please share bisection log that led you to this patch.
> > >
> > > I currently do not see how this patch would cause that, it only
> > > increases the timeout duration.
> > >
> > > I know that under QEMU TCG mode, there are occasionally timeouts in
> > > the test simply due to QEMU being extremely slow or other weirdness.
> > >
> > >
> > Hi Marco,
> >
> > There are some of the results of the current test:
> > 1. Using qemu-kvm on arm64 machine, all testcase can pass.
> > 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
> > 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.
> >
> > I add some printing to the kernel and get very strange results.
> > I add a new variable kfence_allocation_key_gate to track the
> > state of kfence_allocation_key. As shown in the following code, theoretically,
> > if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
> > enabled, so the value of variable error in kfence_alloc() should always be
> > zero. In fact, all the passed testcases fit this point. But as shown in the
> > following failed log, although kfence_allocation_key has been enabled, it's
> > still check failed here.
> >
> > So I think static_key might be problematic in my qemu environment.
> > The change of timeout is not a problem but caused us to observe this problem.
> > I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
> > in each loop, then the failed testcase disappears.
>
> Nice analysis, thanks! What I gather is that static_keys/jump_labels
> are somehow broken in QEMU.
>
> This does remind me that I found a bug in QEMU that might be relevant:
> https://bugs.launchpad.net/qemu/+bug/1920934
> Looks like it was never fixed. :-/
>
> The failures I encountered caused the kernel to crash, but never saw
> the kfence test to fail due to that (never managed to get that far).
> Though the bug I saw was on x86 TCG mode, and I never tried arm64. If

[ ... that is, I didn't try running QEMU-ASan in arm64 TCG mode ... of
course I use QEMU arm64 to test. ;-) ]

> you can, try to build a QEMU with ASan and see if you also get the
> same use-after-free bug.
>
> Unless we observe the problem on a real machine, I think for now we
> can conclude with fairly high confidence that QEMU TCG still has
> issues and cannot be fully trusted here (see bug above).
>
> Thanks,
> -- Marco

  reply	other threads:[~2021-09-18  9:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21 10:51 [PATCH v2 0/3] kfence: optimize timer scheduling Marco Elver
2021-04-21 10:51 ` Marco Elver
2021-04-21 10:51 ` [PATCH v2 1/3] kfence: await for allocation using wait_event Marco Elver
2021-04-21 10:51   ` Marco Elver
2021-04-21 10:51 ` [PATCH v2 2/3] kfence: maximize allocation wait timeout duration Marco Elver
2021-04-21 10:51   ` Marco Elver
2021-09-16  1:02   ` Kefeng Wang
2021-09-16  1:20     ` Kefeng Wang
2021-09-16  8:49       ` Marco Elver
2021-09-16  8:49         ` Marco Elver
2021-09-18  8:07         ` Liu Shixin
2021-09-18  9:37           ` Marco Elver
2021-09-18  9:37             ` Marco Elver
2021-09-18  9:45             ` Marco Elver [this message]
2021-09-18  9:45               ` Marco Elver
2021-09-16 15:45       ` David Laight
2021-09-16 15:48         ` Marco Elver
2021-09-16 15:48           ` Marco Elver
2021-04-21 10:51 ` [PATCH v2 3/3] kfence: use power-efficient work queue to run delayed work Marco Elver
2021-04-21 10:51   ` Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANpmjNOUt5is7iHCAz9aOdD2nBb_7tqAKXmuWtitY_VNOkmv5w@mail.gmail.com \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hdanton@sina.com \
    --cc=jannh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.