All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Zhaoyang Huang <huangzhaoyang@gmail.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	"zhaoyang.huang" <zhaoyang.huang@unisoc.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>, Ke Wang <ke.wang@unisoc.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH] mm: fix racing of vb->va when kasan enabled
Date: Tue, 21 Jun 2022 16:29:06 +0200	[thread overview]
Message-ID: <YrHVsryZlnpO/Vha@pc638.lan> (raw)
In-Reply-To: <CAGWkznFdZ1_jrSWSOPkSDyLY1OSodZBy6MTfdwPKo3VoW67GBg@mail.gmail.com>

> On Tue, Jun 21, 2022 at 5:27 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > > On Mon, Jun 20, 2022 at 6:44 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > > > >
> > > > > > Is it easy to reproduce? If so could you please describe the steps? As i see
> > > > > > the freeing of the "vb" is RCU safe whereas vb->va is not. But from the first
> > > > > > glance i do not see how it can accessed twice. Hm..
> > > > > It was raised from a monkey test on A13_k515 system and got 1/20 pcs
> > > > > failed. IMO, vb->va which out of vmap_purge_lock protection could race
> > > > > with a concurrent ra freeing within __purge_vmap_area_lazy.
> > > > >
> > > > Do you have exact steps how you run "monkey" test?
> > > There are about 30+ kos inserted during startup which could be a
> > > specific criteria for reproduction. Do you have doubts about the test
> > > result or the solution?
> > > >
> > I do not have any doubt about your test results, so if you can trigger it
> > then there is an issue at least on the 5.4.161-android12 kernel.
> >
> > 1. With your fix we get expanded mutex range, thus the worst case of vmalloc
> > allocation can be increased when it fails and repeat. Because it also invokes
> > the purge_vmap_area_lazy() that access the same mutex.
> I am not sure I get your point. _vm_unmap_aliases calls
> _purge_vmap_area_lazy instead of purge_vmap_area_lazy. Do you have any
> other solutions? I really don't think my patch is the best way as I
> don't have a full view of vmalloc mechanism.
>
Yep, but it holds the mutex:

<snip>
mutex_lock(&vmap_purge_lock);
purge_fragmented_blocks_allcpus();
if (!__purge_vmap_area_lazy(start, end) && flush)
	flush_tlb_kernel_range(start, end);
mutex_unlock(&vmap_purge_lock);
<snip>

I do not have a solution yet. I am trying still to figure out how you can
trigger it. 

<snip>
	rcu_read_lock();
	list_for_each_entry_rcu(vb, &vbq->free, free_list) {
		spin_lock(&vb->lock);
		if (vb->dirty && vb->dirty != VMAP_BBMAP_BITS) {
			unsigned long va_start = vb->va->va_start;
<snip>

so you say that "vb->va->va_start" can be accessed twice. I do not see
how it can happen. The purge_fragmented_blocks() removes "vb" from the 
free_list and set vb->dirty to the VMAP_BBMAP_BITS to prevent purging
it again. It is protected by the spin_lock(&vb->lock):

<snip>
spin_lock(&vb->lock);
if (vb->free + vb->dirty == VMAP_BBMAP_BITS && vb->dirty != VMAP_BBMAP_BITS) {
	vb->free = 0; /* prevent further allocs after releasing lock */
	vb->dirty = VMAP_BBMAP_BITS; /* prevent purging it again */
	vb->dirty_min = 0;
	vb->dirty_max = VMAP_BBMAP_BITS;
<snip>

so the VMAP_BBMAP_BITS is set under spinlock. The _vm_unmap_aliases() checks it:

<snip>
list_for_each_entry_rcu(vb, &vbq->free, free_list) {
	spin_lock(&vb->lock);
	if (vb->dirty && vb->dirty != VMAP_BBMAP_BITS) {
		unsigned long va_start = vb->va->va_start;
		unsigned long s, e;
<snip>

if the "vb->dirty != VMAP_BBMAP_BITS". I am missing your point here?

> >
> > 2. You run 5.4.161-android12 kernel what is quite old. Could you please
> > retest with latest kernel? I am asking because on the latest kernel with
> > CONFIG_KASAN i am not able to reproduce it.
> >
> > I do a lot of: vm_map_ram()/vm_unmap_ram()/vmalloc()/vfree() in parallel
> > by 64 kthreads on my 64 CPUs test system.
> The failure generates at 20s from starting up, I think it is a rare timing.
> >
> > Could you please confirm that you can trigger an issue on the latest kernel?
> Sorry, I don't have an available latest kernel for now.
>
Can you do: "gdb ./vmlinux", execute "l *_vm_unmap_aliases+0x164" and provide
output?

--
Uladzislau Rezki

  reply	other threads:[~2022-06-21 14:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25  2:52 [PATCH] mm: fix racing of vb->va when kasan enabled zhaoyang.huang
2022-05-26  6:31 ` Xuewen Yan
2022-06-19 21:03   ` Uladzislau Rezki
2022-06-20  6:58     ` Zhaoyang Huang
2022-06-20 10:43       ` Uladzislau Rezki
2022-06-20 11:23         ` Zhaoyang Huang
2022-06-21  9:26           ` Uladzislau Rezki
2022-06-21 11:00             ` Zhaoyang Huang
2022-06-21 14:29               ` Uladzislau Rezki [this message]
2022-06-22  3:15                 ` Zhaoyang Huang
2022-06-22  6:04                   ` Zhaoyang Huang
2022-06-24 10:27                     ` Uladzislau Rezki
2022-06-24 10:51                       ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YrHVsryZlnpO/Vha@pc638.lan \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=huangzhaoyang@gmail.com \
    --cc=ke.wang@unisoc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=zhaoyang.huang@unisoc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.