linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Minchan Kim <minchan@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Harish Sriram <harish@linux.ibm.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] Revert "mm/vunmap: add cond_resched() in vunmap_pmd_range"
Date: Mon, 9 Nov 2020 12:33:59 +0100	[thread overview]
Message-ID: <20201109113359.GA32670@pc636> (raw)
In-Reply-To: <20201107083939.GA1633068@google.com>

On Sat, Nov 07, 2020 at 12:39:39AM -0800, Minchan Kim wrote:
> Hi Andrew,
> 
> On Fri, Nov 06, 2020 at 05:59:33PM -0800, Andrew Morton wrote:
> > On Thu,  5 Nov 2020 09:02:49 -0800 Minchan Kim <minchan@kernel.org> wrote:
> > 
> > > This reverts commit e47110e90584a22e9980510b00d0dfad3a83354e.
> > > 
> > > While I was doing zram testing, I found sometimes decompression failed
> > > since the compression buffer was corrupted. With investigation,
> > > I found below commit calls cond_resched unconditionally so it could
> > > make a problem in atomic context if the task is reschedule.
> > > 
> > > Revert the original commit for now.
> > > 
> > > [   55.109012] BUG: sleeping function called from invalid context at mm/vmalloc.c:108
> > > [   55.110774] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog
> > > [   55.111973] 3 locks held by memhog/946:
> > > [   55.112807]  #0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160
> > > [   55.114151]  #1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160
> > > [   55.115848]  #2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0
> > > [   55.118947] CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty #316
> > > [   55.121265] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
> > > [   55.122540] Call Trace:
> > > [   55.122974]  dump_stack+0x8b/0xb8
> > > [   55.123588]  ___might_sleep.cold+0xb6/0xc6
> > > [   55.124328]  unmap_kernel_range_noflush+0x2eb/0x350
> > > [   55.125198]  unmap_kernel_range+0x14/0x30
> > > [   55.125920]  zs_unmap_object+0xd5/0xe0
> > > [   55.126604]  zram_bvec_rw.isra.0+0x38c/0x8e0
> > > [   55.127462]  zram_rw_page+0x90/0x101
> > > [   55.128199]  bdev_write_page+0x92/0xe0
> > > [   55.128957]  ? swap_slot_free_notify+0xb0/0xb0
> > > [   55.129841]  __swap_writepage+0x94/0x4a0
> > > [   55.130636]  ? do_raw_spin_unlock+0x4b/0xa0
> > > [   55.131462]  ? _raw_spin_unlock+0x1f/0x30
> > > [   55.132261]  ? page_swapcount+0x6c/0x90
> > > [   55.133038]  pageout+0xe3/0x3a0
> > > [   55.133702]  shrink_page_list+0xb94/0xd60
> > > [   55.134626]  shrink_inactive_list+0x158/0x460
> > >
> > > ...
> > >
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -102,8 +102,6 @@ static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
> > >  		if (pmd_none_or_clear_bad(pmd))
> > >  			continue;
> > >  		vunmap_pte_range(pmd, addr, next, mask);
> > > -
> > > -		cond_resched();
> > >  	} while (pmd++, addr = next, addr != end);
> > >  }
> > 
> > If this is triggering a warning then why isn't the might_sleep() in
> > remove_vm_area() also triggering?
> 
> I don't understand what specific callpath you are talking but if
> it's clearly called in atomic context, the reason would be config
> combination I met.
>     
>     CONFIG_PREEMPT_VOLUNTARY + no CONFIG_DEBUG_ATOMIC_SLEEP
> 
> It makes preempt_count logic void so might_sleep warning will not work.
> 
> > 
> > Sigh.  I also cannot remember why these vfree() functions have to be so
> > awkward.  The mutex_trylock(&vmap_purge_lock) isn't permitted in
> > interrupt context because mutex_trylock() is stupid, but what was the
> > issue with non-interrupt atomic code?
> > 
> 
> Seems like a latency issue.
> 
> commit f9e09977671b
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Mon Dec 12 16:44:23 2016 -0800
> 
>     mm: turn vmap_purge_lock into a mutex
> 
> The purge_lock spinlock causes high latencies with non RT kernel,
> 
__purge_vmap_area_lazy() is our bottleneck. The list of areas to be
drained can be quite big, so the process itself is time consuming.
That is why it is protected by the mutex instead of spinlock. Latency
issues.

I proposed to optimize it here: https://lkml.org/lkml/2020/9/7/815
I will send out the patch soon, but it is another topic not related
to this issue.

--
Vlad Rezki

  reply	other threads:[~2020-11-09 11:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-05 17:02 [PATCH] Revert "mm/vunmap: add cond_resched() in vunmap_pmd_range" Minchan Kim
2020-11-05 17:16 ` Matthew Wilcox
2020-11-05 17:33   ` Minchan Kim
2020-11-07  1:59 ` Andrew Morton
2020-11-07  8:39   ` Minchan Kim
2020-11-09 11:33     ` Uladzislau Rezki [this message]
2020-11-12 20:01     ` Minchan Kim
2020-11-12 22:49       ` Andrew Morton
2020-11-13 16:25         ` Minchan Kim
2020-11-16 17:53           ` Minchan Kim
2020-11-16 23:20             ` Andrew Morton
2020-11-17 13:57               ` Uladzislau Rezki
2020-11-17 13:56             ` Christoph Hellwig
2020-11-17 20:29               ` Minchan Kim
2020-11-18  2:06                 ` Sergey Senozhatsky
2020-11-19  9:29                   ` Tony Lindgren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201109113359.GA32670@pc636 \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=harish@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).