Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Liu Bo <bo.liu@linux.alibaba.com>, Jan Kara <jack@suse.cz>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Stable tree <stable@vger.kernel.org>
Subject: Re: [PATCH v3] mm, memcg: fix reclaim deadlock with writeback
Date: Fri, 14 Dec 2018 09:49:48 +0100
Message-ID: <20181214084948.GA5624@dhcp22.suse.cz> (raw)
In-Reply-To: <20181213220400.GA9829@cmpxchg.org>

On Thu 13-12-18 17:04:00, Johannes Weiner wrote:
[...]
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks!

> Just one nit:
> 
> > @@ -2993,6 +2993,17 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
> >  	struct vm_area_struct *vma = vmf->vma;
> >  	vm_fault_t ret;
> >  
> > +	/*
> > +	 * Preallocate pte before we take page_lock because this might lead to
> > +	 * deadlocks for memcg reclaim which waits for pages under writeback.
> > +	 */
> > +	if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) {
> > +		vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm, vmf->address);
> > +		if (!vmf->prealloc_pte)
> > +			return VM_FAULT_OOM;
> > +		smp_wmb(); /* See comment in __pte_alloc() */
> > +	}
> 
> Could you be more specific in the deadlock comment? git blame will
> work fine for a while, but it becomes a pain to find corresponding
> patches after stuff gets moved around for years.
> 
> In particular the race diagram between reclaim with a page lock held
> and the fs doing SetPageWriteback batches before kicking off IO would
> be useful directly in the code, IMO.

This?

diff --git a/mm/memory.c b/mm/memory.c
index bb78e90a9b70..ece221e4da6d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2995,7 +2995,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
 
 	/*
 	 * Preallocate pte before we take page_lock because this might lead to
-	 * deadlocks for memcg reclaim which waits for pages under writeback.
+	 * deadlocks for memcg reclaim which waits for pages under writeback:
+	 * 				lock_page(A)
+	 * 				SetPageWriteback(A)
+	 * 				unlock_page(A)
+	 * lock_page(B)
+	 * 				lock_page(B)
+	 * pte_alloc_pne
+	 *   shrink_page_list
+	 *     wait_on_page_writeback(A)
+	 *     				SetPageWriteback(B)
+	 *     				unlock_page(B)
+	 *     				# flush A, B to clear the writeback
 	 */
 	if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) {
 		vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm, vmf->address);
-- 
Michal Hocko
SUSE Labs

  reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-11 13:26 [PATCH] " Michal Hocko
2018-12-11 15:15 ` Kirill A. Shutemov
2018-12-11 16:21   ` Michal Hocko
2018-12-11 16:39     ` Jan Kara
2018-12-12  9:42 ` Kirill A. Shutemov
2018-12-12  9:48   ` Michal Hocko
2018-12-12 10:05   ` Jan Kara
2018-12-12 11:58 ` Kirill A. Shutemov
2018-12-12 12:13   ` Michal Hocko
2018-12-12 15:33 ` kbuild test robot
2018-12-12 15:57   ` Michal Hocko
2018-12-12 15:50 ` [PATCH v2] " Michal Hocko
2018-12-12 17:24   ` Shakeel Butt
2018-12-12 17:49   ` Liu Bo
2018-12-13  9:22   ` [PATCH v3] " Michal Hocko
2018-12-13 10:41     ` Kirill A. Shutemov
2018-12-13 12:39       ` Michal Hocko
2018-12-13 22:04     ` Johannes Weiner
2018-12-14  8:49       ` Michal Hocko [this message]
2018-12-13 22:43     ` Liu Bo
2018-12-14 17:13 ` [PATCH] " kbuild test robot
2018-12-14 17:31   ` Michal Hocko

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181214084948.GA5624@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bo.liu@linux.alibaba.com \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shakeelb@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox