All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: snazy@snazy.de
Cc: Jan Kara <jack@suse.cz>, Johannes Weiner <hannes@cmpxchg.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Randy Dunlap <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Potyra, Stefan" <Stefan.Potyra@elektrobit.com>
Subject: Re: mlockall(MCL_CURRENT) blocking infinitely
Date: Wed, 6 Nov 2019 09:56:09 -0500	[thread overview]
Message-ID: <20191106145608.ucvuwsuyijvkxz22@macbook-pro-91.dhcp.thefacebook.com> (raw)
In-Reply-To: <4edf4dea97f6c1e3c7d4fed0e12c3dc6dff7575f.camel@gmx.de>

On Wed, Nov 06, 2019 at 02:45:43PM +0100, Robert Stupp wrote:
> On Wed, 2019-11-06 at 13:03 +0100, Jan Kara wrote:
> > On Tue 05-11-19 13:22:11, Johannes Weiner wrote:
> > > What I don't quite understand yet is why the fault path doesn't
> > > make
> > > progress eventually. We must drop the mmap_sem without changing the
> > > state in any way. How can we keep looping on the same page?
> >
> > That may be a slight suboptimality with Josef's patches. If the page
> > is marked as PageReadahead, we always drop mmap_sem if we can and
> > start
> > readahead without checking whether that makes sense or not in
> > do_async_mmap_readahead(). OTOH page_cache_async_readahead() then
> > clears
> > PageReadahead so the only way how I can see we could loop like this
> > is when
> > file->ra->ra_pages is 0. Not sure if that's what's happening through.
> > We'd
> > need to find which of the paths in filemap_fault() calls
> > maybe_unlock_mmap_for_io() to tell more.
> 
> Yes, ra_pages==0
> Attached the dmesg + smaps outputs
> 
> 

Ah ok I see what's happening, __get_user_pages() returns 0 if we get an EBUSY
from faultin_page, and then __mm_populate does nend = nstart + ret * PAGE_SIZE,
which just leaves us where we are.

We need to handle the non-blocking and the locking separately in __mm_populate
so we know what's going on.  Jan's fix for the readahead thing is definitely
valid as well, but this will keep us from looping forever in other retry cases.

diff --git a/mm/gup.c b/mm/gup.c
index 8f236a335ae9..ac625805d569 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1237,6 +1237,7 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
 	unsigned long end, nstart, nend;
 	struct vm_area_struct *vma = NULL;
 	int locked = 0;
+	int nonblocking = 1;
 	long ret = 0;
 
 	end = start + len;
@@ -1268,7 +1269,7 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
 		 * double checks the vma flags, so that it won't mlock pages
 		 * if the vma was already munlocked.
 		 */
-		ret = populate_vma_page_range(vma, nstart, nend, &locked);
+		ret = populate_vma_page_range(vma, nstart, nend, &nonblocking);
 		if (ret < 0) {
 			if (ignore_errors) {
 				ret = 0;
@@ -1276,6 +1277,14 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
 			}
 			break;
 		}
+
+		/*
+		 * We dropped the mmap_sem, so we need to re-lock, and the next
+		 * loop around we won't drop because nonblocking is now 0.
+		 */
+		if (!nonblocking)
+			locked = 0;
+
 		nend = nstart + ret * PAGE_SIZE;
 		ret = 0;
 	}

  parent reply	other threads:[~2019-11-06 14:56 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-24  7:36 mlockall(MCL_CURRENT) blocking infinitely Robert Stupp
2019-10-24 23:34 ` Randy Dunlap
2019-10-25  9:21   ` Michal Hocko
2019-10-25 11:02     ` Robert Stupp
2019-10-25 11:02       ` Robert Stupp
2019-10-25 11:46       ` Michal Hocko
2019-10-25 11:50         ` Michal Hocko
2019-10-25 11:59           ` Robert Stupp
2019-10-25 11:59             ` Robert Stupp
2019-10-25 13:19             ` Robert Stupp
2019-10-25 13:19               ` Robert Stupp
2019-10-25 11:55         ` Robert Stupp
2019-10-25 11:55           ` Robert Stupp
2019-10-25 12:05           ` Michal Hocko
2019-10-25 12:11             ` Michal Hocko
2019-10-25 13:10               ` Robert Stupp
2019-10-25 13:10                 ` Robert Stupp
2019-10-25 13:27                 ` Michal Hocko
2019-10-25 13:45                   ` Robert Stupp
2019-10-25 13:45                     ` Robert Stupp
2019-10-25 13:57                     ` Michal Hocko
2019-10-25 14:00                       ` Michal Hocko
2019-10-25 15:58                         ` Robert Stupp
2019-10-25 15:58                           ` Robert Stupp
2019-11-05 13:23                           ` Robert Stupp
2019-11-05 13:23                             ` Robert Stupp
2019-11-05 15:28                             ` Vlastimil Babka
2019-11-05 18:22                               ` Johannes Weiner
2019-11-05 20:05                                 ` Robert Stupp
2019-11-05 20:05                                   ` Robert Stupp
2019-11-06 10:25                                   ` Robert Stupp
2019-11-06 10:25                                     ` Robert Stupp
2019-11-06 11:26                                     ` Robert Stupp
2019-11-06 11:26                                       ` Robert Stupp
2019-11-06 12:04                                       ` Jan Kara
2019-11-06 12:24                                       ` Robert Stupp
2019-11-06 12:24                                         ` Robert Stupp
2019-11-06 12:03                                 ` Jan Kara
2019-11-06 13:45                                   ` Robert Stupp
2019-11-06 13:45                                     ` Robert Stupp
2019-11-06 14:35                                     ` Jan Kara
2019-11-06 15:32                                       ` Robert Stupp
2019-11-06 15:32                                         ` Robert Stupp
2019-11-06 14:38                                     ` Jan Kara
2019-11-06 14:56                                     ` Josef Bacik [this message]
2019-11-06 15:05                                       ` Jan Kara
2019-11-06 15:14                                         ` Josef Bacik
2019-11-06 15:25                                           ` Jan Kara
2019-11-06 16:39                                           ` Robert Stupp
2019-11-06 16:39                                             ` Robert Stupp
2019-11-06 17:03                                             ` Robert Stupp
2019-11-06 17:03                                               ` Robert Stupp
2019-11-06 17:25                                               ` Jan Kara
2019-11-07  8:08                                                 ` Robert Stupp
2019-11-07  8:08                                                   ` Robert Stupp
2019-11-20 12:42                                                   ` Robert Stupp
2019-10-25 13:55                   ` Robert Stupp
2019-10-25 13:55                     ` Robert Stupp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191106145608.ucvuwsuyijvkxz22@macbook-pro-91.dhcp.thefacebook.com \
    --to=josef@toxicpanda.com \
    --cc=Stefan.Potyra@elektrobit.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=snazy@snazy.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.