All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@soleen.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	Tyler Hicks <tyhicks@linux.microsoft.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	mike.kravetz@oracle.com, Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	David Rientjes <rientjes@google.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	Ira Weiny <ira.weiny@intel.com>,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v4 08/10] mm/gup: limit number of gup migration failures, honor failures
Date: Fri, 15 Jan 2021 13:10:27 -0500	[thread overview]
Message-ID: <CA+CK2bC=o1-qW5+d-Lud9qN1937PC4Jxf_oyxwVrKby=mH5WyQ@mail.gmail.com> (raw)
In-Reply-To: <CA+CK2bDDUMOeCH8rQBL7fBdHCAUZBOykyXNL2N=hmxq7xi0giQ@mail.gmail.com>

On Wed, Jan 13, 2021 at 3:05 PM Pavel Tatashin
<pasha.tatashin@soleen.com> wrote:
>
> > > > Oh, that existing logic is wrong too :( Another bug.
> > >
> > > I do not think there is a bug.
> > >
> > > > You can't skip pages in the pages[] array under the assumption they
> > > > are contiguous. ie the i+=step is wrong.
> > >
> > > If pages[i] is part of a compound page, the other parts of this page
> > > must be sequential in this array for this compound page
> >
> > That is true only if the PMD points to the page. If the PTE points to
> > a tail page then there is no requirement that other PTEs are
> > contiguous with the compount page.
> >
> > At this point we have no idea if the GUP logic got this compound page
> > as a head page in a PMD or as a tail page from a PTE, so we can't
> > assume a contiguous run of addresses.
>
> I see, I will fix this bug in an upstream as a separate patch in my
> series, and keep the fix when my fixes are applied.
>
> >
> > Look at split_huge_pmd() - it doesn't break up the compound page it
> > just converts the PMD to a PTE array and scatters the tail pages to
> > the PTE.

Hi Jason,

I've been thinking about this some more. Again, I am not sure this is
a bug. I understand split_huge_pmd() may split the PMD size page into
PTEs and leave the compound page intact. However, in order for pages[]
to have non sequential addresses in compound page, those PTEs must
also be migrated after split_huge_pmd(), however when we migrate them
we will either migrate the whole compound page or do
split_huge_page_to_list() which will in turn do ClearPageCompound().
Please let me know if I am missing something.

Thank you,
Pasha

>
> Got it, unfortunately the fix will deoptimize the code by having to
> check every page if it is part of a previous compound page or not.
>
> >
> > I understand Matt is pushing on this idea more by having compound
> > pages in the page cache, but still mapping tail pages when required.
> >
> > > This is actually standard migration procedure, elsewhere in the kernel
> > > we migrate pages in exactly the same fashion: isolate and later
> > > migrate. The isolation works for LRU only pages.
> >
> > But do other places cause a userspace visible random failure when LRU
> > isolation fails?
>
> Makes sense, I will remove maximum retries for isolation, and retry
> indefinitely, the same as it is done during memory hot-remove. So, we
> will fail only when migration fails.
>
> >
> > I don't like it at all, what is the user supposed to do?
> >
> > Jason

  parent reply	other threads:[~2021-01-15 18:12 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-17 18:52 [PATCH v4 00/10] prohibit pinning pages in ZONE_MOVABLE Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 01/10] mm/gup: don't pin migrated cma pages in movable zone Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 02/10] mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 03/10] mm: apply per-task gfp constraints in fast path Pavel Tatashin
2020-12-18  9:36   ` Michal Hocko
2020-12-18 12:23     ` Pavel Tatashin
2020-12-18 12:23       ` Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 04/10] mm: honor PF_MEMALLOC_PIN for all movable pages Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 05/10] mm/gup: migrate pinned pages out of movable zone Pavel Tatashin
2020-12-18  9:43   ` Michal Hocko
2020-12-18 12:24     ` Pavel Tatashin
2020-12-18 12:24       ` Pavel Tatashin
2020-12-18 13:08       ` Michal Hocko
2021-01-13 19:14         ` Pavel Tatashin
2021-01-13 19:14           ` Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 06/10] memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning Pavel Tatashin
2020-12-18  9:44   ` Michal Hocko
2020-12-17 18:52 ` [PATCH v4 07/10] mm/gup: change index type to long as it counts pages Pavel Tatashin
2020-12-18  9:50   ` Michal Hocko
2020-12-18 12:32     ` Pavel Tatashin
2020-12-18 12:32       ` Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 08/10] mm/gup: limit number of gup migration failures, honor failures Pavel Tatashin
2020-12-17 20:50   ` Jason Gunthorpe
2020-12-17 22:02     ` Pavel Tatashin
2020-12-17 22:02       ` Pavel Tatashin
2020-12-18 14:19       ` Jason Gunthorpe
2021-01-13 19:43         ` Pavel Tatashin
2021-01-13 19:43           ` Pavel Tatashin
2021-01-13 19:55           ` Jason Gunthorpe
2021-01-13 20:05             ` Pavel Tatashin
2021-01-13 20:05               ` Pavel Tatashin
2021-01-13 23:40               ` Jason Gunthorpe
2021-01-15 18:10               ` Pavel Tatashin [this message]
2021-01-15 18:10                 ` Pavel Tatashin
2021-01-15 18:40                 ` Jason Gunthorpe
2020-12-18 10:46   ` Michal Hocko
2020-12-18 12:43     ` Pavel Tatashin
2020-12-18 12:43       ` Pavel Tatashin
2020-12-18 13:04       ` David Hildenbrand
2020-12-18 13:14       ` Michal Hocko
2021-01-13 19:49         ` Pavel Tatashin
2021-01-13 19:49           ` Pavel Tatashin
2020-12-17 18:52 ` [PATCH v4 09/10] selftests/vm: test flag is broken Pavel Tatashin
2020-12-18  9:06   ` John Hubbard
2020-12-18  9:11     ` John Hubbard
2020-12-17 18:52 ` [PATCH v4 10/10] selftests/vm: test faulting in kernel, and verify pinnable pages Pavel Tatashin
2020-12-19  5:57   ` John Hubbard
2020-12-19 15:22     ` Pavel Tatashin
2020-12-19 15:22       ` Pavel Tatashin
2020-12-19 23:51       ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+CK2bC=o1-qW5+d-Lud9qN1937PC4Jxf_oyxwVrKby=mH5WyQ@mail.gmail.com' \
    --to=pasha.tatashin@soleen.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=osalvador@suse.de \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=tyhicks@linux.microsoft.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.