linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
To: David Hildenbrand <david@redhat.com>, Michal Hocko <mhocko@kernel.org>
Cc: akpm@linux-foundation.org, aarcange@redhat.com,
	dan.j.williams@intel.com,  dave.hansen@intel.com,
	konrad.wilk@oracle.com, lcapitulino@redhat.com,
	 mgorman@techsingularity.net, mm-commits@vger.kernel.org,
	mst@redhat.com,  osalvador@suse.de, pagupta@redhat.com,
	pbonzini@redhat.com, riel@surriel.com,  vbabka@suse.cz,
	wei.w.wang@intel.com, willy@infradead.org,
	yang.zhang.wz@gmail.com,  linux-mm@kvack.org
Subject: Re: + mm-introduce-reported-pages.patch added to -mm tree
Date: Tue, 12 Nov 2019 10:34:33 -0800	[thread overview]
Message-ID: <4be6114f57934eb1478f84fd1358a7fcc547b248.camel@linux.intel.com> (raw)
In-Reply-To: <8a407188-5dd2-648b-fc26-f03a826bfee3@redhat.com>

On Tue, 2019-11-12 at 14:04 +0100, David Hildenbrand wrote:
> > > > fact is it is still invasive, just to different parts of the mm subsystem.
> > > 
> > > I'd love to see how it uses the page isolation framework, and only has a 
> > > single hook to queue pages. I don't like the way pages are pulled out of 
> > > the buddy in Niteshs approach currently. What you have is cleaner.
> > 
> > I don't see how you could use the page isolation framework to pull out
> > free pages. Is there a thread somewhere on the topic that I missed?
> 
> It's basically only isolating pages while reporting them, and not
> pulling them out of the buddy (IOW, you move the pages to the isolate
> queues where nobody is allowed to touch them, and setting the
> migratetype properly). This e.g., makes other user of page isolation
> (e.g., memory offlining, alloc_contig_range()) play nicely with these
> isolated pages. "somebody else just isolated them, please try again."

How so? If I understand correctly there isn't anything that prevents you
from isolating an already isolated page, is there? Last I knew isolated
pages are still considered "movable" since they are still buddy pages
aren't they?

Also this seems like it would have other implications since isolating a
page kicks of the memory notifier so as a result a balloon driver would
then free the pages back out so that they could be isolated with the
assumption the region is going offline.

> start_isolate_page_range()/undo_isolate_page_range()/test_pages_isolated()
> along with a lockless check if the page is free.

Okay, that part I think I get. However doesn't all that logic more or less
ignore the watermarks? It seems like you could cause an OOM if you don't
have the necessary checks in place for that.

> I think it should be something like this (ignoring different
> migratetypes and such for now)
> 
> 1. Test lockless if page is free: Not free? Done.

So this should help to reduce the liklihood of races in the steps below.
However it might also be useful if the code had some other check to see if
it was done other than just making a pass through the bitmap.

One thing I had brought up with Nitesh was the idea of maybe doing some
sort of RCU bitmap type approach. Basically while we hold the zone lock we
could swap out the old bitmap for a new one. We could probably even keep a
counter at the start of the structure so that we could track how many bits
are actually set there. Then it becomes less likely of having a race where
you free a page and set the bit and the hinting thread tests and clears
the bit but doesn't see the freed page since it is not synchronized.
Otherwise your notification setup and reporting thread may need a few smp
barriers added where necessary.

> 2. start_isolate_page_range(): Busy? Rare race (with other isolate users

Doesn't this have the side effect of draining all the percpu caches in
order to make certain to flush the pages we isolated from there?

> or with an allocation). Done.
> 3. test_pages_isolated()

So I have reviewed the code and I don't see how this could conflict with
other callers isolating the pages. If anything it seems like if another
thread has already isolated the pages you would end up getting a false
positive, reporting the pages, and pulling them back out of isolation.

> 3a. no? Rare race, page not free anymore. undo_isolate_page_range()

I would hope it is rare. However for something like a max order page I
could easily see a piece of it having been pulled out. I would think this
case would be exceedingly expensive since you would have to put back any
pages you had previous moved into isolation.

> 3b. yes? Report, then undo_isolate_page_range()
> 
> If we would run into performance issues with the current page isolation
> implementation (esp. locking), I think there are some nice
> cleanups/reworks possible of which all current users could benefit
> (especially accross pageblocks).

To me this feels a lot like what you had for this solution near the start.
Only now instead of placing the pages into an array you are tracking a
bitmap and then using that bitmap to populate the MIGRATE_ISOLATE lists.

This sounds far more complex to me then it probably needs to be since just
holding the pages with the buddy type cleared should be enough to make
them temporarily unusable for other threads, and even in your case you are
still having to use the scatterlist in order to hold the pages and track
what you will need to undo the isolation later.





  reply	other threads:[~2019-11-12 18:34 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20191106000547.juQRi83gi%akpm@linux-foundation.org>
2019-11-06 12:16 ` + mm-introduce-reported-pages.patch added to -mm tree Michal Hocko
2019-11-06 14:09   ` David Hildenbrand
2019-11-06 16:35     ` Alexander Duyck
2019-11-06 16:54       ` Michal Hocko
2019-11-06 17:48         ` Alexander Duyck
2019-11-06 22:11           ` Mel Gorman
2019-11-06 23:38             ` David Hildenbrand
2019-11-07  0:20             ` Alexander Duyck
2019-11-07 10:20               ` Mel Gorman
2019-11-07 16:07                 ` Alexander Duyck
2019-11-08  9:43                   ` Mel Gorman
2019-11-08 16:17                     ` Alexander Duyck
2019-11-08 18:41                       ` Mel Gorman
2019-11-08 20:29                         ` Alexander Duyck
2019-11-09 14:57                           ` Mel Gorman
2019-11-10 18:03                             ` Alexander Duyck
2019-11-06 23:33           ` David Hildenbrand
2019-11-07  0:20             ` Dave Hansen
2019-11-07  0:52               ` David Hildenbrand
2019-11-07 17:12                 ` Dave Hansen
2019-11-07 17:46                   ` Michal Hocko
2019-11-07 18:08                     ` Dave Hansen
2019-11-07 18:12                     ` Alexander Duyck
2019-11-08  9:57                       ` Michal Hocko
2019-11-08 16:43                         ` Alexander Duyck
2019-11-07 18:46                   ` Qian Cai
2019-11-07 18:02             ` Alexander Duyck
2019-11-07 19:37               ` Nitesh Narayan Lal
2019-11-07 22:46                 ` Alexander Duyck
2019-11-07 22:43               ` David Hildenbrand
2019-11-08  0:42                 ` Alexander Duyck
2019-11-08  7:06                   ` David Hildenbrand
2019-11-08 17:18                     ` Alexander Duyck
2019-11-12 13:04                       ` David Hildenbrand
2019-11-12 18:34                         ` Alexander Duyck [this message]
2019-11-12 21:05                           ` David Hildenbrand
2019-11-12 22:17                             ` David Hildenbrand
2019-11-12 22:19                             ` Alexander Duyck
2019-11-12 23:10                               ` David Hildenbrand
2019-11-13  0:31                                 ` Alexander Duyck
2019-11-13 18:51                           ` Nitesh Narayan Lal
2019-11-06 16:49   ` Nitesh Narayan Lal
2019-11-11 18:52   ` Nitesh Narayan Lal
2019-11-11 22:00     ` Alexander Duyck
2019-11-12 15:19       ` Nitesh Narayan Lal
2019-11-12 16:18         ` Alexander Duyck
2019-11-13 18:39           ` Nitesh Narayan Lal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4be6114f57934eb1478f84fd1358a7fcc547b248.camel@linux.intel.com \
    --to=alexander.h.duyck@linux.intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=lcapitulino@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).