LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Michael Ellerman <mpe@ellerman.id.au>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages
Date: Tue, 24 Oct 2017 13:44:23 +0900
Message-ID: <20171024044423.GA31424@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <20171023081009.7fyz3gfrmurvj635@dhcp22.suse.cz>

On Mon, Oct 23, 2017 at 10:10:09AM +0200, Michal Hocko wrote:
> On Mon 23-10-17 14:23:09, Joonsoo Kim wrote:
> > On Fri, Oct 20, 2017 at 09:02:20AM +0200, Michal Hocko wrote:
> > > On Fri 20-10-17 15:50:14, Joonsoo Kim wrote:
> > > > On Fri, Oct 20, 2017 at 07:59:22AM +0200, Michal Hocko wrote:
> > > > > On Fri 20-10-17 11:13:29, Joonsoo Kim wrote:
> > > > > > On Thu, Oct 19, 2017 at 02:21:18PM +0200, Michal Hocko wrote:
> > > > > > > On Thu 19-10-17 10:20:41, Michal Hocko wrote:
> > > > > > > > On Thu 19-10-17 16:33:56, Joonsoo Kim wrote:
> > > > > > > > > On Thu, Oct 19, 2017 at 09:15:03AM +0200, Michal Hocko wrote:
> > > > > > > > > > On Thu 19-10-17 11:51:11, Joonsoo Kim wrote:
> > > > > > > > [...]
> > > > > > > > > > > Hello,
> > > > > > > > > > > 
> > > > > > > > > > > This patch will break the CMA user. As you mentioned, CMA allocation
> > > > > > > > > > > itself isn't migrateable. So, after a single page is allocated through
> > > > > > > > > > > CMA allocation, has_unmovable_pages() will return true for this
> > > > > > > > > > > pageblock. Then, futher CMA allocation request to this pageblock will
> > > > > > > > > > > fail because it requires isolating the pageblock.
> > > > > > > > > > 
> > > > > > > > > > Hmm, does this mean that the CMA allocation path depends on
> > > > > > > > > > has_unmovable_pages to return false here even though the memory is not
> > > > > > > > > > movable? This sounds really strange to me and kind of abuse of this
> > > > > > > > > 
> > > > > > > > > Your understanding is correct. Perhaps, abuse or wrong function name.
> > > > > > > > >
> > > > > > > > > > function. Which path is that? Can we do the migrate type test theres?
> > > > > > > > > 
> > > > > > > > > alloc_contig_range() -> start_isolate_page_range() ->
> > > > > > > > > set_migratetype_isolate() -> has_unmovable_pages()
> > > > > > > > 
> > > > > > > > I see. It seems that the CMA and memory hotplug have a very different
> > > > > > > > view on what should happen during isolation.
> > > > > > > >  
> > > > > > > > > We can add one argument, 'XXX' to set_migratetype_isolate() and change
> > > > > > > > > it to check migrate type rather than has_unmovable_pages() if 'XXX' is
> > > > > > > > > specified.
> > > > > > > > 
> > > > > > > > Can we use the migratetype argument and do the special thing for
> > > > > > > > MIGRATE_CMA? Like the following diff?
> > > > > > > 
> > > > > > > And with the full changelog.
> > > > > > > ---
> > > > > > > >From 8cbd811d741f5dd93d1b21bb3ef94482a4d0bd32 Mon Sep 17 00:00:00 2001
> > > > > > > From: Michal Hocko <mhocko@suse.com>
> > > > > > > Date: Thu, 19 Oct 2017 14:14:02 +0200
> > > > > > > Subject: [PATCH] mm: distinguish CMA and MOVABLE isolation in
> > > > > > >  has_unmovable_pages
> > > > > > > 
> > > > > > > Joonsoo has noticed that "mm: drop migrate type checks from
> > > > > > > has_unmovable_pages" would break CMA allocator because it relies on
> > > > > > > has_unmovable_pages returning false even for CMA pageblocks which in
> > > > > > > fact don't have to be movable:
> > > > > > > alloc_contig_range
> > > > > > >   start_isolate_page_range
> > > > > > >     set_migratetype_isolate
> > > > > > >       has_unmovable_pages
> > > > > > > 
> > > > > > > This is a result of the code sharing between CMA and memory hotplug
> > > > > > > while each one has a different idea of what has_unmovable_pages should
> > > > > > > return. This is unfortunate but fixing it properly would require a lot
> > > > > > > of code duplication.
> > > > > > > 
> > > > > > > Fix the issue by introducing the requested migrate type argument
> > > > > > > and special case MIGRATE_CMA case where CMA page blocks are handled
> > > > > > > properly. This will work for memory hotplug because it requires
> > > > > > > MIGRATE_MOVABLE.
> > > > > > 
> > > > > > Unfortunately, alloc_contig_range() can be called with
> > > > > > MIGRATE_MOVABLE so this patch cannot perfectly fix the problem.
> > > > > 
> > > > > Yes, alloc_contig_range can be called with MIGRATE_MOVABLE but my
> > > > > understanding is that only CMA allocator really depends on this weird
> > > > > semantic and that does MIGRATE_CMA unconditionally.
> > > > 
> > > > alloc_contig_range() could be called for partial pages in the
> > > > pageblock. With your patch, this case also fails unnecessarilly if the
> > > > other pages in the pageblock is pinned.
> > > 
> > > Is this really the case for GB pages? Do we really want to mess those
> > 
> > No, but, as I mentioned already, this API can be called with less
> > pages. I know that there is no user with less pages at this moment but
> > I cannot see any point to reduce this API's capability.
> 
> I am still confused. So when exactly would you want to use this api for
> MIGRATE_MOVABLE and use a partial MIGRATE_CMA pageblock?
> 
> > > with CMA blocks and make those blocks basically unusable because GB
> > > pages are rarely (if at all migrateable)?
> > > 
> > > > Until now, there is no user calling alloc_contig_range() with partial
> > > > pages except CMA allocator but API could support it.
> > > 
> > > I disagree. If this is a CMA thing it should stay that way. The semantic
> > > is quite confusing already, please let's not make it even worse.
> > 
> > It is already used by other component.
> > 
> > I'm not sure what is the confusing semantic you mentioned. I think
> > that set_migratetype_isolate() has confusing semantic and should be
> > fixed since making the pageblock isolated doesn't need to check if
> > there is unmovable page or not. Do you think that
> > set_migratetype_isolate() need to check it? If so, why?
> 
> My intuitive understanding of set_migratetype_isolate is that it either
> suceeds and that means that the given pfn range can be isolated for the
> given type of allocation (be it movable or cma). No new pages will be
> allocated from this range to allow converging into a free range in a
> finit amount of time. At least this is how the hotplug code would like
> to use it and I suppose that the alloc_contig_range would like to
> guarantee the same to not rely on a fixed amount of migration attempts.

Yes, alloc_contig_range() also want to guarantee the similar thing.
Major difference between them is 'given pfn range'. memory hotplug
works by pageblock unit but alloc_contig_range() doesn't.
alloc_contig_range() works by the page unit. However, there is no easy
way to isolate individual page so it uses pageblock isolation
regardless of 'given pfn range'. In this case, checking movability of
all pages on the pageblock would cause the problem as I mentioned
before.

> 
> > > > > > I did a more thinking and found that it's strange to check if there is
> > > > > > unmovable page in the pageblock during the set_migratetype_isolate().
> > > > > > set_migratetype_isolate() should be just for setting the migratetype
> > > > > > of the pageblock. Checking other things should be done by another
> > > > > > place, for example, before calling the start_isolate_page_range() in
> > > > > > __offline_pages().
> > > > > 
> > > > > How do we guarantee the atomicity?
> > > > 
> > > > What atomicity do you mean?
> > > 
> > > Currently we are checking and isolating pages under zone lock. If we
> > > split that we are losing atomicity, aren't we.
> > 
> > I think that it can be done easily.
> > 
> > set_migratetype_isolate() {
> >         lock
> >         __set_migratetype_isolate();
> >         unlock
> > }
> > 
> > set_migratetype_isolate_if_no_unmovable_pages() {
> >         lock
> >         if (has_unmovable_pages())
> >                 fail
> >         else
> >                 __set_migratetype_isolate()
> >         unlock
> > }
> 
> So you are essentially suggesting to split the API for
> alloc_contig_range and hotplug users? Care to send a patch? It is not
> like I would really love this but I would really like to have this issue
> addressed because I really do want all other patches which depend on
> this to be merged in the next release cycle.
> 
> That being said, I would much rather see MIGRATE_CMA case special cased
> than duplicate the already confusing API but I will not insist of
> course.

Okay. I atteach the patch. Andrew, could you revert Michal's series
and apply this patch first? Perhaps, Michal will resend his series on
top of this one.

Thanks.


--------------->8-------------------
>From e8e6215c4cdadf7f4df7c420349750412d89e99b Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Date: Tue, 24 Oct 2017 11:38:22 +0900
Subject: [PATCH] mm/page_isolation: separate the pageblock isolation function

There are two users who use pageblock isolation function,
alloc_contig_range() and memory hotplug. Each one has different purpose
on isolation so they should be treated separately. For example,
alloc_contig_range() doesn't require that all pages on the pageblock are
movable because it could just needs part of pages on the pageblock.
But, memory hotplug does since memory offline works for pageblock unit
or more. Currently, they are distiniguished by migratetype of
the target pageblock but it causes a problem on memory hotplug
so it's better to separate the function completely at this moment.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/page-isolation.h |  3 +-
 mm/page_alloc.c                |  9 +++--
 mm/page_isolation.c            | 76 +++++++++++++++++++++++++++++-------------
 3 files changed, 57 insertions(+), 31 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index d4cd201..614dc00 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -29,8 +29,7 @@ static inline bool is_migrate_isolate(int migratetype)
 }
 #endif
 
-bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-                        bool skip_hwpoisoned_pages);
+bool has_unremovable_pages(struct zone *zone, struct page *page, int count);
 void set_pageblock_migratetype(struct page *page, int migratetype);
 int move_freepages_block(struct zone *zone, struct page *page,
                                int migratetype, int *num_movable);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1008c58..04f3b36 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7373,7 +7373,7 @@ void *__init alloc_large_system_hash(const char *tablename,
 }
 
 /*
- * This function checks whether pageblock includes unmovable pages or not.
+ * This function checks whether pageblock includes unremovable pages or not.
  * If @count is not zero, it is okay to include less @count unmovable pages
  *
  * PageLRU check without isolation or lru_lock could race so that
@@ -7381,8 +7381,7 @@ void *__init alloc_large_system_hash(const char *tablename,
  * check without lock_page also may miss some movable non-lru pages at
  * race condition. So you can't expect this function should be exact.
  */
-bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-                        bool skip_hwpoisoned_pages)
+bool has_unremovable_pages(struct zone *zone, struct page *page, int count)
 {
        unsigned long pfn, iter, found;
        int mt;
@@ -7432,7 +7431,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
                 * The HWPoisoned page may be not in buddy system, and
                 * page_count() is not 0.
                 */
-               if (skip_hwpoisoned_pages && PageHWPoison(page))
+               if (PageHWPoison(page))
                        continue;
 
                if (__PageMovable(page))
@@ -7479,7 +7478,7 @@ bool is_pageblock_removable_nolock(struct page *page)
        if (!zone_spans_pfn(zone, pfn))
                return false;
 
-       return !has_unmovable_pages(zone, page, 0, true);
+       return !has_unremovable_pages(zone, page, 0);
 }
 
 #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || defined(CONFIG_CMA)
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 757410d..1650e01 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -14,8 +14,38 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/page_isolation.h>
 
-static int set_migratetype_isolate(struct page *page,
-                               bool skip_hwpoisoned_pages)
+/* Should hold the zone lock */
+static void __set_migratetype_isolate(struct zone *zone,
+                               struct page *page, int mt)
+{
+       unsigned long nr_pages;
+
+       set_pageblock_migratetype(page, MIGRATE_ISOLATE);
+       zone->nr_isolate_pageblock++;
+       nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE, NULL);
+       __mod_zone_freepage_state(zone, -nr_pages, mt);
+}
+
+static int isolate_pageblock(struct page *page, int mt)
+{
+       struct zone *zone = page_zone(page);
+       unsigned long flags;
+
+       spin_lock_irqsave(&zone->lock, flags);
+       if (get_pageblock_migratetype(page) != mt) {
+               spin_unlock_irqrestore(&zone->lock, flags);
+               return -EBUSY;
+       }
+
+       __set_migratetype_isolate(zone, page, mt);
+       spin_unlock_irqrestore(&zone->lock, flags);
+
+       drain_all_pages(zone);
+
+       return 0;
+}
+
+static int isolate_pageblock_for_offline(struct page *page)
 {
        struct zone *zone;
        unsigned long flags, pfn;
@@ -46,33 +76,22 @@ static int set_migratetype_isolate(struct page *page,
        notifier_ret = memory_isolate_notify(MEM_ISOLATE_COUNT, &arg);
        notifier_ret = notifier_to_errno(notifier_ret);
        if (notifier_ret)
-               goto out;
+               goto err;
        /*
         * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
         * We just check MOVABLE pages.
         */
-       if (!has_unmovable_pages(zone, page, arg.pages_found,
-                                skip_hwpoisoned_pages))
-               ret = 0;
-
        /*
         * immobile means "not-on-lru" pages. If immobile is larger than
         * removable-by-driver pages reported by notifier, we'll fail.
         */
+       if (has_unremovable_pages(zone, page, arg.pages_found))
+               goto err;
 
-out:
-       if (!ret) {
-               unsigned long nr_pages;
-               int migratetype = get_pageblock_migratetype(page);
-
-               set_pageblock_migratetype(page, MIGRATE_ISOLATE);
-               zone->nr_isolate_pageblock++;
-               nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE,
-                                                                       NULL);
-
-               __mod_zone_freepage_state(zone, -nr_pages, migratetype);
-       }
+       __set_migratetype_isolate(zone, page, get_pageblock_migratetype(page));
+       ret = 0;
 
+err:
        spin_unlock_irqrestore(&zone->lock, flags);
        if (!ret)
                drain_all_pages(zone);
@@ -159,6 +178,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * @start_pfn: The lower PFN of the range to be isolated.
  * @end_pfn: The upper PFN of the range to be isolated.
  * @migratetype: migrate type to set in error recovery.
+ * @for_memory_offline: The purpose of the isolation
  *
  * Making page-allocation-type to be MIGRATE_ISOLATE means free pages in
  * the range will never be allocated. Any free pages and pages freed in the
@@ -168,7 +188,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * Returns 0 on success and -EBUSY if any part of range cannot be isolated.
  */
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-                            unsigned migratetype, bool skip_hwpoisoned_pages)
+                            unsigned int migratetype, bool for_memory_offline)
 {
        unsigned long pfn;
        unsigned long undo_pfn;
@@ -181,11 +201,19 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
             pfn < end_pfn;
             pfn += pageblock_nr_pages) {
                page = __first_valid_page(pfn, pageblock_nr_pages);
-               if (page &&
-                   set_migratetype_isolate(page, skip_hwpoisoned_pages)) {
-                       undo_pfn = pfn;
-                       goto undo;
+               if (!page)
+                       continue;
+
+               if (for_memory_offline) {
+                       if (!isolate_pageblock_for_offline(page))
+                               continue;
+               } else {
+                       if (!isolate_pageblock(page, migratetype))
+                               continue;
                }
+
+               undo_pfn = pfn;
+               goto undo;
        }
        return 0;
 undo:
-- 
2.7.4

  reply index

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-18  7:08 [PATCH v2 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko
2017-09-18  7:08 ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michal Hocko
2017-10-10 12:05   ` Michael Ellerman
2017-10-10 12:27     ` Michal Hocko
2017-10-11  2:37       ` Michael Ellerman
2017-10-11  5:19         ` Michael Ellerman
2017-10-11 14:05           ` Anshuman Khandual
2017-10-11 14:16             ` Michal Hocko
2017-10-11  6:51         ` Michal Hocko
2017-10-11  8:04           ` Vlastimil Babka
2017-10-11  8:13             ` Michal Hocko
2017-10-11 11:17               ` Vlastimil Babka
2017-10-11 11:24                 ` Michal Hocko
2017-10-13 11:42             ` Michael Ellerman
2017-10-13 11:58               ` Michal Hocko
2017-10-13 12:00                 ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Michal Hocko
2017-10-13 12:00                   ` [PATCH 2/2] mm, page_alloc: fail has_unmovable_pages when seeing reserved pages Michal Hocko
2017-10-13 12:04                     ` Vlastimil Babka
2017-10-13 12:07                       ` Michal Hocko
2017-10-17 13:03                         ` Vlastimil Babka
2017-10-17 11:41                   ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Michael Ellerman
2017-10-17 12:03                     ` Michal Hocko
2017-10-17 13:02                   ` Vlastimil Babka
2017-10-19  2:51                   ` Joonsoo Kim
2017-10-19  7:15                     ` Michal Hocko
2017-10-19  7:33                       ` Joonsoo Kim
2017-10-19  8:20                         ` Michal Hocko
2017-10-19 12:21                           ` Michal Hocko
2017-10-20  2:13                             ` Joonsoo Kim
2017-10-20  5:59                               ` Michal Hocko
2017-10-20  6:50                                 ` Joonsoo Kim
2017-10-20  7:02                                   ` Michal Hocko
2017-10-23  5:23                                     ` Joonsoo Kim
2017-10-23  8:10                                       ` Michal Hocko
2017-10-24  4:44                                         ` Joonsoo Kim [this message]
2017-10-24  7:44                                           ` Michal Hocko
2017-10-24  8:12                                           ` Vlastimil Babka
2017-10-24 12:25                                             ` Michal Hocko
2017-10-26  2:47                                             ` Joonsoo Kim
2017-10-26  7:41                                               ` Michal Hocko
2017-10-20  7:22                               ` Xishi Qiu
2017-10-20  8:17                                 ` Michal Hocko
2017-10-23  5:26                                   ` Joonsoo Kim
2017-10-26 13:04                             ` Vlastimil Babka
2017-10-26 13:59                             ` Michal Hocko
2017-09-18  7:08 ` [PATCH 2/2] mm, memory_hotplug: remove timeout from __offline_memory Michal Hocko
     [not found] <AM3PR04MB14892A9D6D2FBCE21B8C1F0FF12B0@AM3PR04MB1489.eurprd04.prod.outlook.com>
2017-11-13  7:33 ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Ran Wang
2017-11-13 11:02   ` Michal Hocko
2017-11-14  6:10     ` Ran Wang
2017-11-14  7:06       ` Michal Hocko
2017-11-14  7:45         ` Ran Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171024044423.GA31424@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=imammedo@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=qiuxishi@huawei.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git