All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>, Qian Cai <cai@lca.pw>,
	linux-mm@kvack.org, vbabka@suse.cz
Subject: Re: kernel BUG at include/linux/mm.h:1020!
Date: Mon, 25 Mar 2019 10:58:56 +0000	[thread overview]
Message-ID: <20190325105856.GI3189@techsingularity.net> (raw)
In-Reply-To: <CABXGCsMG+oCTxiEv1vmiK0P+fvr7ZiuOsbX-GCE13gapcRi5-Q@mail.gmail.com>

On Sat, Mar 23, 2019 at 09:40:04AM +0500, Mikhail Gavrilov wrote:
> >         /*
> >          * Only clear the hint if a sample indicates there is either a
> >          * free page or an LRU page in the block. One or other condition
> >          * is necessary for the block to be a migration source/target.
> >          */
> > -       block_pfn = pageblock_start_pfn(pfn);
> > -       pfn = max(block_pfn, zone->zone_start_pfn);
> > -       page = pfn_to_page(pfn);
> > -       if (zone != page_zone(page))
> > -               return false;
> > -       pfn = block_pfn + pageblock_nr_pages;
> > -       pfn = min(pfn, zone_end_pfn(zone));
> > -       end_page = pfn_to_page(pfn);
> > -
> >         do {
> >                 if (pfn_valid_within(pfn)) {
> >                         if (check_source && PageLRU(page)) {
> 
> Unfortunately this patch didn't helps too.
> 
> kernel log: https://pastebin.com/RHhmXPM2
> 

Ok, it's somewhat of a pity that we don't know what PFN that page
corresponds to. Specifically it would be interesting to know if the PFN
corresponds to a memory hole as DMA32 on your machine has a number of
gaps. What I'm wondering is if the reinit fails to find good starting
points that it picks a PFN that corresponds to an uninitialised page and
trips up later.

Can you try again with this patch please? It replaces the failed patch
entirely.

Thanks.

diff --git a/mm/compaction.c b/mm/compaction.c
index f171a83707ce..caac4b07eb33 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -242,6 +242,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
 							bool check_target)
 {
 	struct page *page = pfn_to_online_page(pfn);
+	struct page *block_page;
 	struct page *end_page;
 	unsigned long block_pfn;
 
@@ -267,20 +268,26 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
 	    get_pageblock_migratetype(page) != MIGRATE_MOVABLE)
 		return false;
 
+	/* Ensure the start of the pageblock or zone is online and valid */
+	block_pfn = pageblock_start_pfn(pfn);
+	block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn));
+	if (block_page) {
+		page = block_page;
+		pfn = block_pfn;
+	}
+
+	/* Ensure the end of the pageblock or zone is online and valid */
+	block_pfn += pageblock_nr_pages;
+	block_pfn = min(block_pfn, zone_end_pfn(zone));
+	end_page = pfn_to_online_page(block_pfn);
+	if (!end_page)
+		return false;
+
 	/*
 	 * Only clear the hint if a sample indicates there is either a
 	 * free page or an LRU page in the block. One or other condition
 	 * is necessary for the block to be a migration source/target.
 	 */
-	block_pfn = pageblock_start_pfn(pfn);
-	pfn = max(block_pfn, zone->zone_start_pfn);
-	page = pfn_to_page(pfn);
-	if (zone != page_zone(page))
-		return false;
-	pfn = block_pfn + pageblock_nr_pages;
-	pfn = min(pfn, zone_end_pfn(zone));
-	end_page = pfn_to_page(pfn);
-
 	do {
 		if (pfn_valid_within(pfn)) {
 			if (check_source && PageLRU(page)) {
@@ -320,6 +327,16 @@ static void __reset_isolation_suitable(struct zone *zone)
 
 	zone->compact_blockskip_flush = false;
 
+
+	/*
+	 * Re-init the scanners and attempt to find a better starting
+	 * position below. This may result in redundant scanning if
+	 * a better position is not found but it avoids the corner
+	 * case whereby the cached PFNs are left in a memory hole with
+	 * no proper struct page backing it.
+	 */
+	reset_cached_positions(zone);
+
 	/*
 	 * Walk the zone and update pageblock skip information. Source looks
 	 * for PageLRU while target looks for PageBuddy. When the scanner
@@ -349,13 +366,6 @@ static void __reset_isolation_suitable(struct zone *zone)
 			zone->compact_cached_free_pfn = reset_free;
 		}
 	}
-
-	/* Leave no distance if no suitable block was reset */
-	if (reset_migrate >= reset_free) {
-		zone->compact_cached_migrate_pfn[0] = migrate_pfn;
-		zone->compact_cached_migrate_pfn[1] = migrate_pfn;
-		zone->compact_cached_free_pfn = free_pfn;
-	}
 }
 
 void reset_isolation_suitable(pg_data_t *pgdat)

-- 
Mel Gorman
SUSE Labs


  reply	other threads:[~2019-03-25 10:59 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-12 17:55 kernel BUG at include/linux/mm.h:1020! Mikhail Gavrilov
2019-03-15 20:58 ` Daniel Jordan
2019-03-15 21:34   ` Qian Cai
2019-03-17 15:22   ` Mel Gorman
2019-03-19 19:14     ` Qian Cai
2019-03-19 19:27       ` Pavel Tatashin
2019-03-19 19:35         ` Qian Cai
2019-03-19 23:13           ` Pavel Tatashin
2019-03-19 23:26             ` Qian Cai
2019-03-20 14:20       ` Mel Gorman
2019-03-20 21:50   ` Mikhail Gavrilov
2019-03-21  5:39     ` Mikhail Gavrilov
2019-03-21 13:21       ` Qian Cai
2019-03-21 15:08         ` Mikhail Gavrilov
2019-03-21 15:48           ` Qian Cai
2019-03-21 18:57             ` Mikhail Gavrilov
2019-03-21 19:14               ` Qian Cai
2019-03-22  3:41                 ` Mikhail Gavrilov
2019-03-22 13:43                   ` Qian Cai
2019-03-22 11:15       ` Mel Gorman
2019-03-23  4:40         ` Mikhail Gavrilov
2019-03-25 10:58           ` Mel Gorman [this message]
2019-03-25 16:06             ` Mikhail Gavrilov
2019-03-25 20:31               ` Mel Gorman
2019-03-26  4:03                 ` Mikhail Gavrilov
2019-03-26 12:03                   ` Mel Gorman
2019-03-27  3:57                     ` Mikhail Gavrilov
2019-03-27  8:54                       ` Mel Gorman
2019-03-22  7:39 ` Oscar Salvador
2019-03-22  7:54   ` Mikhail Gavrilov
2019-03-22  8:55     ` Oscar Salvador
2019-03-22  8:56       ` Oscar Salvador
2019-03-22 17:49         ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190325105856.GI3189@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=cai@lca.pw \
    --cc=daniel.m.jordan@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.