All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: kernel test robot <rong.a.chen@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	linux-kernel@vger.kernel.org, LKP <lkp@01.org>,
	Pavel Tatashin <pasha.tatashin@soleen.com>
Subject: Re: [LKP] efad4e475c [ 40.308255] Oops: 0000 [#1] PREEMPT SMP PTI
Date: Mon, 18 Feb 2019 09:55:10 +0100	[thread overview]
Message-ID: <20190218085510.GC7251@dhcp22.suse.cz> (raw)
In-Reply-To: <20190218070844.GC4525@dhcp22.suse.cz>

[Sorry for an excessive quoting in the previous email]
[Cc Pavel - the full report is http://lkml.kernel.org/r/20190218052823.GH29177@shao2-debian[]

On Mon 18-02-19 08:08:44, Michal Hocko wrote:
> On Mon 18-02-19 13:28:23, kernel test robot wrote:
[...]
> > [   40.305212] PGD 0 P4D 0 
> > [   40.308255] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   40.313055] CPU: 1 PID: 239 Comm: udevd Not tainted 5.0.0-rc4-00149-gefad4e4 #1
> > [   40.321348] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > [   40.330813] RIP: 0010:page_mapping+0x12/0x80
> > [   40.335709] Code: 5d c3 48 89 df e8 0e ad 02 00 85 c0 75 da 89 e8 5b 5d c3 0f 1f 44 00 00 53 48 89 fb 48 8b 43 08 48 8d 50 ff a8 01 48 0f 45 da <48> 8b 53 08 48 8d 42 ff 83 e2 01 48 0f 44 c3 48 83 38 ff 74 2f 48
> > [   40.356704] RSP: 0018:ffff88801fa87cd8 EFLAGS: 00010202
> > [   40.362714] RAX: ffffffffffffffff RBX: fffffffffffffffe RCX: 000000000000000a
> > [   40.370798] RDX: fffffffffffffffe RSI: ffffffff820b9a20 RDI: ffff88801e5c0000
> > [   40.378830] RBP: 6db6db6db6db6db7 R08: ffff88801e8bb000 R09: 0000000001b64d13
> > [   40.386902] R10: ffff88801fa87cf8 R11: 0000000000000001 R12: ffff88801e640000
> > [   40.395033] R13: ffffffff820b9a20 R14: ffff88801f145258 R15: 0000000000000001
> > [   40.403138] FS:  00007fb2079817c0(0000) GS:ffff88801dd00000(0000) knlGS:0000000000000000
> > [   40.412243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   40.418846] CR2: 0000000000000006 CR3: 000000001fa82000 CR4: 00000000000006a0
> > [   40.426951] Call Trace:
> > [   40.429843]  __dump_page+0x14/0x2c0
> > [   40.433947]  is_mem_section_removable+0x24c/0x2c0
> 
> This looks like we are stumbling over an unitialized struct page again.
> Something this patch should prevent from. Could you try to apply [1]
> which will make __dump_page more robust so that we do not blow up there
> and give some more details in return.
> 
> Btw. is this reproducible all the time?

And forgot to ask whether this is reproducible with pending mmotm
patches in linux-next.

> I will have a look at the memory layout later today.

[    0.059335] No NUMA configuration found
[    0.059345] Faking a node at [mem 0x0000000000000000-0x000000001ffdffff]
[    0.059399] NODE_DATA(0) allocated [mem 0x1e8c3000-0x1e8c5fff]
[    0.073143] Zone ranges:
[    0.073175]   DMA32    [mem 0x0000000000001000-0x000000001ffdffff]
[    0.073204]   Normal   empty
[    0.073212] Movable zone start for each node
[    0.073240] Early memory node ranges
[    0.073247]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.073275]   node   0: [mem 0x0000000000100000-0x000000001ffdffff]
[    0.073309] Zeroed struct page in unavailable ranges: 98 pages
[    0.073312] Initmem setup node 0 [mem 0x0000000000001000-0x000000001ffdffff]
[    0.073343] On node 0 totalpages: 130942
[    0.073373]   DMA32 zone: 1792 pages used for memmap
[    0.073400]   DMA32 zone: 21 pages reserved
[    0.073408]   DMA32 zone: 130942 pages, LIFO batch:31

We have only a single NUMA node with a single ZONE_DMA32. But there is a
hole in the zone and the first range before the hole is not section
aligned. We do zero some unavailable ranges but from the number it is no
clear which range it is and 98. [0x60fff, 0xfffff) is 96 pages. The
patch below should tell us whether we are covering all we need. If yes
then the hole shouldn't make any difference and the problem must be
somewhere else.

---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 35fdde041f5c..c60642505e04 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6706,10 +6706,13 @@ void __init zero_resv_unavail(void)
 	pgcnt = 0;
 	for_each_mem_range(i, &memblock.memory, NULL,
 			NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, NULL) {
-		if (next < start)
+		if (next < start) {
+			pr_info("zeroying %llx-%llx\n", PFN_DOWN(next), PFN_UP(start));
 			pgcnt += zero_pfn_range(PFN_DOWN(next), PFN_UP(start));
+		}
 		next = end;
 	}
+	pr_info("zeroying %llx-%lx\n", PFN_DOWN(next), max_pfn);
 	pgcnt += zero_pfn_range(PFN_DOWN(next), max_pfn);
 
 	/*
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: lkp@lists.01.org
Subject: Re: efad4e475c [ 40.308255] Oops: 0000 [#1] PREEMPT SMP PTI
Date: Mon, 18 Feb 2019 09:55:10 +0100	[thread overview]
Message-ID: <20190218085510.GC7251@dhcp22.suse.cz> (raw)
In-Reply-To: <20190218070844.GC4525@dhcp22.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 4191 bytes --]

[Sorry for an excessive quoting in the previous email]
[Cc Pavel - the full report is http://lkml.kernel.org/r/20190218052823.GH29177(a)shao2-debian[]

On Mon 18-02-19 08:08:44, Michal Hocko wrote:
> On Mon 18-02-19 13:28:23, kernel test robot wrote:
[...]
> > [   40.305212] PGD 0 P4D 0 
> > [   40.308255] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   40.313055] CPU: 1 PID: 239 Comm: udevd Not tainted 5.0.0-rc4-00149-gefad4e4 #1
> > [   40.321348] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > [   40.330813] RIP: 0010:page_mapping+0x12/0x80
> > [   40.335709] Code: 5d c3 48 89 df e8 0e ad 02 00 85 c0 75 da 89 e8 5b 5d c3 0f 1f 44 00 00 53 48 89 fb 48 8b 43 08 48 8d 50 ff a8 01 48 0f 45 da <48> 8b 53 08 48 8d 42 ff 83 e2 01 48 0f 44 c3 48 83 38 ff 74 2f 48
> > [   40.356704] RSP: 0018:ffff88801fa87cd8 EFLAGS: 00010202
> > [   40.362714] RAX: ffffffffffffffff RBX: fffffffffffffffe RCX: 000000000000000a
> > [   40.370798] RDX: fffffffffffffffe RSI: ffffffff820b9a20 RDI: ffff88801e5c0000
> > [   40.378830] RBP: 6db6db6db6db6db7 R08: ffff88801e8bb000 R09: 0000000001b64d13
> > [   40.386902] R10: ffff88801fa87cf8 R11: 0000000000000001 R12: ffff88801e640000
> > [   40.395033] R13: ffffffff820b9a20 R14: ffff88801f145258 R15: 0000000000000001
> > [   40.403138] FS:  00007fb2079817c0(0000) GS:ffff88801dd00000(0000) knlGS:0000000000000000
> > [   40.412243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   40.418846] CR2: 0000000000000006 CR3: 000000001fa82000 CR4: 00000000000006a0
> > [   40.426951] Call Trace:
> > [   40.429843]  __dump_page+0x14/0x2c0
> > [   40.433947]  is_mem_section_removable+0x24c/0x2c0
> 
> This looks like we are stumbling over an unitialized struct page again.
> Something this patch should prevent from. Could you try to apply [1]
> which will make __dump_page more robust so that we do not blow up there
> and give some more details in return.
> 
> Btw. is this reproducible all the time?

And forgot to ask whether this is reproducible with pending mmotm
patches in linux-next.

> I will have a look at the memory layout later today.

[    0.059335] No NUMA configuration found
[    0.059345] Faking a node at [mem 0x0000000000000000-0x000000001ffdffff]
[    0.059399] NODE_DATA(0) allocated [mem 0x1e8c3000-0x1e8c5fff]
[    0.073143] Zone ranges:
[    0.073175]   DMA32    [mem 0x0000000000001000-0x000000001ffdffff]
[    0.073204]   Normal   empty
[    0.073212] Movable zone start for each node
[    0.073240] Early memory node ranges
[    0.073247]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.073275]   node   0: [mem 0x0000000000100000-0x000000001ffdffff]
[    0.073309] Zeroed struct page in unavailable ranges: 98 pages
[    0.073312] Initmem setup node 0 [mem 0x0000000000001000-0x000000001ffdffff]
[    0.073343] On node 0 totalpages: 130942
[    0.073373]   DMA32 zone: 1792 pages used for memmap
[    0.073400]   DMA32 zone: 21 pages reserved
[    0.073408]   DMA32 zone: 130942 pages, LIFO batch:31

We have only a single NUMA node with a single ZONE_DMA32. But there is a
hole in the zone and the first range before the hole is not section
aligned. We do zero some unavailable ranges but from the number it is no
clear which range it is and 98. [0x60fff, 0xfffff) is 96 pages. The
patch below should tell us whether we are covering all we need. If yes
then the hole shouldn't make any difference and the problem must be
somewhere else.

---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 35fdde041f5c..c60642505e04 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6706,10 +6706,13 @@ void __init zero_resv_unavail(void)
 	pgcnt = 0;
 	for_each_mem_range(i, &memblock.memory, NULL,
 			NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, NULL) {
-		if (next < start)
+		if (next < start) {
+			pr_info("zeroying %llx-%llx\n", PFN_DOWN(next), PFN_UP(start));
 			pgcnt += zero_pfn_range(PFN_DOWN(next), PFN_UP(start));
+		}
 		next = end;
 	}
+	pr_info("zeroying %llx-%lx\n", PFN_DOWN(next), max_pfn);
 	pgcnt += zero_pfn_range(PFN_DOWN(next), max_pfn);
 
 	/*
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2019-02-18  8:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-18  5:28 [LKP] efad4e475c [ 40.308255] Oops: 0000 [#1] PREEMPT SMP PTI kernel test robot
2019-02-18  5:28 ` kernel test robot
2019-02-18  7:08 ` [LKP] " Michal Hocko
2019-02-18  7:08   ` Michal Hocko
2019-02-18  8:47   ` [LKP] " Rong Chen
2019-02-18  8:47     ` Rong Chen
2019-02-18  9:03     ` [LKP] " Michal Hocko
2019-02-18  9:03       ` Michal Hocko
2019-02-18  9:11       ` [LKP] " Rong Chen
2019-02-18  9:11         ` Rong Chen
2019-02-18  9:29         ` [LKP] " Michal Hocko
2019-02-18  9:29           ` Michal Hocko
2019-02-18  8:55   ` Michal Hocko [this message]
2019-02-18  8:55     ` Michal Hocko
2019-02-18 10:01     ` [LKP] " Rong Chen
2019-02-18 10:01       ` Rong Chen
2019-02-18 10:30       ` [LKP] " Michal Hocko
2019-02-18 10:30         ` Michal Hocko
2019-02-18 14:05         ` [LKP] " Mike Rapoport
2019-02-18 15:20           ` Michal Hocko
2019-02-18 15:20             ` Michal Hocko
2019-02-18 15:22             ` [LKP] " Michal Hocko
2019-02-18 15:22               ` Michal Hocko
2019-02-18 16:48               ` [LKP] " Mike Rapoport
2019-02-18 17:05                 ` Michal Hocko
2019-02-18 17:05                   ` Michal Hocko
2019-02-18 17:48                   ` [LKP] " Mike Rapoport
2019-02-18 17:57                   ` Matthew Wilcox
2019-02-18 17:57                     ` Matthew Wilcox
2019-02-18 18:11                     ` [LKP] " Michal Hocko
2019-02-18 18:11                       ` Michal Hocko
2019-02-18 19:05                       ` [LKP] " Matthew Wilcox
2019-02-18 19:05                         ` Matthew Wilcox
2019-02-18 18:15 ` [RFC PATCH] mm, memory_hotplug: fix off-by-one in is_pageblock_removable Michal Hocko
2019-02-18 18:15   ` Michal Hocko
2019-02-18 18:31   ` Mike Rapoport
2019-02-20  8:33   ` Oscar Salvador
2019-02-20  8:33     ` Oscar Salvador
2019-02-20 12:57   ` Michal Hocko
2019-02-20 12:57     ` Michal Hocko
2019-02-21  3:18     ` [LKP] " Rong Chen
2019-02-21  3:18       ` Rong Chen
2019-02-21  7:25       ` [LKP] " Michal Hocko
2019-02-21  7:25         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190218085510.GC7251@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@01.org \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=rong.a.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.