All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org, Michal Hocko <mhocko@suse.com>
Subject: Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9
Date: Tue, 18 Feb 2020 14:01:46 +0300	[thread overview]
Message-ID: <0ba2a3c6-6593-2cee-1cef-983cd75f920f@virtuozzo.com> (raw)
In-Reply-To: <cf6be5f5-4bbc-0d34-fb64-33fd37bc48d9@virtuozzo.com>

On 18.02.2020 13:50, Kirill Tkhai wrote:
> Hi, Sachin,
> 
> On 18.02.2020 13:45, Sachin Sant wrote:
>> Todays next fails to boot on a POWER9 PowerVM logical partition
>> with following trace:
>>
>> [    8.767660] random: systemd: uninitialized urandom read (16 bytes read)
>> [    8.768629] BUG: Kernel NULL pointer dereference on read at 0x000073b0
>> [    8.768635] Faulting instruction address: 0xc0000000003d55f4
>> [    8.768641] Oops: Kernel access of bad area, sig: 11 [#1]
>> [    8.768645] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> [    8.768650] Modules linked in:
>> [    8.768655] CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1
>> [    8.768660] NIP:  c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000
>> [    8.768666] REGS: c0000008b37836d0 TRAP: 0300   Not tainted  (5.6.0-rc2-next-20200218-autotest)
>> [    8.768671] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004844  XER: 00000000
>> [    8.768679] CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1
>> [    8.768679] GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500
>> [    8.768679] GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620
>> [    8.768679] GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000
>> [    8.768679] GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000
>> [    8.768679] GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002
>> [    8.768679] GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122
>> [    8.768679] GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8
>> [    8.768679] GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180
>> [    8.768727] NIP [c0000000003d55f4] ___slab_alloc+0x1f4/0x760
>> [    8.768732] LR [c0000000003d5b94] __slab_alloc+0x34/0x60
>> [    8.768735] Call Trace:
>> [    8.768739] [c0000008b3783960] [c0000000003d5734] ___slab_alloc+0x334/0x760 (unreliable)
>> [    8.768745] [c0000008b3783a40] [c0000000003d5b94] __slab_alloc+0x34/0x60
>> [    8.768751] [c0000008b3783a70] [c0000000003d6fa0] __kmalloc_node+0x110/0x490
>> [    8.768757] [c0000008b3783af0] [c0000000003443d8] kvmalloc_node+0x58/0x110
>> [    8.768763] [c0000008b3783b30] [c0000000003fee38] mem_cgroup_css_online+0x108/0x270
>> [    8.768769] [c0000008b3783b90] [c000000000235aa8] online_css+0x48/0xd0
>> [    8.768775] [c0000008b3783bc0] [c00000000023eaec] cgroup_apply_control_enable+0x2ec/0x4d0
>> [    8.768781] [c0000008b3783ca0] [c000000000242318] cgroup_mkdir+0x228/0x5f0
>> [    8.768786] [c0000008b3783d10] [c00000000051e170] kernfs_iop_mkdir+0x90/0xf0
>> [    8.768792] [c0000008b3783d50] [c00000000043dc00] vfs_mkdir+0x110/0x230
>> [    8.768797] [c0000008b3783da0] [c000000000441c90] do_mkdirat+0xb0/0x1a0
>> [    8.768804] [c0000008b3783e20] [c00000000000b278] system_call+0x5c/0x68
>> [    8.768808] Instruction dump:
>> [    8.768811] 7c421378 e95f0000 714a0001 4082fff0 4bffff64 60000000 60000000 faa10088
>> [    8.768818] 3ea2000c 3ab57070 7b4a1f24 7d55502a <e94a73b0> 2faa0000 409e0394 3d02002a
>> [    8.768826] ---[ end trace 631af2cb73507891 ]---
>> [    8.770876]
>> [    9.770887] Kernel panic - not syncing: Fatal exception
>>
>> Bisect reveals the problem was introduced in next-20200217 by following commit 
>>
>> commit a75056fc1e7c 
>> mm/memcontrol.c: allocate shrinker_map on appropriate NUMA node
>>
>> I can boot the kernel successfully if the patch is reverted. 
> 
> 
> could you please test your boot with original patch from here:
> 
> https://patchwork.kernel.org/patch/11360007/

After you tried the above patch instead of the problem patch,
do one more test and apply the below on current linux-next.
Please, say which of the patches makes your kernel bootable again.

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 63bb6a2aab81..7b9b48dcbc60 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -334,7 +334,7 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg,
 		if (!old)
 			return 0;
 
-		new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid);
+		new = kmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid);
 		if (!new)
 			return -ENOMEM;
 
@@ -378,7 +378,7 @@ static int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg)
 	mutex_lock(&memcg_shrinker_map_mutex);
 	size = memcg_shrinker_map_size;
 	for_each_node(nid) {
-		map = kvzalloc_node(sizeof(*map) + size, GFP_KERNEL, nid);
+		map = kzalloc_node(sizeof(*map) + size, GFP_KERNEL, nid);
 		if (!map) {
 			memcg_free_shrinker_maps(memcg);
 			ret = -ENOMEM;



WARNING: multiple messages have this Message-ID (diff)
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.com>, linuxppc-dev@lists.ozlabs.org
Subject: Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9
Date: Tue, 18 Feb 2020 14:01:46 +0300	[thread overview]
Message-ID: <0ba2a3c6-6593-2cee-1cef-983cd75f920f@virtuozzo.com> (raw)
In-Reply-To: <cf6be5f5-4bbc-0d34-fb64-33fd37bc48d9@virtuozzo.com>

On 18.02.2020 13:50, Kirill Tkhai wrote:
> Hi, Sachin,
> 
> On 18.02.2020 13:45, Sachin Sant wrote:
>> Todays next fails to boot on a POWER9 PowerVM logical partition
>> with following trace:
>>
>> [    8.767660] random: systemd: uninitialized urandom read (16 bytes read)
>> [    8.768629] BUG: Kernel NULL pointer dereference on read at 0x000073b0
>> [    8.768635] Faulting instruction address: 0xc0000000003d55f4
>> [    8.768641] Oops: Kernel access of bad area, sig: 11 [#1]
>> [    8.768645] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> [    8.768650] Modules linked in:
>> [    8.768655] CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1
>> [    8.768660] NIP:  c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000
>> [    8.768666] REGS: c0000008b37836d0 TRAP: 0300   Not tainted  (5.6.0-rc2-next-20200218-autotest)
>> [    8.768671] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004844  XER: 00000000
>> [    8.768679] CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1
>> [    8.768679] GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500
>> [    8.768679] GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620
>> [    8.768679] GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000
>> [    8.768679] GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000
>> [    8.768679] GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002
>> [    8.768679] GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122
>> [    8.768679] GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8
>> [    8.768679] GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180
>> [    8.768727] NIP [c0000000003d55f4] ___slab_alloc+0x1f4/0x760
>> [    8.768732] LR [c0000000003d5b94] __slab_alloc+0x34/0x60
>> [    8.768735] Call Trace:
>> [    8.768739] [c0000008b3783960] [c0000000003d5734] ___slab_alloc+0x334/0x760 (unreliable)
>> [    8.768745] [c0000008b3783a40] [c0000000003d5b94] __slab_alloc+0x34/0x60
>> [    8.768751] [c0000008b3783a70] [c0000000003d6fa0] __kmalloc_node+0x110/0x490
>> [    8.768757] [c0000008b3783af0] [c0000000003443d8] kvmalloc_node+0x58/0x110
>> [    8.768763] [c0000008b3783b30] [c0000000003fee38] mem_cgroup_css_online+0x108/0x270
>> [    8.768769] [c0000008b3783b90] [c000000000235aa8] online_css+0x48/0xd0
>> [    8.768775] [c0000008b3783bc0] [c00000000023eaec] cgroup_apply_control_enable+0x2ec/0x4d0
>> [    8.768781] [c0000008b3783ca0] [c000000000242318] cgroup_mkdir+0x228/0x5f0
>> [    8.768786] [c0000008b3783d10] [c00000000051e170] kernfs_iop_mkdir+0x90/0xf0
>> [    8.768792] [c0000008b3783d50] [c00000000043dc00] vfs_mkdir+0x110/0x230
>> [    8.768797] [c0000008b3783da0] [c000000000441c90] do_mkdirat+0xb0/0x1a0
>> [    8.768804] [c0000008b3783e20] [c00000000000b278] system_call+0x5c/0x68
>> [    8.768808] Instruction dump:
>> [    8.768811] 7c421378 e95f0000 714a0001 4082fff0 4bffff64 60000000 60000000 faa10088
>> [    8.768818] 3ea2000c 3ab57070 7b4a1f24 7d55502a <e94a73b0> 2faa0000 409e0394 3d02002a
>> [    8.768826] ---[ end trace 631af2cb73507891 ]---
>> [    8.770876]
>> [    9.770887] Kernel panic - not syncing: Fatal exception
>>
>> Bisect reveals the problem was introduced in next-20200217 by following commit 
>>
>> commit a75056fc1e7c 
>> mm/memcontrol.c: allocate shrinker_map on appropriate NUMA node
>>
>> I can boot the kernel successfully if the patch is reverted. 
> 
> 
> could you please test your boot with original patch from here:
> 
> https://patchwork.kernel.org/patch/11360007/

After you tried the above patch instead of the problem patch,
do one more test and apply the below on current linux-next.
Please, say which of the patches makes your kernel bootable again.

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 63bb6a2aab81..7b9b48dcbc60 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -334,7 +334,7 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg,
 		if (!old)
 			return 0;
 
-		new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid);
+		new = kmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid);
 		if (!new)
 			return -ENOMEM;
 
@@ -378,7 +378,7 @@ static int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg)
 	mutex_lock(&memcg_shrinker_map_mutex);
 	size = memcg_shrinker_map_size;
 	for_each_node(nid) {
-		map = kvzalloc_node(sizeof(*map) + size, GFP_KERNEL, nid);
+		map = kzalloc_node(sizeof(*map) + size, GFP_KERNEL, nid);
 		if (!map) {
 			memcg_free_shrinker_maps(memcg);
 			ret = -ENOMEM;



  reply	other threads:[~2020-02-18 11:02 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 10:45 [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9 Sachin Sant
2020-02-18 10:50 ` Kirill Tkhai
2020-02-18 10:50   ` Kirill Tkhai
2020-02-18 11:01   ` Kirill Tkhai [this message]
2020-02-18 11:01     ` Kirill Tkhai
2020-02-18 11:35     ` Kirill Tkhai
2020-02-18 11:35       ` Kirill Tkhai
2020-02-18 11:40     ` Sachin Sant
2020-02-18 11:55       ` Michal Hocko
2020-02-18 11:55         ` Michal Hocko
2020-02-18 14:00         ` Sachin Sant
2020-02-18 14:00           ` Sachin Sant
2020-02-18 14:26           ` Michal Hocko
2020-02-18 14:26             ` Michal Hocko
2020-02-18 15:11             ` Sachin Sant
2020-02-18 15:11               ` Sachin Sant
2020-02-18 15:24               ` Michal Hocko
2020-02-18 15:24                 ` Michal Hocko
2020-02-22  3:38                 ` Christopher Lameter
2020-02-22  3:38                   ` Christopher Lameter
2020-02-24  8:58                   ` Michal Hocko
2020-02-24  8:58                     ` Michal Hocko
2020-02-26 18:25                     ` Christopher Lameter
2020-02-26 18:25                       ` Christopher Lameter
2020-02-26 18:41                       ` Michal Hocko
2020-02-26 18:41                         ` Michal Hocko
2020-02-26 18:44                         ` Christopher Lameter
2020-02-26 18:44                           ` Christopher Lameter
2020-02-26 19:01                           ` Michal Hocko
2020-02-26 19:01                             ` Michal Hocko
2020-02-26 20:31                             ` David Rientjes
2020-02-26 20:31                               ` David Rientjes
2020-02-26 20:52                               ` Michal Hocko
2020-02-26 20:52                                 ` Michal Hocko
2020-02-26 21:45                         ` Vlastimil Babka
2020-02-26 21:45                           ` Vlastimil Babka
2020-02-26 22:29                           ` Vlastimil Babka
2020-02-26 22:29                             ` Vlastimil Babka
2020-02-27 12:12                             ` Michal Hocko
2020-02-27 12:12                               ` Michal Hocko
2020-02-27 16:00                               ` Sachin Sant
2020-02-27 16:00                                 ` Sachin Sant
2020-02-27 16:16                                 ` Vlastimil Babka
2020-02-27 18:26                                   ` Michal Hocko
2020-02-27 18:26                                     ` Michal Hocko
2020-03-10 15:01                                     ` Michal Hocko
2020-03-10 15:01                                       ` Michal Hocko
2020-03-12 12:18                                       ` Michael Ellerman
2020-03-12 12:18                                         ` Michael Ellerman
2020-03-12 16:51                                         ` Sachin Sant
2020-03-12 16:51                                           ` Sachin Sant
2020-03-13 10:48                                           ` Michael Ellerman
2020-03-13 10:48                                             ` Michael Ellerman
2020-03-13 11:12                                             ` Srikar Dronamraju
2020-03-13 11:12                                               ` Srikar Dronamraju
2020-03-13 11:35                                               ` Vlastimil Babka
2020-03-13 11:35                                                 ` Vlastimil Babka
2020-03-14  8:10                                                 ` Sachin Sant
2020-02-27 12:02                           ` Michal Hocko
2020-02-27 12:02                             ` Michal Hocko
2020-02-18 11:38   ` Sachin Sant
2020-02-18 11:53     ` Kirill Tkhai
2020-03-17 13:17 ` [PATCH 0/4] Fix kmalloc_node on offline nodes Srikar Dronamraju
2020-03-17 13:17   ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 1/4] mm: Check for node_online in node_present_pages Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 13:37     ` Srikar Dronamraju
2020-03-17 13:37       ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 2/4] mm/slub: Use mem_node to allocate a new slab Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 13:34     ` Vlastimil Babka
2020-03-17 13:34       ` Vlastimil Babka
2020-03-17 13:45       ` Srikar Dronamraju
2020-03-17 13:45         ` Srikar Dronamraju
2020-03-17 13:53         ` Vlastimil Babka
2020-03-17 13:53           ` Vlastimil Babka
2020-03-17 14:51           ` Srikar Dronamraju
2020-03-17 14:51             ` Srikar Dronamraju
2020-03-17 15:29             ` Vlastimil Babka
2020-03-17 15:29               ` Vlastimil Babka
2020-03-18  7:29               ` Srikar Dronamraju
2020-03-18  7:29                 ` Srikar Dronamraju
2020-03-17 16:41       ` Srikar Dronamraju
2020-03-17 16:41         ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 3/4] mm: Implement reset_numa_mem Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 4/4] powerpc/numa: Set fallback nodes for offline nodes Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 14:22     ` Bharata B Rao
2020-03-17 14:22       ` Bharata B Rao
2020-03-17 14:29       ` Srikar Dronamraju
2020-03-17 14:29         ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ba2a3c6-6593-2cee-1cef-983cd75f920f@virtuozzo.com \
    --to=ktkhai@virtuozzo.com \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@suse.com \
    --cc=sachinp@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.