From: Xie XiuQi <xiexiuqi@huawei.com> To: <catalin.marinas@arm.com>, <will.deacon@arm.com>, <bhelgaas@google.com>, <gregkh@linuxfoundation.org>, <rafael.j.wysocki@intel.com>, <jarkko.sakkinen@linux.intel.com> Cc: <linux-arm-kernel@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <guohanjun@huawei.com>, <wanghuiqiang@huawei.com>, <tnowicki@caviumnetworks.com> Subject: [PATCH 2/2] drivers: check numa node's online status in dev_to_node Date: Thu, 31 May 2018 20:14:39 +0800 [thread overview] Message-ID: <1527768879-88161-3-git-send-email-xiexiuqi@huawei.com> (raw) In-Reply-To: <1527768879-88161-1-git-send-email-xiexiuqi@huawei.com> If dev->numa_node is not available (or offline), we should return NUMA_NO_NODE to prevent alloc memory on offline nodes, which could cause oops. For example, a numa node: 1) without memory 2) NR_CPUS is very small, and the cpus on the node are not brought up [ 27.851041] Unable to handle kernel NULL pointer dereference at virtual address 00001988 [ 27.859128] Mem abort info: [ 27.861908] ESR = 0x96000005 [ 27.864949] Exception class = DABT (current EL), IL = 32 bits [ 27.870860] SET = 0, FnV = 0 [ 27.873900] EA = 0, S1PTW = 0 [ 27.877029] Data abort info: [ 27.879895] ISV = 0, ISS = 0x00000005 [ 27.883716] CM = 0, WnR = 0 [ 27.886673] [0000000000001988] user address but active_mm is swapper [ 27.893012] Internal error: Oops: 96000005 [#1] SMP [ 27.897876] Modules linked in: [ 27.900919] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.17.0-rc6-mpam+ #116 [ 27.907865] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 EC UEFI Nemo 2.0 RC0 - B306 05/28/2018 [ 27.916983] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 27.921763] pc : __alloc_pages_nodemask+0xf0/0xe70 [ 27.926540] lr : __alloc_pages_nodemask+0x184/0xe70 [ 27.931403] sp : ffff00000996f7e0 [ 27.934704] x29: ffff00000996f7e0 x28: ffff000008cb10a0 [ 27.940003] x27: 00000000014012c0 x26: 0000000000000000 [ 27.945301] x25: 0000000000000003 x24: ffff0000085bbc14 [ 27.950600] x23: 0000000000400000 x22: 0000000000000000 [ 27.955898] x21: 0000000000000001 x20: 0000000000000000 [ 27.961196] x19: 0000000000400000 x18: 0000000000000f00 [ 27.966494] x17: 00000000003bff88 x16: 0000000000000020 [ 27.971792] x15: 000000000000003b x14: ffffffffffffffff [ 27.977090] x13: ffffffffffff0000 x12: 0000000000000030 [ 27.982388] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 27.987686] x9 : 2e64716e622e7364 x8 : 7f7f7f7f7f7f7f7f [ 27.992984] x7 : 0000000000000000 x6 : ffff000008d73c08 [ 27.998282] x5 : 0000000000000000 x4 : 0000000000000081 [ 28.003580] x3 : 0000000000000000 x2 : 0000000000000000 [ 28.008878] x1 : 0000000000000001 x0 : 0000000000001980 [ 28.014177] Process swapper/0 (pid: 1, stack limit = 0x (ptrval)) [ 28.020863] Call trace: [ 28.023296] __alloc_pages_nodemask+0xf0/0xe70 [ 28.027727] allocate_slab+0x94/0x590 [ 28.031374] new_slab+0x68/0xc8 [ 28.034502] ___slab_alloc+0x444/0x4f8 [ 28.038237] __slab_alloc+0x50/0x68 [ 28.041713] __kmalloc_node_track_caller+0x100/0x320 [ 28.046664] devm_kmalloc+0x3c/0x90 [ 28.050139] pinctrl_bind_pins+0x4c/0x298 [ 28.054135] driver_probe_device+0xb4/0x4a0 [ 28.058305] __driver_attach+0x124/0x128 [ 28.062213] bus_for_each_dev+0x78/0xe0 [ 28.066035] driver_attach+0x30/0x40 [ 28.069597] bus_add_driver+0x248/0x2b8 [ 28.073419] driver_register+0x68/0x100 [ 28.077242] __pci_register_driver+0x64/0x78 [ 28.081500] pcie_portdrv_init+0x44/0x4c [ 28.085410] do_one_initcall+0x54/0x208 [ 28.089232] kernel_init_freeable+0x244/0x340 [ 28.093577] kernel_init+0x18/0x118 [ 28.097052] ret_from_fork+0x10/0x1c [ 28.100614] Code: 7100047f 321902a4 1a950095 b5000602 (b9400803) [ 28.106740] ---[ end trace e32df44e6e1c3a4b ]--- Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Tested-by: Huiqiang Wang <wanghuiqiang@huawei.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <Tomasz.Nowicki@caviumnetworks.com> Cc: Xishi Qiu <qiuxishi@huawei.com> --- include/linux/device.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/device.h b/include/linux/device.h index 4779569..2a4fb08 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1017,7 +1017,12 @@ extern __printf(2, 3) #ifdef CONFIG_NUMA static inline int dev_to_node(struct device *dev) { - return dev->numa_node; + int node = dev->numa_node; + + if (unlikely(node != NUMA_NO_NODE && !node_online(node))) + return NUMA_NO_NODE; + + return node; } static inline void set_dev_node(struct device *dev, int node) { -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: xiexiuqi@huawei.com (Xie XiuQi) To: linux-arm-kernel@lists.infradead.org Subject: [PATCH 2/2] drivers: check numa node's online status in dev_to_node Date: Thu, 31 May 2018 20:14:39 +0800 [thread overview] Message-ID: <1527768879-88161-3-git-send-email-xiexiuqi@huawei.com> (raw) In-Reply-To: <1527768879-88161-1-git-send-email-xiexiuqi@huawei.com> If dev->numa_node is not available (or offline), we should return NUMA_NO_NODE to prevent alloc memory on offline nodes, which could cause oops. For example, a numa node: 1) without memory 2) NR_CPUS is very small, and the cpus on the node are not brought up [ 27.851041] Unable to handle kernel NULL pointer dereference at virtual address 00001988 [ 27.859128] Mem abort info: [ 27.861908] ESR = 0x96000005 [ 27.864949] Exception class = DABT (current EL), IL = 32 bits [ 27.870860] SET = 0, FnV = 0 [ 27.873900] EA = 0, S1PTW = 0 [ 27.877029] Data abort info: [ 27.879895] ISV = 0, ISS = 0x00000005 [ 27.883716] CM = 0, WnR = 0 [ 27.886673] [0000000000001988] user address but active_mm is swapper [ 27.893012] Internal error: Oops: 96000005 [#1] SMP [ 27.897876] Modules linked in: [ 27.900919] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.17.0-rc6-mpam+ #116 [ 27.907865] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 EC UEFI Nemo 2.0 RC0 - B306 05/28/2018 [ 27.916983] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 27.921763] pc : __alloc_pages_nodemask+0xf0/0xe70 [ 27.926540] lr : __alloc_pages_nodemask+0x184/0xe70 [ 27.931403] sp : ffff00000996f7e0 [ 27.934704] x29: ffff00000996f7e0 x28: ffff000008cb10a0 [ 27.940003] x27: 00000000014012c0 x26: 0000000000000000 [ 27.945301] x25: 0000000000000003 x24: ffff0000085bbc14 [ 27.950600] x23: 0000000000400000 x22: 0000000000000000 [ 27.955898] x21: 0000000000000001 x20: 0000000000000000 [ 27.961196] x19: 0000000000400000 x18: 0000000000000f00 [ 27.966494] x17: 00000000003bff88 x16: 0000000000000020 [ 27.971792] x15: 000000000000003b x14: ffffffffffffffff [ 27.977090] x13: ffffffffffff0000 x12: 0000000000000030 [ 27.982388] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 27.987686] x9 : 2e64716e622e7364 x8 : 7f7f7f7f7f7f7f7f [ 27.992984] x7 : 0000000000000000 x6 : ffff000008d73c08 [ 27.998282] x5 : 0000000000000000 x4 : 0000000000000081 [ 28.003580] x3 : 0000000000000000 x2 : 0000000000000000 [ 28.008878] x1 : 0000000000000001 x0 : 0000000000001980 [ 28.014177] Process swapper/0 (pid: 1, stack limit = 0x (ptrval)) [ 28.020863] Call trace: [ 28.023296] __alloc_pages_nodemask+0xf0/0xe70 [ 28.027727] allocate_slab+0x94/0x590 [ 28.031374] new_slab+0x68/0xc8 [ 28.034502] ___slab_alloc+0x444/0x4f8 [ 28.038237] __slab_alloc+0x50/0x68 [ 28.041713] __kmalloc_node_track_caller+0x100/0x320 [ 28.046664] devm_kmalloc+0x3c/0x90 [ 28.050139] pinctrl_bind_pins+0x4c/0x298 [ 28.054135] driver_probe_device+0xb4/0x4a0 [ 28.058305] __driver_attach+0x124/0x128 [ 28.062213] bus_for_each_dev+0x78/0xe0 [ 28.066035] driver_attach+0x30/0x40 [ 28.069597] bus_add_driver+0x248/0x2b8 [ 28.073419] driver_register+0x68/0x100 [ 28.077242] __pci_register_driver+0x64/0x78 [ 28.081500] pcie_portdrv_init+0x44/0x4c [ 28.085410] do_one_initcall+0x54/0x208 [ 28.089232] kernel_init_freeable+0x244/0x340 [ 28.093577] kernel_init+0x18/0x118 [ 28.097052] ret_from_fork+0x10/0x1c [ 28.100614] Code: 7100047f 321902a4 1a950095 b5000602 (b9400803) [ 28.106740] ---[ end trace e32df44e6e1c3a4b ]--- Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Tested-by: Huiqiang Wang <wanghuiqiang@huawei.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <Tomasz.Nowicki@caviumnetworks.com> Cc: Xishi Qiu <qiuxishi@huawei.com> --- include/linux/device.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/device.h b/include/linux/device.h index 4779569..2a4fb08 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1017,7 +1017,12 @@ extern __printf(2, 3) #ifdef CONFIG_NUMA static inline int dev_to_node(struct device *dev) { - return dev->numa_node; + int node = dev->numa_node; + + if (unlikely(node != NUMA_NO_NODE && !node_online(node))) + return NUMA_NO_NODE; + + return node; } static inline void set_dev_node(struct device *dev, int node) { -- 1.8.3.1
next prev parent reply other threads:[~2018-05-31 12:08 UTC|newest] Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-05-31 12:14 [PATCH 0/2] arm64/drivers: avoid alloc memory on offline node Xie XiuQi 2018-05-31 12:14 ` Xie XiuQi 2018-05-31 12:14 ` [PATCH 1/2] arm64: " Xie XiuQi 2018-05-31 12:14 ` Xie XiuQi 2018-06-06 15:45 ` Will Deacon 2018-06-06 15:45 ` Will Deacon 2018-06-06 20:39 ` Bjorn Helgaas 2018-06-06 20:39 ` Bjorn Helgaas 2018-06-06 20:39 ` Bjorn Helgaas 2018-06-07 10:55 ` Michal Hocko 2018-06-07 10:55 ` Michal Hocko 2018-06-07 10:55 ` Michal Hocko 2018-06-07 11:55 ` Hanjun Guo 2018-06-07 11:55 ` Hanjun Guo 2018-06-07 11:55 ` Hanjun Guo 2018-06-07 11:55 ` Hanjun Guo 2018-06-07 12:21 ` Michal Hocko 2018-06-07 12:21 ` Michal Hocko 2018-06-07 12:21 ` Michal Hocko 2018-06-11 3:23 ` Xie XiuQi 2018-06-11 3:23 ` Xie XiuQi 2018-06-11 3:23 ` Xie XiuQi 2018-06-11 3:23 ` Xie XiuQi 2018-06-11 8:52 ` Michal Hocko 2018-06-11 8:52 ` Michal Hocko 2018-06-11 8:52 ` Michal Hocko 2018-06-11 12:32 ` Xie XiuQi 2018-06-11 12:32 ` Xie XiuQi 2018-06-11 12:32 ` Xie XiuQi 2018-06-11 12:32 ` Xie XiuQi 2018-06-11 13:43 ` Bjorn Helgaas 2018-06-11 13:43 ` Bjorn Helgaas 2018-06-11 13:43 ` Bjorn Helgaas 2018-06-11 14:53 ` Michal Hocko 2018-06-11 14:53 ` Michal Hocko 2018-06-12 15:08 ` Punit Agrawal 2018-06-12 15:08 ` Punit Agrawal 2018-06-12 15:08 ` Punit Agrawal 2018-06-12 15:20 ` Michal Hocko 2018-06-12 15:20 ` Michal Hocko 2018-06-13 17:39 ` Punit Agrawal 2018-06-13 17:39 ` Punit Agrawal 2018-06-13 17:39 ` Punit Agrawal 2018-06-14 6:23 ` Hanjun Guo 2018-06-14 6:23 ` Hanjun Guo 2018-06-14 6:23 ` Hanjun Guo 2018-06-19 12:03 ` Xie XiuQi 2018-06-19 12:03 ` Xie XiuQi 2018-06-19 12:03 ` Xie XiuQi 2018-06-19 12:07 ` Michal Hocko 2018-06-19 12:07 ` Michal Hocko 2018-06-19 12:40 ` Xie XiuQi 2018-06-19 12:40 ` Xie XiuQi 2018-06-19 12:40 ` Xie XiuQi 2018-06-19 12:52 ` Punit Agrawal 2018-06-19 12:52 ` Punit Agrawal 2018-06-19 12:52 ` Punit Agrawal 2018-06-19 12:52 ` Punit Agrawal 2018-06-19 14:08 ` Lorenzo Pieralisi 2018-06-19 14:08 ` Lorenzo Pieralisi 2018-06-19 14:54 ` Punit Agrawal 2018-06-19 14:54 ` Punit Agrawal 2018-06-19 14:54 ` Punit Agrawal 2018-06-19 14:54 ` Punit Agrawal 2018-06-19 15:14 ` Michal Hocko 2018-06-19 15:14 ` Michal Hocko 2018-06-19 15:35 ` Punit Agrawal 2018-06-19 15:35 ` Punit Agrawal 2018-06-19 15:35 ` Punit Agrawal 2018-06-19 15:35 ` Punit Agrawal 2018-06-19 16:32 ` Lorenzo Pieralisi 2018-06-19 16:32 ` Lorenzo Pieralisi 2018-06-20 3:31 ` Xie XiuQi 2018-06-20 3:31 ` Xie XiuQi 2018-06-20 3:31 ` Xie XiuQi 2018-06-20 11:51 ` Punit Agrawal 2018-06-20 11:51 ` Punit Agrawal 2018-06-20 11:51 ` Punit Agrawal 2018-06-20 11:51 ` Punit Agrawal 2018-06-22 8:58 ` Hanjun Guo 2018-06-22 8:58 ` Hanjun Guo 2018-06-22 8:58 ` Hanjun Guo 2018-06-22 9:11 ` Michal Hocko 2018-06-22 9:11 ` Michal Hocko 2018-06-22 10:24 ` Punit Agrawal 2018-06-22 10:24 ` Punit Agrawal 2018-06-22 10:24 ` Punit Agrawal 2018-06-22 10:24 ` Punit Agrawal 2018-06-22 17:42 ` Jonathan Cameron 2018-06-22 17:42 ` Jonathan Cameron 2018-06-22 17:42 ` Jonathan Cameron 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-06-26 17:27 ` Punit Agrawal 2018-05-31 12:14 ` Xie XiuQi [this message] 2018-05-31 12:14 ` [PATCH 2/2] drivers: check numa node's online status in dev_to_node Xie XiuQi 2018-05-31 14:00 ` [PATCH 0/2] arm64/drivers: avoid alloc memory on offline node Hanjun Guo 2018-05-31 14:00 ` Hanjun Guo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1527768879-88161-3-git-send-email-xiexiuqi@huawei.com \ --to=xiexiuqi@huawei.com \ --cc=bhelgaas@google.com \ --cc=catalin.marinas@arm.com \ --cc=gregkh@linuxfoundation.org \ --cc=guohanjun@huawei.com \ --cc=jarkko.sakkinen@linux.intel.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=rafael.j.wysocki@intel.com \ --cc=tnowicki@caviumnetworks.com \ --cc=wanghuiqiang@huawei.com \ --cc=will.deacon@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.