* [PATCH 0/2] fix issue when acpi smmuv3 device alloc offline node memory @ 2019-03-15 2:19 Kefeng Wang 2019-03-15 2:19 ` [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device Kefeng Wang 2019-03-15 2:19 ` [PATCH 2/2] ACPI: NUMA: show match info about PXM ID and offline/online node Kefeng Wang 0 siblings, 2 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-15 2:19 UTC (permalink / raw) To: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel Cc: Kefeng Wang IF acpi smmuv3 device is set to offline node, parsed from proximity domain in SMMUv3 IORT table, it will lead to crash when alloc memory, so fix it by using acpi_map_pxm_to_online_node() to find a online node and set it to smmuv3 device. Meanwhile, show the match info about pxm id, offline node and online node. Kefeng Wang (2): ACPI/IORT: set online numa node for smmuv3 device ACPI: NUMA: show match info about PXM ID and offline/online node drivers/acpi/arm64/iort.c | 8 ++++---- drivers/acpi/numa.c | 3 +++ 2 files changed, 7 insertions(+), 4 deletions(-) -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device 2019-03-15 2:19 [PATCH 0/2] fix issue when acpi smmuv3 device alloc offline node memory Kefeng Wang @ 2019-03-15 2:19 ` Kefeng Wang 2019-03-20 11:41 ` Robin Murphy 2019-03-15 2:19 ` [PATCH 2/2] ACPI: NUMA: show match info about PXM ID and offline/online node Kefeng Wang 1 sibling, 1 reply; 18+ messages in thread From: Kefeng Wang @ 2019-03-15 2:19 UTC (permalink / raw) To: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel Cc: Kefeng Wang If there is only node 0 in system, but smmuv3 device is set to offline node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead to following crash, [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 [ 47.500361] Mem abort info: [ 47.503143] ESR = 0x96000004 [ 47.506189] Exception class = DABT (current EL), IL = 32 bits [ 47.512099] SET = 0, FnV = 0 [ 47.515140] EA = 0, S1PTW = 0 [ 47.518272] Data abort info: [ 47.521144] ISV = 0, ISS = 0x00000004 [ 47.524970] CM = 0, WnR = 0 [ 47.527929] [0000000000001388] user address but active_mm is swapper [ 47.534285] Internal error: Oops: 96000004 [#1] SMP [ 47.539151] Modules linked in: [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 ... [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) [ 47.653560] Call trace: [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 [ 47.660600] new_slab+0xec/0x570 [ 47.663816] ___slab_alloc+0x3e0/0x4f8 [ 47.667553] __slab_alloc+0x60/0x80 [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 [ 47.675984] devm_kmalloc+0x44/0xb0 [ 47.679460] pinctrl_bind_pins+0x4c/0x188 [ 47.683457] really_probe+0x78/0x2b8 [ 47.687019] driver_probe_device+0x64/0x110 [ 47.691189] device_driver_attach+0x74/0x98 [ 47.695360] __driver_attach+0x9c/0xe8 [ 47.699095] bus_for_each_dev+0x84/0xd8 [ 47.702919] driver_attach+0x30/0x40 [ 47.706481] bus_add_driver+0x170/0x218 [ 47.710304] driver_register+0x64/0x118 [ 47.714128] __platform_driver_register+0x54/0x60 [ 47.718820] arm_smmu_driver_init+0x24/0x2c [ 47.722991] do_one_initcall+0xbc/0x328 [ 47.726816] kernel_init_freeable+0x304/0x3ac [ 47.731162] kernel_init+0x18/0x110 [ 47.734638] ret_from_fork+0x10/0x1c [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- Using acpi_map_pxm_to_online_node() to get online node to fix it. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- drivers/acpi/arm64/iort.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index e48894e002ba..a2ce836ec103 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1239,10 +1239,10 @@ static void __init arm_smmu_v3_set_proximity(struct device *dev, smmu = (struct acpi_iort_smmu_v3 *)node->node_data; if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); - pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", - smmu->base_address, - smmu->pxm); + int node = acpi_map_pxm_to_online_node(smmu->pxm); + set_dev_node(dev, node); + pr_info("SMMU-v3[%llx] -> PXM %d -> Node %d\n", + smmu->base_address, smmu->pxm, node); } } #else -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device 2019-03-15 2:19 ` [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device Kefeng Wang @ 2019-03-20 11:41 ` Robin Murphy 2019-03-20 14:00 ` Lorenzo Pieralisi 0 siblings, 1 reply; 18+ messages in thread From: Robin Murphy @ 2019-03-20 11:41 UTC (permalink / raw) To: Kefeng Wang, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel On 15/03/2019 02:19, Kefeng Wang wrote: > If there is only node 0 in system, but smmuv3 device is set to offline > node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead > to following crash, Surely that's just a firmware bug? If node 1 doesn't exist in the system then AFAICS if we're presented with a device claiming to be on that node we can only assume the whole thing is bogus. Thus if we're going to work around it at all, it seems to me like we should reject the entire device rather than just bodging it to some other node. Robin. > > [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 > [ 47.500361] Mem abort info: > [ 47.503143] ESR = 0x96000004 > [ 47.506189] Exception class = DABT (current EL), IL = 32 bits > [ 47.512099] SET = 0, FnV = 0 > [ 47.515140] EA = 0, S1PTW = 0 > [ 47.518272] Data abort info: > [ 47.521144] ISV = 0, ISS = 0x00000004 > [ 47.524970] CM = 0, WnR = 0 > [ 47.527929] [0000000000001388] user address but active_mm is swapper > [ 47.534285] Internal error: Oops: 96000004 [#1] SMP > [ 47.539151] Modules linked in: > [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 > [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) > [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 > [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 > ... > [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) > [ 47.653560] Call trace: > [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 > [ 47.660600] new_slab+0xec/0x570 > [ 47.663816] ___slab_alloc+0x3e0/0x4f8 > [ 47.667553] __slab_alloc+0x60/0x80 > [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 > [ 47.675984] devm_kmalloc+0x44/0xb0 > [ 47.679460] pinctrl_bind_pins+0x4c/0x188 > [ 47.683457] really_probe+0x78/0x2b8 > [ 47.687019] driver_probe_device+0x64/0x110 > [ 47.691189] device_driver_attach+0x74/0x98 > [ 47.695360] __driver_attach+0x9c/0xe8 > [ 47.699095] bus_for_each_dev+0x84/0xd8 > [ 47.702919] driver_attach+0x30/0x40 > [ 47.706481] bus_add_driver+0x170/0x218 > [ 47.710304] driver_register+0x64/0x118 > [ 47.714128] __platform_driver_register+0x54/0x60 > [ 47.718820] arm_smmu_driver_init+0x24/0x2c > [ 47.722991] do_one_initcall+0xbc/0x328 > [ 47.726816] kernel_init_freeable+0x304/0x3ac > [ 47.731162] kernel_init+0x18/0x110 > [ 47.734638] ret_from_fork+0x10/0x1c > [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) > [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- > > Using acpi_map_pxm_to_online_node() to get online node to fix it. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > drivers/acpi/arm64/iort.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > index e48894e002ba..a2ce836ec103 100644 > --- a/drivers/acpi/arm64/iort.c > +++ b/drivers/acpi/arm64/iort.c > @@ -1239,10 +1239,10 @@ static void __init arm_smmu_v3_set_proximity(struct device *dev, > > smmu = (struct acpi_iort_smmu_v3 *)node->node_data; > if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { > - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); > - pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", > - smmu->base_address, > - smmu->pxm); > + int node = acpi_map_pxm_to_online_node(smmu->pxm); > + set_dev_node(dev, node); > + pr_info("SMMU-v3[%llx] -> PXM %d -> Node %d\n", > + smmu->base_address, smmu->pxm, node); > } > } > #else > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device 2019-03-20 11:41 ` Robin Murphy @ 2019-03-20 14:00 ` Lorenzo Pieralisi 2019-03-21 6:08 ` Kefeng Wang 0 siblings, 1 reply; 18+ messages in thread From: Lorenzo Pieralisi @ 2019-03-20 14:00 UTC (permalink / raw) To: Robin Murphy Cc: Kefeng Wang, rjw, linux-acpi, Hanjun Guo, Sudeep Holla, linux-arm-kernel On Wed, Mar 20, 2019 at 11:41:18AM +0000, Robin Murphy wrote: > On 15/03/2019 02:19, Kefeng Wang wrote: > >If there is only node 0 in system, but smmuv3 device is set to offline > >node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead > >to following crash, > > Surely that's just a firmware bug? If node 1 doesn't exist in the system > then AFAICS if we're presented with a device claiming to be on that node we > can only assume the whole thing is bogus. Thus if we're going to work around > it at all, it seems to me like we should reject the entire device rather > than just bodging it to some other node. I suspect that's the same issue this thread addressed: https://lore.kernel.org/linux-pci/CAErSpo6S0qtR42tjGZrFu4aMFFyThx1hkHTSowTt6t3XerpHnA@mail.gmail.com/ Lorenzo > Robin. > > > > >[ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 > >[ 47.500361] Mem abort info: > >[ 47.503143] ESR = 0x96000004 > >[ 47.506189] Exception class = DABT (current EL), IL = 32 bits > >[ 47.512099] SET = 0, FnV = 0 > >[ 47.515140] EA = 0, S1PTW = 0 > >[ 47.518272] Data abort info: > >[ 47.521144] ISV = 0, ISS = 0x00000004 > >[ 47.524970] CM = 0, WnR = 0 > >[ 47.527929] [0000000000001388] user address but active_mm is swapper > >[ 47.534285] Internal error: Oops: 96000004 [#1] SMP > >[ 47.539151] Modules linked in: > >[ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 > >[ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) > >[ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 > >[ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 > >... > >[ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) > >[ 47.653560] Call trace: > >[ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 > >[ 47.660600] new_slab+0xec/0x570 > >[ 47.663816] ___slab_alloc+0x3e0/0x4f8 > >[ 47.667553] __slab_alloc+0x60/0x80 > >[ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 > >[ 47.675984] devm_kmalloc+0x44/0xb0 > >[ 47.679460] pinctrl_bind_pins+0x4c/0x188 > >[ 47.683457] really_probe+0x78/0x2b8 > >[ 47.687019] driver_probe_device+0x64/0x110 > >[ 47.691189] device_driver_attach+0x74/0x98 > >[ 47.695360] __driver_attach+0x9c/0xe8 > >[ 47.699095] bus_for_each_dev+0x84/0xd8 > >[ 47.702919] driver_attach+0x30/0x40 > >[ 47.706481] bus_add_driver+0x170/0x218 > >[ 47.710304] driver_register+0x64/0x118 > >[ 47.714128] __platform_driver_register+0x54/0x60 > >[ 47.718820] arm_smmu_driver_init+0x24/0x2c > >[ 47.722991] do_one_initcall+0xbc/0x328 > >[ 47.726816] kernel_init_freeable+0x304/0x3ac > >[ 47.731162] kernel_init+0x18/0x110 > >[ 47.734638] ret_from_fork+0x10/0x1c > >[ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) > >[ 47.744307] ---[ end trace dfeaed4c373a32da ]-- > > > >Using acpi_map_pxm_to_online_node() to get online node to fix it. > > > >Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > >--- > > drivers/acpi/arm64/iort.c | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > >diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > >index e48894e002ba..a2ce836ec103 100644 > >--- a/drivers/acpi/arm64/iort.c > >+++ b/drivers/acpi/arm64/iort.c > >@@ -1239,10 +1239,10 @@ static void __init arm_smmu_v3_set_proximity(struct device *dev, > > smmu = (struct acpi_iort_smmu_v3 *)node->node_data; > > if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { > >- set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); > >- pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", > >- smmu->base_address, > >- smmu->pxm); > >+ int node = acpi_map_pxm_to_online_node(smmu->pxm); > >+ set_dev_node(dev, node); > >+ pr_info("SMMU-v3[%llx] -> PXM %d -> Node %d\n", > >+ smmu->base_address, smmu->pxm, node); > > } > > } > > #else > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device 2019-03-20 14:00 ` Lorenzo Pieralisi @ 2019-03-21 6:08 ` Kefeng Wang 2019-03-27 14:24 ` Kefeng Wang 2019-03-28 11:32 ` Lorenzo Pieralisi 0 siblings, 2 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-21 6:08 UTC (permalink / raw) To: Lorenzo Pieralisi, Robin Murphy Cc: linux-acpi, rjw, Hanjun Guo, linux-arm-kernel, Sudeep Holla On 2019/3/20 22:00, Lorenzo Pieralisi wrote: > On Wed, Mar 20, 2019 at 11:41:18AM +0000, Robin Murphy wrote: >> On 15/03/2019 02:19, Kefeng Wang wrote: >>> If there is only node 0 in system, but smmuv3 device is set to offline >>> node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead >>> to following crash, >> Surely that's just a firmware bug? If node 1 doesn't exist in the system >> then AFAICS if we're presented with a device claiming to be on that node we >> can only assume the whole thing is bogus. Thus if we're going to work around >> it at all, it seems to me like we should reject the entire device rather >> than just bodging it to some other node. Yes, I met this oops with a wrong IORT configuration, > I suspect that's the same issue this thread addressed: > > https://lore.kernel.org/linux-pci/CAErSpo6S0qtR42tjGZrFu4aMFFyThx1hkHTSowTt6t3XerpHnA@mail.gmail.com/ and the situation mentioned above should will trigger this issue too. If the node is offline, we can just return from arm_smmu_v3_set_proximity(), any better way to fix this? > Lorenzo > >> Robin. >> >>> [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 >>> [ 47.500361] Mem abort info: >>> [ 47.503143] ESR = 0x96000004 >>> [ 47.506189] Exception class = DABT (current EL), IL = 32 bits >>> [ 47.512099] SET = 0, FnV = 0 >>> [ 47.515140] EA = 0, S1PTW = 0 >>> [ 47.518272] Data abort info: >>> [ 47.521144] ISV = 0, ISS = 0x00000004 >>> [ 47.524970] CM = 0, WnR = 0 >>> [ 47.527929] [0000000000001388] user address but active_mm is swapper >>> [ 47.534285] Internal error: Oops: 96000004 [#1] SMP >>> [ 47.539151] Modules linked in: >>> [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 >>> [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) >>> [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 >>> [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 >>> ... >>> [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) >>> [ 47.653560] Call trace: >>> [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 >>> [ 47.660600] new_slab+0xec/0x570 >>> [ 47.663816] ___slab_alloc+0x3e0/0x4f8 >>> [ 47.667553] __slab_alloc+0x60/0x80 >>> [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 >>> [ 47.675984] devm_kmalloc+0x44/0xb0 >>> [ 47.679460] pinctrl_bind_pins+0x4c/0x188 >>> [ 47.683457] really_probe+0x78/0x2b8 >>> [ 47.687019] driver_probe_device+0x64/0x110 >>> [ 47.691189] device_driver_attach+0x74/0x98 >>> [ 47.695360] __driver_attach+0x9c/0xe8 >>> [ 47.699095] bus_for_each_dev+0x84/0xd8 >>> [ 47.702919] driver_attach+0x30/0x40 >>> [ 47.706481] bus_add_driver+0x170/0x218 >>> [ 47.710304] driver_register+0x64/0x118 >>> [ 47.714128] __platform_driver_register+0x54/0x60 >>> [ 47.718820] arm_smmu_driver_init+0x24/0x2c >>> [ 47.722991] do_one_initcall+0xbc/0x328 >>> [ 47.726816] kernel_init_freeable+0x304/0x3ac >>> [ 47.731162] kernel_init+0x18/0x110 >>> [ 47.734638] ret_from_fork+0x10/0x1c >>> [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) >>> [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- >>> >>> Using acpi_map_pxm_to_online_node() to get online node to fix it. >>> >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>> --- >>> drivers/acpi/arm64/iort.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c >>> index e48894e002ba..a2ce836ec103 100644 >>> --- a/drivers/acpi/arm64/iort.c >>> +++ b/drivers/acpi/arm64/iort.c >>> @@ -1239,10 +1239,10 @@ static void __init arm_smmu_v3_set_proximity(struct device *dev, >>> smmu = (struct acpi_iort_smmu_v3 *)node->node_data; >>> if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { >>> - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); >>> - pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", >>> - smmu->base_address, >>> - smmu->pxm); >>> + int node = acpi_map_pxm_to_online_node(smmu->pxm); >>> + set_dev_node(dev, node); >>> + pr_info("SMMU-v3[%llx] -> PXM %d -> Node %d\n", >>> + smmu->base_address, smmu->pxm, node); >>> } >>> } >>> #else >>> > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device 2019-03-21 6:08 ` Kefeng Wang @ 2019-03-27 14:24 ` Kefeng Wang 2019-03-28 11:32 ` Lorenzo Pieralisi 1 sibling, 0 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-27 14:24 UTC (permalink / raw) To: Lorenzo Pieralisi, Robin Murphy Cc: linux-acpi, rjw, Hanjun Guo, linux-arm-kernel, Sudeep Holla Kindly ping, thanks. On 2019/3/21 14:08, Kefeng Wang wrote: > On 2019/3/20 22:00, Lorenzo Pieralisi wrote: >> On Wed, Mar 20, 2019 at 11:41:18AM +0000, Robin Murphy wrote: >>> On 15/03/2019 02:19, Kefeng Wang wrote: >>>> If there is only node 0 in system, but smmuv3 device is set to offline >>>> node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead >>>> to following crash, >>> Surely that's just a firmware bug? If node 1 doesn't exist in the system >>> then AFAICS if we're presented with a device claiming to be on that node we >>> can only assume the whole thing is bogus. Thus if we're going to work around >>> it at all, it seems to me like we should reject the entire device rather >>> than just bodging it to some other node. > Yes, I met this oops with a wrong IORT configuration, > >> I suspect that's the same issue this thread addressed: >> >> https://lore.kernel.org/linux-pci/CAErSpo6S0qtR42tjGZrFu4aMFFyThx1hkHTSowTt6t3XerpHnA@mail.gmail.com/ > and the situation mentioned above should will trigger this issue too. > > If the node is offline, we can just return from arm_smmu_v3_set_proximity(), any better way to fix this? > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device 2019-03-21 6:08 ` Kefeng Wang 2019-03-27 14:24 ` Kefeng Wang @ 2019-03-28 11:32 ` Lorenzo Pieralisi 2019-03-28 14:00 ` [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node Kefeng Wang 1 sibling, 1 reply; 18+ messages in thread From: Lorenzo Pieralisi @ 2019-03-28 11:32 UTC (permalink / raw) To: Kefeng Wang Cc: rjw, linux-acpi, Hanjun Guo, Sudeep Holla, Robin Murphy, linux-arm-kernel On Thu, Mar 21, 2019 at 02:08:47PM +0800, Kefeng Wang wrote: > > On 2019/3/20 22:00, Lorenzo Pieralisi wrote: > > On Wed, Mar 20, 2019 at 11:41:18AM +0000, Robin Murphy wrote: > >> On 15/03/2019 02:19, Kefeng Wang wrote: > >>> If there is only node 0 in system, but smmuv3 device is set to offline > >>> node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead > >>> to following crash, > >> Surely that's just a firmware bug? If node 1 doesn't exist in the system > >> then AFAICS if we're presented with a device claiming to be on that node we > >> can only assume the whole thing is bogus. Thus if we're going to work around > >> it at all, it seems to me like we should reject the entire device rather > >> than just bodging it to some other node. > > Yes, I met this oops with a wrong IORT configuration, > > > I suspect that's the same issue this thread addressed: > > > > https://lore.kernel.org/linux-pci/CAErSpo6S0qtR42tjGZrFu4aMFFyThx1hkHTSowTt6t3XerpHnA@mail.gmail.com/ > > and the situation mentioned above should will trigger this issue too. > > If the node is offline, we can just return from > arm_smmu_v3_set_proximity(), any better way to fix this? Add a return value to the set_promixity() callback and return failure on hitting the issue above, therefore terminating device creation. Thanks, Lorenzo _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node 2019-03-28 11:32 ` Lorenzo Pieralisi @ 2019-03-28 14:00 ` Kefeng Wang 2019-03-28 13:59 ` Robin Murphy 2019-03-29 3:17 ` [PATCH RESEND " Kefeng Wang 0 siblings, 2 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-28 14:00 UTC (permalink / raw) To: Lorenzo Pieralisi, Robin Murphy, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel, hanjun.guo Cc: Kefeng Wang If there is only node 0 in system, but smmuv3 device is set to offline node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead to following crash, [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 [ 47.500361] Mem abort info: [ 47.503143] ESR = 0x96000004 [ 47.506189] Exception class = DABT (current EL), IL = 32 bits [ 47.512099] SET = 0, FnV = 0 [ 47.515140] EA = 0, S1PTW = 0 [ 47.518272] Data abort info: [ 47.521144] ISV = 0, ISS = 0x00000004 [ 47.524970] CM = 0, WnR = 0 [ 47.527929] [0000000000001388] user address but active_mm is swapper [ 47.534285] Internal error: Oops: 96000004 [#1] SMP [ 47.539151] Modules linked in: [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 ... [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) [ 47.653560] Call trace: [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 [ 47.660600] new_slab+0xec/0x570 [ 47.663816] ___slab_alloc+0x3e0/0x4f8 [ 47.667553] __slab_alloc+0x60/0x80 [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 [ 47.675984] devm_kmalloc+0x44/0xb0 [ 47.679460] pinctrl_bind_pins+0x4c/0x188 [ 47.683457] really_probe+0x78/0x2b8 [ 47.687019] driver_probe_device+0x64/0x110 [ 47.691189] device_driver_attach+0x74/0x98 [ 47.695360] __driver_attach+0x9c/0xe8 [ 47.699095] bus_for_each_dev+0x84/0xd8 [ 47.702919] driver_attach+0x30/0x40 [ 47.706481] bus_add_driver+0x170/0x218 [ 47.710304] driver_register+0x64/0x118 [ 47.714128] __platform_driver_register+0x54/0x60 [ 47.718820] arm_smmu_driver_init+0x24/0x2c [ 47.722991] do_one_initcall+0xbc/0x328 [ 47.726816] kernel_init_freeable+0x304/0x3ac [ 47.731162] kernel_init+0x18/0x110 [ 47.734638] ret_from_fork+0x10/0x1c [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- This could be triggered by firmware bug with bad IORT configuration, or a NUMA node has no memory attaching to it, also with NR_CPUS less than CPUs presented in MADT. Make dev_set_proximity() with a return value, terminating device creation if it return failure. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- drivers/acpi/arm64/iort.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index e48894e002ba..c294c3490e66 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1232,21 +1232,30 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) /* * set numa proximity domain for smmuv3 device */ -static void __init arm_smmu_v3_set_proximity(struct device *dev, +static int __init arm_smmu_v3_set_proximity(struct device *dev, struct acpi_iort_node *node) { struct acpi_iort_smmu_v3 *smmu; smmu = (struct acpi_iort_smmu_v3 *)node->node_data; if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); + int node = acpi_map_pxm_to_node(smmu->pxm); + if (node != NUMA_NO_NODE && !node_online(node)) + return -EINVAL; + + set_dev_node(dev, node); pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", smmu->base_address, smmu->pxm); } + return 0; } #else -#define arm_smmu_v3_set_proximity NULL +static int __init arm_smmu_v3_set_proximity(struct device *dev, + struct acpi_iort_node *node) +{ + return 0; +} #endif static int __init arm_smmu_count_resources(struct acpi_iort_node *node) @@ -1318,7 +1327,7 @@ struct iort_dev_config { int (*dev_count_resources)(struct acpi_iort_node *node); void (*dev_init_resources)(struct resource *res, struct acpi_iort_node *node); - void (*dev_set_proximity)(struct device *dev, + int (*dev_set_proximity)(struct device *dev, struct acpi_iort_node *node); }; @@ -1369,8 +1378,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, if (!pdev) return -ENOMEM; - if (ops->dev_set_proximity) - ops->dev_set_proximity(&pdev->dev, node); + if (ops->dev_set_proximity) { + ret = ops->dev_set_proximity(&pdev->dev, node); + if (ret) + goto dev_put; + } count = ops->dev_count_resources(node); -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node 2019-03-28 14:00 ` [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node Kefeng Wang @ 2019-03-28 13:59 ` Robin Murphy 2019-03-28 14:29 ` Kefeng Wang 2019-03-29 3:17 ` [PATCH RESEND " Kefeng Wang 1 sibling, 1 reply; 18+ messages in thread From: Robin Murphy @ 2019-03-28 13:59 UTC (permalink / raw) To: Kefeng Wang, Lorenzo Pieralisi, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel, hanjun.guo On 28/03/2019 14:00, Kefeng Wang wrote: > If there is only node 0 in system, but smmuv3 device is set to offline > node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead > to following crash, > > [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 > [ 47.500361] Mem abort info: > [ 47.503143] ESR = 0x96000004 > [ 47.506189] Exception class = DABT (current EL), IL = 32 bits > [ 47.512099] SET = 0, FnV = 0 > [ 47.515140] EA = 0, S1PTW = 0 > [ 47.518272] Data abort info: > [ 47.521144] ISV = 0, ISS = 0x00000004 > [ 47.524970] CM = 0, WnR = 0 > [ 47.527929] [0000000000001388] user address but active_mm is swapper > [ 47.534285] Internal error: Oops: 96000004 [#1] SMP > [ 47.539151] Modules linked in: > [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 > [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) > [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 > [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 > ... > [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) > [ 47.653560] Call trace: > [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 > [ 47.660600] new_slab+0xec/0x570 > [ 47.663816] ___slab_alloc+0x3e0/0x4f8 > [ 47.667553] __slab_alloc+0x60/0x80 > [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 > [ 47.675984] devm_kmalloc+0x44/0xb0 > [ 47.679460] pinctrl_bind_pins+0x4c/0x188 > [ 47.683457] really_probe+0x78/0x2b8 > [ 47.687019] driver_probe_device+0x64/0x110 > [ 47.691189] device_driver_attach+0x74/0x98 > [ 47.695360] __driver_attach+0x9c/0xe8 > [ 47.699095] bus_for_each_dev+0x84/0xd8 > [ 47.702919] driver_attach+0x30/0x40 > [ 47.706481] bus_add_driver+0x170/0x218 > [ 47.710304] driver_register+0x64/0x118 > [ 47.714128] __platform_driver_register+0x54/0x60 > [ 47.718820] arm_smmu_driver_init+0x24/0x2c > [ 47.722991] do_one_initcall+0xbc/0x328 > [ 47.726816] kernel_init_freeable+0x304/0x3ac > [ 47.731162] kernel_init+0x18/0x110 > [ 47.734638] ret_from_fork+0x10/0x1c > [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) > [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- > > This could be triggered by firmware bug with bad IORT configuration, > or a NUMA node has no memory attaching to it, also with NR_CPUS less > than CPUs presented in MADT. > > Make dev_set_proximity() with a return value, terminating device creation > if it return failure. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > drivers/acpi/arm64/iort.c | 24 ++++++++++++++++++------ > 1 file changed, 18 insertions(+), 6 deletions(-) > > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > index e48894e002ba..c294c3490e66 100644 > --- a/drivers/acpi/arm64/iort.c > +++ b/drivers/acpi/arm64/iort.c > @@ -1232,21 +1232,30 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) > /* > * set numa proximity domain for smmuv3 device > */ > -static void __init arm_smmu_v3_set_proximity(struct device *dev, > +static int __init arm_smmu_v3_set_proximity(struct device *dev, > struct acpi_iort_node *node) > { > struct acpi_iort_smmu_v3 *smmu; > > smmu = (struct acpi_iort_smmu_v3 *)node->node_data; > if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { > - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); > + int node = acpi_map_pxm_to_node(smmu->pxm); > + if (node != NUMA_NO_NODE && !node_online(node)) > + return -EINVAL; > + > + set_dev_node(dev, node); > pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", > smmu->base_address, > smmu->pxm); > } > + return 0; > } > #else > -#define arm_smmu_v3_set_proximity NULL > +static int __init arm_smmu_v3_set_proximity(struct device *dev, > + struct acpi_iort_node *node) > +{ > + return 0; > +} Doesn't this end up having the same effect as just leaving the callback assigned with NULL? Not sure why that would need to change :/ Robin. > #endif > > static int __init arm_smmu_count_resources(struct acpi_iort_node *node) > @@ -1318,7 +1327,7 @@ struct iort_dev_config { > int (*dev_count_resources)(struct acpi_iort_node *node); > void (*dev_init_resources)(struct resource *res, > struct acpi_iort_node *node); > - void (*dev_set_proximity)(struct device *dev, > + int (*dev_set_proximity)(struct device *dev, > struct acpi_iort_node *node); > }; > > @@ -1369,8 +1378,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, > if (!pdev) > return -ENOMEM; > > - if (ops->dev_set_proximity) > - ops->dev_set_proximity(&pdev->dev, node); > + if (ops->dev_set_proximity) { > + ret = ops->dev_set_proximity(&pdev->dev, node); > + if (ret) > + goto dev_put; > + } > > count = ops->dev_count_resources(node); > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node 2019-03-28 13:59 ` Robin Murphy @ 2019-03-28 14:29 ` Kefeng Wang 0 siblings, 0 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-28 14:29 UTC (permalink / raw) To: Robin Murphy, Lorenzo Pieralisi, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel, hanjun.guo On 2019/3/28 21:59, Robin Murphy wrote: > On 28/03/2019 14:00, Kefeng Wang wrote: >> If there is only node 0 in system, but smmuv3 device is set to offline >> node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead >> to following crash, >> >> [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 >> [ 47.500361] Mem abort info: >> [ 47.503143] ESR = 0x96000004 >> [ 47.506189] Exception class = DABT (current EL), IL = 32 bits >> [ 47.512099] SET = 0, FnV = 0 >> [ 47.515140] EA = 0, S1PTW = 0 >> [ 47.518272] Data abort info: >> [ 47.521144] ISV = 0, ISS = 0x00000004 >> [ 47.524970] CM = 0, WnR = 0 >> [ 47.527929] [0000000000001388] user address but active_mm is swapper >> [ 47.534285] Internal error: Oops: 96000004 [#1] SMP >> [ 47.539151] Modules linked in: >> [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 >> [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) >> [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 >> [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 >> ... >> [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) >> [ 47.653560] Call trace: >> [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 >> [ 47.660600] new_slab+0xec/0x570 >> [ 47.663816] ___slab_alloc+0x3e0/0x4f8 >> [ 47.667553] __slab_alloc+0x60/0x80 >> [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 >> [ 47.675984] devm_kmalloc+0x44/0xb0 >> [ 47.679460] pinctrl_bind_pins+0x4c/0x188 >> [ 47.683457] really_probe+0x78/0x2b8 >> [ 47.687019] driver_probe_device+0x64/0x110 >> [ 47.691189] device_driver_attach+0x74/0x98 >> [ 47.695360] __driver_attach+0x9c/0xe8 >> [ 47.699095] bus_for_each_dev+0x84/0xd8 >> [ 47.702919] driver_attach+0x30/0x40 >> [ 47.706481] bus_add_driver+0x170/0x218 >> [ 47.710304] driver_register+0x64/0x118 >> [ 47.714128] __platform_driver_register+0x54/0x60 >> [ 47.718820] arm_smmu_driver_init+0x24/0x2c >> [ 47.722991] do_one_initcall+0xbc/0x328 >> [ 47.726816] kernel_init_freeable+0x304/0x3ac >> [ 47.731162] kernel_init+0x18/0x110 >> [ 47.734638] ret_from_fork+0x10/0x1c >> [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) >> [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- >> >> This could be triggered by firmware bug with bad IORT configuration, >> or a NUMA node has no memory attaching to it, also with NR_CPUS less >> than CPUs presented in MADT. >> >> Make dev_set_proximity() with a return value, terminating device creation >> if it return failure. >> >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >> --- >> drivers/acpi/arm64/iort.c | 24 ++++++++++++++++++------ >> 1 file changed, 18 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c >> index e48894e002ba..c294c3490e66 100644 >> --- a/drivers/acpi/arm64/iort.c >> +++ b/drivers/acpi/arm64/iort.c >> @@ -1232,21 +1232,30 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) >> /* >> * set numa proximity domain for smmuv3 device >> */ >> -static void __init arm_smmu_v3_set_proximity(struct device *dev, >> +static int __init arm_smmu_v3_set_proximity(struct device *dev, >> struct acpi_iort_node *node) >> { >> struct acpi_iort_smmu_v3 *smmu; >> smmu = (struct acpi_iort_smmu_v3 *)node->node_data; >> if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { >> - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); >> + int node = acpi_map_pxm_to_node(smmu->pxm); >> + if (node != NUMA_NO_NODE && !node_online(node)) >> + return -EINVAL; >> + >> + set_dev_node(dev, node); >> pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", >> smmu->base_address, >> smmu->pxm); >> } >> + return 0; >> } >> #else >> -#define arm_smmu_v3_set_proximity NULL >> +static int __init arm_smmu_v3_set_proximity(struct device *dev, >> + struct acpi_iort_node *node) >> +{ >> + return 0; >> +} > > Doesn't this end up having the same effect as just leaving the callback assigned with NULL? Not sure why that would need to change :/ Oops, should not change this part ; ( if no other issue, will resend Thanks. > > Robin. > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH RESEND v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node 2019-03-28 14:00 ` [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node Kefeng Wang 2019-03-28 13:59 ` Robin Murphy @ 2019-03-29 3:17 ` Kefeng Wang 2019-04-08 10:42 ` Lorenzo Pieralisi 2019-04-08 10:46 ` Lorenzo Pieralisi 1 sibling, 2 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-29 3:17 UTC (permalink / raw) To: Lorenzo Pieralisi, Robin Murphy, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel, hanjun.guo Cc: Kefeng Wang If there is only node 0 in system, but smmuv3 device is set to offline node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead to following crash, [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 [ 47.500361] Mem abort info: [ 47.503143] ESR = 0x96000004 [ 47.506189] Exception class = DABT (current EL), IL = 32 bits [ 47.512099] SET = 0, FnV = 0 [ 47.515140] EA = 0, S1PTW = 0 [ 47.518272] Data abort info: [ 47.521144] ISV = 0, ISS = 0x00000004 [ 47.524970] CM = 0, WnR = 0 [ 47.527929] [0000000000001388] user address but active_mm is swapper [ 47.534285] Internal error: Oops: 96000004 [#1] SMP [ 47.539151] Modules linked in: [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 ... [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) [ 47.653560] Call trace: [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 [ 47.660600] new_slab+0xec/0x570 [ 47.663816] ___slab_alloc+0x3e0/0x4f8 [ 47.667553] __slab_alloc+0x60/0x80 [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 [ 47.675984] devm_kmalloc+0x44/0xb0 [ 47.679460] pinctrl_bind_pins+0x4c/0x188 [ 47.683457] really_probe+0x78/0x2b8 [ 47.687019] driver_probe_device+0x64/0x110 [ 47.691189] device_driver_attach+0x74/0x98 [ 47.695360] __driver_attach+0x9c/0xe8 [ 47.699095] bus_for_each_dev+0x84/0xd8 [ 47.702919] driver_attach+0x30/0x40 [ 47.706481] bus_add_driver+0x170/0x218 [ 47.710304] driver_register+0x64/0x118 [ 47.714128] __platform_driver_register+0x54/0x60 [ 47.718820] arm_smmu_driver_init+0x24/0x2c [ 47.722991] do_one_initcall+0xbc/0x328 [ 47.726816] kernel_init_freeable+0x304/0x3ac [ 47.731162] kernel_init+0x18/0x110 [ 47.734638] ret_from_fork+0x10/0x1c [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- This could be triggered by firmware bug with bad IORT configuration, or a NUMA node has no memory attaching to it, also with NR_CPUS less than CPUs presented in MADT. Make dev_set_proximity() with a return value, terminating device creation if it return failure. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- drivers/acpi/arm64/iort.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index e48894e002ba..1fc1851b078e 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1232,18 +1232,23 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) /* * set numa proximity domain for smmuv3 device */ -static void __init arm_smmu_v3_set_proximity(struct device *dev, +static int __init arm_smmu_v3_set_proximity(struct device *dev, struct acpi_iort_node *node) { struct acpi_iort_smmu_v3 *smmu; smmu = (struct acpi_iort_smmu_v3 *)node->node_data; if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); + int node = acpi_map_pxm_to_node(smmu->pxm); + if (node != NUMA_NO_NODE && !node_online(node)) + return -EINVAL; + + set_dev_node(dev, node); pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", smmu->base_address, smmu->pxm); } + return 0; } #else #define arm_smmu_v3_set_proximity NULL @@ -1318,7 +1323,7 @@ struct iort_dev_config { int (*dev_count_resources)(struct acpi_iort_node *node); void (*dev_init_resources)(struct resource *res, struct acpi_iort_node *node); - void (*dev_set_proximity)(struct device *dev, + int (*dev_set_proximity)(struct device *dev, struct acpi_iort_node *node); }; @@ -1369,8 +1374,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, if (!pdev) return -ENOMEM; - if (ops->dev_set_proximity) - ops->dev_set_proximity(&pdev->dev, node); + if (ops->dev_set_proximity) { + ret = ops->dev_set_proximity(&pdev->dev, node); + if (ret) + goto dev_put; + } count = ops->dev_count_resources(node); -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node 2019-03-29 3:17 ` [PATCH RESEND " Kefeng Wang @ 2019-04-08 10:42 ` Lorenzo Pieralisi 2019-04-08 10:46 ` Lorenzo Pieralisi 1 sibling, 0 replies; 18+ messages in thread From: Lorenzo Pieralisi @ 2019-04-08 10:42 UTC (permalink / raw) To: Kefeng Wang Cc: rjw, linux-acpi, hanjun.guo, Sudeep Holla, Robin Murphy, linux-arm-kernel On Fri, Mar 29, 2019 at 11:17:51AM +0800, Kefeng Wang wrote: > If there is only node 0 in system, but smmuv3 device is set to offline > node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead > to following crash, "In a system where, through IORT firmware mappings, the SMMU device is mapped to a NUMA node that is not online, the kernel bootstrap results in the following crash:" > [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 > [ 47.500361] Mem abort info: > [ 47.503143] ESR = 0x96000004 > [ 47.506189] Exception class = DABT (current EL), IL = 32 bits > [ 47.512099] SET = 0, FnV = 0 > [ 47.515140] EA = 0, S1PTW = 0 > [ 47.518272] Data abort info: > [ 47.521144] ISV = 0, ISS = 0x00000004 > [ 47.524970] CM = 0, WnR = 0 > [ 47.527929] [0000000000001388] user address but active_mm is swapper > [ 47.534285] Internal error: Oops: 96000004 [#1] SMP > [ 47.539151] Modules linked in: > [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 > [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) > [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 > [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 > ... > [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) > [ 47.653560] Call trace: > [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 > [ 47.660600] new_slab+0xec/0x570 > [ 47.663816] ___slab_alloc+0x3e0/0x4f8 > [ 47.667553] __slab_alloc+0x60/0x80 > [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 > [ 47.675984] devm_kmalloc+0x44/0xb0 > [ 47.679460] pinctrl_bind_pins+0x4c/0x188 > [ 47.683457] really_probe+0x78/0x2b8 > [ 47.687019] driver_probe_device+0x64/0x110 > [ 47.691189] device_driver_attach+0x74/0x98 > [ 47.695360] __driver_attach+0x9c/0xe8 > [ 47.699095] bus_for_each_dev+0x84/0xd8 > [ 47.702919] driver_attach+0x30/0x40 > [ 47.706481] bus_add_driver+0x170/0x218 > [ 47.710304] driver_register+0x64/0x118 > [ 47.714128] __platform_driver_register+0x54/0x60 > [ 47.718820] arm_smmu_driver_init+0x24/0x2c > [ 47.722991] do_one_initcall+0xbc/0x328 > [ 47.726816] kernel_init_freeable+0x304/0x3ac > [ 47.731162] kernel_init+0x18/0x110 > [ 47.734638] ret_from_fork+0x10/0x1c > [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) > [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- Nit: timestamps are not useful information, remove them and indent the log with two spaces, to quote it. > This could be triggered by firmware bug with bad IORT configuration, > or a NUMA node has no memory attaching to it, also with NR_CPUS less > than CPUs presented in MADT. Either you explain this properly or you remove this paragraph, I would remove it. Actually I would add a Link: tag to point at the lore archives where the related discussions took place. > Make dev_set_proximity() with a return value, terminating device creation > if it return failure. "Change the dev_set_proximity() hook prototype so that it returns a value and make it return failure if the PXM->NUMA-node mapping corresponds to an offline node, fixing the crash". > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > drivers/acpi/arm64/iort.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) With the commit log changes above: Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > index e48894e002ba..1fc1851b078e 100644 > --- a/drivers/acpi/arm64/iort.c > +++ b/drivers/acpi/arm64/iort.c > @@ -1232,18 +1232,23 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) > /* > * set numa proximity domain for smmuv3 device > */ > -static void __init arm_smmu_v3_set_proximity(struct device *dev, > +static int __init arm_smmu_v3_set_proximity(struct device *dev, > struct acpi_iort_node *node) > { > struct acpi_iort_smmu_v3 *smmu; > > smmu = (struct acpi_iort_smmu_v3 *)node->node_data; > if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { > - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); > + int node = acpi_map_pxm_to_node(smmu->pxm); > + if (node != NUMA_NO_NODE && !node_online(node)) > + return -EINVAL; > + > + set_dev_node(dev, node); > pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", > smmu->base_address, > smmu->pxm); > } > + return 0; > } > #else > #define arm_smmu_v3_set_proximity NULL > @@ -1318,7 +1323,7 @@ struct iort_dev_config { > int (*dev_count_resources)(struct acpi_iort_node *node); > void (*dev_init_resources)(struct resource *res, > struct acpi_iort_node *node); > - void (*dev_set_proximity)(struct device *dev, > + int (*dev_set_proximity)(struct device *dev, > struct acpi_iort_node *node); > }; > > @@ -1369,8 +1374,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, > if (!pdev) > return -ENOMEM; > > - if (ops->dev_set_proximity) > - ops->dev_set_proximity(&pdev->dev, node); > + if (ops->dev_set_proximity) { > + ret = ops->dev_set_proximity(&pdev->dev, node); > + if (ret) > + goto dev_put; > + } > > count = ops->dev_count_resources(node); > > -- > 2.20.1 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node 2019-03-29 3:17 ` [PATCH RESEND " Kefeng Wang 2019-04-08 10:42 ` Lorenzo Pieralisi @ 2019-04-08 10:46 ` Lorenzo Pieralisi 2019-04-08 15:21 ` [PATCH v3] ACPI/IORT: Reject platform device creation on NUMA node mapping failure Kefeng Wang 1 sibling, 1 reply; 18+ messages in thread From: Lorenzo Pieralisi @ 2019-04-08 10:46 UTC (permalink / raw) To: Kefeng Wang Cc: rjw, linux-acpi, hanjun.guo, Sudeep Holla, Robin Murphy, linux-arm-kernel Also, in the $SUBJECT, s/numa/NUMA because that's an acronym not an English word. Here: "ACPI/IORT: Reject platform device creation on NUMA node mapping failure" Thanks, Lorenzo On Fri, Mar 29, 2019 at 11:17:51AM +0800, Kefeng Wang wrote: > If there is only node 0 in system, but smmuv3 device is set to offline > node 1, parsed from proximity domain in SMMUv3 IORT table, it will lead > to following crash, > > [ 47.492451] Unable to handle kernel paging request at virtual address 0000000000001388 > [ 47.500361] Mem abort info: > [ 47.503143] ESR = 0x96000004 > [ 47.506189] Exception class = DABT (current EL), IL = 32 bits > [ 47.512099] SET = 0, FnV = 0 > [ 47.515140] EA = 0, S1PTW = 0 > [ 47.518272] Data abort info: > [ 47.521144] ISV = 0, ISS = 0x00000004 > [ 47.524970] CM = 0, WnR = 0 > [ 47.527929] [0000000000001388] user address but active_mm is swapper > [ 47.534285] Internal error: Oops: 96000004 [#1] SMP > [ 47.539151] Modules linked in: > [ 47.542194] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 > [ 47.549490] pstate: 80c00009 (Nzcv daif +PAN +UAO) > [ 47.554272] pc : __alloc_pages_nodemask+0x13c/0x1068 > [ 47.559224] lr : __alloc_pages_nodemask+0xdc/0x1068 > ... > [ 47.646873] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) > [ 47.653560] Call trace: > [ 47.655994] __alloc_pages_nodemask+0x13c/0x1068 > [ 47.660600] new_slab+0xec/0x570 > [ 47.663816] ___slab_alloc+0x3e0/0x4f8 > [ 47.667553] __slab_alloc+0x60/0x80 > [ 47.671029] __kmalloc_node_track_caller+0x10c/0x478 > [ 47.675984] devm_kmalloc+0x44/0xb0 > [ 47.679460] pinctrl_bind_pins+0x4c/0x188 > [ 47.683457] really_probe+0x78/0x2b8 > [ 47.687019] driver_probe_device+0x64/0x110 > [ 47.691189] device_driver_attach+0x74/0x98 > [ 47.695360] __driver_attach+0x9c/0xe8 > [ 47.699095] bus_for_each_dev+0x84/0xd8 > [ 47.702919] driver_attach+0x30/0x40 > [ 47.706481] bus_add_driver+0x170/0x218 > [ 47.710304] driver_register+0x64/0x118 > [ 47.714128] __platform_driver_register+0x54/0x60 > [ 47.718820] arm_smmu_driver_init+0x24/0x2c > [ 47.722991] do_one_initcall+0xbc/0x328 > [ 47.726816] kernel_init_freeable+0x304/0x3ac > [ 47.731162] kernel_init+0x18/0x110 > [ 47.734638] ret_from_fork+0x10/0x1c > [ 47.738202] Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) > [ 47.744307] ---[ end trace dfeaed4c373a32da ]-- > > This could be triggered by firmware bug with bad IORT configuration, > or a NUMA node has no memory attaching to it, also with NR_CPUS less > than CPUs presented in MADT. > > Make dev_set_proximity() with a return value, terminating device creation > if it return failure. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > drivers/acpi/arm64/iort.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > index e48894e002ba..1fc1851b078e 100644 > --- a/drivers/acpi/arm64/iort.c > +++ b/drivers/acpi/arm64/iort.c > @@ -1232,18 +1232,23 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) > /* > * set numa proximity domain for smmuv3 device > */ > -static void __init arm_smmu_v3_set_proximity(struct device *dev, > +static int __init arm_smmu_v3_set_proximity(struct device *dev, > struct acpi_iort_node *node) > { > struct acpi_iort_smmu_v3 *smmu; > > smmu = (struct acpi_iort_smmu_v3 *)node->node_data; > if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { > - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); > + int node = acpi_map_pxm_to_node(smmu->pxm); > + if (node != NUMA_NO_NODE && !node_online(node)) > + return -EINVAL; > + > + set_dev_node(dev, node); > pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", > smmu->base_address, > smmu->pxm); > } > + return 0; > } > #else > #define arm_smmu_v3_set_proximity NULL > @@ -1318,7 +1323,7 @@ struct iort_dev_config { > int (*dev_count_resources)(struct acpi_iort_node *node); > void (*dev_init_resources)(struct resource *res, > struct acpi_iort_node *node); > - void (*dev_set_proximity)(struct device *dev, > + int (*dev_set_proximity)(struct device *dev, > struct acpi_iort_node *node); > }; > > @@ -1369,8 +1374,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, > if (!pdev) > return -ENOMEM; > > - if (ops->dev_set_proximity) > - ops->dev_set_proximity(&pdev->dev, node); > + if (ops->dev_set_proximity) { > + ret = ops->dev_set_proximity(&pdev->dev, node); > + if (ret) > + goto dev_put; > + } > > count = ops->dev_count_resources(node); > > -- > 2.20.1 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3] ACPI/IORT: Reject platform device creation on NUMA node mapping failure 2019-04-08 10:46 ` Lorenzo Pieralisi @ 2019-04-08 15:21 ` Kefeng Wang 2019-04-16 17:02 ` Lorenzo Pieralisi 0 siblings, 1 reply; 18+ messages in thread From: Kefeng Wang @ 2019-04-08 15:21 UTC (permalink / raw) To: Lorenzo Pieralisi, Robin Murphy, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel Cc: Kefeng Wang In a system where, through IORT firmware mappings, the SMMU device is mapped to a NUMA node that is not online, the kernel bootstrap results in the following crash: Unable to handle kernel paging request at virtual address 0000000000001388 Mem abort info: ESR = 0x96000004 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 [0000000000001388] user address but active_mm is swapper Internal error: Oops: 96000004 [#1] SMP Modules linked in: CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 pstate: 80c00009 (Nzcv daif +PAN +UAO) pc : __alloc_pages_nodemask+0x13c/0x1068 lr : __alloc_pages_nodemask+0xdc/0x1068 ... Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) Call trace: __alloc_pages_nodemask+0x13c/0x1068 new_slab+0xec/0x570 ___slab_alloc+0x3e0/0x4f8 __slab_alloc+0x60/0x80 __kmalloc_node_track_caller+0x10c/0x478 devm_kmalloc+0x44/0xb0 pinctrl_bind_pins+0x4c/0x188 really_probe+0x78/0x2b8 driver_probe_device+0x64/0x110 device_driver_attach+0x74/0x98 __driver_attach+0x9c/0xe8 bus_for_each_dev+0x84/0xd8 driver_attach+0x30/0x40 bus_add_driver+0x170/0x218 driver_register+0x64/0x118 __platform_driver_register+0x54/0x60 arm_smmu_driver_init+0x24/0x2c do_one_initcall+0xbc/0x328 kernel_init_freeable+0x304/0x3ac kernel_init+0x18/0x110 ret_from_fork+0x10/0x1c Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) ---[ end trace dfeaed4c373a32da ]-- Change the dev_set_proximity() hook prototype so that it returns a value and make it return failure if the PXM->NUMA-node mapping corresponds to an offline node, fixing the crash. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/linux-arm-kernel/20190315021940.86905-1-wangkefeng.wang@huawei.com/ --- v2->v3: -Update changelog according to Lorenzo Pieralisi's comment and add acked-by. drivers/acpi/arm64/iort.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index e48894e002ba..a46c2c162c03 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1232,18 +1232,24 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) /* * set numa proximity domain for smmuv3 device */ -static void __init arm_smmu_v3_set_proximity(struct device *dev, +static int __init arm_smmu_v3_set_proximity(struct device *dev, struct acpi_iort_node *node) { struct acpi_iort_smmu_v3 *smmu; smmu = (struct acpi_iort_smmu_v3 *)node->node_data; if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); + int node = acpi_map_pxm_to_node(smmu->pxm); + + if (node != NUMA_NO_NODE && !node_online(node)) + return -EINVAL; + + set_dev_node(dev, node); pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", smmu->base_address, smmu->pxm); } + return 0; } #else #define arm_smmu_v3_set_proximity NULL @@ -1318,7 +1324,7 @@ struct iort_dev_config { int (*dev_count_resources)(struct acpi_iort_node *node); void (*dev_init_resources)(struct resource *res, struct acpi_iort_node *node); - void (*dev_set_proximity)(struct device *dev, + int (*dev_set_proximity)(struct device *dev, struct acpi_iort_node *node); }; @@ -1369,8 +1375,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, if (!pdev) return -ENOMEM; - if (ops->dev_set_proximity) - ops->dev_set_proximity(&pdev->dev, node); + if (ops->dev_set_proximity) { + ret = ops->dev_set_proximity(&pdev->dev, node); + if (ret) + goto dev_put; + } count = ops->dev_count_resources(node); -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3] ACPI/IORT: Reject platform device creation on NUMA node mapping failure 2019-04-08 15:21 ` [PATCH v3] ACPI/IORT: Reject platform device creation on NUMA node mapping failure Kefeng Wang @ 2019-04-16 17:02 ` Lorenzo Pieralisi 2019-04-16 17:05 ` Will Deacon 0 siblings, 1 reply; 18+ messages in thread From: Lorenzo Pieralisi @ 2019-04-16 17:02 UTC (permalink / raw) To: Kefeng Wang, will.deacon Cc: rjw, Robin Murphy, linux-acpi, linux-arm-kernel, Sudeep Holla [+Will] Hi Will, there is not enough material for an IORT pull request this cycle but this patch should be merged and IORT code goes usually via arm64, can you pick it up or if you prefer I can resend it on LAKML and CC you in if it makes it any simpler ? Thanks, Lorenzo On Mon, Apr 08, 2019 at 11:21:12PM +0800, Kefeng Wang wrote: > In a system where, through IORT firmware mappings, the SMMU device is > mapped to a NUMA node that is not online, the kernel bootstrap results > in the following crash: > > Unable to handle kernel paging request at virtual address 0000000000001388 > Mem abort info: > ESR = 0x96000004 > Exception class = DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > Data abort info: > ISV = 0, ISS = 0x00000004 > CM = 0, WnR = 0 > [0000000000001388] user address but active_mm is swapper > Internal error: Oops: 96000004 [#1] SMP > Modules linked in: > CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #15 > pstate: 80c00009 (Nzcv daif +PAN +UAO) > pc : __alloc_pages_nodemask+0x13c/0x1068 > lr : __alloc_pages_nodemask+0xdc/0x1068 > ... > Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) > Call trace: > __alloc_pages_nodemask+0x13c/0x1068 > new_slab+0xec/0x570 > ___slab_alloc+0x3e0/0x4f8 > __slab_alloc+0x60/0x80 > __kmalloc_node_track_caller+0x10c/0x478 > devm_kmalloc+0x44/0xb0 > pinctrl_bind_pins+0x4c/0x188 > really_probe+0x78/0x2b8 > driver_probe_device+0x64/0x110 > device_driver_attach+0x74/0x98 > __driver_attach+0x9c/0xe8 > bus_for_each_dev+0x84/0xd8 > driver_attach+0x30/0x40 > bus_add_driver+0x170/0x218 > driver_register+0x64/0x118 > __platform_driver_register+0x54/0x60 > arm_smmu_driver_init+0x24/0x2c > do_one_initcall+0xbc/0x328 > kernel_init_freeable+0x304/0x3ac > kernel_init+0x18/0x110 > ret_from_fork+0x10/0x1c > Code: f90013b5 b9410fa1 1a9f0694 b50014c2 (b9400804) > ---[ end trace dfeaed4c373a32da ]-- > > Change the dev_set_proximity() hook prototype so that it returns a > value and make it return failure if the PXM->NUMA-node mapping > corresponds to an offline node, fixing the crash. > > Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > Link: https://lore.kernel.org/linux-arm-kernel/20190315021940.86905-1-wangkefeng.wang@huawei.com/ > --- > v2->v3: > -Update changelog according to Lorenzo Pieralisi's comment and add acked-by. > > drivers/acpi/arm64/iort.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > index e48894e002ba..a46c2c162c03 100644 > --- a/drivers/acpi/arm64/iort.c > +++ b/drivers/acpi/arm64/iort.c > @@ -1232,18 +1232,24 @@ static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node) > /* > * set numa proximity domain for smmuv3 device > */ > -static void __init arm_smmu_v3_set_proximity(struct device *dev, > +static int __init arm_smmu_v3_set_proximity(struct device *dev, > struct acpi_iort_node *node) > { > struct acpi_iort_smmu_v3 *smmu; > > smmu = (struct acpi_iort_smmu_v3 *)node->node_data; > if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { > - set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm)); > + int node = acpi_map_pxm_to_node(smmu->pxm); > + > + if (node != NUMA_NO_NODE && !node_online(node)) > + return -EINVAL; > + > + set_dev_node(dev, node); > pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", > smmu->base_address, > smmu->pxm); > } > + return 0; > } > #else > #define arm_smmu_v3_set_proximity NULL > @@ -1318,7 +1324,7 @@ struct iort_dev_config { > int (*dev_count_resources)(struct acpi_iort_node *node); > void (*dev_init_resources)(struct resource *res, > struct acpi_iort_node *node); > - void (*dev_set_proximity)(struct device *dev, > + int (*dev_set_proximity)(struct device *dev, > struct acpi_iort_node *node); > }; > > @@ -1369,8 +1375,11 @@ static int __init iort_add_platform_device(struct acpi_iort_node *node, > if (!pdev) > return -ENOMEM; > > - if (ops->dev_set_proximity) > - ops->dev_set_proximity(&pdev->dev, node); > + if (ops->dev_set_proximity) { > + ret = ops->dev_set_proximity(&pdev->dev, node); > + if (ret) > + goto dev_put; > + } > > count = ops->dev_count_resources(node); > > -- > 2.20.1 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3] ACPI/IORT: Reject platform device creation on NUMA node mapping failure 2019-04-16 17:02 ` Lorenzo Pieralisi @ 2019-04-16 17:05 ` Will Deacon 0 siblings, 0 replies; 18+ messages in thread From: Will Deacon @ 2019-04-16 17:05 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Kefeng Wang, rjw, linux-acpi, Sudeep Holla, Robin Murphy, linux-arm-kernel On Tue, Apr 16, 2019 at 06:02:20PM +0100, Lorenzo Pieralisi wrote: > [+Will] > > there is not enough material for an IORT pull request this cycle but > this patch should be merged and IORT code goes usually via arm64, can > you pick it up or if you prefer I can resend it on LAKML and CC you in > if it makes it any simpler ? > > On Mon, Apr 08, 2019 at 11:21:12PM +0800, Kefeng Wang wrote: > > In a system where, through IORT firmware mappings, the SMMU device is > > mapped to a NUMA node that is not online, the kernel bootstrap results > > in the following crash: Queued for 5.2. Thanks for the heads-up. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/2] ACPI: NUMA: show match info about PXM ID and offline/online node 2019-03-15 2:19 [PATCH 0/2] fix issue when acpi smmuv3 device alloc offline node memory Kefeng Wang 2019-03-15 2:19 ` [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device Kefeng Wang @ 2019-03-15 2:19 ` Kefeng Wang 2019-03-15 8:34 ` Kefeng Wang 1 sibling, 1 reply; 18+ messages in thread From: Kefeng Wang @ 2019-03-15 2:19 UTC (permalink / raw) To: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel Cc: Kefeng Wang It maybe trigger some issue when the acpi device driver allocs memory from an offline node, so use alternative acpi_map_pxm_to_online_node() to find an online node, let's show infomation about proximity ID, mapped offline node and the nearest online node returned. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- drivers/acpi/numa.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c index 7bbbf8256a41..064c771a7338 100644 --- a/drivers/acpi/numa.c +++ b/drivers/acpi/numa.c @@ -121,6 +121,9 @@ int acpi_map_pxm_to_online_node(int pxm) } } } + if (min_node != node) + pr_warn("IORT: PXM %d Mapped to offline node %d, choose nearest online node %d", + pxm, node, min_node); return min_node; } -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] ACPI: NUMA: show match info about PXM ID and offline/online node 2019-03-15 2:19 ` [PATCH 2/2] ACPI: NUMA: show match info about PXM ID and offline/online node Kefeng Wang @ 2019-03-15 8:34 ` Kefeng Wang 0 siblings, 0 replies; 18+ messages in thread From: Kefeng Wang @ 2019-03-15 8:34 UTC (permalink / raw) To: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, rjw, linux-acpi, linux-arm-kernel On 2019/3/15 10:19, Kefeng Wang wrote: > It maybe trigger some issue when the acpi device driver allocs memory > from an offline node, so use alternative acpi_map_pxm_to_online_node() > to find an online node, let's show infomation about proximity ID, mapped > offline node and the nearest online node returned. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > drivers/acpi/numa.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c > index 7bbbf8256a41..064c771a7338 100644 > --- a/drivers/acpi/numa.c > +++ b/drivers/acpi/numa.c > @@ -121,6 +121,9 @@ int acpi_map_pxm_to_online_node(int pxm) > } > } > } > + if (min_node != node) > + pr_warn("IORT: PXM %d Mapped to offline node %d, choose nearest online node %d", > + pxm, node, min_node); Should remove IORT prefix, some other table like NFIT, DMAR have PXM filed, waiting for review. > > return min_node; > } _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-04-16 17:05 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-15 2:19 [PATCH 0/2] fix issue when acpi smmuv3 device alloc offline node memory Kefeng Wang 2019-03-15 2:19 ` [PATCH 1/2] ACPI/IORT: set online numa node for smmuv3 device Kefeng Wang 2019-03-20 11:41 ` Robin Murphy 2019-03-20 14:00 ` Lorenzo Pieralisi 2019-03-21 6:08 ` Kefeng Wang 2019-03-27 14:24 ` Kefeng Wang 2019-03-28 11:32 ` Lorenzo Pieralisi 2019-03-28 14:00 ` [PATCH v2] ACPI/IORT: Reject platform dev creation when dev set to wrong numa node Kefeng Wang 2019-03-28 13:59 ` Robin Murphy 2019-03-28 14:29 ` Kefeng Wang 2019-03-29 3:17 ` [PATCH RESEND " Kefeng Wang 2019-04-08 10:42 ` Lorenzo Pieralisi 2019-04-08 10:46 ` Lorenzo Pieralisi 2019-04-08 15:21 ` [PATCH v3] ACPI/IORT: Reject platform device creation on NUMA node mapping failure Kefeng Wang 2019-04-16 17:02 ` Lorenzo Pieralisi 2019-04-16 17:05 ` Will Deacon 2019-03-15 2:19 ` [PATCH 2/2] ACPI: NUMA: show match info about PXM ID and offline/online node Kefeng Wang 2019-03-15 8:34 ` Kefeng Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).