linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] dax/kmem: refrain from adding memory into an impossible node
@ 2020-04-11  0:09 Vishal Verma
  2020-04-11  2:04 ` Dan Williams
  2020-04-14 11:51 ` David Hildenbrand
  0 siblings, 2 replies; 5+ messages in thread
From: Vishal Verma @ 2020-04-11  0:09 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, Dave Hansen

A misbehaving qemu created a situation where the ACPI SRAT table
advertised one fewer proximity domains than intended. The NFIT table did
describe all the expected proximity domains. This caused the device dax
driver to assign an impossible target_node to the device, and when
hotplugged as system memory, this would fail with the following
signature:

  [  +0.001627] BUG: kernel NULL pointer dereference, address: 0000000000000088
  [  +0.001331] #PF: supervisor read access in kernel mode
  [  +0.000975] #PF: error_code(0x0000) - not-present page
  [  +0.000976] PGD 80000001767d4067 P4D 80000001767d4067 PUD 10e0c4067 PMD 0
  [  +0.001338] Oops: 0000 [#1] SMP PTI
  [  +0.000676] CPU: 4 PID: 22737 Comm: kswapd3 Tainted: G           O      5.6.0-rc5 #9
  [  +0.001457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  [  +0.001990] RIP: 0010:prepare_kswapd_sleep+0x7c/0xc0
  [  +0.000780] Code: 89 df e8 87 fd ff ff 89 c2 31 c0 84 d2 74 e6 0f 1f 44
                      00 00 48 8b 05 fb af 7a 01 48 63 93 88 1d 01 00 48 8b
		      84 d0 20 0f 00 00 <48> 3b 98 88 00 00 00 75 28 f0 80 a0
		      80 00 00 00 fe f0 80 a3 38 20
  [  +0.002877] RSP: 0018:ffffc900017a3e78 EFLAGS: 00010202
  [  +0.000805] RAX: 0000000000000000 RBX: ffff8881209e0000 RCX: 0000000000000000
  [  +0.001115] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff8881209e0e80
  [  +0.001098] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000008000
  [  +0.001092] R10: 0000000000000000 R11: 0000000000000003 R12: 0000000000000003
  [  +0.001092] R13: 0000000000000003 R14: 0000000000000000 R15: ffffc900017a3ec8
  [  +0.001091] FS:  0000000000000000(0000) GS:ffff888318c00000(0000) knlGS:0000000000000000
  [  +0.001275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  +0.000882] CR2: 0000000000000088 CR3: 0000000120b50002 CR4: 00000000001606e0
  [  +0.001095] Call Trace:
  [  +0.000388]  kswapd+0x103/0x520
  [  +0.000494]  ? finish_wait+0x80/0x80
  [  +0.000547]  ? balance_pgdat+0x5a0/0x5a0
  [  +0.000607]  kthread+0x120/0x140
  [  +0.000508]  ? kthread_create_on_node+0x60/0x60
  [  +0.000706]  ret_from_fork+0x3a/0x50

Add a check in the kmem driver to ensure that the target_node for the
device in question is in the nodes_possible mask.

Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 drivers/dax/kmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 3d0a7e702c94..760c5b4e88c8 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -32,7 +32,7 @@ int dev_dax_kmem_probe(struct device *dev)
 	 * unavoidable performance issues.
 	 */
 	numa_node = dev_dax->target_node;
-	if (numa_node < 0) {
+	if (numa_node < 0 || !node_possible(numa_node)) {
 		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
 			 res, numa_node);
 		return -EINVAL;
-- 
2.21.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] dax/kmem: refrain from adding memory into an impossible node
  2020-04-11  0:09 [PATCH] dax/kmem: refrain from adding memory into an impossible node Vishal Verma
@ 2020-04-11  2:04 ` Dan Williams
  2020-04-14 11:51 ` David Hildenbrand
  1 sibling, 0 replies; 5+ messages in thread
From: Dan Williams @ 2020-04-11  2:04 UTC (permalink / raw)
  To: Vishal Verma; +Cc: linux-nvdimm, Linux MM, Dave Hansen

On Fri, Apr 10, 2020 at 5:09 PM Vishal Verma <vishal.l.verma@intel.com> wrote:
>
> A misbehaving qemu created a situation where the ACPI SRAT table
> advertised one fewer proximity domains than intended. The NFIT table did
> describe all the expected proximity domains. This caused the device dax
> driver to assign an impossible target_node to the device, and when
> hotplugged as system memory, this would fail with the following
> signature:
>
>   [  +0.001627] BUG: kernel NULL pointer dereference, address: 0000000000000088
>   [  +0.001331] #PF: supervisor read access in kernel mode
>   [  +0.000975] #PF: error_code(0x0000) - not-present page
>   [  +0.000976] PGD 80000001767d4067 P4D 80000001767d4067 PUD 10e0c4067 PMD 0
>   [  +0.001338] Oops: 0000 [#1] SMP PTI
>   [  +0.000676] CPU: 4 PID: 22737 Comm: kswapd3 Tainted: G           O      5.6.0-rc5 #9
>   [  +0.001457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>       BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   [  +0.001990] RIP: 0010:prepare_kswapd_sleep+0x7c/0xc0
>   [  +0.000780] Code: 89 df e8 87 fd ff ff 89 c2 31 c0 84 d2 74 e6 0f 1f 44
>                       00 00 48 8b 05 fb af 7a 01 48 63 93 88 1d 01 00 48 8b
>                       84 d0 20 0f 00 00 <48> 3b 98 88 00 00 00 75 28 f0 80 a0
>                       80 00 00 00 fe f0 80 a3 38 20
>   [  +0.002877] RSP: 0018:ffffc900017a3e78 EFLAGS: 00010202
>   [  +0.000805] RAX: 0000000000000000 RBX: ffff8881209e0000 RCX: 0000000000000000
>   [  +0.001115] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff8881209e0e80
>   [  +0.001098] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000008000
>   [  +0.001092] R10: 0000000000000000 R11: 0000000000000003 R12: 0000000000000003
>   [  +0.001092] R13: 0000000000000003 R14: 0000000000000000 R15: ffffc900017a3ec8
>   [  +0.001091] FS:  0000000000000000(0000) GS:ffff888318c00000(0000) knlGS:0000000000000000
>   [  +0.001275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [  +0.000882] CR2: 0000000000000088 CR3: 0000000120b50002 CR4: 00000000001606e0
>   [  +0.001095] Call Trace:
>   [  +0.000388]  kswapd+0x103/0x520
>   [  +0.000494]  ? finish_wait+0x80/0x80
>   [  +0.000547]  ? balance_pgdat+0x5a0/0x5a0
>   [  +0.000607]  kthread+0x120/0x140
>   [  +0.000508]  ? kthread_create_on_node+0x60/0x60
>   [  +0.000706]  ret_from_fork+0x3a/0x50
>
> Add a check in the kmem driver to ensure that the target_node for the
> device in question is in the nodes_possible mask.
>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  drivers/dax/kmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
> index 3d0a7e702c94..760c5b4e88c8 100644
> --- a/drivers/dax/kmem.c
> +++ b/drivers/dax/kmem.c
> @@ -32,7 +32,7 @@ int dev_dax_kmem_probe(struct device *dev)
>          * unavoidable performance issues.
>          */
>         numa_node = dev_dax->target_node;
> -       if (numa_node < 0) {
> +       if (numa_node < 0 || !node_possible(numa_node)) {

Looks good.

Additionally, I think we should also fix this at the other end and
have the nfit driver validate that the proximity domain values that it
translates to numa nodes are in the possible set and if not fall back
to acpi_map_pxm_to_online_node() i.e. "if impossible fallback to
closest". That way this failing config will start working albeit with
the wrong numa node, but that's the firmware's problem. See the calls
to acpi_map_pxm_to_node() in drivers/acpi/nfit/core.c.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] dax/kmem: refrain from adding memory into an impossible node
  2020-04-11  0:09 [PATCH] dax/kmem: refrain from adding memory into an impossible node Vishal Verma
  2020-04-11  2:04 ` Dan Williams
@ 2020-04-14 11:51 ` David Hildenbrand
  2020-04-14 17:58   ` Dan Williams
  1 sibling, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2020-04-14 11:51 UTC (permalink / raw)
  To: Vishal Verma, linux-nvdimm; +Cc: linux-mm, Dave Hansen

On 11.04.20 02:09, Vishal Verma wrote:
> A misbehaving qemu created a situation where the ACPI SRAT table
> advertised one fewer proximity domains than intended. The NFIT table did
> describe all the expected proximity domains. This caused the device dax
> driver to assign an impossible target_node to the device, and when
> hotplugged as system memory, this would fail with the following
> signature:
> 
>   [  +0.001627] BUG: kernel NULL pointer dereference, address: 0000000000000088
>   [  +0.001331] #PF: supervisor read access in kernel mode
>   [  +0.000975] #PF: error_code(0x0000) - not-present page
>   [  +0.000976] PGD 80000001767d4067 P4D 80000001767d4067 PUD 10e0c4067 PMD 0
>   [  +0.001338] Oops: 0000 [#1] SMP PTI
>   [  +0.000676] CPU: 4 PID: 22737 Comm: kswapd3 Tainted: G           O      5.6.0-rc5 #9
>   [  +0.001457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>       BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   [  +0.001990] RIP: 0010:prepare_kswapd_sleep+0x7c/0xc0
>   [  +0.000780] Code: 89 df e8 87 fd ff ff 89 c2 31 c0 84 d2 74 e6 0f 1f 44
>                       00 00 48 8b 05 fb af 7a 01 48 63 93 88 1d 01 00 48 8b
> 		      84 d0 20 0f 00 00 <48> 3b 98 88 00 00 00 75 28 f0 80 a0
> 		      80 00 00 00 fe f0 80 a3 38 20
>   [  +0.002877] RSP: 0018:ffffc900017a3e78 EFLAGS: 00010202
>   [  +0.000805] RAX: 0000000000000000 RBX: ffff8881209e0000 RCX: 0000000000000000
>   [  +0.001115] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff8881209e0e80
>   [  +0.001098] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000008000
>   [  +0.001092] R10: 0000000000000000 R11: 0000000000000003 R12: 0000000000000003
>   [  +0.001092] R13: 0000000000000003 R14: 0000000000000000 R15: ffffc900017a3ec8
>   [  +0.001091] FS:  0000000000000000(0000) GS:ffff888318c00000(0000) knlGS:0000000000000000
>   [  +0.001275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [  +0.000882] CR2: 0000000000000088 CR3: 0000000120b50002 CR4: 00000000001606e0
>   [  +0.001095] Call Trace:
>   [  +0.000388]  kswapd+0x103/0x520
>   [  +0.000494]  ? finish_wait+0x80/0x80
>   [  +0.000547]  ? balance_pgdat+0x5a0/0x5a0
>   [  +0.000607]  kthread+0x120/0x140
>   [  +0.000508]  ? kthread_create_on_node+0x60/0x60
>   [  +0.000706]  ret_from_fork+0x3a/0x50
> 
> Add a check in the kmem driver to ensure that the target_node for the
> device in question is in the nodes_possible mask.
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  drivers/dax/kmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
> index 3d0a7e702c94..760c5b4e88c8 100644
> --- a/drivers/dax/kmem.c
> +++ b/drivers/dax/kmem.c
> @@ -32,7 +32,7 @@ int dev_dax_kmem_probe(struct device *dev)
>  	 * unavoidable performance issues.
>  	 */
>  	numa_node = dev_dax->target_node;
> -	if (numa_node < 0) {
> +	if (numa_node < 0 || !node_possible(numa_node)) {
>  		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
>  			 res, numa_node);
>  		return -EINVAL;
> 

I do wonder if we should reject that from
add_memory()..->add_memory_resource() instead, where we do the
__try_online_node().

-- 
Thanks,

David / dhildenb
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] dax/kmem: refrain from adding memory into an impossible node
  2020-04-14 11:51 ` David Hildenbrand
@ 2020-04-14 17:58   ` Dan Williams
  2020-04-14 18:03     ` David Hildenbrand
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2020-04-14 17:58 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: linux-nvdimm, Linux MM, Dave Hansen

On Tue, Apr 14, 2020 at 4:51 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 11.04.20 02:09, Vishal Verma wrote:
> > A misbehaving qemu created a situation where the ACPI SRAT table
> > advertised one fewer proximity domains than intended. The NFIT table did
> > describe all the expected proximity domains. This caused the device dax
> > driver to assign an impossible target_node to the device, and when
> > hotplugged as system memory, this would fail with the following
> > signature:
> >
> >   [  +0.001627] BUG: kernel NULL pointer dereference, address: 0000000000000088
> >   [  +0.001331] #PF: supervisor read access in kernel mode
> >   [  +0.000975] #PF: error_code(0x0000) - not-present page
> >   [  +0.000976] PGD 80000001767d4067 P4D 80000001767d4067 PUD 10e0c4067 PMD 0
> >   [  +0.001338] Oops: 0000 [#1] SMP PTI
> >   [  +0.000676] CPU: 4 PID: 22737 Comm: kswapd3 Tainted: G           O      5.6.0-rc5 #9
> >   [  +0.001457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >       BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> >   [  +0.001990] RIP: 0010:prepare_kswapd_sleep+0x7c/0xc0
> >   [  +0.000780] Code: 89 df e8 87 fd ff ff 89 c2 31 c0 84 d2 74 e6 0f 1f 44
> >                       00 00 48 8b 05 fb af 7a 01 48 63 93 88 1d 01 00 48 8b
> >                     84 d0 20 0f 00 00 <48> 3b 98 88 00 00 00 75 28 f0 80 a0
> >                     80 00 00 00 fe f0 80 a3 38 20
> >   [  +0.002877] RSP: 0018:ffffc900017a3e78 EFLAGS: 00010202
> >   [  +0.000805] RAX: 0000000000000000 RBX: ffff8881209e0000 RCX: 0000000000000000
> >   [  +0.001115] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff8881209e0e80
> >   [  +0.001098] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000008000
> >   [  +0.001092] R10: 0000000000000000 R11: 0000000000000003 R12: 0000000000000003
> >   [  +0.001092] R13: 0000000000000003 R14: 0000000000000000 R15: ffffc900017a3ec8
> >   [  +0.001091] FS:  0000000000000000(0000) GS:ffff888318c00000(0000) knlGS:0000000000000000
> >   [  +0.001275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   [  +0.000882] CR2: 0000000000000088 CR3: 0000000120b50002 CR4: 00000000001606e0
> >   [  +0.001095] Call Trace:
> >   [  +0.000388]  kswapd+0x103/0x520
> >   [  +0.000494]  ? finish_wait+0x80/0x80
> >   [  +0.000547]  ? balance_pgdat+0x5a0/0x5a0
> >   [  +0.000607]  kthread+0x120/0x140
> >   [  +0.000508]  ? kthread_create_on_node+0x60/0x60
> >   [  +0.000706]  ret_from_fork+0x3a/0x50
> >
> > Add a check in the kmem driver to ensure that the target_node for the
> > device in question is in the nodes_possible mask.
> >
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> > ---
> >  drivers/dax/kmem.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
> > index 3d0a7e702c94..760c5b4e88c8 100644
> > --- a/drivers/dax/kmem.c
> > +++ b/drivers/dax/kmem.c
> > @@ -32,7 +32,7 @@ int dev_dax_kmem_probe(struct device *dev)
> >        * unavoidable performance issues.
> >        */
> >       numa_node = dev_dax->target_node;
> > -     if (numa_node < 0) {
> > +     if (numa_node < 0 || !node_possible(numa_node)) {
> >               dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
> >                        res, numa_node);
> >               return -EINVAL;
> >
>
> I do wonder if we should reject that from
> add_memory()..->add_memory_resource() instead, where we do the
> __try_online_node().

Yes, makes sense to centralize that check internal to
add_memory_resource(). However, instead of a failure let's just pick
the next "closest" possible node with a firmware-workaround
taint-warning to let the admin know when their added memory has an
awkward numa node, but otherwise let the memory come online.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] dax/kmem: refrain from adding memory into an impossible node
  2020-04-14 17:58   ` Dan Williams
@ 2020-04-14 18:03     ` David Hildenbrand
  0 siblings, 0 replies; 5+ messages in thread
From: David Hildenbrand @ 2020-04-14 18:03 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, Linux MM, Dave Hansen

On 14.04.20 19:58, Dan Williams wrote:
> On Tue, Apr 14, 2020 at 4:51 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 11.04.20 02:09, Vishal Verma wrote:
>>> A misbehaving qemu created a situation where the ACPI SRAT table
>>> advertised one fewer proximity domains than intended. The NFIT table did
>>> describe all the expected proximity domains. This caused the device dax
>>> driver to assign an impossible target_node to the device, and when
>>> hotplugged as system memory, this would fail with the following
>>> signature:
>>>
>>>   [  +0.001627] BUG: kernel NULL pointer dereference, address: 0000000000000088
>>>   [  +0.001331] #PF: supervisor read access in kernel mode
>>>   [  +0.000975] #PF: error_code(0x0000) - not-present page
>>>   [  +0.000976] PGD 80000001767d4067 P4D 80000001767d4067 PUD 10e0c4067 PMD 0
>>>   [  +0.001338] Oops: 0000 [#1] SMP PTI
>>>   [  +0.000676] CPU: 4 PID: 22737 Comm: kswapd3 Tainted: G           O      5.6.0-rc5 #9
>>>   [  +0.001457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>>       BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>>>   [  +0.001990] RIP: 0010:prepare_kswapd_sleep+0x7c/0xc0
>>>   [  +0.000780] Code: 89 df e8 87 fd ff ff 89 c2 31 c0 84 d2 74 e6 0f 1f 44
>>>                       00 00 48 8b 05 fb af 7a 01 48 63 93 88 1d 01 00 48 8b
>>>                     84 d0 20 0f 00 00 <48> 3b 98 88 00 00 00 75 28 f0 80 a0
>>>                     80 00 00 00 fe f0 80 a3 38 20
>>>   [  +0.002877] RSP: 0018:ffffc900017a3e78 EFLAGS: 00010202
>>>   [  +0.000805] RAX: 0000000000000000 RBX: ffff8881209e0000 RCX: 0000000000000000
>>>   [  +0.001115] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff8881209e0e80
>>>   [  +0.001098] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000008000
>>>   [  +0.001092] R10: 0000000000000000 R11: 0000000000000003 R12: 0000000000000003
>>>   [  +0.001092] R13: 0000000000000003 R14: 0000000000000000 R15: ffffc900017a3ec8
>>>   [  +0.001091] FS:  0000000000000000(0000) GS:ffff888318c00000(0000) knlGS:0000000000000000
>>>   [  +0.001275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>   [  +0.000882] CR2: 0000000000000088 CR3: 0000000120b50002 CR4: 00000000001606e0
>>>   [  +0.001095] Call Trace:
>>>   [  +0.000388]  kswapd+0x103/0x520
>>>   [  +0.000494]  ? finish_wait+0x80/0x80
>>>   [  +0.000547]  ? balance_pgdat+0x5a0/0x5a0
>>>   [  +0.000607]  kthread+0x120/0x140
>>>   [  +0.000508]  ? kthread_create_on_node+0x60/0x60
>>>   [  +0.000706]  ret_from_fork+0x3a/0x50
>>>
>>> Add a check in the kmem driver to ensure that the target_node for the
>>> device in question is in the nodes_possible mask.
>>>
>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
>>> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
>>> ---
>>>  drivers/dax/kmem.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
>>> index 3d0a7e702c94..760c5b4e88c8 100644
>>> --- a/drivers/dax/kmem.c
>>> +++ b/drivers/dax/kmem.c
>>> @@ -32,7 +32,7 @@ int dev_dax_kmem_probe(struct device *dev)
>>>        * unavoidable performance issues.
>>>        */
>>>       numa_node = dev_dax->target_node;
>>> -     if (numa_node < 0) {
>>> +     if (numa_node < 0 || !node_possible(numa_node)) {
>>>               dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
>>>                        res, numa_node);
>>>               return -EINVAL;
>>>
>>
>> I do wonder if we should reject that from
>> add_memory()..->add_memory_resource() instead, where we do the
>> __try_online_node().
> 
> Yes, makes sense to centralize that check internal to
> add_memory_resource(). However, instead of a failure let's just pick
> the next "closest" possible node with a firmware-workaround
> taint-warning to let the admin know when their added memory has an
> awkward numa node, but otherwise let the memory come online.
> 

With a warning, this makes sense.

-- 
Thanks,

David / dhildenb
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-14 18:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-11  0:09 [PATCH] dax/kmem: refrain from adding memory into an impossible node Vishal Verma
2020-04-11  2:04 ` Dan Williams
2020-04-14 11:51 ` David Hildenbrand
2020-04-14 17:58   ` Dan Williams
2020-04-14 18:03     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).