linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] pseries/drmem: update LMBs after LPM
@ 2021-05-04  9:20 Laurent Dufour
  2021-05-04 22:30 ` Nathan Lynch
  0 siblings, 1 reply; 3+ messages in thread
From: Laurent Dufour @ 2021-05-04  9:20 UTC (permalink / raw)
  To: mpe, benh, paulus
  Cc: nathanl, linux-kernel, linuxppc-dev, Aneesh Kumar K . V, Tyrel Datwyler

After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is caught by the kernel, but the memory's node is updated because
there is no way to move a memory block between nodes.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node to match the added or removed LMB. But
the LMB's associativity node has not been updated after the DT node update
and thus the node is overwritten by the Linux's topology instead of the
hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity. However, ignore the
call to that hook when the update has been triggered by drmem_update_dt().
Because, in that case, the LMB tree has been used to set the DT property
and thus it doesn't need to be updated back. Since drmem_update_dt() is
called under the protection of the device_hotplug_lock and the hook is
called in the same context, use a simple boolean variable to detect that
call.

Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---

V4:
 - Prevent the LMB to be updated back in the case the request came from the
 LMB tree's update.
V3:
 - Check rd->dn->name instead of rd->dn->full_name
V2:
 - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
 introducing a new hook mechanism.
---
 arch/powerpc/include/asm/drmem.h              |  1 +
 arch/powerpc/mm/drmem.c                       | 46 +++++++++++++++++++
 .../platforms/pseries/hotplug-memory.c        |  4 ++
 3 files changed, 51 insertions(+)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index bf2402fed3e0..4265d5e95c2c 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -111,6 +111,7 @@ int drmem_update_dt(void);
 int __init
 walk_drmem_lmbs_early(unsigned long node, void *data,
 		      int (*func)(struct drmem_lmb *, const __be32 **, void *));
+void drmem_update_lmbs(struct property *prop);
 #endif
 
 static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 9af3832c9d8d..22197b18d85e 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -18,6 +18,7 @@ static int n_root_addr_cells, n_root_size_cells;
 
 static struct drmem_lmb_info __drmem_info;
 struct drmem_lmb_info *drmem_info = &__drmem_info;
+static bool in_drmem_update;
 
 u64 drmem_lmb_memory_max(void)
 {
@@ -178,6 +179,11 @@ int drmem_update_dt(void)
 	if (!memory)
 		return -1;
 
+	/*
+	 * Set in_drmem_update to prevent the notifier callback to process the
+	 * DT property back since the change is coming from the LMB tree.
+	 */
+	in_drmem_update = true;
 	prop = of_find_property(memory, "ibm,dynamic-memory", NULL);
 	if (prop) {
 		rc = drmem_update_dt_v1(memory, prop);
@@ -186,6 +192,7 @@ int drmem_update_dt(void)
 		if (prop)
 			rc = drmem_update_dt_v2(memory, prop);
 	}
+	in_drmem_update = false;
 
 	of_node_put(memory);
 	return rc;
@@ -307,6 +314,45 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
 	return ret;
 }
 
+/*
+ * Update the LMB associativity index.
+ */
+static int update_lmb(struct drmem_lmb *updated_lmb,
+		      __maybe_unused const __be32 **usm,
+		      __maybe_unused void *data)
+{
+	struct drmem_lmb *lmb;
+
+	for_each_drmem_lmb(lmb) {
+		if (lmb->drc_index != updated_lmb->drc_index)
+			continue;
+
+		lmb->aa_index = updated_lmb->aa_index;
+		break;
+	}
+	return 0;
+}
+
+/*
+ * Update the LMB associativity index.
+ *
+ * This needs to be called when the hypervisor is updating the
+ * dynamic-reconfiguration-memory node property.
+ */
+void drmem_update_lmbs(struct property *prop)
+{
+	/*
+	 * Don't update the LMBs if triggered by the update done in
+	 * drmem_update_dt(), the LMB values have been used to the update the DT
+	 * property in that case.
+	 */
+	if (in_drmem_update)
+		return;
+	if (!strcmp(prop->name, "ibm,dynamic-memory"))
+		__walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
+	else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
+		__walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
+}
 #endif
 
 static int init_drmem_lmb_size(struct device_node *dn)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..672ffbee2e78 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
 	case OF_RECONFIG_DETACH_NODE:
 		err = pseries_remove_mem_node(rd->dn);
 		break;
+	case OF_RECONFIG_UPDATE_PROPERTY:
+		if (!strcmp(rd->dn->name,
+			    "ibm,dynamic-reconfiguration-memory"))
+			drmem_update_lmbs(rd->prop);
 	}
 	return notifier_from_errno(err);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v4] pseries/drmem: update LMBs after LPM
  2021-05-04  9:20 [PATCH v4] pseries/drmem: update LMBs after LPM Laurent Dufour
@ 2021-05-04 22:30 ` Nathan Lynch
  2021-05-05 14:39   ` Laurent Dufour
  0 siblings, 1 reply; 3+ messages in thread
From: Nathan Lynch @ 2021-05-04 22:30 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: linux-kernel, linuxppc-dev, Aneesh Kumar K . V, Tyrel Datwyler,
	mpe, benh, paulus

Hi Laurent,

Bear with me while I work through the commit message:

Laurent Dufour <ldufour@linux.ibm.com> writes:
> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
> updated by the hypervisor in the case the NUMA topology of the LPAR's
> memory is updated.

Yes, the RTAS functions ibm,update-nodes and ibm,update-properties,
which the OS invokes after resuming, may bring in updated properties
under the ibm,dynamic-reconfiguration-memory node, including the
ibm,associativity-lookup-arrays property.

> This is caught by the kernel,

"Caught" makes me think this is an error condition, as in catching an
exception. I guess "handled" better conveys your meaning?

> but the memory's node is updated because
> there is no way to move a memory block between nodes.

"The memory's node" refers the ibm,dynamic-reconfiguration-memory DT
node, yes? Or is it referring to Linux's NUMA nodes? ("move a memory
block between nodes" in your statement here refers to Linux's NUMA
nodes, that much is clear to me.)

I am failing to follow the cause->effect relationship stated. True,
changing a block's node assignment while it's in use isn't safe. I don't
see why that implies that "the memory's node is updated"? In fact this
seems contradictory.

This statement makes more sense to me if I change it to "the memory's
node is _not_ updated" -- is this what you intended?

> If later a memory block is added or removed, drmem_update_dt() is called
> and it is overwriting the DT node to match the added or removed LMB.

I understand this, but I will expand on it.

dlpar_memory()
  -> dlpar_memory_add_by_count()
    -> dlpar_add_lmb()
      -> update_lmb_associativity_index()
        ... lmb->aa_index = <value>
  -> drmem_update_dt()

update_lmb_associativity_index() retrieves the firmware description of
the new block, and sets the aa_index of the matching entry in the
drmem_info array to the value matching the firmware description.

Then, drmem_update_dt() walks the drmem_info array and synthesizes a new
/ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2 property based
on the recently updated information in that array.

> But the LMB's associativity node has not been updated after the DT
> node update and thus the node is overwritten by the Linux's topology
> instead of the hypervisor one.

So, an example of the problem is:

1. VM migrates. On resume, ibm,associativity-lookup-arrays is changed
   via ibm,update-properties. Entries in the drmem_info array remain
   unchanged, with aa_index values that correspond to the source
   system's ibm,associativity-lookup-arrays property, now inaccessible.

2. A memory block is added. We look up the new block's entry in the
   drmem_info array, and set the aa_index to the value matching the
   current ibm,associativity-lookup-arrays.

3. Then, the ibm,associativity-lookup-arrays property is completely
   regenerated from the drmem_info array, which reflects a mixture of
   information from the source and destination systems.

Do I understand correctly?


> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
> updated to force an update of the LMB's associativity. However, ignore the
> call to that hook when the update has been triggered by drmem_update_dt().
> Because, in that case, the LMB tree has been used to set the DT property
> and thus it doesn't need to be updated back. Since drmem_update_dt() is
> called under the protection of the device_hotplug_lock and the hook is
> called in the same context, use a simple boolean variable to detect that
> call.

This strikes me as almost a revert of e978a3ccaa71 ("powerpc/pseries:
remove obsolete memory hotplug DT notifier code").

I'd rather avoid smuggling through global state information that ought
to be passed in function parameters, if it should be passed around at
all. Despite having (IMO) relatively simple responsibilities, this code
is difficult to change and review; adding this property makes it
worse. If the structure of the code is pushing us toward this kind of
compromise, then the code probably needs more fundamental changes.

I'm probably forgetting something -- can anyone remind me why we need an
array of these:

struct drmem_lmb {
	u64     base_addr;
	u32     drc_index;
	u32     aa_index;
	u32     flags;
};

which is just a less efficient representation of what's already in the
device tree? If we got rid of it, would this problem disappear?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v4] pseries/drmem: update LMBs after LPM
  2021-05-04 22:30 ` Nathan Lynch
@ 2021-05-05 14:39   ` Laurent Dufour
  0 siblings, 0 replies; 3+ messages in thread
From: Laurent Dufour @ 2021-05-05 14:39 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Tyrel Datwyler, linux-kernel, paulus, Aneesh Kumar K . V, linuxppc-dev

Le 05/05/2021 à 00:30, Nathan Lynch a écrit :
> Hi Laurent,

Hi Nathan,

Thanks for your review.

> Bear with me while I work through the commit message:
> 
> Laurent Dufour <ldufour@linux.ibm.com> writes:
>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>> memory is updated.
> 
> Yes, the RTAS functions ibm,update-nodes and ibm,update-properties,
> which the OS invokes after resuming, may bring in updated properties
> under the ibm,dynamic-reconfiguration-memory node, including the
> ibm,associativity-lookup-arrays property.
> 
>> This is caught by the kernel,
> 
> "Caught" makes me think this is an error condition, as in catching an
> exception. I guess "handled" better conveys your meaning?

ok

> 
>> but the memory's node is updated because
>> there is no way to move a memory block between nodes.
> 
> "The memory's node" refers the ibm,dynamic-reconfiguration-memory DT
> node, yes? Or is it referring to Linux's NUMA nodes? ("move a memory
> block between nodes" in your statement here refers to Linux's NUMA
> nodes, that much is clear to me.)
> 
> I am failing to follow the cause->effect relationship stated. True,
> changing a block's node assignment while it's in use isn't safe. I don't
> see why that implies that "the memory's node is updated"? In fact this
> seems contradictory.
> 
> This statement makes more sense to me if I change it to "the memory's
> node is _not_ updated" -- is this what you intended?

Correct, I dropped the 'not' word here ;)

> 
>> If later a memory block is added or removed, drmem_update_dt() is called
>> and it is overwriting the DT node to match the added or removed LMB.
> 
> I understand this, but I will expand on it.
> 
> dlpar_memory()
>    -> dlpar_memory_add_by_count()
>      -> dlpar_add_lmb()
>        -> update_lmb_associativity_index()
>          ... lmb->aa_index = <value>
>    -> drmem_update_dt()
> 
> update_lmb_associativity_index() retrieves the firmware description of
> the new block, and sets the aa_index of the matching entry in the
> drmem_info array to the value matching the firmware description.
> 
> Then, drmem_update_dt() walks the drmem_info array and synthesizes a new
> /ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2 property based
> on the recently updated information in that array.

Yes

> 
>> But the LMB's associativity node has not been updated after the DT
>> node update and thus the node is overwritten by the Linux's topology
>> instead of the hypervisor one.
> 
> So, an example of the problem is:
> 
> 1. VM migrates. On resume, ibm,associativity-lookup-arrays is changed
>     via ibm,update-properties. Entries in the drmem_info array remain
>     unchanged, with aa_index values that correspond to the source
>     system's ibm,associativity-lookup-arrays property, now inaccessible.
> 
> 2. A memory block is added. We look up the new block's entry in the
>     drmem_info array, and set the aa_index to the value matching the
>     current ibm,associativity-lookup-arrays.
> 
> 3. Then, the ibm,associativity-lookup-arrays property is completely
>     regenerated from the drmem_info array, which reflects a mixture of
>     information from the source and destination systems.
> 
> Do I understand correctly?

Yes

> 
> 
>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>> updated to force an update of the LMB's associativity. However, ignore the
>> call to that hook when the update has been triggered by drmem_update_dt().
>> Because, in that case, the LMB tree has been used to set the DT property
>> and thus it doesn't need to be updated back. Since drmem_update_dt() is
>> called under the protection of the device_hotplug_lock and the hook is
>> called in the same context, use a simple boolean variable to detect that
>> call.
> 
> This strikes me as almost a revert of e978a3ccaa71 ("powerpc/pseries:
> remove obsolete memory hotplug DT notifier code").

Not really identical to reverting e978a3ccaa71, here only the aa_index of the 
LMB is updated, everything else is kept in place. I don't try to apply the 
memory layout's changes, just updating the in use LMB's aa_index field.

The only matching point with the code reverted by the commit you mentioned would 
be the use of a global variable in_drmem_update instead of the previous 
rtas_hp_event to prevent the LMB tree to be updated again during memory hot plug 
event.

> I'd rather avoid smuggling through global state information that ought
> to be passed in function parameters, if it should be passed around at
> all. Despite having (IMO) relatively simple responsibilities, this code
> is difficult to change and review; adding this property makes it
> worse. If the structure of the code is pushing us toward this kind of
> compromise, then the code probably needs more fundamental changes.
> 
> I'm probably forgetting something -- can anyone remind me why we need an
> array of these:
> 
> struct drmem_lmb {
> 	u64     base_addr;
> 	u32     drc_index;
> 	u32     aa_index;
> 	u32     flags;
> };
> 
> which is just a less efficient representation of what's already in the
> device tree? If we got rid of it, would this problem disappear?

I don't think this is right for the moment, at first, we should robustify the 
DLPAR and LPM operations.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-05-05 14:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-04  9:20 [PATCH v4] pseries/drmem: update LMBs after LPM Laurent Dufour
2021-05-04 22:30 ` Nathan Lynch
2021-05-05 14:39   ` Laurent Dufour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).