All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
@ 2020-02-28  6:04 Leonardo Bras
  2020-03-04  4:43   ` Bharata B Rao
  0 siblings, 1 reply; 7+ messages in thread
From: Leonardo Bras @ 2020-02-28  6:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Hari Bathini, Leonardo Bras,
	Christophe Leroy, Thomas Gleixner, Claudio Carvalho, mdroth
  Cc: linuxppc-dev, linux-kernel

While providing guests, it's desirable to resize it's memory on demand.

By now, it's possible to do so by creating a guest with a small base
memory, hot-plugging all the rest, and using 'movable_node' kernel
command-line parameter, which puts all hot-plugged memory in
ZONE_MOVABLE, allowing it to be removed whenever needed.

But there is an issue regarding guest reboot:
If memory is hot-plugged, and then the guest is rebooted, all hot-plugged
memory goes to ZONE_NORMAL, which offers no guaranteed hot-removal.
It usually prevents this memory to be hot-removed from the guest.

It's possible to use device-tree information to fix that behavior, as
it stores flags for LMB ranges on ibm,dynamic-memory-vN.
It involves marking each memblock with the correct flags as hotpluggable
memory, which mm/memblock.c puts in ZONE_MOVABLE during boot if
'movable_node' is passed.

For base memory, qemu assigns these flags for it's LMBs:
(DRCONF_MEM_AI_INVALID | DRCONF_MEM_RESERVED)
For hot-plugged memory, it assigns (DRCONF_MEM_ASSIGNED).

While guest kernel reads the device-tree, early_init_drmem_lmb() is
called for every added LMBs, doing nothing for base memory, and adding
memblocks for hot-plugged memory. Skipping base memory happens here:

if ((lmb->flags & DRCONF_MEM_RESERVED) ||
    !(lmb->flags & DRCONF_MEM_ASSIGNED))
	return;

Marking memblocks added by this function as hotplugable memory
is enough to get the desirable behavior, and should cause no change
if 'movable_node' parameter is not passed to kernel.

Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
---
 arch/powerpc/kernel/prom.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 6620f37abe73..f4d14c67bf53 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -518,6 +518,8 @@ static void __init early_init_drmem_lmb(struct drmem_lmb *lmb,
 		DBG("Adding: %llx -> %llx\n", base, size);
 		if (validate_mem_limit(base, &size))
 			memblock_add(base, size);
+
+		early_init_dt_mark_hotplug_memory_arch(base, size);
 	} while (--rngs);
 }
 #endif /* CONFIG_PPC_PSERIES */
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
  2020-02-28  6:04 [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests Leonardo Bras
@ 2020-03-04  4:43   ` Bharata B Rao
  0 siblings, 0 replies; 7+ messages in thread
From: Bharata B Rao @ 2020-03-04  4:43 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Hari Bathini, Christophe Leroy,
	Thomas Gleixner, Claudio Carvalho, Michael Roth, linuxppc-dev,
	linux-kernel, arbab, ndfont

On Fri, Feb 28, 2020 at 11:36 AM Leonardo Bras <leonardo@linux.ibm.com> wrote:
>
> While providing guests, it's desirable to resize it's memory on demand.
>
> By now, it's possible to do so by creating a guest with a small base
> memory, hot-plugging all the rest, and using 'movable_node' kernel
> command-line parameter, which puts all hot-plugged memory in
> ZONE_MOVABLE, allowing it to be removed whenever needed.
>
> But there is an issue regarding guest reboot:
> If memory is hot-plugged, and then the guest is rebooted, all hot-plugged
> memory goes to ZONE_NORMAL, which offers no guaranteed hot-removal.
> It usually prevents this memory to be hot-removed from the guest.
>
> It's possible to use device-tree information to fix that behavior, as
> it stores flags for LMB ranges on ibm,dynamic-memory-vN.
> It involves marking each memblock with the correct flags as hotpluggable
> memory, which mm/memblock.c puts in ZONE_MOVABLE during boot if
> 'movable_node' is passed.
>
> For base memory, qemu assigns these flags for it's LMBs:
> (DRCONF_MEM_AI_INVALID | DRCONF_MEM_RESERVED)
> For hot-plugged memory, it assigns (DRCONF_MEM_ASSIGNED).
>
> While guest kernel reads the device-tree, early_init_drmem_lmb() is
> called for every added LMBs, doing nothing for base memory, and adding
> memblocks for hot-plugged memory. Skipping base memory happens here:
>
> if ((lmb->flags & DRCONF_MEM_RESERVED) ||
>     !(lmb->flags & DRCONF_MEM_ASSIGNED))
>         return;
>
> Marking memblocks added by this function as hotplugable memory
> is enough to get the desirable behavior, and should cause no change
> if 'movable_node' parameter is not passed to kernel.
>
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  arch/powerpc/kernel/prom.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 6620f37abe73..f4d14c67bf53 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -518,6 +518,8 @@ static void __init early_init_drmem_lmb(struct drmem_lmb *lmb,
>                 DBG("Adding: %llx -> %llx\n", base, size);
>                 if (validate_mem_limit(base, &size))
>                         memblock_add(base, size);
> +
> +               early_init_dt_mark_hotplug_memory_arch(base, size);

Hi,

I tried this a few years back
(https://patchwork.ozlabs.org/patch/800142/) and didn't pursue it
further because at that time, it was felt that the approach might not
work for PowerVM guests, because all the present memory except RMA
gets marked as hot-pluggable by PowerVM. This discussion is not
present in the above thread, but during my private discussions with
Reza and Nathan, it was noted that making all that memory as MOVABLE
is not preferable for PowerVM guests as we might run out of memory for
kernel allocations.

Regards,
Bharata.
-- 
http://raobharata.wordpress.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
@ 2020-03-04  4:43   ` Bharata B Rao
  0 siblings, 0 replies; 7+ messages in thread
From: Bharata B Rao @ 2020-03-04  4:43 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: ndfont, linux-kernel, arbab, Claudio Carvalho, Michael Roth,
	Paul Mackerras, Greg Kroah-Hartman, Thomas Gleixner,
	linuxppc-dev, Hari Bathini

On Fri, Feb 28, 2020 at 11:36 AM Leonardo Bras <leonardo@linux.ibm.com> wrote:
>
> While providing guests, it's desirable to resize it's memory on demand.
>
> By now, it's possible to do so by creating a guest with a small base
> memory, hot-plugging all the rest, and using 'movable_node' kernel
> command-line parameter, which puts all hot-plugged memory in
> ZONE_MOVABLE, allowing it to be removed whenever needed.
>
> But there is an issue regarding guest reboot:
> If memory is hot-plugged, and then the guest is rebooted, all hot-plugged
> memory goes to ZONE_NORMAL, which offers no guaranteed hot-removal.
> It usually prevents this memory to be hot-removed from the guest.
>
> It's possible to use device-tree information to fix that behavior, as
> it stores flags for LMB ranges on ibm,dynamic-memory-vN.
> It involves marking each memblock with the correct flags as hotpluggable
> memory, which mm/memblock.c puts in ZONE_MOVABLE during boot if
> 'movable_node' is passed.
>
> For base memory, qemu assigns these flags for it's LMBs:
> (DRCONF_MEM_AI_INVALID | DRCONF_MEM_RESERVED)
> For hot-plugged memory, it assigns (DRCONF_MEM_ASSIGNED).
>
> While guest kernel reads the device-tree, early_init_drmem_lmb() is
> called for every added LMBs, doing nothing for base memory, and adding
> memblocks for hot-plugged memory. Skipping base memory happens here:
>
> if ((lmb->flags & DRCONF_MEM_RESERVED) ||
>     !(lmb->flags & DRCONF_MEM_ASSIGNED))
>         return;
>
> Marking memblocks added by this function as hotplugable memory
> is enough to get the desirable behavior, and should cause no change
> if 'movable_node' parameter is not passed to kernel.
>
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  arch/powerpc/kernel/prom.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 6620f37abe73..f4d14c67bf53 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -518,6 +518,8 @@ static void __init early_init_drmem_lmb(struct drmem_lmb *lmb,
>                 DBG("Adding: %llx -> %llx\n", base, size);
>                 if (validate_mem_limit(base, &size))
>                         memblock_add(base, size);
> +
> +               early_init_dt_mark_hotplug_memory_arch(base, size);

Hi,

I tried this a few years back
(https://patchwork.ozlabs.org/patch/800142/) and didn't pursue it
further because at that time, it was felt that the approach might not
work for PowerVM guests, because all the present memory except RMA
gets marked as hot-pluggable by PowerVM. This discussion is not
present in the above thread, but during my private discussions with
Reza and Nathan, it was noted that making all that memory as MOVABLE
is not preferable for PowerVM guests as we might run out of memory for
kernel allocations.

Regards,
Bharata.
-- 
http://raobharata.wordpress.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
  2020-03-04  4:43   ` Bharata B Rao
@ 2020-03-04  7:18     ` Leonardo Bras
  -1 siblings, 0 replies; 7+ messages in thread
From: Leonardo Bras @ 2020-03-04  7:18 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: ndfont, linux-kernel, arbab, Claudio Carvalho, Michael Roth,
	Paul Mackerras, Greg Kroah-Hartman, Thomas Gleixner,
	linuxppc-dev, Hari Bathini

[-- Attachment #1: Type: text/plain, Size: 1095 bytes --]

Hello Bharata, thanks for this feedback!

On Wed, 2020-03-04 at 10:13 +0530, Bharata B Rao wrote:
> Hi,
> 
> I tried this a few years back
> (https://patchwork.ozlabs.org/patch/800142/) and didn't pursue it
> further because at that time, it was felt that the approach might not
> work for PowerVM guests, because all the present memory except RMA
> gets marked as hot-pluggable by PowerVM. This discussion is not
> present in the above thread, but during my private discussions with
> Reza and Nathan, it was noted that making all that memory as MOVABLE
> is not preferable for PowerVM guests as we might run out of memory for
> kernel allocations.

Humm, this makes sense.
But with mu change, these pieces of memory only get into ZONE_MOVABLE
if the boot parameter 'movable_node' gets passed to guest kernel. 

So, even if we are unable to sort out some flag combination that work
fine for both use-cases, if PowerVM don't pass 'movable_node' as boot
parameter to kernel, it will behave just as today.

What are your thoughts on that?

Best regards,

Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
@ 2020-03-04  7:18     ` Leonardo Bras
  0 siblings, 0 replies; 7+ messages in thread
From: Leonardo Bras @ 2020-03-04  7:18 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: ndfont, linuxppc-dev, Greg Kroah-Hartman, Claudio Carvalho,
	Michael Roth, linux-kernel, Paul Mackerras, Thomas Gleixner,
	arbab, Hari Bathini

[-- Attachment #1: Type: text/plain, Size: 1095 bytes --]

Hello Bharata, thanks for this feedback!

On Wed, 2020-03-04 at 10:13 +0530, Bharata B Rao wrote:
> Hi,
> 
> I tried this a few years back
> (https://patchwork.ozlabs.org/patch/800142/) and didn't pursue it
> further because at that time, it was felt that the approach might not
> work for PowerVM guests, because all the present memory except RMA
> gets marked as hot-pluggable by PowerVM. This discussion is not
> present in the above thread, but during my private discussions with
> Reza and Nathan, it was noted that making all that memory as MOVABLE
> is not preferable for PowerVM guests as we might run out of memory for
> kernel allocations.

Humm, this makes sense.
But with mu change, these pieces of memory only get into ZONE_MOVABLE
if the boot parameter 'movable_node' gets passed to guest kernel. 

So, even if we are unable to sort out some flag combination that work
fine for both use-cases, if PowerVM don't pass 'movable_node' as boot
parameter to kernel, it will behave just as today.

What are your thoughts on that?

Best regards,

Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
  2020-03-04  7:18     ` Leonardo Bras
@ 2020-03-04 22:05       ` Leonardo Bras
  -1 siblings, 0 replies; 7+ messages in thread
From: Leonardo Bras @ 2020-03-04 22:05 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: ndfont, linux-kernel, arbab, Claudio Carvalho, Michael Roth,
	Paul Mackerras, Greg Kroah-Hartman, Thomas Gleixner,
	linuxppc-dev, Hari Bathini

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Wed, 2020-03-04 at 04:18 -0300, Leonardo Bras wrote:
> Humm, this makes sense.
> But with mu change, these pieces of memory only get into ZONE_MOVABLE
> if the boot parameter 'movable_node' gets passed to guest kernel. 

Humm, I think your patch also does that.

> So, even if we are unable to sort out some flag combination that work
> fine for both use-cases, if PowerVM don't pass 'movable_node' as boot
> parameter to kernel, it will behave just as today.

Also, another option would be adding a new 'removable' flag, given it
has a lot of free bytes. It would only be passed by qemu, so we would
be safe with PowerVM. 

Then we would have 
+	if(lmb->flags & DRCONF_MEM_REMOVABLE)	
+		early_init_dt_mark_hotplug_memory_arch(base, size);

Do you know if it's possible?
We would need to update the LOPAPR? 

Leonardo

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
@ 2020-03-04 22:05       ` Leonardo Bras
  0 siblings, 0 replies; 7+ messages in thread
From: Leonardo Bras @ 2020-03-04 22:05 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: ndfont, linuxppc-dev, Greg Kroah-Hartman, Claudio Carvalho,
	Michael Roth, linux-kernel, Paul Mackerras, Thomas Gleixner,
	arbab, Hari Bathini

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Wed, 2020-03-04 at 04:18 -0300, Leonardo Bras wrote:
> Humm, this makes sense.
> But with mu change, these pieces of memory only get into ZONE_MOVABLE
> if the boot parameter 'movable_node' gets passed to guest kernel. 

Humm, I think your patch also does that.

> So, even if we are unable to sort out some flag combination that work
> fine for both use-cases, if PowerVM don't pass 'movable_node' as boot
> parameter to kernel, it will behave just as today.

Also, another option would be adding a new 'removable' flag, given it
has a lot of free bytes. It would only be passed by qemu, so we would
be safe with PowerVM. 

Then we would have 
+	if(lmb->flags & DRCONF_MEM_REMOVABLE)	
+		early_init_dt_mark_hotplug_memory_arch(base, size);

Do you know if it's possible?
We would need to update the LOPAPR? 

Leonardo

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-04 22:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-28  6:04 [PATCH 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests Leonardo Bras
2020-03-04  4:43 ` Bharata B Rao
2020-03-04  4:43   ` Bharata B Rao
2020-03-04  7:18   ` Leonardo Bras
2020-03-04  7:18     ` Leonardo Bras
2020-03-04 22:05     ` Leonardo Bras
2020-03-04 22:05       ` Leonardo Bras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.