* [PATCH 1/2] powerpc/pseries: Use a helper to fixup nr_cores @ 2016-08-05 6:10 Sukadev Bhattiprolu 2016-08-05 6:14 ` [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size Sukadev Bhattiprolu 0 siblings, 1 reply; 9+ messages in thread From: Sukadev Bhattiprolu @ 2016-08-05 6:10 UTC (permalink / raw) To: Michael Ellerman; +Cc: linux-kernel, linuxppc-dev >From d49b597623ac58fa1ab61ce0157470b6390e9a67 Mon Sep 17 00:00:00 2001 From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Date: Fri, 5 Aug 2016 00:01:54 -0400 Subject: [PATCH 1/2] powerpc/pseries: Use a helper to fixup nr_cores. We have to fixup RMA size also, so using helpers will make it cleaner and consistent. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> --- arch/powerpc/kernel/prom_init.c | 70 ++++++++++++++++++++++------------------- 1 file changed, 38 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index 6ee4b72..f612a99 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -863,47 +863,53 @@ static int __init prom_count_smt_threads(void) } +static void fixup_nr_cores(void) +{ + u32 cores; + unsigned char *ptcores; + + /* We need to tell the FW about the number of cores we support. + * + * To do that, we count the number of threads on the first core + * (we assume this is the same for all cores) and use it to + * divide NR_CPUS. + */ + + /* The core value may start at an odd address. If such a word + * access is made at a cache line boundary, this leads to an + * exception which may not be handled at this time. + * Forcing a per byte access to avoid exception. + */ + ptcores = &ibm_architecture_vec[IBM_ARCH_VEC_NRCORES_OFFSET]; + cores = 0; + cores |= ptcores[0] << 24; + cores |= ptcores[1] << 16; + cores |= ptcores[2] << 8; + cores |= ptcores[3]; + if (cores != NR_CPUS) { + prom_printf("WARNING ! " + "ibm_architecture_vec structure inconsistent: %lu!\n", + cores); + } else { + cores = DIV_ROUND_UP(NR_CPUS, prom_count_smt_threads()); + prom_printf("Max number of cores passed to firmware: %lu (NR_CPUS = %lu)\n", + cores, NR_CPUS); + ptcores[0] = (cores >> 24) & 0xff; + ptcores[1] = (cores >> 16) & 0xff; + ptcores[2] = (cores >> 8) & 0xff; + ptcores[3] = cores & 0xff; + } +} static void __init prom_send_capabilities(void) { ihandle root; prom_arg_t ret; - u32 cores; - unsigned char *ptcores; root = call_prom("open", 1, 1, ADDR("/")); if (root != 0) { - /* We need to tell the FW about the number of cores we support. - * - * To do that, we count the number of threads on the first core - * (we assume this is the same for all cores) and use it to - * divide NR_CPUS. - */ - /* The core value may start at an odd address. If such a word - * access is made at a cache line boundary, this leads to an - * exception which may not be handled at this time. - * Forcing a per byte access to avoid exception. - */ - ptcores = &ibm_architecture_vec[IBM_ARCH_VEC_NRCORES_OFFSET]; - cores = 0; - cores |= ptcores[0] << 24; - cores |= ptcores[1] << 16; - cores |= ptcores[2] << 8; - cores |= ptcores[3]; - if (cores != NR_CPUS) { - prom_printf("WARNING ! " - "ibm_architecture_vec structure inconsistent: %lu!\n", - cores); - } else { - cores = DIV_ROUND_UP(NR_CPUS, prom_count_smt_threads()); - prom_printf("Max number of cores passed to firmware: %lu (NR_CPUS = %lu)\n", - cores, NR_CPUS); - ptcores[0] = (cores >> 24) & 0xff; - ptcores[1] = (cores >> 16) & 0xff; - ptcores[2] = (cores >> 8) & 0xff; - ptcores[3] = cores & 0xff; - } + fixup_nr_cores(); /* try calling the ibm,client-architecture-support method */ prom_printf("Calling ibm,client-architecture-support..."); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2016-08-05 6:10 [PATCH 1/2] powerpc/pseries: Use a helper to fixup nr_cores Sukadev Bhattiprolu @ 2016-08-05 6:14 ` Sukadev Bhattiprolu 2016-08-05 13:28 ` kbuild test robot 2016-08-05 18:30 ` Sukadev Bhattiprolu 0 siblings, 2 replies; 9+ messages in thread From: Sukadev Bhattiprolu @ 2016-08-05 6:14 UTC (permalink / raw) To: Michael Ellerman; +Cc: linux-kernel, linuxppc-dev >From ddce2a5f439111f08969d66ccc0c7b4d9196b69d Mon Sep 17 00:00:00 2001 From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Date: Thu, 4 Aug 2016 23:13:37 -0400 Subject: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size When booting a very large system with a large initrd we run out of space for the flattened device tree (FDT). To fix this we must increase the space allocated for the RMA region. The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing the size there will apply to all systems, large and small, so we want to increase the RMA region only when necessary. When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size' to the new RMA size (512MB) and issue a client-architecture-support (CAS) call to the firmware. This will initiate a system reboot. Upon reboot we notice the new property and update the RMA size accordingly. The CAS call we issue would end up being a second CAS call in the boot sequence. Use a static variable, 'fixup_nr_cores_done', to detect this second CAS and avoid fixing up nr_cores or hitting the WARNING again. Fix suggested by Michael Ellerman. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> --- arch/powerpc/kernel/prom_init.c | 86 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 85 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index f612a99..407cbb9 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -679,6 +679,7 @@ unsigned char ibm_architecture_vec[] = { W(0xffffffff), /* virt_base */ W(0xffffffff), /* virt_size */ W(0xffffffff), /* load_base */ +#define IBM_ARCH_VEC_MIN_RMA_OFFSET 108 W(256), /* 256MB min RMA */ W(0xffffffff), /* full client load */ 0, /* min RMA percentage of total RAM */ @@ -867,6 +868,10 @@ static void fixup_nr_cores(void) { u32 cores; unsigned char *ptcores; + static bool fixup_nr_cores_done = false; + + if (fixup_nr_cores_done) + return; /* We need to tell the FW about the number of cores we support. * @@ -898,6 +903,41 @@ static void fixup_nr_cores(void) ptcores[1] = (cores >> 16) & 0xff; ptcores[2] = (cores >> 8) & 0xff; ptcores[3] = cores & 0xff; + fixup_nr_cores_done = true; + } +} + +static void __init fixup_rma_size(void) +{ + int rc; + u64 size; + unsigned char *min_rmap; + phandle optnode; + char str[64]; + + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); + if (!PHANDLE_VALID(optnode)) + prom_panic("Cannot find /options"); + + /* + * If a prior boot specified a new RMA size, use that size in + * ibm_architecture_vec[]. See also increase_rma_size(). + */ + size = 0ULL; + memset(str, 0, sizeof(str)); + rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str)); + if (rc <= 0) + return; + + size = prom_strtoul(str, NULL); + min_rmap = &ibm_architecture_vec[IBM_ARCH_VEC_MIN_RMA_OFFSET]; + + if (size) { + prom_printf("Using RMA size %lu from ibm,new-rma-size.\n", size); + min_rmap[0] = (size >> 24) & 0xff; + min_rmap[1] = (size >> 16) & 0xff; + min_rmap[2] = (size >> 8) & 0xff; + min_rmap[3] = size & 0xff; } } @@ -911,6 +951,8 @@ static void __init prom_send_capabilities(void) fixup_nr_cores(); + fixup_rma_size(); + /* try calling the ibm,client-architecture-support method */ prom_printf("Calling ibm,client-architecture-support..."); if (call_prom_ret("call-method", 3, 2, &ret, @@ -946,6 +988,46 @@ static void __init prom_send_capabilities(void) } #endif /* __BIG_ENDIAN__ */ } + +static void __init increase_rma_size(void) +{ + int rc; + u64 size; + char str[64]; + phandle optnode; + + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); + if (!PHANDLE_VALID(optnode)) + prom_panic("Cannot find /options"); + + /* + * If we already increased the RMA size, return. + */ + size = 0ULL; + memset(str, 0, sizeof(str)); + rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str)); + + size = prom_strtoul(str, NULL); + if (size == 512ULL) { + prom_printf("RMA size already at %lu.\n", size); + return; + } + /* + * Otherwise, set the ibm,new-rma-size property and initiate a CAS + * reboot so the RMA size can take effect. See also init_rma_size(). + */ + memset(str, 0, 4); + memcpy(str, "512", 3); + prom_printf("Setting ibm,new-rma-size property to %s\n", str); + rc = prom_setprop(optnode, "/options", "ibm,new-rma-size", &str, + strlen(str)+1); + + /* Force a reboot. Will work only if ibm,fw-override-cas==false */ + prom_send_capabilities(); + + prom_printf("No CAS initiated reboot? Try setting ibm,fw-override-cas to 'false' in Open Firmware\n"); +} + #endif /* #if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV) */ /* @@ -2027,9 +2109,11 @@ static void __init *make_room(unsigned long *mem_start, unsigned long *mem_end, room = alloc_top - alloc_bottom; if (room > DEVTREE_CHUNK_SIZE) room = DEVTREE_CHUNK_SIZE; - if (room < PAGE_SIZE) + if (room < PAGE_SIZE) { + increase_rma_size(); prom_panic("No memory for flatten_device_tree " "(no room)\n"); + } chunk = alloc_up(room, 0); if (chunk == 0) prom_panic("No memory for flatten_device_tree " -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2016-08-05 6:14 ` [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size Sukadev Bhattiprolu @ 2016-08-05 13:28 ` kbuild test robot 2016-08-05 18:30 ` Sukadev Bhattiprolu 1 sibling, 0 replies; 9+ messages in thread From: kbuild test robot @ 2016-08-05 13:28 UTC (permalink / raw) To: Sukadev Bhattiprolu Cc: kbuild-all, Michael Ellerman, linux-kernel, linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 1708 bytes --] Hi Sukadev, [auto build test ERROR on powerpc/next] [also build test ERROR on v4.7 next-20160805] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Sukadev-Bhattiprolu/powerpc-pseries-Use-a-helper-to-fixup-nr_cores/20160805-141813 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-kmeter1_defconfig (attached as .config) compiler: powerpc-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): arch/powerpc/kernel/prom_init.c: In function 'make_room': >> arch/powerpc/kernel/prom_init.c:2113:4: error: implicit declaration of function 'increase_rma_size' [-Werror=implicit-function-declaration] increase_rma_size(); ^ cc1: all warnings being treated as errors vim +/increase_rma_size +2113 arch/powerpc/kernel/prom_init.c 2107 prom_debug("Chunk exhausted, claiming more at %x...\n", 2108 alloc_bottom); 2109 room = alloc_top - alloc_bottom; 2110 if (room > DEVTREE_CHUNK_SIZE) 2111 room = DEVTREE_CHUNK_SIZE; 2112 if (room < PAGE_SIZE) { > 2113 increase_rma_size(); 2114 prom_panic("No memory for flatten_device_tree " 2115 "(no room)\n"); 2116 } --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 11074 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2016-08-05 6:14 ` [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size Sukadev Bhattiprolu 2016-08-05 13:28 ` kbuild test robot @ 2016-08-05 18:30 ` Sukadev Bhattiprolu 2016-08-05 19:04 ` Paul Clarke 1 sibling, 1 reply; 9+ messages in thread From: Sukadev Bhattiprolu @ 2016-08-05 18:30 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, linux-kernel Here is an updated patch to fix the build when CONFIG_PPC_PSERIES=n. --- >From d4f77a6ca7b6ea83f6588e7d541cc70bf001ae85 Mon Sep 17 00:00:00 2001 From: root <sukadev@linux.vnet.ibm.com> Date: Thu, 4 Aug 2016 23:13:37 -0400 Subject: [PATCH 2/2] powerpc/pseries: Dynamically grow RMA size When booting a very large system with a larg initrd we run out of space for the flattened device tree (FDT). To fix this we must increase the space allocated for the RMA region. The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing the size there will apply to all systems, small and large, so we want to increase the RMA region only when necessary. When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size' to the new RMA size (512MB) and issue a client-architecture-support (CAS) call to the firmware. This will initiate a system reboot. Upon reboot we notice the new property and update the RMA size accordingly. Fix suggested by Michael Ellerman. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> --- [v2]: - Add a comment in code regarding 'fixup_nr_cores_done' - Fix build break when CONFIG_PPC_PSERIES=n --- arch/powerpc/kernel/prom_init.c | 96 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 95 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index f612a99..cbd5387 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -679,6 +679,7 @@ unsigned char ibm_architecture_vec[] = { W(0xffffffff), /* virt_base */ W(0xffffffff), /* virt_size */ W(0xffffffff), /* load_base */ +#define IBM_ARCH_VEC_MIN_RMA_OFFSET 108 W(256), /* 256MB min RMA */ W(0xffffffff), /* full client load */ 0, /* min RMA percentage of total RAM */ @@ -867,6 +868,14 @@ static void fixup_nr_cores(void) { u32 cores; unsigned char *ptcores; + static bool fixup_nr_cores_done = false; + + /* + * If this is a second CAS call in the same boot sequence, (see + * increase_rma_size()), we don't need to do the fixup again. + */ + if (fixup_nr_cores_done) + return; /* We need to tell the FW about the number of cores we support. * @@ -898,6 +907,41 @@ static void fixup_nr_cores(void) ptcores[1] = (cores >> 16) & 0xff; ptcores[2] = (cores >> 8) & 0xff; ptcores[3] = cores & 0xff; + fixup_nr_cores_done = true; + } +} + +static void __init fixup_rma_size(void) +{ + int rc; + u64 size; + unsigned char *min_rmap; + phandle optnode; + char str[64]; + + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); + if (!PHANDLE_VALID(optnode)) + prom_panic("Cannot find /options"); + + /* + * If a prior boot specified a new RMA size, use that size in + * ibm_architecture_vec[]. See also increase_rma_size(). + */ + size = 0ULL; + memset(str, 0, sizeof(str)); + rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str)); + if (rc <= 0) + return; + + size = prom_strtoul(str, NULL); + min_rmap = &ibm_architecture_vec[IBM_ARCH_VEC_MIN_RMA_OFFSET]; + + if (size) { + prom_printf("Using RMA size %lu from ibm,new-rma-size.\n", size); + min_rmap[0] = (size >> 24) & 0xff; + min_rmap[1] = (size >> 16) & 0xff; + min_rmap[2] = (size >> 8) & 0xff; + min_rmap[3] = size & 0xff; } } @@ -911,6 +955,8 @@ static void __init prom_send_capabilities(void) fixup_nr_cores(); + fixup_rma_size(); + /* try calling the ibm,client-architecture-support method */ prom_printf("Calling ibm,client-architecture-support..."); if (call_prom_ret("call-method", 3, 2, &ret, @@ -946,6 +992,52 @@ static void __init prom_send_capabilities(void) } #endif /* __BIG_ENDIAN__ */ } + +static void __init increase_rma_size(void) +{ + int rc; + u64 size; + char str[64]; + phandle optnode; + + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); + if (!PHANDLE_VALID(optnode)) + prom_panic("Cannot find /options"); + + /* + * If we already increased the RMA size, return. + */ + size = 0ULL; + memset(str, 0, sizeof(str)); + rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str)); + + size = prom_strtoul(str, NULL); + if (size == 512ULL) { + prom_printf("RMA size already at %lu.\n", size); + return; + } + /* + * Otherwise, set the ibm,new-rma-size property and initiate a CAS + * reboot so the RMA size can take effect. See also init_rma_size(). + */ + memset(str, 0, 4); + memcpy(str, "512", 3); + prom_printf("Setting ibm,new-rma-size property to %s\n", str); + rc = prom_setprop(optnode, "/options", "ibm,new-rma-size", &str, + strlen(str)+1); + + /* Force a reboot. Will work only if ibm,fw-override-cas==false */ + prom_send_capabilities(); + + prom_printf("No CAS initiated reboot? Try setting ibm,fw-override-cas to 'false' in Open Firmware\n"); +} + +#else + +static void __init increase_rma_size(void) +{ +} + #endif /* #if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV) */ /* @@ -2027,9 +2119,11 @@ static void __init *make_room(unsigned long *mem_start, unsigned long *mem_end, room = alloc_top - alloc_bottom; if (room > DEVTREE_CHUNK_SIZE) room = DEVTREE_CHUNK_SIZE; - if (room < PAGE_SIZE) + if (room < PAGE_SIZE) { + increase_rma_size(); prom_panic("No memory for flatten_device_tree " "(no room)\n"); + } chunk = alloc_up(room, 0); if (chunk == 0) prom_panic("No memory for flatten_device_tree " -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2016-08-05 18:30 ` Sukadev Bhattiprolu @ 2016-08-05 19:04 ` Paul Clarke 2016-08-09 17:11 ` Sukadev Bhattiprolu 0 siblings, 1 reply; 9+ messages in thread From: Paul Clarke @ 2016-08-05 19:04 UTC (permalink / raw) To: Sukadev Bhattiprolu, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel Only nits from me...(see below) On 08/05/2016 01:30 PM, Sukadev Bhattiprolu wrote: > Here is an updated patch to fix the build when CONFIG_PPC_PSERIES=n. > --- > From d4f77a6ca7b6ea83f6588e7d541cc70bf001ae85 Mon Sep 17 00:00:00 2001 > From: root <sukadev@linux.vnet.ibm.com> > Date: Thu, 4 Aug 2016 23:13:37 -0400 > Subject: [PATCH 2/2] powerpc/pseries: Dynamically grow RMA size > > When booting a very large system with a larg initrd we run out of space > for the flattened device tree (FDT). To fix this we must increase the > space allocated for the RMA region. > > The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing > the size there will apply to all systems, small and large, so we want to > increase the RMA region only when necessary. > > When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size' > to the new RMA size (512MB) and issue a client-architecture-support (CAS) > call to the firmware. This will initiate a system reboot. Upon reboot we > notice the new property and update the RMA size accordingly. > > Fix suggested by Michael Ellerman. > > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> > --- > > [v2]: - Add a comment in code regarding 'fixup_nr_cores_done' > - Fix build break when CONFIG_PPC_PSERIES=n > --- > arch/powerpc/kernel/prom_init.c | 96 ++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 95 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c > index f612a99..cbd5387 100644 > --- a/arch/powerpc/kernel/prom_init.c > +++ b/arch/powerpc/kernel/prom_init.c > @@ -679,6 +679,7 @@ unsigned char ibm_architecture_vec[] = { > W(0xffffffff), /* virt_base */ > W(0xffffffff), /* virt_size */ > W(0xffffffff), /* load_base */ > +#define IBM_ARCH_VEC_MIN_RMA_OFFSET 108 > W(256), /* 256MB min RMA */ > W(0xffffffff), /* full client load */ > 0, /* min RMA percentage of total RAM */ > @@ -867,6 +868,14 @@ static void fixup_nr_cores(void) > { > u32 cores; > unsigned char *ptcores; > + static bool fixup_nr_cores_done = false; > + > + /* > + * If this is a second CAS call in the same boot sequence, (see > + * increase_rma_size()), we don't need to do the fixup again. > + */ > + if (fixup_nr_cores_done) > + return; > > /* We need to tell the FW about the number of cores we support. > * > @@ -898,6 +907,41 @@ static void fixup_nr_cores(void) > ptcores[1] = (cores >> 16) & 0xff; > ptcores[2] = (cores >> 8) & 0xff; > ptcores[3] = cores & 0xff; > + fixup_nr_cores_done = true; > + } > +} > + > +static void __init fixup_rma_size(void) > +{ > + int rc; > + u64 size; > + unsigned char *min_rmap; > + phandle optnode; > + char str[64]; > + > + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); > + if (!PHANDLE_VALID(optnode)) > + prom_panic("Cannot find /options"); > + > + /* > + * If a prior boot specified a new RMA size, use that size in > + * ibm_architecture_vec[]. See also increase_rma_size(). > + */ > + size = 0ULL; > + memset(str, 0, sizeof(str)); > + rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str)); > + if (rc <= 0) > + return; > + > + size = prom_strtoul(str, NULL); > + min_rmap = &ibm_architecture_vec[IBM_ARCH_VEC_MIN_RMA_OFFSET]; > + > + if (size) { > + prom_printf("Using RMA size %lu from ibm,new-rma-size.\n", size); > + min_rmap[0] = (size >> 24) & 0xff; > + min_rmap[1] = (size >> 16) & 0xff; > + min_rmap[2] = (size >> 8) & 0xff; > + min_rmap[3] = size & 0xff; > } > } > > @@ -911,6 +955,8 @@ static void __init prom_send_capabilities(void) > > fixup_nr_cores(); > > + fixup_rma_size(); > + > /* try calling the ibm,client-architecture-support method */ > prom_printf("Calling ibm,client-architecture-support..."); > if (call_prom_ret("call-method", 3, 2, &ret, > @@ -946,6 +992,52 @@ static void __init prom_send_capabilities(void) > } > #endif /* __BIG_ENDIAN__ */ > } > + > +static void __init increase_rma_size(void) > +{ > + int rc; > + u64 size; > + char str[64]; > + phandle optnode; > + > + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); > + if (!PHANDLE_VALID(optnode)) > + prom_panic("Cannot find /options"); > + > + /* > + * If we already increased the RMA size, return. > + */ > + size = 0ULL; > + memset(str, 0, sizeof(str)); > + rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str)); > + > + size = prom_strtoul(str, NULL); > + if (size == 512ULL) { Is this preferred over strncmp? Using a string also helps with my suggestion below... > + prom_printf("RMA size already at %lu.\n", size); > + return; > + } > + /* > + * Otherwise, set the ibm,new-rma-size property and initiate a CAS > + * reboot so the RMA size can take effect. See also init_rma_size(). > + */ > + memset(str, 0, 4); > + memcpy(str, "512", 3); There's a "512" here and a few lines above. Would it be better to define the magic value once somewhere, then use that common name as needed? The string "ibm,new-rma-size" is used in a number of places, too. (I'm just saying that if it changes, you'd need to go back and find them all.) Also, instead of memset/memcpy, why not: memcpy(str, "512", 4); > + prom_printf("Setting ibm,new-rma-size property to %s\n", str); > + rc = prom_setprop(optnode, "/options", "ibm,new-rma-size", &str, > + strlen(str)+1); > + > + /* Force a reboot. Will work only if ibm,fw-override-cas==false */ > + prom_send_capabilities(); > + > + prom_printf("No CAS initiated reboot? Try setting ibm,fw-override-cas to 'false' in Open Firmware\n"); > +} > + > +#else > + > +static void __init increase_rma_size(void) > +{ > +} > + > #endif /* #if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV) */ > > /* > @@ -2027,9 +2119,11 @@ static void __init *make_room(unsigned long *mem_start, unsigned long *mem_end, > room = alloc_top - alloc_bottom; > if (room > DEVTREE_CHUNK_SIZE) > room = DEVTREE_CHUNK_SIZE; > - if (room < PAGE_SIZE) > + if (room < PAGE_SIZE) { > + increase_rma_size(); > prom_panic("No memory for flatten_device_tree " > "(no room)\n"); > + } > chunk = alloc_up(room, 0); > if (chunk == 0) > prom_panic("No memory for flatten_device_tree " > -- PC ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2016-08-05 19:04 ` Paul Clarke @ 2016-08-09 17:11 ` Sukadev Bhattiprolu 2017-02-01 5:37 ` Michael Ellerman 0 siblings, 1 reply; 9+ messages in thread From: Sukadev Bhattiprolu @ 2016-08-09 17:11 UTC (permalink / raw) To: Paul Clarke; +Cc: Michael Ellerman, linuxppc-dev, linux-kernel Paul Clarke [pc@us.ibm.com] wrote: > Only nits from me...(see below) Paul, I agree with your comments and fixed them. Here is the updated patch. --- >From f9e9e8460206bc3fa7eaa741b9a2bde22870b9e0 Mon Sep 17 00:00:00 2001 From: root <sukadev@linux.vnet.ibm.com> Date: Thu, 4 Aug 2016 23:13:37 -0400 Subject: [PATCH 2/2] powerpc/pseries: Dynamically grow RMA size When booting a very large system with a large initrd we run out of space for the flattened device tree (FDT). To fix this we must increase the space allocated for the RMA region. The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing the size there will apply to all systems, large and small, so we want to increase the RMA region only when necessary. When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size' to the new RMA size (512MB) and issue a client-architecture-support (CAS) call to the firmware. This will initiate a system reboot. Upon reboot we notice the new property and update the RMA size accordingly. Fix suggested by Michael Ellerman. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> --- [v3]: - [Paul Clarke] Fix a few nits. [v2]: - Add a comment in code regarding 'fixup_nr_cores_done' - Fix build break when CONFIG_PPC_PSERIES=n --- arch/powerpc/kernel/prom_init.c | 97 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 96 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index f612a99..d1aaeda 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -87,6 +87,9 @@ int of_workarounds; #endif +#define IBM_NEW_RMA_SIZE_PROP "ibm,new-rma-size" +#define IBM_NEW_RMA_SIZE_STR "512" + #define OF_WA_CLAIM 1 /* do phys/virt claim separately, then map */ #define OF_WA_LONGTRAIL 2 /* work around longtrail bugs */ @@ -679,6 +682,7 @@ unsigned char ibm_architecture_vec[] = { W(0xffffffff), /* virt_base */ W(0xffffffff), /* virt_size */ W(0xffffffff), /* load_base */ +#define IBM_ARCH_VEC_MIN_RMA_OFFSET 108 W(256), /* 256MB min RMA */ W(0xffffffff), /* full client load */ 0, /* min RMA percentage of total RAM */ @@ -867,6 +871,14 @@ static void fixup_nr_cores(void) { u32 cores; unsigned char *ptcores; + static bool fixup_nr_cores_done = false; + + /* + * If this is a second CAS call in the same boot sequence, (see + * increase_rma_size()), we don't need to do the fixup again. + */ + if (fixup_nr_cores_done) + return; /* We need to tell the FW about the number of cores we support. * @@ -898,6 +910,42 @@ static void fixup_nr_cores(void) ptcores[1] = (cores >> 16) & 0xff; ptcores[2] = (cores >> 8) & 0xff; ptcores[3] = cores & 0xff; + fixup_nr_cores_done = true; + } +} + +static void __init fixup_rma_size(void) +{ + int rc; + u64 size; + unsigned char *min_rmap; + phandle optnode; + char str[64]; + + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); + if (!PHANDLE_VALID(optnode)) + prom_panic("Cannot find /options"); + + /* + * If a prior boot specified a new RMA size, use that size in + * ibm_architecture_vec[]. See also increase_rma_size(). + */ + size = 0ULL; + memset(str, 0, sizeof(str)); + rc = prom_getprop(optnode, IBM_NEW_RMA_SIZE_PROP, &str, sizeof(str)); + if (rc <= 0) + return; + + size = prom_strtoul(str, NULL); + min_rmap = &ibm_architecture_vec[IBM_ARCH_VEC_MIN_RMA_OFFSET]; + + if (size) { + prom_printf("Using RMA size %lu from %s.\n", size, + IBM_NEW_RMA_SIZE_PROP); + min_rmap[0] = (size >> 24) & 0xff; + min_rmap[1] = (size >> 16) & 0xff; + min_rmap[2] = (size >> 8) & 0xff; + min_rmap[3] = size & 0xff; } } @@ -911,6 +959,8 @@ static void __init prom_send_capabilities(void) fixup_nr_cores(); + fixup_rma_size(); + /* try calling the ibm,client-architecture-support method */ prom_printf("Calling ibm,client-architecture-support..."); if (call_prom_ret("call-method", 3, 2, &ret, @@ -946,6 +996,49 @@ static void __init prom_send_capabilities(void) } #endif /* __BIG_ENDIAN__ */ } + +static void __init increase_rma_size(void) +{ + int rc, len; + char str[64]; + phandle optnode; + + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); + if (!PHANDLE_VALID(optnode)) + prom_panic("Cannot find /options"); + + /* + * If we already increased the RMA size, return. + */ + memset(str, 0, sizeof(str)); + rc = prom_getprop(optnode, IBM_NEW_RMA_SIZE_PROP, &str, sizeof(str)); + + if (!strcmp(str, IBM_NEW_RMA_SIZE_STR)) { + prom_printf("RMA size already at %.3s.\n", str); + return; + } + /* + * Otherwise, set the ibm,new-rma-size property and initiate a CAS + * reboot so the RMA size can take effect. See also init_rma_size(). + */ + len = strlen(IBM_NEW_RMA_SIZE_STR) + 1; + memcpy(str, IBM_NEW_RMA_SIZE_STR, len); + + prom_printf("Setting %s property to %s\n", IBM_NEW_RMA_SIZE_PROP, str); + rc = prom_setprop(optnode, "/options", IBM_NEW_RMA_SIZE_PROP, str, len); + + /* Force a reboot. Will work only if ibm,fw-override-cas==false */ + prom_send_capabilities(); + + prom_printf("No CAS initiated reboot? Try setting ibm,fw-override-cas to 'false' in Open Firmware\n"); +} + +#else + +static void __init increase_rma_size(void) +{ +} + #endif /* #if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV) */ /* @@ -2027,9 +2120,11 @@ static void __init *make_room(unsigned long *mem_start, unsigned long *mem_end, room = alloc_top - alloc_bottom; if (room > DEVTREE_CHUNK_SIZE) room = DEVTREE_CHUNK_SIZE; - if (room < PAGE_SIZE) + if (room < PAGE_SIZE) { + increase_rma_size(); prom_panic("No memory for flatten_device_tree " "(no room)\n"); + } chunk = alloc_up(room, 0); if (chunk == 0) prom_panic("No memory for flatten_device_tree " -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2016-08-09 17:11 ` Sukadev Bhattiprolu @ 2017-02-01 5:37 ` Michael Ellerman 2017-02-01 17:49 ` Thiago Jung Bauermann 0 siblings, 1 reply; 9+ messages in thread From: Michael Ellerman @ 2017-02-01 5:37 UTC (permalink / raw) To: Sukadev Bhattiprolu, Paul Clarke; +Cc: linuxppc-dev, linux-kernel Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes: > Paul Clarke [pc@us.ibm.com] wrote: > --- > > From f9e9e8460206bc3fa7eaa741b9a2bde22870b9e0 Mon Sep 17 00:00:00 2001 I know it's been a while but I think it would still be good to get this in a shape that we can merge it. Comments inline ... > From: root <sukadev@linux.vnet.ibm.com> > Date: Thu, 4 Aug 2016 23:13:37 -0400 > Subject: [PATCH 2/2] powerpc/pseries: Dynamically grow RMA size > > When booting a very large system with a large initrd we run out of space > for the flattened device tree (FDT). To fix this we must increase the > space allocated for the RMA region. > > The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing > the size there will apply to all systems, large and small, so we want to > increase the RMA region only when necessary. > > When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size' > to the new RMA size (512MB) and issue a client-architecture-support (CAS) > call to the firmware. This will initiate a system reboot. Upon reboot we > notice the new property and update the RMA size accordingly. > > Fix suggested by Michael Ellerman. > > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> > > diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c > index f612a99..d1aaeda 100644 > --- a/arch/powerpc/kernel/prom_init.c > +++ b/arch/powerpc/kernel/prom_init.c > @@ -87,6 +87,9 @@ > int of_workarounds; > #endif > > +#define IBM_NEW_RMA_SIZE_PROP "ibm,new-rma-size" > +#define IBM_NEW_RMA_SIZE_STR "512" The property name should really start with "linux,", as it's a Linux property, not used by firmware at all. And does it need to contain a value? Just its existence is a flag that we want to increase the RMA size. So it could just be called "linux,increase-rma-size". And we don't need a #define for the name, it's not going to change once the code is in, and a #define just obscures the actual name. > @@ -898,6 +910,42 @@ static void fixup_nr_cores(void) > ptcores[1] = (cores >> 16) & 0xff; > ptcores[2] = (cores >> 8) & 0xff; > ptcores[3] = cores & 0xff; > + fixup_nr_cores_done = true; That code has changed upstream, so that won't apply. But that's OK, I don't think we need to do it anyway. > +static void __init fixup_rma_size(void) > +{ > + int rc; > + u64 size; > + unsigned char *min_rmap; > + phandle optnode; > + char str[64]; > + > + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); > + if (!PHANDLE_VALID(optnode)) > + prom_panic("Cannot find /options"); > + > + /* > + * If a prior boot specified a new RMA size, use that size in > + * ibm_architecture_vec[]. See also increase_rma_size(). > + */ > + size = 0ULL; > + memset(str, 0, sizeof(str)); > + rc = prom_getprop(optnode, IBM_NEW_RMA_SIZE_PROP, &str, sizeof(str)); > + if (rc <= 0) > + return; So this can just become something like: rc = prom_getprop(optnode, "linux,increase-rma-size", NULL, 0) if (rc == PROM_ERROR) return; val = be32_to_cpu(ibm_architecture_vec.vec2.min_rma); ibm_architecture_vec.vec2.min_rma = cpu_to_be32(val * 2); > @@ -946,6 +996,49 @@ static void __init prom_send_capabilities(void) > } > #endif /* __BIG_ENDIAN__ */ > } > + > +static void __init increase_rma_size(void) > +{ > + int rc, len; > + char str[64]; > + phandle optnode; > + > + optnode = call_prom("finddevice", 1, 1, ADDR("/options")); > + if (!PHANDLE_VALID(optnode)) > + prom_panic("Cannot find /options"); > + > + /* > + * If we already increased the RMA size, return. > + */ > + memset(str, 0, sizeof(str)); > + rc = prom_getprop(optnode, IBM_NEW_RMA_SIZE_PROP, &str, sizeof(str)); > + > + if (!strcmp(str, IBM_NEW_RMA_SIZE_STR)) { > + prom_printf("RMA size already at %.3s.\n", str); > + return; > + } > + /* > + * Otherwise, set the ibm,new-rma-size property and initiate a CAS > + * reboot so the RMA size can take effect. See also init_rma_size(). > + */ > + len = strlen(IBM_NEW_RMA_SIZE_STR) + 1; > + memcpy(str, IBM_NEW_RMA_SIZE_STR, len); > + > + prom_printf("Setting %s property to %s\n", IBM_NEW_RMA_SIZE_PROP, str); > + rc = prom_setprop(optnode, "/options", IBM_NEW_RMA_SIZE_PROP, str, len); We should check rc there shouldn't we? Again that code can be simpler if the property is just a flag. > + /* Force a reboot. Will work only if ibm,fw-override-cas==false */ > + prom_send_capabilities(); > + > + prom_printf("No CAS initiated reboot? Try setting ibm,fw-override-cas to 'false' in Open Firmware\n"); I'm not sure if we want to be referring to ibm,fw-override-case. I don't thing it's a documented property (not in PAPR anyway), and it's certainly IBM PFW specific even if it is. I know for a fact that on KVM you won't get rebooted here, so I think if the CAS returns we should just reboot directly. cheers ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2017-02-01 5:37 ` Michael Ellerman @ 2017-02-01 17:49 ` Thiago Jung Bauermann 2017-02-01 18:11 ` Sukadev Bhattiprolu 0 siblings, 1 reply; 9+ messages in thread From: Thiago Jung Bauermann @ 2017-02-01 17:49 UTC (permalink / raw) To: linuxppc-dev Cc: Michael Ellerman, Sukadev Bhattiprolu, Paul Clarke, linux-kernel Hello, Am Mittwoch, 1. Februar 2017, 16:37:58 BRST schrieb Michael Ellerman: > Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes: > > Paul Clarke [pc@us.ibm.com] wrote: > > --- > > > > From f9e9e8460206bc3fa7eaa741b9a2bde22870b9e0 Mon Sep 17 00:00:00 2001 > > I know it's been a while but I think it would still be good to get this > in a shape that we can merge it. Sorry if this has been tried and didn't work or if I'm missing something obvious: Instead of this method of trying a small RMA size and rebooting to try a bigger size, could the "min RMA percentage of total RAM" field of the ibm_architecture_vec be used? LoPAPR says that "The Initial size of the RMA is set to the greater of the values indicated by bytes 24-27 [min RMA] or 32 [min RMA percentage of total RAM] of option vector number 2 “Open Firmware” or minimum RMA size supported by the platform and capped by the maximum memory defined for the partition and the maximum size of the RMA supported by the platform. The respective selected values are reported in the length of the first memory property." My understanding is that these patches are intended for big guests with many processors, but the RMA size isn't changed to 512MB outright because of worries that it could affect smaller guests. Since guests with many processors tend to have more RAM as well, specifying a min RMA size of 256MB and a min RMA percentage of, say, 10% or 20% could make the host automatically allocate an adequate RMA size in the first boot. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size 2017-02-01 17:49 ` Thiago Jung Bauermann @ 2017-02-01 18:11 ` Sukadev Bhattiprolu 0 siblings, 0 replies; 9+ messages in thread From: Sukadev Bhattiprolu @ 2017-02-01 18:11 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: linuxppc-dev, Michael Ellerman, Paul Clarke, linux-kernel Thiago Jung Bauermann [bauerman@linux.vnet.ibm.com] wrote: > Instead of this method of trying a small RMA size and rebooting to try a > bigger size, could the "min RMA percentage of total RAM" field of the > ibm_architecture_vec be used? We tried that and concluded that even 1% could end up reserving a lot of memory on systems with lot of memory. Sukadev ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-02-01 18:11 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-08-05 6:10 [PATCH 1/2] powerpc/pseries: Use a helper to fixup nr_cores Sukadev Bhattiprolu 2016-08-05 6:14 ` [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size Sukadev Bhattiprolu 2016-08-05 13:28 ` kbuild test robot 2016-08-05 18:30 ` Sukadev Bhattiprolu 2016-08-05 19:04 ` Paul Clarke 2016-08-09 17:11 ` Sukadev Bhattiprolu 2017-02-01 5:37 ` Michael Ellerman 2017-02-01 17:49 ` Thiago Jung Bauermann 2017-02-01 18:11 ` Sukadev Bhattiprolu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).