linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: configurable sparsemem section size
@ 2019-04-23 20:38 Pavel Tatashin
  2019-04-24  9:07 ` Anshuman Khandual
  2019-04-25 15:25 ` Michal Hocko
  0 siblings, 2 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-04-23 20:38 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, linux-kernel, linux-mm,
	linux-nvdimm, akpm, mhocko, dave.hansen, dan.j.williams,
	keith.busch, vishal.l.verma, dave.jiang, zwisler,
	thomas.lendacky, ying.huang, fengguang.wu, bp, bhelgaas,
	baiyaowei, tiwai, jglisse, catalin.marinas, will.deacon, rppt,
	ard.biesheuvel, andrew.murray, james.morse, marc.zyngier, sboyd,
	linux-arm-kernel

sparsemem section size determines the maximum size and alignment that
is allowed to offline/online memory block. The bigger the size the less
the clutter in /sys/devices/system/memory/*. On the other hand, however,
there is less flexability in what granules of memory can be added and
removed.

Recently, it was enabled in Linux to hotadd persistent memory that
can be either real NV device, or reserved from regular System RAM
and has identity of devdax.

The problem is that because ARM64's section size is 1G, and devdax must
have 2M label section, the first 1G is always missed when device is
attached, because it is not 1G aligned.

Allow, better flexibility by making section size configurable.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 arch/arm64/Kconfig                 | 10 ++++++++++
 arch/arm64/include/asm/sparsemem.h |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b5d8cf57e220..a0c5b9d13a7f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -801,6 +801,16 @@ config ARM64_PA_BITS
 	default 48 if ARM64_PA_BITS_48
 	default 52 if ARM64_PA_BITS_52
 
+config ARM64_SECTION_SIZE_BITS
+	int "sparsemem section size shift"
+	range 27 30
+	default "30"
+	depends on SPARSEMEM
+	help
+	  Specify section size in bits. Section size determines the hotplug
+	  hotremove granularity. The current size can be determined from
+	  /sys/devices/system/memory/block_size_bytes
+
 config CPU_BIG_ENDIAN
        bool "Build big-endian kernel"
        help
diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h
index b299929fe56c..810db34d7038 100644
--- a/arch/arm64/include/asm/sparsemem.h
+++ b/arch/arm64/include/asm/sparsemem.h
@@ -18,7 +18,7 @@
 
 #ifdef CONFIG_SPARSEMEM
 #define MAX_PHYSMEM_BITS	CONFIG_ARM64_PA_BITS
-#define SECTION_SIZE_BITS	30
+#define SECTION_SIZE_BITS	CONFIG_ARM64_SECTION_SIZE_BITS
 #endif
 
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-23 20:38 [PATCH] arm64: configurable sparsemem section size Pavel Tatashin
@ 2019-04-24  9:07 ` Anshuman Khandual
  2019-04-24 19:48   ` Pavel Tatashin
  2019-04-25 15:25 ` Michal Hocko
  1 sibling, 1 reply; 12+ messages in thread
From: Anshuman Khandual @ 2019-04-24  9:07 UTC (permalink / raw)
  To: Pavel Tatashin, jmorris, sashal, linux-kernel, linux-mm,
	linux-nvdimm, akpm, mhocko, dave.hansen, dan.j.williams,
	keith.busch, vishal.l.verma, dave.jiang, zwisler,
	thomas.lendacky, ying.huang, fengguang.wu, bp, bhelgaas,
	baiyaowei, tiwai, jglisse, catalin.marinas, will.deacon, rppt,
	ard.biesheuvel, andrew.murray, james.morse, marc.zyngier, sboyd,
	linux-arm-kernel



On 04/24/2019 02:08 AM, Pavel Tatashin wrote:
> sparsemem section size determines the maximum size and alignment that
> is allowed to offline/online memory block. The bigger the size the less
> the clutter in /sys/devices/system/memory/*. On the other hand, however,
> there is less flexability in what granules of memory can be added and
> removed.

Is there any scenario where less than a 1GB needs to be added on arm64 ?

> 
> Recently, it was enabled in Linux to hotadd persistent memory that
> can be either real NV device, or reserved from regular System RAM
> and has identity of devdax.

devdax (even ZONE_DEVICE) support has not been enabled on arm64 yet.

> 
> The problem is that because ARM64's section size is 1G, and devdax must
> have 2M label section, the first 1G is always missed when device is
> attached, because it is not 1G aligned.

devdax has to be 2M aligned ? Does Linux enforce that right now ?

> 
> Allow, better flexibility by making section size configurable.

Unless 2M is being enforced from Linux not sure why this is necessary at
the moment.

> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  arch/arm64/Kconfig                 | 10 ++++++++++
>  arch/arm64/include/asm/sparsemem.h |  2 +-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b5d8cf57e220..a0c5b9d13a7f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -801,6 +801,16 @@ config ARM64_PA_BITS
>  	default 48 if ARM64_PA_BITS_48
>  	default 52 if ARM64_PA_BITS_52
>  
> +config ARM64_SECTION_SIZE_BITS
> +	int "sparsemem section size shift"
> +	range 27 30

27 and 28 do not even compile for ARM64_64_PAGES because of MAX_ORDER and
SECTION_SIZE mismatch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-24  9:07 ` Anshuman Khandual
@ 2019-04-24 19:48   ` Pavel Tatashin
  2019-04-24 19:54     ` Pavel Tatashin
  2019-04-25  3:06     ` Anshuman Khandual
  0 siblings, 2 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-04-24 19:48 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: James Morris, Sasha Levin, LKML, linux-mm, linux-nvdimm,
	Andrew Morton, Michal Hocko, Dave Hansen, Dan Williams,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	catalin.marinas, Will Deacon, rppt, Ard Biesheuvel,
	andrew.murray, james.morse, Marc Zyngier, sboyd,
	linux-arm-kernel

On Wed, Apr 24, 2019 at 5:07 AM Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
>
> On 04/24/2019 02:08 AM, Pavel Tatashin wrote:
> > sparsemem section size determines the maximum size and alignment that
> > is allowed to offline/online memory block. The bigger the size the less
> > the clutter in /sys/devices/system/memory/*. On the other hand, however,
> > there is less flexability in what granules of memory can be added and
> > removed.
>
> Is there any scenario where less than a 1GB needs to be added on arm64 ?

Yes, DAX hotplug loses 1G of memory without allowing smaller sections.
Machines on which we are going to be using this functionality have 8G
of System RAM, therefore losing 1G is a big problem.

For details about using scenario see this cover letter:
https://lore.kernel.org/lkml/20190421014429.31206-1-pasha.tatashin@soleen.com/

>
> >
> > Recently, it was enabled in Linux to hotadd persistent memory that
> > can be either real NV device, or reserved from regular System RAM
> > and has identity of devdax.
>
> devdax (even ZONE_DEVICE) support has not been enabled on arm64 yet.

Correct, I use your patches to enable ZONE_DEVICE, and  thus devdax on ARM64:
https://lore.kernel.org/lkml/1554265806-11501-1-git-send-email-anshuman.khandual@arm.com/

>
> >
> > The problem is that because ARM64's section size is 1G, and devdax must
> > have 2M label section, the first 1G is always missed when device is
> > attached, because it is not 1G aligned.
>
> devdax has to be 2M aligned ? Does Linux enforce that right now ?

Unfortunately, there is no way around this. Part of the memory can be
reserved as persistent memory via device tree.
        memory@40000000 {
                device_type = "memory";
                reg = < 0x00000000 0x40000000
                        0x00000002 0x00000000 >;
        };

        pmem@1c0000000 {
                compatible = "pmem-region";
                reg = <0x00000001 0xc0000000
                       0x00000000 0x80000000>;
                volatile;
                numa-node-id = <0>;
        };

So, while pmem is section aligned, as it should be, the dax device is
going to be pmem start address + label size, which is 2M. The actual
DAX device starts at:
0x1c0000000 + 2M.

Because section size is 1G, the hotplug will able to add only memory
starting from
0x1c0000000 + 1G

> 27 and 28 do not even compile for ARM64_64_PAGES because of MAX_ORDER and
> SECTION_SIZE mismatch.

Can you please elaborate what configs are you using? I have no
problems compiling with 27 and 28 bit.

Thank you,
Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-24 19:48   ` Pavel Tatashin
@ 2019-04-24 19:54     ` Pavel Tatashin
  2019-04-24 20:24       ` Dan Williams
  2019-04-25  3:06     ` Anshuman Khandual
  1 sibling, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2019-04-24 19:54 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: James Morris, Sasha Levin, LKML, linux-mm, linux-nvdimm,
	Andrew Morton, Michal Hocko, Dave Hansen, Dan Williams,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	catalin.marinas, Will Deacon, rppt, Ard Biesheuvel,
	andrew.murray, james.morse, Marc Zyngier, sboyd,
	linux-arm-kernel

<resending> from original email

On Wed, Apr 24, 2019 at 3:48 PM Pavel Tatashin
<patatash@linux.microsoft.com> wrote:
>
> On Wed, Apr 24, 2019 at 5:07 AM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
> >
> > On 04/24/2019 02:08 AM, Pavel Tatashin wrote:
> > > sparsemem section size determines the maximum size and alignment that
> > > is allowed to offline/online memory block. The bigger the size the less
> > > the clutter in /sys/devices/system/memory/*. On the other hand, however,
> > > there is less flexability in what granules of memory can be added and
> > > removed.
> >
> > Is there any scenario where less than a 1GB needs to be added on arm64 ?
>
> Yes, DAX hotplug loses 1G of memory without allowing smaller sections.
> Machines on which we are going to be using this functionality have 8G
> of System RAM, therefore losing 1G is a big problem.
>
> For details about using scenario see this cover letter:
> https://lore.kernel.org/lkml/20190421014429.31206-1-pasha.tatashin@soleen.com/
>
> >
> > >
> > > Recently, it was enabled in Linux to hotadd persistent memory that
> > > can be either real NV device, or reserved from regular System RAM
> > > and has identity of devdax.
> >
> > devdax (even ZONE_DEVICE) support has not been enabled on arm64 yet.
>
> Correct, I use your patches to enable ZONE_DEVICE, and  thus devdax on ARM64:
> https://lore.kernel.org/lkml/1554265806-11501-1-git-send-email-anshuman.khandual@arm.com/
>
> >
> > >
> > > The problem is that because ARM64's section size is 1G, and devdax must
> > > have 2M label section, the first 1G is always missed when device is
> > > attached, because it is not 1G aligned.
> >
> > devdax has to be 2M aligned ? Does Linux enforce that right now ?
>
> Unfortunately, there is no way around this. Part of the memory can be
> reserved as persistent memory via device tree.
>         memory@40000000 {
>                 device_type = "memory";
>                 reg = < 0x00000000 0x40000000
>                         0x00000002 0x00000000 >;
>         };
>
>         pmem@1c0000000 {
>                 compatible = "pmem-region";
>                 reg = <0x00000001 0xc0000000
>                        0x00000000 0x80000000>;
>                 volatile;
>                 numa-node-id = <0>;
>         };
>
> So, while pmem is section aligned, as it should be, the dax device is
> going to be pmem start address + label size, which is 2M. The actual
> DAX device starts at:
> 0x1c0000000 + 2M.
>
> Because section size is 1G, the hotplug will able to add only memory
> starting from
> 0x1c0000000 + 1G
>
> > 27 and 28 do not even compile for ARM64_64_PAGES because of MAX_ORDER and
> > SECTION_SIZE mismatch.
>
> Can you please elaborate what configs are you using? I have no
> problems compiling with 27 and 28 bit.
>
> Thank you,
> Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-24 19:54     ` Pavel Tatashin
@ 2019-04-24 20:24       ` Dan Williams
  2019-04-24 20:33         ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Williams @ 2019-04-24 20:24 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Anshuman Khandual, James Morris, Sasha Levin, LKML, linux-mm,
	linux-nvdimm, Andrew Morton, Michal Hocko, Dave Hansen,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	Catalin Marinas, Will Deacon, rppt, Ard Biesheuvel,
	andrew.murray, james.morse, Marc Zyngier, sboyd, Linux ARM

On Wed, Apr 24, 2019 at 12:54 PM Pavel Tatashin
<pasha.tatashin@soleen.com> wrote:
>
> <resending> from original email
>
> On Wed, Apr 24, 2019 at 3:48 PM Pavel Tatashin
> <patatash@linux.microsoft.com> wrote:
> >
> > On Wed, Apr 24, 2019 at 5:07 AM Anshuman Khandual
> > <anshuman.khandual@arm.com> wrote:
> > >
> > > On 04/24/2019 02:08 AM, Pavel Tatashin wrote:
> > > > sparsemem section size determines the maximum size and alignment that
> > > > is allowed to offline/online memory block. The bigger the size the less
> > > > the clutter in /sys/devices/system/memory/*. On the other hand, however,
> > > > there is less flexability in what granules of memory can be added and
> > > > removed.
> > >
> > > Is there any scenario where less than a 1GB needs to be added on arm64 ?
> >
> > Yes, DAX hotplug loses 1G of memory without allowing smaller sections.
> > Machines on which we are going to be using this functionality have 8G
> > of System RAM, therefore losing 1G is a big problem.
> >
> > For details about using scenario see this cover letter:
> > https://lore.kernel.org/lkml/20190421014429.31206-1-pasha.tatashin@soleen.com/
> >
> > >
> > > >
> > > > Recently, it was enabled in Linux to hotadd persistent memory that
> > > > can be either real NV device, or reserved from regular System RAM
> > > > and has identity of devdax.
> > >
> > > devdax (even ZONE_DEVICE) support has not been enabled on arm64 yet.
> >
> > Correct, I use your patches to enable ZONE_DEVICE, and  thus devdax on ARM64:
> > https://lore.kernel.org/lkml/1554265806-11501-1-git-send-email-anshuman.khandual@arm.com/
> >
> > >
> > > >
> > > > The problem is that because ARM64's section size is 1G, and devdax must
> > > > have 2M label section, the first 1G is always missed when device is
> > > > attached, because it is not 1G aligned.
> > >
> > > devdax has to be 2M aligned ? Does Linux enforce that right now ?
> >
> > Unfortunately, there is no way around this. Part of the memory can be
> > reserved as persistent memory via device tree.
> >         memory@40000000 {
> >                 device_type = "memory";
> >                 reg = < 0x00000000 0x40000000
> >                         0x00000002 0x00000000 >;
> >         };
> >
> >         pmem@1c0000000 {
> >                 compatible = "pmem-region";
> >                 reg = <0x00000001 0xc0000000
> >                        0x00000000 0x80000000>;
> >                 volatile;
> >                 numa-node-id = <0>;
> >         };
> >
> > So, while pmem is section aligned, as it should be, the dax device is
> > going to be pmem start address + label size, which is 2M. The actual
> > DAX device starts at:
> > 0x1c0000000 + 2M.
> >
> > Because section size is 1G, the hotplug will able to add only memory
> > starting from
> > 0x1c0000000 + 1G

This is yet another example of where we need to break down the section
alignment requirement for arch_add_memory().

https://lore.kernel.org/lkml/155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-24 20:24       ` Dan Williams
@ 2019-04-24 20:33         ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-04-24 20:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: Anshuman Khandual, James Morris, Sasha Levin, LKML, linux-mm,
	linux-nvdimm, Andrew Morton, Michal Hocko, Dave Hansen,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	Catalin Marinas, Will Deacon, rppt, Ard Biesheuvel,
	andrew.murray, james.morse, Marc Zyngier, sboyd, Linux ARM

> This is yet another example of where we need to break down the section
> alignment requirement for arch_add_memory().
>
> https://lore.kernel.org/lkml/155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com/

Hi Dan,

Yes, that is exactly what I am trying to solve with this patch. I will
test if your series works with ARM64, and if does not I will let you
know what is broken. But, I think, this patch is not needed if your
patches are accepted into mainline.

Thank you,
Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-24 19:48   ` Pavel Tatashin
  2019-04-24 19:54     ` Pavel Tatashin
@ 2019-04-25  3:06     ` Anshuman Khandual
  1 sibling, 0 replies; 12+ messages in thread
From: Anshuman Khandual @ 2019-04-25  3:06 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: James Morris, Sasha Levin, LKML, linux-mm, linux-nvdimm,
	Andrew Morton, Michal Hocko, Dave Hansen, Dan Williams,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	catalin.marinas, Will Deacon, rppt, Ard Biesheuvel,
	andrew.murray, james.morse, Marc Zyngier, sboyd,
	linux-arm-kernel



On 04/25/2019 01:18 AM, Pavel Tatashin wrote:
> On Wed, Apr 24, 2019 at 5:07 AM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>>
>> On 04/24/2019 02:08 AM, Pavel Tatashin wrote:
>>> sparsemem section size determines the maximum size and alignment that
>>> is allowed to offline/online memory block. The bigger the size the less
>>> the clutter in /sys/devices/system/memory/*. On the other hand, however,
>>> there is less flexability in what granules of memory can be added and
>>> removed.
>>
>> Is there any scenario where less than a 1GB needs to be added on arm64 ?
> 
> Yes, DAX hotplug loses 1G of memory without allowing smaller sections.
> Machines on which we are going to be using this functionality have 8G
> of System RAM, therefore losing 1G is a big problem.
> 
> For details about using scenario see this cover letter:
> https://lore.kernel.org/lkml/20190421014429.31206-1-pasha.tatashin@soleen.com/

Its loosing 1GB because devdax has 2M alignment ? IIRC from Dan's subsection memory
hot add series 2M comes from persistent memory HW controller's limitations. Does that
limitation applicable across all platforms including arm64 for all possible persistent
memory vendors. I mean is it universal ? IIUC subsection memory hot plug series is
still getting reviewed. Hence should not we wait for it to get merged before enabling
applicable platforms to accommodate these 2M limitations.

> 
>>
>>>
>>> Recently, it was enabled in Linux to hotadd persistent memory that
>>> can be either real NV device, or reserved from regular System RAM
>>> and has identity of devdax.
>>
>> devdax (even ZONE_DEVICE) support has not been enabled on arm64 yet.
> 
> Correct, I use your patches to enable ZONE_DEVICE, and  thus devdax on ARM64:
> https://lore.kernel.org/lkml/1554265806-11501-1-git-send-email-anshuman.khandual@arm.com/
> 
>>
>>>
>>> The problem is that because ARM64's section size is 1G, and devdax must
>>> have 2M label section, the first 1G is always missed when device is
>>> attached, because it is not 1G aligned.
>>
>> devdax has to be 2M aligned ? Does Linux enforce that right now ?
> 
> Unfortunately, there is no way around this. Part of the memory can be
> reserved as persistent memory via device tree.
>         memory@40000000 {
>                 device_type = "memory";
>                 reg = < 0x00000000 0x40000000
>                         0x00000002 0x00000000 >;
>         };
> 
>         pmem@1c0000000 {
>                 compatible = "pmem-region";
>                 reg = <0x00000001 0xc0000000
>                        0x00000000 0x80000000>;
>                 volatile;
>                 numa-node-id = <0>;
>         };
> 
> So, while pmem is section aligned, as it should be, the dax device is
> going to be pmem start address + label size, which is 2M. The actual

Forgive my ignorance here but why dax device label size is 2M aligned. Again is that
because of some persistent memory HW controller limitations ?

> DAX device starts at:
> 0x1c0000000 + 2M.
> 
> Because section size is 1G, the hotplug will able to add only memory
> starting from
> 0x1c0000000 + 1G

Got it but as mentioned before we will have to make sure that 2M alignment requirement
is universal else we will be adjusting this multiple times.

> 
>> 27 and 28 do not even compile for ARM64_64_PAGES because of MAX_ORDER and
>> SECTION_SIZE mismatch.

Even with 27 bits its 128 MB section size. How does it solve the problem with 2M ?
The patch just wanted to reduce the memory wastage ?

> 
> Can you please elaborate what configs are you using? I have no
> problems compiling with 27 and 28 bit.

After applying your patch [1] on current mainline kernel [2].

$make defconfig

CONFIG_ARM64_64K_PAGES=y
CONFIG_ARM64_VA_BITS_48=y
CONFIG_ARM64_VA_BITS=48
CONFIG_ARM64_PA_BITS_48=y
CONFIG_ARM64_PA_BITS=48
CONFIG_ARM64_SECTION_SIZE_BITS=27

[1] https://patchwork.kernel.org/patch/10913737/
[2] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

It fails with

  CC      arch/arm64/kernel/asm-offsets.s
In file included from ./include/linux/gfp.h:6,
                 from ./include/linux/slab.h:15,
                 from ./include/linux/resource_ext.h:19,
                 from ./include/linux/acpi.h:26,
                 from ./include/acpi/apei.h:9,
                 from ./include/acpi/ghes.h:5,
                 from ./include/linux/arm_sdei.h:14,
                 from arch/arm64/kernel/asm-offsets.c:21:
./include/linux/mmzone.h:1095:2: error: #error Allocator MAX_ORDER exceeds SECTION_SIZE
 #error Allocator MAX_ORDER exceeds SECTION_SIZE

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-23 20:38 [PATCH] arm64: configurable sparsemem section size Pavel Tatashin
  2019-04-24  9:07 ` Anshuman Khandual
@ 2019-04-25 15:25 ` Michal Hocko
  2019-04-25 15:31   ` Will Deacon
  1 sibling, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2019-04-25 15:25 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: jmorris, sashal, linux-kernel, linux-mm, linux-nvdimm, akpm,
	dave.hansen, dan.j.williams, keith.busch, vishal.l.verma,
	dave.jiang, zwisler, thomas.lendacky, ying.huang, fengguang.wu,
	bp, bhelgaas, baiyaowei, tiwai, jglisse, catalin.marinas,
	will.deacon, rppt, ard.biesheuvel, andrew.murray, james.morse,
	marc.zyngier, sboyd, linux-arm-kernel

On Tue 23-04-19 16:38:43, Pavel Tatashin wrote:
> sparsemem section size determines the maximum size and alignment that
> is allowed to offline/online memory block. The bigger the size the less
> the clutter in /sys/devices/system/memory/*. On the other hand, however,
> there is less flexability in what granules of memory can be added and
> removed.
> 
> Recently, it was enabled in Linux to hotadd persistent memory that
> can be either real NV device, or reserved from regular System RAM
> and has identity of devdax.
> 
> The problem is that because ARM64's section size is 1G, and devdax must
> have 2M label section, the first 1G is always missed when device is
> attached, because it is not 1G aligned.
> 
> Allow, better flexibility by making section size configurable.

Is there any inherent reason (64k page size?) that enforces such a large
memsection?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-25 15:25 ` Michal Hocko
@ 2019-04-25 15:31   ` Will Deacon
  2019-04-25 15:41     ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2019-04-25 15:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Pavel Tatashin, jmorris, sashal, linux-kernel, linux-mm,
	linux-nvdimm, akpm, dave.hansen, dan.j.williams, keith.busch,
	vishal.l.verma, dave.jiang, zwisler, thomas.lendacky, ying.huang,
	fengguang.wu, bp, bhelgaas, baiyaowei, tiwai, jglisse,
	catalin.marinas, rppt, ard.biesheuvel, andrew.murray,
	james.morse, marc.zyngier, sboyd, linux-arm-kernel

On Thu, Apr 25, 2019 at 05:25:50PM +0200, Michal Hocko wrote:
> On Tue 23-04-19 16:38:43, Pavel Tatashin wrote:
> > sparsemem section size determines the maximum size and alignment that
> > is allowed to offline/online memory block. The bigger the size the less
> > the clutter in /sys/devices/system/memory/*. On the other hand, however,
> > there is less flexability in what granules of memory can be added and
> > removed.
> > 
> > Recently, it was enabled in Linux to hotadd persistent memory that
> > can be either real NV device, or reserved from regular System RAM
> > and has identity of devdax.
> > 
> > The problem is that because ARM64's section size is 1G, and devdax must
> > have 2M label section, the first 1G is always missed when device is
> > attached, because it is not 1G aligned.
> > 
> > Allow, better flexibility by making section size configurable.
> 
> Is there any inherent reason (64k page size?) that enforces such a large
> memsection?

I gave *vague* memories of running out of bits in the page flags if we
changed this, but that was a while back. If that's no longer the case,
then I'm open to changing the value, but I really don't want to expose
it as a Kconfig option as proposed in this patch. People won't have a
clue what to set and it doesn't help at all with the single-Image effort.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-25 15:31   ` Will Deacon
@ 2019-04-25 15:41     ` Michal Hocko
  2019-04-25 17:57       ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2019-04-25 15:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Pavel Tatashin, jmorris, sashal, linux-kernel, linux-mm,
	linux-nvdimm, akpm, dave.hansen, dan.j.williams, keith.busch,
	vishal.l.verma, dave.jiang, zwisler, thomas.lendacky, ying.huang,
	fengguang.wu, bp, bhelgaas, baiyaowei, tiwai, jglisse,
	catalin.marinas, rppt, ard.biesheuvel, andrew.murray,
	james.morse, marc.zyngier, sboyd, linux-arm-kernel

On Thu 25-04-19 16:31:38, Will Deacon wrote:
> On Thu, Apr 25, 2019 at 05:25:50PM +0200, Michal Hocko wrote:
> > On Tue 23-04-19 16:38:43, Pavel Tatashin wrote:
> > > sparsemem section size determines the maximum size and alignment that
> > > is allowed to offline/online memory block. The bigger the size the less
> > > the clutter in /sys/devices/system/memory/*. On the other hand, however,
> > > there is less flexability in what granules of memory can be added and
> > > removed.
> > > 
> > > Recently, it was enabled in Linux to hotadd persistent memory that
> > > can be either real NV device, or reserved from regular System RAM
> > > and has identity of devdax.
> > > 
> > > The problem is that because ARM64's section size is 1G, and devdax must
> > > have 2M label section, the first 1G is always missed when device is
> > > attached, because it is not 1G aligned.
> > > 
> > > Allow, better flexibility by making section size configurable.
> > 
> > Is there any inherent reason (64k page size?) that enforces such a large
> > memsection?
> 
> I gave *vague* memories of running out of bits in the page flags if we
> changed this, but that was a while back. If that's no longer the case,
> then I'm open to changing the value, but I really don't want to expose
> it as a Kconfig option as proposed in this patch. People won't have a
> clue what to set and it doesn't help at all with the single-Image effort.

Ohh, I absolutely agree about the config option part JFTR. 1GB section
loos quite excessive. I am not really sure a standard arm64 memory
layout looks though.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-25 15:41     ` Michal Hocko
@ 2019-04-25 17:57       ` Pavel Tatashin
  2019-04-26  5:33         ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2019-04-25 17:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Will Deacon, James Morris, Sasha Levin, LKML, linux-mm,
	linux-nvdimm, Andrew Morton, Dave Hansen, Dan Williams,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	Catalin Marinas, rppt, Ard Biesheuvel, andrew.murray,
	james.morse, Marc Zyngier, sboyd, Linux ARM

> > I gave *vague* memories of running out of bits in the page flags if we
> > changed this, but that was a while back. If that's no longer the case,
> > then I'm open to changing the value, but I really don't want to expose
> > it as a Kconfig option as proposed in this patch. People won't have a
> > clue what to set and it doesn't help at all with the single-Image effort.
>
> Ohh, I absolutely agree about the config option part JFTR. 1GB section
> loos quite excessive. I am not really sure a standard arm64 memory
> layout looks though.

I am now looking to use Dan's patches "mm: Sub-section memory hotplug
support" to solve this problem. I think this patch can be ignored.

Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] arm64: configurable sparsemem section size
  2019-04-25 17:57       ` Pavel Tatashin
@ 2019-04-26  5:33         ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2019-04-26  5:33 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Will Deacon, James Morris, Sasha Levin, LKML, linux-mm,
	linux-nvdimm, Andrew Morton, Dave Hansen, Dan Williams,
	Keith Busch, Vishal L Verma, Dave Jiang, Ross Zwisler,
	Tom Lendacky, Huang, Ying, Fengguang Wu, Borislav Petkov,
	Bjorn Helgaas, Yaowei Bai, Takashi Iwai, Jérôme Glisse,
	Catalin Marinas, rppt, Ard Biesheuvel, andrew.murray,
	james.morse, Marc Zyngier, sboyd, Linux ARM

On Thu 25-04-19 13:57:25, Pavel Tatashin wrote:
> > > I gave *vague* memories of running out of bits in the page flags if we
> > > changed this, but that was a while back. If that's no longer the case,
> > > then I'm open to changing the value, but I really don't want to expose
> > > it as a Kconfig option as proposed in this patch. People won't have a
> > > clue what to set and it doesn't help at all with the single-Image effort.
> >
> > Ohh, I absolutely agree about the config option part JFTR. 1GB section
> > loos quite excessive. I am not really sure a standard arm64 memory
> > layout looks though.
> 
> I am now looking to use Dan's patches "mm: Sub-section memory hotplug
> support" to solve this problem. I think this patch can be ignored.

Even if the subsection memory hotplug is going to be used then the
underlying question remains. If there is no real reason to use large
memsections then it would be better to use smaller ones.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-04-26  5:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23 20:38 [PATCH] arm64: configurable sparsemem section size Pavel Tatashin
2019-04-24  9:07 ` Anshuman Khandual
2019-04-24 19:48   ` Pavel Tatashin
2019-04-24 19:54     ` Pavel Tatashin
2019-04-24 20:24       ` Dan Williams
2019-04-24 20:33         ` Pavel Tatashin
2019-04-25  3:06     ` Anshuman Khandual
2019-04-25 15:25 ` Michal Hocko
2019-04-25 15:31   ` Will Deacon
2019-04-25 15:41     ` Michal Hocko
2019-04-25 17:57       ` Pavel Tatashin
2019-04-26  5:33         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).