* [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
@ 2016-06-23 16:39 Reza Arbab
2016-06-23 17:17 ` Aneesh Kumar K.V
0 siblings, 1 reply; 8+ messages in thread
From: Reza Arbab @ 2016-06-23 16:39 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Aneesh Kumar K.V, Dan Williams, Balbir Singh, Gavin Shan,
David Gibson, Vasant Hegde, Scott Wood, Oliver O'Halloran,
linuxppc-dev, linux-kernel
These functions are making direct calls to the hash table APIs,
leading to a BUG() on systems using radix.
Switch them to the vmemmap_{create,remove}_mapping() wrappers, and
move to the __meminit section.
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
arch/powerpc/mm/mem.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 2fd57fa..80c6ee7 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -116,7 +116,7 @@ int memory_add_physaddr_to_nid(u64 start)
}
#endif
-int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
+int __meminit arch_add_memory(int nid, u64 start, u64 size, bool for_device)
{
struct pglist_data *pgdata;
struct zone *zone;
@@ -127,7 +127,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
pgdata = NODE_DATA(nid);
start = (unsigned long)__va(start);
- rc = create_section_mapping(start, start + size);
+ rc = vmemmap_create_mapping(start, size, __pa(start));
if (rc) {
pr_warning(
"Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
@@ -143,7 +143,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
}
#ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(u64 start, u64 size)
+int __meminit arch_remove_memory(u64 start, u64 size)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -157,7 +157,7 @@ int arch_remove_memory(u64 start, u64 size)
/* Remove htab bolted mappings for this section of memory */
start = (unsigned long)__va(start);
- ret = remove_section_mapping(start, start + size);
+ vmemmap_remove_mapping(start, size);
/* Ensure all vmalloc mappings are flushed in case they also
* hit that section of memory
--
1.8.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-23 16:39 [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix Reza Arbab
@ 2016-06-23 17:17 ` Aneesh Kumar K.V
2016-06-23 19:37 ` Reza Arbab
2016-06-24 3:22 ` Balbir Singh
0 siblings, 2 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2016-06-23 17:17 UTC (permalink / raw)
To: Reza Arbab, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman, Dan Williams, Balbir Singh, Gavin Shan,
David Gibson, Vasant Hegde, Scott Wood, Oliver O'Halloran,
linuxppc-dev, linux-kernel
Reza Arbab <arbab@linux.vnet.ibm.com> writes:
> These functions are making direct calls to the hash table APIs,
> leading to a BUG() on systems using radix.
>
> Switch them to the vmemmap_{create,remove}_mapping() wrappers, and
> move to the __meminit section.
They are really not the same. They can possibly end up using different
base page size. Also vmemmap is available only with SPARSEMEM_VMEMMAP
enabled. Does hotplug depend on sparsemem vmemmap ?
>
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> ---
> arch/powerpc/mm/mem.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 2fd57fa..80c6ee7 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -116,7 +116,7 @@ int memory_add_physaddr_to_nid(u64 start)
> }
> #endif
>
> -int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> +int __meminit arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> {
> struct pglist_data *pgdata;
> struct zone *zone;
> @@ -127,7 +127,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> pgdata = NODE_DATA(nid);
>
> start = (unsigned long)__va(start);
> - rc = create_section_mapping(start, start + size);
> + rc = vmemmap_create_mapping(start, size, __pa(start));
> if (rc) {
> pr_warning(
> "Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
> @@ -143,7 +143,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
> }
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
> -int arch_remove_memory(u64 start, u64 size)
> +int __meminit arch_remove_memory(u64 start, u64 size)
> {
> unsigned long start_pfn = start >> PAGE_SHIFT;
> unsigned long nr_pages = size >> PAGE_SHIFT;
> @@ -157,7 +157,7 @@ int arch_remove_memory(u64 start, u64 size)
>
> /* Remove htab bolted mappings for this section of memory */
> start = (unsigned long)__va(start);
> - ret = remove_section_mapping(start, start + size);
> + vmemmap_remove_mapping(start, size);
>
> /* Ensure all vmalloc mappings are flushed in case they also
> * hit that section of memory
> --
> 1.8.3.1
We really want to look at memory hotplug to fully understand the
radix impact. The unplug operation needs to do some additional freeing.
I did put in comments around that with the idea of coming back to that
later
#ifdef CONFIG_MEMORY_HOTPLUG
void radix__vmemmap_remove_mapping(unsigned long start, unsigned long page_size)
{
/* FIXME!! intel does more. We should free page tables mapping vmemmap ? */
}
#endif
-aneesh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-23 17:17 ` Aneesh Kumar K.V
@ 2016-06-23 19:37 ` Reza Arbab
2016-06-23 19:52 ` Reza Arbab
2016-06-28 11:21 ` Michael Ellerman
2016-06-24 3:22 ` Balbir Singh
1 sibling, 2 replies; 8+ messages in thread
From: Reza Arbab @ 2016-06-23 19:37 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Dan Williams, Balbir Singh, Gavin Shan, David Gibson,
Vasant Hegde, Scott Wood, Oliver O'Halloran, Nathan Fontenot,
linuxppc-dev, linux-kernel
On Thu, Jun 23, 2016 at 10:47:20PM +0530, Aneesh Kumar K.V wrote:
>Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>> These functions are making direct calls to the hash table APIs,
>> leading to a BUG() on systems using radix.
>>
>> Switch them to the vmemmap_{create,remove}_mapping() wrappers, and
>> move to the __meminit section.
>
>They are really not the same. They can possibly end up using different
>base page size. Also vmemmap is available only with SPARSEMEM_VMEMMAP
>enabled. Does hotplug depend on sparsemem vmemmap ?
I'm not sure. Maybe it's best if I back up a step and explain what lead
me to this patch. During hotplug, you get
...
arch_add_memory
create_section_mapping
htab_bolt_mapping
BUG_ON(!ppc_md.hpte_insert);
So it seemed to me that I needed a radix equivalent of
create_section_mapping().
After some digging, I found hash__vmemmap_create_mapping() and
radix__vmemmap_create_mapping() did what I needed. I did not notice the
#ifdef SPARSEMEM_VMEMMAP around them.
Hotplug/remove are now working for me as far as I can tell, so I think
the functional change in itself was correct. Hotplug may not depend on
sparsemem vmemmap, but we seem to need a radix create_section_mapping()
regardless.
Could it be that the functions just need to be renamed
hash__create_mapping()/radix__create_mapping() and moved out of #ifdef
SPARSEMEM_VMEMMAP?
--
Reza Arbab
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-23 19:37 ` Reza Arbab
@ 2016-06-23 19:52 ` Reza Arbab
2016-06-28 11:21 ` Michael Ellerman
1 sibling, 0 replies; 8+ messages in thread
From: Reza Arbab @ 2016-06-23 19:52 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Dan Williams, Balbir Singh, Gavin Shan, David Gibson,
Vasant Hegde, Scott Wood, Oliver O'Halloran, Nathan Fontenot,
linuxppc-dev, linux-kernel
On Thu, Jun 23, 2016 at 02:37:39PM -0500, Reza Arbab wrote:
>Could it be that the functions just need to be renamed
>hash__create_mapping()/radix__create_mapping() and moved out of #ifdef
>SPARSEMEM_VMEMMAP?
Or vice-versa, maybe the callers should have been wrapped in the first
place:
arch_add_memory() {
...
if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
vmemmap_create_mapping()
...
}
arch_remove_memory() {
...
if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
vmemmap_remove_mapping()
...
}
--
Reza Arbab
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-23 17:17 ` Aneesh Kumar K.V
2016-06-23 19:37 ` Reza Arbab
@ 2016-06-24 3:22 ` Balbir Singh
1 sibling, 0 replies; 8+ messages in thread
From: Balbir Singh @ 2016-06-24 3:22 UTC (permalink / raw)
To: Aneesh Kumar K.V, Reza Arbab, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dan Williams, Gavin Shan,
David Gibson, Vasant Hegde, Scott Wood, Oliver O'Halloran,
linuxppc-dev, linux-kernel
On 24/06/16 03:17, Aneesh Kumar K.V wrote:
> Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>
>> These functions are making direct calls to the hash table APIs,
>> leading to a BUG() on systems using radix.
>>
>> Switch them to the vmemmap_{create,remove}_mapping() wrappers, and
>> move to the __meminit section.
>
>
> They are really not the same. They can possibly end up using different
> base page size. Also vmemmap is available only with SPARSEMEM_VMEMMAP
> enabled. Does hotplug depend on sparsemem vmemmap ?
# eventually, we can have this option just 'select SPARSEMEM'
config MEMORY_HOTPLUG
bool "Allow for memory hot-add"
depends on SPARSEMEM || X86_64_ACPI_NUMA
depends on ARCH_ENABLE_MEMORY_HOTPLUG
We depend on sparsemem for sure. vmemmap is just a way of getting the memory
virtually mapped. From the patch perspective, I think we need the equivalent of
just mapping the pages in kernel. The address may differ based on whether vmemmap
is used or not and of-course page_size,
Balbir Singh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-23 19:37 ` Reza Arbab
2016-06-23 19:52 ` Reza Arbab
@ 2016-06-28 11:21 ` Michael Ellerman
2016-06-28 14:03 ` Aneesh Kumar K.V
2016-06-29 15:37 ` Reza Arbab
1 sibling, 2 replies; 8+ messages in thread
From: Michael Ellerman @ 2016-06-28 11:21 UTC (permalink / raw)
To: Reza Arbab, Aneesh Kumar K.V
Cc: Benjamin Herrenschmidt, Paul Mackerras, Dan Williams,
Balbir Singh, Gavin Shan, David Gibson, Vasant Hegde, Scott Wood,
Oliver O'Halloran, Nathan Fontenot, linuxppc-dev,
linux-kernel
On Thu, 2016-06-23 at 14:37 -0500, Reza Arbab wrote:
> On Thu, Jun 23, 2016 at 10:47:20PM +0530, Aneesh Kumar K.V wrote:
> > Reza Arbab <arbab@linux.vnet.ibm.com> writes:
> > > These functions are making direct calls to the hash table APIs,
> > > leading to a BUG() on systems using radix.
> > >
> > > Switch them to the vmemmap_{create,remove}_mapping() wrappers, and
> > > move to the __meminit section.
> >
> > They are really not the same. They can possibly end up using different
> > base page size. Also vmemmap is available only with SPARSEMEM_VMEMMAP
> > enabled. Does hotplug depend on sparsemem vmemmap ?
>
> I'm not sure. Maybe it's best if I back up a step and explain what lead
> me to this patch. During hotplug, you get
>
> ...
> arch_add_memory
> create_section_mapping
> htab_bolt_mapping
> BUG_ON(!ppc_md.hpte_insert);
>
> So it seemed to me that I needed a radix equivalent of
> create_section_mapping().
>
> After some digging, I found hash__vmemmap_create_mapping() and
> radix__vmemmap_create_mapping() did what I needed. I did not notice the
> #ifdef SPARSEMEM_VMEMMAP around them.
I think that's more by luck than design. The vmemmap routines use
mmu_vmemmap_psize which is probably but not definitely the same as
mmu_linear_psize.
> Could it be that the functions just need to be renamed
> hash__create_mapping()/radix__create_mapping() and moved out of #ifdef
> SPARSEMEM_VMEMMAP?
No, you need to use mmu_linear_psize for the hotplug case.
But you can probably factor out a common routine that both cases use, and hide
the hash vs radix check in that.
And probably send me a patch to make MEMORY_HOTPLUG depend on !RADIX for v4.7?
cheers
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-28 11:21 ` Michael Ellerman
@ 2016-06-28 14:03 ` Aneesh Kumar K.V
2016-06-29 15:37 ` Reza Arbab
1 sibling, 0 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2016-06-28 14:03 UTC (permalink / raw)
To: Michael Ellerman, Reza Arbab
Cc: Benjamin Herrenschmidt, Paul Mackerras, Dan Williams,
Balbir Singh, Gavin Shan, David Gibson, Vasant Hegde, Scott Wood,
Oliver O'Halloran, Nathan Fontenot, linuxppc-dev,
linux-kernel
Michael Ellerman <mpe@ellerman.id.au> writes:
> On Thu, 2016-06-23 at 14:37 -0500, Reza Arbab wrote:
>> On Thu, Jun 23, 2016 at 10:47:20PM +0530, Aneesh Kumar K.V wrote:
>> > Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>> > > These functions are making direct calls to the hash table APIs,
>> > > leading to a BUG() on systems using radix.
>> > >
>> > > Switch them to the vmemmap_{create,remove}_mapping() wrappers, and
>> > > move to the __meminit section.
>> >
>> > They are really not the same. They can possibly end up using different
>> > base page size. Also vmemmap is available only with SPARSEMEM_VMEMMAP
>> > enabled. Does hotplug depend on sparsemem vmemmap ?
>>
>> I'm not sure. Maybe it's best if I back up a step and explain what lead
>> me to this patch. During hotplug, you get
>>
>> ...
>> arch_add_memory
>> create_section_mapping
>> htab_bolt_mapping
>> BUG_ON(!ppc_md.hpte_insert);
>>
>> So it seemed to me that I needed a radix equivalent of
>> create_section_mapping().
>>
>> After some digging, I found hash__vmemmap_create_mapping() and
>> radix__vmemmap_create_mapping() did what I needed. I did not notice the
>> #ifdef SPARSEMEM_VMEMMAP around them.
>
> I think that's more by luck than design. The vmemmap routines use
> mmu_vmemmap_psize which is probably but not definitely the same as
> mmu_linear_psize.
>
>> Could it be that the functions just need to be renamed
>> hash__create_mapping()/radix__create_mapping() and moved out of #ifdef
>> SPARSEMEM_VMEMMAP?
>
> No, you need to use mmu_linear_psize for the hotplug case.
>
> But you can probably factor out a common routine that both cases use, and hide
> the hash vs radix check in that.
>
> And probably send me a patch to make MEMORY_HOTPLUG depend on !RADIX for v4.7?
Few other stuff we need to still look from Radix point
1) machine check handling/memory errors
2) kexec
-aneesh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix
2016-06-28 11:21 ` Michael Ellerman
2016-06-28 14:03 ` Aneesh Kumar K.V
@ 2016-06-29 15:37 ` Reza Arbab
1 sibling, 0 replies; 8+ messages in thread
From: Reza Arbab @ 2016-06-29 15:37 UTC (permalink / raw)
To: Michael Ellerman
Cc: Aneesh Kumar K.V, Benjamin Herrenschmidt, Paul Mackerras,
Dan Williams, Balbir Singh, Gavin Shan, David Gibson,
Vasant Hegde, Scott Wood, Oliver O'Halloran, Nathan Fontenot,
linuxppc-dev, linux-kernel
On Tue, Jun 28, 2016 at 09:21:05PM +1000, Michael Ellerman wrote:
>No, you need to use mmu_linear_psize for the hotplug case.
>
>But you can probably factor out a common routine that both cases use, and hide
>the hash vs radix check in that.
Okay, I'm trying to refactor {create,remove}_section_mapping() into
hash__ and radix__ variants. This lead to a couple of questions.
Pseudocode for radix__create_section_mapping(start, end):
page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
start = _ALIGN_DOWN(start, page_size);
for (; start < end; start += page_size) {
radix__map_kernel_page(start, __pa(start),
PAGE_KERNEL, page_size);
}
Should the above use PAGE_KERNEL, like the the hash table bolt, or
(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_KERNEL_RW), like in the radix
vmemmap creation?
The other question is what radix__remove_section_mapping() should do.
I don't know offhand what the opposite of map_kernel_page() is. As
Aneesh mentioned, radix vmemmap removal is currently stubbed as a FIXME
so I couldn't use that as a reference.
--
Reza Arbab
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-06-29 15:43 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-23 16:39 [PATCH] powerpc/mm: update arch_{add,remove}_memory() for radix Reza Arbab
2016-06-23 17:17 ` Aneesh Kumar K.V
2016-06-23 19:37 ` Reza Arbab
2016-06-23 19:52 ` Reza Arbab
2016-06-28 11:21 ` Michael Ellerman
2016-06-28 14:03 ` Aneesh Kumar K.V
2016-06-29 15:37 ` Reza Arbab
2016-06-24 3:22 ` Balbir Singh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).