All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, akpm@linux-foundation.org,
	catalin.marinas@arm.com, will.deacon@arm.com, mhocko@suse.com,
	mgorman@techsingularity.net, james.morse@arm.com,
	robin.murphy@arm.com, cpandya@codeaurora.org,
	arunks@codeaurora.org, dan.j.williams@intel.com,
	osalvador@suse.de, david@redhat.com, cai@lca.pw,
	logang@deltatee.com, ira.weiny@intel.com
Subject: Re: [PATCH V3 4/4] arm64/mm: Enable memory hot remove
Date: Fri, 17 May 2019 08:45:27 +0530	[thread overview]
Message-ID: <f83469e3-c514-cc37-a7d0-c8b57e242ebe@arm.com> (raw)
In-Reply-To: <20190516105741.GC40960@lakrids.cambridge.arm.com>



On 05/16/2019 04:27 PM, Mark Rutland wrote:
> On Thu, May 16, 2019 at 11:04:48AM +0530, Anshuman Khandual wrote:
>> On 05/15/2019 05:19 PM, Mark Rutland wrote:
>>> On Tue, May 14, 2019 at 02:30:07PM +0530, Anshuman Khandual wrote:
>>>> Memory removal from an arch perspective involves tearing down two different
>>>> kernel based mappings i.e vmemmap and linear while releasing related page
>>>> table and any mapped pages allocated for given physical memory range to be
>>>> removed.
>>>>
>>>> Define a common kernel page table tear down helper remove_pagetable() which
>>>> can be used to unmap given kernel virtual address range. In effect it can
>>>> tear down both vmemap or kernel linear mappings. This new helper is called
>>>> from both vmemamp_free() and ___remove_pgd_mapping() during memory removal.
>>>>
>>>> For linear mapping there are no actual allocated pages which are mapped to
>>>> create the translation. Any pfn on a given entry is derived from physical
>>>> address (__va(PA) --> PA) whose linear translation is to be created. They
>>>> need not be freed as they were never allocated in the first place. But for
>>>> vmemmap which is a real virtual mapping (like vmalloc) physical pages are
>>>> allocated either from buddy or memblock which get mapped in the kernel page
>>>> table. These allocated and mapped pages need to be freed during translation
>>>> tear down. But page table pages need to be freed in both these cases.
>>>
>>> As previously discussed, we should only hot-remove memory which was
>>> hot-added, so we shouldn't encounter memory allocated from memblock.
>>
>> Right, not applicable any more. Will drop this word.
>>
>>>> These mappings need to be differentiated while deciding if a mapped page at
>>>> any level i.e [pte|pmd|pud]_page() should be freed or not. Callers for the
>>>> mapping tear down process should pass on 'sparse_vmap' variable identifying
>>>> kernel vmemmap mappings.
>>>
>>> I think that you can simplify the paragraphs above down to:
>>>
>>>   The arch code for hot-remove must tear down portions of the linear map
>>>   and vmemmap corresponding to memory being removed. In both cases the
>>>   page tables mapping these regions must be freed, and when sparse
>>>   vmemmap is in use the memory backing the vmemmap must also be freed.
>>>
>>>   This patch adds a new remove_pagetable() helper which can be used to
>>>   tear down either region, and calls it from vmemmap_free() and
>>>   ___remove_pgd_mapping(). The sparse_vmap argument determines whether
>>>   the backing memory will be freed.
>>
>> The current one is bit more descriptive on detail. Anyways will replace with
>> the above writeup if that is preferred.
> 
> I would prefer the suggested form above, as it's easier to extract the
> necessary details from it.

Fair enough.

> 
> [...]
> 
>>>> +static void
>>>> +remove_pagetable(unsigned long start, unsigned long end, bool sparse_vmap)
>>>> +{
>>>> +	unsigned long addr, next;
>>>> +	pud_t *pudp_base;
>>>> +	pgd_t *pgdp;
>>>> +
>>>> +	spin_lock(&init_mm.page_table_lock);
>>>
>>> It would be good to explain why we need to take the ptl here.
>>
>> Will update both commit message and add an in-code comment here.
>>
>>>
>>> IIUC that shouldn't be necessary for the linear map. Am I mistaken?
>>
>> Its not absolutely necessary for linear map right now because both memory hot
>> plug & ptdump which modifies or walks the page table ranges respectively take
>> memory hotplug lock. That apart, no other callers creates or destroys linear
>> mapping at runtime.
>>
>>>
>>> Is there a specific race when tearing down the vmemmap?
>>
>> This is trickier than linear map. vmemmap additions would be protected with
>> memory hotplug lock but this can potential collide with vmalloc/IO regions.
>> Even if they dont right now that will be because they dont share intermediate
>> page table levels.
> 
> Sure; if we could just state something like:
> 
>   The vmemmap region may share levels of table with the vmalloc region.
>   Take the ptl so that we can safely free potentially-sahred tables.
> 
> ... I think that would be sufficient.

Will do.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: cai@lca.pw, mhocko@suse.com, ira.weiny@intel.com,
	david@redhat.com, catalin.marinas@arm.com, will.deacon@arm.com,
	linux-kernel@vger.kernel.org, logang@deltatee.com,
	james.morse@arm.com, cpandya@codeaurora.org,
	arunks@codeaurora.org, akpm@linux-foundation.org,
	osalvador@suse.de, mgorman@techsingularity.net,
	dan.j.williams@intel.com, linux-arm-kernel@lists.infradead.org,
	robin.murphy@arm.com
Subject: Re: [PATCH V3 4/4] arm64/mm: Enable memory hot remove
Date: Fri, 17 May 2019 08:45:27 +0530	[thread overview]
Message-ID: <f83469e3-c514-cc37-a7d0-c8b57e242ebe@arm.com> (raw)
In-Reply-To: <20190516105741.GC40960@lakrids.cambridge.arm.com>



On 05/16/2019 04:27 PM, Mark Rutland wrote:
> On Thu, May 16, 2019 at 11:04:48AM +0530, Anshuman Khandual wrote:
>> On 05/15/2019 05:19 PM, Mark Rutland wrote:
>>> On Tue, May 14, 2019 at 02:30:07PM +0530, Anshuman Khandual wrote:
>>>> Memory removal from an arch perspective involves tearing down two different
>>>> kernel based mappings i.e vmemmap and linear while releasing related page
>>>> table and any mapped pages allocated for given physical memory range to be
>>>> removed.
>>>>
>>>> Define a common kernel page table tear down helper remove_pagetable() which
>>>> can be used to unmap given kernel virtual address range. In effect it can
>>>> tear down both vmemap or kernel linear mappings. This new helper is called
>>>> from both vmemamp_free() and ___remove_pgd_mapping() during memory removal.
>>>>
>>>> For linear mapping there are no actual allocated pages which are mapped to
>>>> create the translation. Any pfn on a given entry is derived from physical
>>>> address (__va(PA) --> PA) whose linear translation is to be created. They
>>>> need not be freed as they were never allocated in the first place. But for
>>>> vmemmap which is a real virtual mapping (like vmalloc) physical pages are
>>>> allocated either from buddy or memblock which get mapped in the kernel page
>>>> table. These allocated and mapped pages need to be freed during translation
>>>> tear down. But page table pages need to be freed in both these cases.
>>>
>>> As previously discussed, we should only hot-remove memory which was
>>> hot-added, so we shouldn't encounter memory allocated from memblock.
>>
>> Right, not applicable any more. Will drop this word.
>>
>>>> These mappings need to be differentiated while deciding if a mapped page at
>>>> any level i.e [pte|pmd|pud]_page() should be freed or not. Callers for the
>>>> mapping tear down process should pass on 'sparse_vmap' variable identifying
>>>> kernel vmemmap mappings.
>>>
>>> I think that you can simplify the paragraphs above down to:
>>>
>>>   The arch code for hot-remove must tear down portions of the linear map
>>>   and vmemmap corresponding to memory being removed. In both cases the
>>>   page tables mapping these regions must be freed, and when sparse
>>>   vmemmap is in use the memory backing the vmemmap must also be freed.
>>>
>>>   This patch adds a new remove_pagetable() helper which can be used to
>>>   tear down either region, and calls it from vmemmap_free() and
>>>   ___remove_pgd_mapping(). The sparse_vmap argument determines whether
>>>   the backing memory will be freed.
>>
>> The current one is bit more descriptive on detail. Anyways will replace with
>> the above writeup if that is preferred.
> 
> I would prefer the suggested form above, as it's easier to extract the
> necessary details from it.

Fair enough.

> 
> [...]
> 
>>>> +static void
>>>> +remove_pagetable(unsigned long start, unsigned long end, bool sparse_vmap)
>>>> +{
>>>> +	unsigned long addr, next;
>>>> +	pud_t *pudp_base;
>>>> +	pgd_t *pgdp;
>>>> +
>>>> +	spin_lock(&init_mm.page_table_lock);
>>>
>>> It would be good to explain why we need to take the ptl here.
>>
>> Will update both commit message and add an in-code comment here.
>>
>>>
>>> IIUC that shouldn't be necessary for the linear map. Am I mistaken?
>>
>> Its not absolutely necessary for linear map right now because both memory hot
>> plug & ptdump which modifies or walks the page table ranges respectively take
>> memory hotplug lock. That apart, no other callers creates or destroys linear
>> mapping at runtime.
>>
>>>
>>> Is there a specific race when tearing down the vmemmap?
>>
>> This is trickier than linear map. vmemmap additions would be protected with
>> memory hotplug lock but this can potential collide with vmalloc/IO regions.
>> Even if they dont right now that will be because they dont share intermediate
>> page table levels.
> 
> Sure; if we could just state something like:
> 
>   The vmemmap region may share levels of table with the vmalloc region.
>   Take the ptl so that we can safely free potentially-sahred tables.
> 
> ... I think that would be sufficient.

Will do.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-05-17  3:15 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14  9:00 [PATCH V3 0/4] arm64/mm: Enable memory hot remove Anshuman Khandual
2019-05-14  9:00 ` Anshuman Khandual
2019-05-14  9:00 ` [PATCH V3 1/4] mm/hotplug: Reorder arch_remove_memory() call in __remove_memory() Anshuman Khandual
2019-05-14  9:00   ` Anshuman Khandual
2019-05-14  9:05   ` David Hildenbrand
2019-05-14  9:05     ` David Hildenbrand
2019-05-14  9:00 ` [PATCH V3 2/4] arm64/mm: Hold memory hotplug lock while walking for kernel page table dump Anshuman Khandual
2019-05-14  9:00   ` Anshuman Khandual
2019-05-14  9:16   ` David Hildenbrand
2019-05-14  9:16     ` David Hildenbrand
2019-05-14 15:40   ` Mark Rutland
2019-05-14 15:40     ` Mark Rutland
2019-05-15  1:56     ` Anshuman Khandual
2019-05-15  1:56       ` Anshuman Khandual
2019-05-15 16:58   ` Michal Hocko
2019-05-15 16:58     ` Michal Hocko
2019-05-16 10:23     ` Mark Rutland
2019-05-16 10:23       ` Mark Rutland
2019-05-16 11:05       ` Michal Hocko
2019-05-16 11:05         ` Michal Hocko
2019-05-22 16:42         ` Mark Rutland
2019-05-22 16:42           ` Mark Rutland
2019-05-24  6:06           ` Anshuman Khandual
2019-05-24  6:06             ` Anshuman Khandual
2019-05-27  7:20           ` Michal Hocko
2019-05-27  7:20             ` Michal Hocko
2019-05-28 14:09             ` Mark Rutland
2019-05-28 14:09               ` Mark Rutland
2019-05-16 11:06       ` Anshuman Khandual
2019-05-16 11:06         ` Anshuman Khandual
2019-05-16 11:16         ` Michal Hocko
2019-05-16 11:16           ` Michal Hocko
2019-05-23  8:40           ` David Hildenbrand
2019-05-23  8:40             ` David Hildenbrand
2019-05-24  5:43             ` Anshuman Khandual
2019-05-24  5:43               ` Anshuman Khandual
2019-05-14  9:00 ` [PATCH V3 3/4] arm64/mm: Inhibit huge-vmap with ptdump Anshuman Khandual
2019-05-14  9:00   ` Anshuman Khandual
2019-05-14  9:20   ` David Hildenbrand
2019-05-14  9:20     ` David Hildenbrand
2019-05-16  8:38   ` Ard Biesheuvel
2019-05-16  8:38     ` Ard Biesheuvel
2019-05-14  9:00 ` [PATCH V3 4/4] arm64/mm: Enable memory hot remove Anshuman Khandual
2019-05-14  9:00   ` Anshuman Khandual
2019-05-14  9:08   ` David Hildenbrand
2019-05-14  9:08     ` David Hildenbrand
2019-05-15 11:49   ` Mark Rutland
2019-05-15 11:49     ` Mark Rutland
2019-05-16  5:34     ` Anshuman Khandual
2019-05-16  5:34       ` Anshuman Khandual
2019-05-16 10:57       ` Mark Rutland
2019-05-16 10:57         ` Mark Rutland
2019-05-17  3:15         ` Anshuman Khandual [this message]
2019-05-17  3:15           ` Anshuman Khandual
2019-05-14  9:10 ` [PATCH V3 0/4] " David Hildenbrand
2019-05-14  9:10   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f83469e3-c514-cc37-a7d0-c8b57e242ebe@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arunks@codeaurora.org \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=cpandya@codeaurora.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=robin.murphy@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.