linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, akpm@linux-foundation.org,
	will@kernel.org, mark.rutland@arm.com, david@redhat.com,
	cai@lca.pw, logang@deltatee.com, cpandya@codeaurora.org,
	arunks@codeaurora.org, dan.j.williams@intel.com,
	mgorman@techsingularity.net, osalvador@suse.de,
	ard.biesheuvel@arm.com, steve.capper@arm.com, broonie@kernel.org,
	valentin.schneider@arm.com, Robin.Murphy@arm.com,
	steven.price@arm.com, suzuki.poulose@arm.com,
	ira.weiny@intel.com
Subject: Re: [PATCH V9 2/2] arm64/mm: Enable memory hot remove
Date: Fri, 11 Oct 2019 08:26:32 +0530	[thread overview]
Message-ID: <f51cdb20-ddc4-4fb7-6c45-791d2e1e690c@arm.com> (raw)
In-Reply-To: <20191010113433.GI28269@mbp>

On 10/10/2019 05:04 PM, Catalin Marinas wrote:
> Hi Anshuman,
> 
> On Wed, Oct 09, 2019 at 01:51:48PM +0530, Anshuman Khandual wrote:
>> +static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
>> +				    unsigned long end, bool free_mapped)
>> +{
>> +	unsigned long next;
>> +	pmd_t *pmdp, pmd;
>> +
>> +	do {
>> +		next = pmd_addr_end(addr, end);
>> +		pmdp = pmd_offset(pudp, addr);
>> +		pmd = READ_ONCE(*pmdp);
>> +		if (pmd_none(pmd))
>> +			continue;
>> +
>> +		WARN_ON(!pmd_present(pmd));
>> +		if (pmd_sect(pmd)) {
>> +			pmd_clear(pmdp);
>> +			flush_tlb_kernel_range(addr, next);
> 
> The range here could be a whole PMD_SIZE. Since we are invalidating a
> single block entry, one TLBI should be sufficient:
> 
> 			flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

Sure, will change.

> 
>> +			if (free_mapped)
>> +				free_hotplug_page_range(pmd_page(pmd),
>> +							PMD_SIZE);
>> +			continue;
>> +		}
>> +		WARN_ON(!pmd_table(pmd));
>> +		unmap_hotplug_pte_range(pmdp, addr, next, free_mapped);
>> +	} while (addr = next, addr < end);
>> +}
>> +
>> +static void unmap_hotplug_pud_range(pgd_t *pgdp, unsigned long addr,
>> +				    unsigned long end, bool free_mapped)
>> +{
>> +	unsigned long next;
>> +	pud_t *pudp, pud;
>> +
>> +	do {
>> +		next = pud_addr_end(addr, end);
>> +		pudp = pud_offset(pgdp, addr);
>> +		pud = READ_ONCE(*pudp);
>> +		if (pud_none(pud))
>> +			continue;
>> +
>> +		WARN_ON(!pud_present(pud));
>> +		if (pud_sect(pud)) {
>> +			pud_clear(pudp);
>> +			flush_tlb_kernel_range(addr, next);
> 			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 			flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

Will change.

> 
> [...]
>> +static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr,
>> +				 unsigned long end, unsigned long floor,
>> +				 unsigned long ceiling)
>> +{
>> +	pte_t *ptep, pte;
>> +	unsigned long i, start = addr;
>> +
>> +	do {
>> +		ptep = pte_offset_kernel(pmdp, addr);
>> +		pte = READ_ONCE(*ptep);
>> +		WARN_ON(!pte_none(pte));
>> +	} while (addr += PAGE_SIZE, addr < end);
> 
> So this loop is just a sanity check (pte clearing having been done by
> the unmap loops). That's fine, maybe a comment for future reference.

Sure, will add.
> 
>> +
>> +	if (!pgtable_range_aligned(start, end, floor, ceiling, PMD_MASK))
>> +		return;
>> +
>> +	ptep = pte_offset_kernel(pmdp, 0UL);
>> +	for (i = 0; i < PTRS_PER_PTE; i++) {
>> +		if (!pte_none(READ_ONCE(ptep[i])))
>> +			return;
>> +	}
> 
> We could do with a comment for this loop along the lines of:
> 
> 	Check whether we can free the pte page if the rest of the
> 	entries are empty. Overlap with other regions have been handled
> 	by the floor/ceiling check.

Sure, will add.

> 
> Apart from the comments above, the rest of the patch looks fine. Once
> fixed:
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> Mark Rutland mentioned at some point that, as a preparatory patch to
> this series, we'd need to make sure we don't hot-remove memory already
> given to the kernel at boot. Any plans here?

Hmm, this series just enables platform memory hot remove as required from
generic memory hotplug framework. The path here is triggered either from
remove_memory() or __remove_memory() which takes physical memory range
arguments like (nid, start, size) and do the needful. arch_remove_memory()
should never be required to test given memory range for anything including
being part of the boot memory.

IIUC boot memory added to system with memblock_add() lose all it's identity
after the system is up and running. In order to reject any attempt to hot
remove boot memory, platform needs to remember all those memory that came
early in the boot and then scan through it during arch_remove_memory().

Ideally, it is the responsibility of [_]remove_memory() callers like ACPI
driver, DAX etc to make sure they never attempt to hot remove a memory
range, which never got hot added by them in the first place. Also, unlike
/sys/devices/system/memory/probe there is no 'unprobe' interface where the
user can just trigger boot memory removal. Hence, unless there is a bug in
ACPI, DAX or other callers, there should never be any attempt to hot remove
boot memory in the first place.

> 
> Thanks.
> 


  reply	other threads:[~2019-10-11  2:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09  8:21 [PATCH V9 0/2] arm64/mm: Enable memory hot remove Anshuman Khandual
2019-10-09  8:21 ` [PATCH V9 1/2] arm64/mm: Hold memory hotplug lock while walking for kernel page table dump Anshuman Khandual
2019-10-09  8:21 ` [PATCH V9 2/2] arm64/mm: Enable memory hot remove Anshuman Khandual
2019-10-10 11:34   ` Catalin Marinas
2019-10-11  2:56     ` Anshuman Khandual [this message]
     [not found]       ` <20191018094825.GD19734@arrakis.emea.arm.com>
2019-10-21  9:53         ` Anshuman Khandual
2019-10-21  9:55           ` Anshuman Khandual
2019-10-25 17:09           ` James Morse
2019-10-28  8:25             ` Anshuman Khandual
2019-11-04  3:57               ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f51cdb20-ddc4-4fb7-6c45-791d2e1e690c@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=Robin.Murphy@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ard.biesheuvel@arm.com \
    --cc=arunks@codeaurora.org \
    --cc=broonie@kernel.org \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=cpandya@codeaurora.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@techsingularity.net \
    --cc=osalvador@suse.de \
    --cc=steve.capper@arm.com \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=valentin.schneider@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).