All of lore.kernel.org
 help / color / mirror / Atom feed
From: <sudaraja@codeaurora.org>
To: "'Anshuman Khandual'" <anshuman.khandual@arm.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>
Cc: 'Mark Rutland' <mark.rutland@arm.com>,
	'Will Deacon' <will@kernel.org>,
	'David Hildenbrand' <david@redhat.com>,
	'Catalin Marinas' <catalin.marinas@arm.com>,
	'Steven Price' <steven.price@arm.com>,
	'Andrew Morton' <akpm@linux-foundation.org>,
	'Logan Gunthorpe' <logang@deltatee.com>,
	pratikp@codeaurora.org
Subject: RE: [PATCH] arm64/mm: add fallback option to allocate virtually contiguous memory
Date: Thu, 10 Sep 2020 08:27:09 +0000	[thread overview]
Message-ID: <0101017477202341-51a1c9d3-5ba4-4328-b24c-0b29317af168-000000@us-west-2.amazonses.com> (raw)
In-Reply-To: <0475e813-cb95-0992-39e4-593bfd5cdbf8@arm.com>

Hello Anshuman,

>On 09/10/2020 11:35 AM, Sudarshan Rajagopalan wrote:
>> When section mappings are enabled, we allocate vmemmap pages from 
>> physically continuous memory of size PMD_SZIE using 
>> vmemmap_alloc_block_buf(). Section> mappings are good to reduce TLB 
>> pressure. But when system is highly fragmented and memory blocks are 
>> being hot-added at runtime, its possible that such physically 
>> continuous memory allocations can fail. Rather than failing the
>
>Did you really see this happen on a system ?

Thanks for the response.

Yes, this happened on a system with very low RAM (size ~120MB) where no free order-9 pages were present. Pasting below few kernel logs. On systems with low RAM, its high probability where memory is fragmented and no higher order pages are free. On such scenarios, vmemmap alloc would fail for PMD_SIZE of contiguous memory.

We have a usecase for memory sharing between VMs where one of the VM uses add_memory() to add the memory that was donated by the other VM. This uses something similar to VirtIO-Mem. And this requires memory to be _guaranteed_ to be added in the VM so that the usecase can run without any failure.

vmemmap alloc failure: order:9, mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), nodemask=(null),cpuset=/,mems_allowed=0
CPU: 1 PID: 294 Comm: -------- Tainted: G S                5.4.50 #1
Call trace:
 dump_stack+0xa4/0xdc
 warn_alloc+0x104/0x160
 vmemmap_alloc_block+0xe4/0xf4
 vmemmap_alloc_block_buf+0x34/0x38
 vmemmap_populate+0xc8/0x224
 __populate_section_memmap+0x34/0x54
 sparse_add_section+0x16c/0x254
 __add_pages+0xd0/0x138
 arch_add_memory+0x114/0x1a8

DMA32: 2627*4kB (UMC) 23*8kB (UME) 6*16kB (UM) 8*32kB (UME) 2*64kB (ME) 2*128kB (UE) 1*256kB (M) 2*512kB (ME) 1*1024kB (M) 0*2048kB 0*4096kB = 13732kB
30455 pages RAM

But keeping this usecase aside, won’t this be problematic on any systems with low RAM where order-9 alloc would fail on a fragmented system, and any memory hot-adding would fail? Or other similar users of VirtIO-Mem which uses arch_add_memory.

>
>> memory hot-add procedure, add a fallback option to allocate vmemmap 
>> pages from discontinuous pages using vmemmap_populate_basepages().
>
>Which could lead to a mixed page size mapping in the VMEMMAP area.

Would this be problematic? We would only lose one section mapping per failure and increases slight TLB pressure. Also, we would anyway do discontinuous pages alloc for systems having non-4K pages (ARM64_SWAPPER_USES_SECTION_MAPS will be 0). I only see a small cost to performance due to slight TLB pressure.

>Allocation failure in vmemmap_populate() should just cleanly fail the memory hot add operation, which can then be retried. Why the retry has to be offloaded to kernel ?

While a retry can attempted again, but it won’t help in cases where there are no order-9 pages available and any retry would just not succeed again until a order-9 page gets free'ed. Here we are just falling back to use discontinuous pages allocation to help succeed memory hot-add as best as possible.

Thanks and Regards,
Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-----Original Message-----
From: Anshuman Khandual <anshuman.khandual@arm.com> 
Sent: Wednesday, September 9, 2020 11:45 PM
To: Sudarshan Rajagopalan <sudaraja@codeaurora.org>; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>; Will Deacon <will@kernel.org>; Mark Rutland <mark.rutland@arm.com>; Logan Gunthorpe <logang@deltatee.com>; David Hildenbrand <david@redhat.com>; Andrew Morton <akpm@linux-foundation.org>; Steven Price <steven.price@arm.com>
Subject: Re: [PATCH] arm64/mm: add fallback option to allocate virtually contiguous memory

Hello Sudarshan,

On 09/10/2020 11:35 AM, Sudarshan Rajagopalan wrote:
> When section mappings are enabled, we allocate vmemmap pages from 
> physically continuous memory of size PMD_SZIE using 
> vmemmap_alloc_block_buf(). Section> mappings are good to reduce TLB 
> pressure. But when system is highly fragmented and memory blocks are 
> being hot-added at runtime, its possible that such physically 
> continuous memory allocations can fail. Rather than failing the

Did you really see this happen on a system ?

> memory hot-add procedure, add a fallback option to allocate vmemmap 
> pages from discontinuous pages using vmemmap_populate_basepages().

Which could lead to a mixed page size mapping in the VMEMMAP area.
Allocation failure in vmemmap_populate() should just cleanly fail the memory hot add operation, which can then be retried. Why the retry has to be offloaded to kernel ?

> 
> Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 
> 75df62f..a46c7d4 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1100,6 +1100,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
>  	p4d_t *p4dp;
>  	pud_t *pudp;
>  	pmd_t *pmdp;
> +	int ret = 0;
>  
>  	do {
>  		next = pmd_addr_end(addr, end);
> @@ -1121,15 +1122,23 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
>  			void *p = NULL;
>  
>  			p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
> -			if (!p)
> -				return -ENOMEM;
> +			if (!p) {
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +				vmemmap_free(start, end, altmap); #endif

The mapping was never created in the first place, as the allocation failed. vmemmap_free() here will free an unmapped area !

> +				ret = -ENOMEM;
> +				break;
> +			}
>  
>  			pmd_set_huge(pmdp, __pa(p), __pgprot(PROT_SECT_NORMAL));
>  		} else
>  			vmemmap_verify((pte_t *)pmdp, node, addr, next);
>  	} while (addr = next, addr != end);
>  
> -	return 0;
> +	if (ret)
> +		return vmemmap_populate_basepages(start, end, node, altmap);
> +	else
> +		return ret;
>  }
>  #endif	/* !ARM64_SWAPPER_USES_SECTION_MAPS */
>  void vmemmap_free(unsigned long start, unsigned long end,
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-09-10  8:28 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <01010174769e2b68-a6f3768e-aef8-43c7-b357-a8cb1e17d3eb-000000@us-west-2.amazonses.com>
2020-09-10  6:45 ` [PATCH] arm64/mm: add fallback option to allocate virtually contiguous memory Anshuman Khandual
2020-09-10  6:45   ` Anshuman Khandual
2020-09-10  8:08   ` David Hildenbrand
2020-09-10  8:08     ` David Hildenbrand
2020-09-10 10:58     ` Anshuman Khandual
2020-09-10 10:58       ` Anshuman Khandual
2020-09-10  8:27   ` sudaraja [this message]
2020-09-10 11:16     ` Anshuman Khandual
2020-09-10 11:16       ` Anshuman Khandual
2020-09-10  8:27   ` sudaraja
2020-09-10  8:27 ` Steven Price
2020-09-10  8:27   ` Steven Price
2020-09-10 10:50   ` Anshuman Khandual
2020-09-10 10:50     ` Anshuman Khandual
2020-09-10 20:48     ` sudaraja
2020-09-21 17:43       ` Will Deacon
2020-09-21 17:43         ` Will Deacon
2020-09-10 20:48     ` sudaraja
2020-09-10  6:05 Sudarshan Rajagopalan
  -- strict thread matches above, loose matches on Subject: below --
2020-09-10  6:05 Sudarshan Rajagopalan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0101017477202341-51a1c9d3-5ba4-4328-b24c-0b29317af168-000000@us-west-2.amazonses.com \
    --to=sudaraja@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=mark.rutland@arm.com \
    --cc=pratikp@codeaurora.org \
    --cc=steven.price@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.