linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* HPT allocation failures on POWER8 KVM hosts
@ 2019-11-15 15:28 Roman Bolshakov
  2019-11-18  2:02 ` Daniel Axtens
  0 siblings, 1 reply; 4+ messages in thread
From: Roman Bolshakov @ 2019-11-15 15:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: qemu-ppc, linuxppc-dev, linux

Hi Aneesh,

We're running a lot of KVM virtual machines on POWER8 hosts and
sometimes new VMs can't be started because there are no contiguous
regions for HPT because of CMA region fragmentation.

The issue is covered in the LWN article: https://lwn.net/Articles/684611/
The article points that you raised the problem on LSFMM 2016. However I
couldn't find a follow up article on the issue.

Looking at the kernel commit log I've identified a few commits that
might reduce CMA fragmentaiton and overcome HPT allocation failure:
  - bd2e75633c801 ("dma-contiguous: use fallback alloc_pages for single pages")
  - 678e174c4c16a ("powerpc/mm/iommu: allow migration of cma allocated
    pages during mm_iommu_do_alloc")
  - 9a4e9f3b2d739 ("mm: update get_user_pages_longterm to migrate pages allocated from
    CMA region")
  - d7fefcc8de914 ("mm/cma: add PF flag to force non cma alloc")

Are there any other commits that address the issue? What is the first
kernel version that shouldn't have the HPT allocation problem due to CMA
fragmentation?

Thank you,
Roman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HPT allocation failures on POWER8 KVM hosts
  2019-11-15 15:28 HPT allocation failures on POWER8 KVM hosts Roman Bolshakov
@ 2019-11-18  2:02 ` Daniel Axtens
  2019-11-18 11:42   ` Roman Bolshakov
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Axtens @ 2019-11-18  2:02 UTC (permalink / raw)
  To: Roman Bolshakov, Aneesh Kumar K.V; +Cc: linuxppc-dev, qemu-ppc, linux

Hi Roman,

> We're running a lot of KVM virtual machines on POWER8 hosts and
> sometimes new VMs can't be started because there are no contiguous
> regions for HPT because of CMA region fragmentation.
>
> The issue is covered in the LWN article: https://lwn.net/Articles/684611/
> The article points that you raised the problem on LSFMM 2016. However I
> couldn't find a follow up article on the issue.
>
> Looking at the kernel commit log I've identified a few commits that
> might reduce CMA fragmentaiton and overcome HPT allocation failure:
>   - bd2e75633c801 ("dma-contiguous: use fallback alloc_pages for single pages")
>   - 678e174c4c16a ("powerpc/mm/iommu: allow migration of cma allocated
>     pages during mm_iommu_do_alloc")
>   - 9a4e9f3b2d739 ("mm: update get_user_pages_longterm to migrate pages allocated from
>     CMA region")
>   - d7fefcc8de914 ("mm/cma: add PF flag to force non cma alloc")
>
> Are there any other commits that address the issue? What is the first
> kernel version that shouldn't have the HPT allocation problem due to CMA
> fragmentation?

I've had some success increasing the CMA allocation with the
kvm_cma_resv_ratio boot parameter - see
arch/powerpc/kvm/book3s_hv_builtin.c

The default is 5%. In a support case in a former job we had a customer
who increased this to I think 7 or 8% and saw the symptoms subside
dramatically.

HTH,
Daniel

>
> Thank you,
> Roman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HPT allocation failures on POWER8 KVM hosts
  2019-11-18  2:02 ` Daniel Axtens
@ 2019-11-18 11:42   ` Roman Bolshakov
  2019-12-13  0:33     ` Roman Bolshakov
  0 siblings, 1 reply; 4+ messages in thread
From: Roman Bolshakov @ 2019-11-18 11:42 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Aneesh Kumar K.V, qemu-ppc, linuxppc-dev, linux

On Mon, Nov 18, 2019 at 01:02:00PM +1100, Daniel Axtens wrote:
> Hi Roman,
> 
> > We're running a lot of KVM virtual machines on POWER8 hosts and
> > sometimes new VMs can't be started because there are no contiguous
> > regions for HPT because of CMA region fragmentation.
> >
> > The issue is covered in the LWN article: https://lwn.net/Articles/684611/
> > The article points that you raised the problem on LSFMM 2016. However I
> > couldn't find a follow up article on the issue.
> >
> > Looking at the kernel commit log I've identified a few commits that
> > might reduce CMA fragmentaiton and overcome HPT allocation failure:
> >   - bd2e75633c801 ("dma-contiguous: use fallback alloc_pages for single pages")
> >   - 678e174c4c16a ("powerpc/mm/iommu: allow migration of cma allocated
> >     pages during mm_iommu_do_alloc")
> >   - 9a4e9f3b2d739 ("mm: update get_user_pages_longterm to migrate pages allocated from
> >     CMA region")
> >   - d7fefcc8de914 ("mm/cma: add PF flag to force non cma alloc")
> >
> > Are there any other commits that address the issue? What is the first
> > kernel version that shouldn't have the HPT allocation problem due to CMA
> > fragmentation?
> 
> I've had some success increasing the CMA allocation with the
> kvm_cma_resv_ratio boot parameter - see
> arch/powerpc/kvm/book3s_hv_builtin.c
> 
> The default is 5%. In a support case in a former job we had a customer
> who increased this to I think 7 or 8% and saw the symptoms subside
> dramatically.
> 

Hi Daniel,

Thank you, I'll try to increase kvm_cma_resv_ratio for now, but even 5%
CMA reserve should be more than enough, given the size of HPT as 1/128th
of VM max memory.

For a 16GB RAM VM without balloon device, only 128MB is going to be
reserved for HPT using CMA. So, 5% CMA reserve should allow to provision
VMs with over 1.5TB of RAM on 256GB RAM host. In other words the default
CMA reserve allows to overprovision 6 times more memory for VMs than
presented on a host.

We rarely add balloon device and sometimes don't add it at all. Therefore
I'm still looking for commits that would help to avoid the issue with
the default CMA reserve.

Thank you,
Roman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HPT allocation failures on POWER8 KVM hosts
  2019-11-18 11:42   ` Roman Bolshakov
@ 2019-12-13  0:33     ` Roman Bolshakov
  0 siblings, 0 replies; 4+ messages in thread
From: Roman Bolshakov @ 2019-12-13  0:33 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Aneesh Kumar K.V, qemu-ppc, linuxppc-dev, linux

On Mon, Nov 18, 2019 at 02:42:42PM +0300, Roman Bolshakov wrote:
> On Mon, Nov 18, 2019 at 01:02:00PM +1100, Daniel Axtens wrote:
> > Hi Roman,
> > 
> > > We're running a lot of KVM virtual machines on POWER8 hosts and
> > > sometimes new VMs can't be started because there are no contiguous
> > > regions for HPT because of CMA region fragmentation.
> > >
> > > The issue is covered in the LWN article: https://lwn.net/Articles/684611/
> > > The article points that you raised the problem on LSFMM 2016. However I
> > > couldn't find a follow up article on the issue.
> > >
> > > Looking at the kernel commit log I've identified a few commits that
> > > might reduce CMA fragmentaiton and overcome HPT allocation failure:
> > >   - bd2e75633c801 ("dma-contiguous: use fallback alloc_pages for single pages")
> > >   - 678e174c4c16a ("powerpc/mm/iommu: allow migration of cma allocated
> > >     pages during mm_iommu_do_alloc")
> > >   - 9a4e9f3b2d739 ("mm: update get_user_pages_longterm to migrate pages allocated from
> > >     CMA region")
> > >   - d7fefcc8de914 ("mm/cma: add PF flag to force non cma alloc")
> > >
> > > Are there any other commits that address the issue? What is the first
> > > kernel version that shouldn't have the HPT allocation problem due to CMA
> > > fragmentation?
> > 
> > I've had some success increasing the CMA allocation with the
> > kvm_cma_resv_ratio boot parameter - see
> > arch/powerpc/kvm/book3s_hv_builtin.c
> > 
> > The default is 5%. In a support case in a former job we had a customer
> > who increased this to I think 7 or 8% and saw the symptoms subside
> > dramatically.
> > 
> 
> Hi Daniel,
> 
> Thank you, I'll try to increase kvm_cma_resv_ratio for now, but even 5%
> CMA reserve should be more than enough, given the size of HPT as 1/128th
> of VM max memory.
> 
> For a 16GB RAM VM without balloon device, only 128MB is going to be
> reserved for HPT using CMA. So, 5% CMA reserve should allow to provision
> VMs with over 1.5TB of RAM on 256GB RAM host. In other words the default
> CMA reserve allows to overprovision 6 times more memory for VMs than
> presented on a host.
> 
> We rarely add balloon device and sometimes don't add it at all. Therefore
> I'm still looking for commits that would help to avoid the issue with
> the default CMA reserve.
> 

FWIW, I have noticed the following. My host has 4 NUMA nodes with 4 CPUs
per node, only one of the nodes have CMA pages and only two of the nodes
have memory according to /proc/zoneinfo. The error can be reliably
reproduced if I attempt to place vCPUs on the node with CMA pages.

Roman

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-12-13  0:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-15 15:28 HPT allocation failures on POWER8 KVM hosts Roman Bolshakov
2019-11-18  2:02 ` Daniel Axtens
2019-11-18 11:42   ` Roman Bolshakov
2019-12-13  0:33     ` Roman Bolshakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).