All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Cornelia Huck <cohuck@redhat.com>, Igor Mammedov <imammedo@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
	ehabkost@redhat.com, David Hildenbrand <david@redhat.com>,
	groug@kaod.org, qemu-devel@nongnu.org, qemu-s390x@nongnu.org,
	qemu-ppc@nongnu.org, clg@kaod.org, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation
Date: Thu, 19 Apr 2018 15:34:10 +0200	[thread overview]
Message-ID: <77d0717b-6eba-8b20-6691-c3085937604b@de.ibm.com> (raw)
In-Reply-To: <20180419145840.324602ff.cohuck@redhat.com>



On 04/19/2018 02:58 PM, Cornelia Huck wrote:
> On Thu, 19 Apr 2018 14:33:18 +0200
> Igor Mammedov <imammedo@redhat.com> wrote:
> 
>> On Thu, 19 Apr 2018 17:21:23 +1000
>> David Gibson <david@gibson.dropbear.id.au> wrote:
>>
>>> If the -mem-path option is set, we attempt to map the guest's RAM from a
>>> file in the given path; it's usually used to back guest RAM with hugepages.
>>> If we're unable to (e.g. not enough free hugepages) then we fall back to
>>> allocating normal anonymous pages.  This behaviour can be surprising, but a
>>> comment in allocate_system_memory_nonnuma() suggests it's legacy behaviour
>>> we can't change.
>>>
>>> What really isn't ok, though, is that in this case we leave mem_path set.
>>> That means functions which attempt to determine the pagesize of main RAM
>>> can erroneously think it is hugepage based on the requested path, even
>>> though it's not.
>>>
>>> This is particular bad for the pseries machine type.  KVM HV limitations
>>> mean the guest can't use pagesizes larger than the host page size used to
>>> back RAM.  That means that such a fallback, rather than merely giving
>>> poorer performance that expected will cause the guest to freeze up early in
>>> boot as it attempts to use large page mappings that can't work.
>>>
>>> This patch addresses the problem by clearing the mem_path variable when we
>>> fall back to anonymous pages, meaning that subsequent attempts to
>>> determine the RAM page size will get an accurate result.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>>  numa.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> Paolo et al, as with my earlier patches adding some extensions to the
>>> helpers for determining backing page sizes, if there are no objections
>>> can I get an ack to merge this via my ppc tree?
>>>
>>> diff --git a/numa.c b/numa.c
>>> index 1116c90af9..78a869e598 100644
>>> --- a/numa.c
>>> +++ b/numa.c
>>> @@ -469,6 +469,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>>>              /* Legacy behavior: if allocation failed, fall back to
>>>               * regular RAM allocation.
>>>               */
>>> +            mem_path = NULL;
>>>              memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
>>>          }
>>>  #else  
>>
>> mem_path is also used by kvm_s390_apply_cpu_model(),
>> and in ccw_init() memory is initialized before CPUs are
>> so if QEM was started with -mem-path, then before patch
>> created CPU won't have CMM enabled and print warning:
>>   
>>  "CMM will not be enabled because it is not compatible with hugetlbfs."
>>
>> and after patch it might enable CMM if we clear mem_path.
>> So question is do we care about this?
> 
> I don't quite remember the cmm semantics here -- Christian?

The CMMA interface does not work on large pages. I think the kernel will react
with EFAULT in some cases (cmma migration and others) so qemu will probably fail
unexpectedly. 

But this patch seems to only clear mem-path if we do not allocate at all from
hugetlbfs. So things should be ok, no?

  reply	other threads:[~2018-04-19 13:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-19  7:21 [Qemu-devel] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation David Gibson
2018-04-19 12:33 ` Igor Mammedov
2018-04-19 12:58   ` [Qemu-devel] [qemu-s390x] " Cornelia Huck
2018-04-19 13:34     ` Christian Borntraeger [this message]
2018-04-19 14:11       ` David Hildenbrand
2018-04-19 16:08         ` Greg Kurz
2018-04-20  2:17           ` David Gibson
2018-04-20  7:13           ` Christian Borntraeger
2018-04-19 16:30 ` [Qemu-devel] " Greg Kurz
2018-04-20  2:18   ` David Gibson
2018-04-20 15:34     ` Paolo Bonzini
2018-04-21  9:20       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77d0717b-6eba-8b20-6691-c3085937604b@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=clg@kaod.org \
    --cc=cohuck@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=david@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=groug@kaod.org \
    --cc=imammedo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.