From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: Re: xc: error: xc_machphys_mfn_list: 83 != 129 when suspending 32GB PV DomU Date: Mon, 14 Mar 2011 10:20:09 +0000 Message-ID: <1300098009.17339.2110.camel@zakaz.uk.xensource.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Tim Deegan , Keir, Fraser , Xen Devel , Gianni Tedesco List-Id: xen-devel@lists.xenproject.org On Fri, 2011-03-11 at 19:21 +0000, Keir Fraser wrote: > On 11/03/2011 18:52, "Gianni Tedesco" wrote: > > > Further debugging reveals the variables are set as such: > > (XEN) compat_machine_to_phys_mapping = 18446606377058041856 > > (XEN) max_page = 67272704 > > (XEN) MACH2PHYS_COMPAT_NR_ENTRIES(current->domain) = 43515904 > > (XEN) RDWR_COMPAT_MPT_VIRT_START = 18446606377058041856 > > (XEN) RDWR_COMPAT_MPT_VIRT_END = 18446606378131783680 > > (XEN) limit = 18446606377232105472, (1 << L2_PAGETABLE_SHIFT) = 2097152 > > > > Could it be that the compat mach-to-phys conversion table size of 1GB is > > too small? > > It is insufficient to cover all of the system's memory. The reason for the > limit is that a 1GB M2P table is all that is reasonable to map into a 32-bit > domain's address space while still leaving space for the guest's own > mappings. The compat M2P actually mapped into the guest isn't 1GB, 1GB would be the entire kernel mapping with no room for anything else. Also 1GB of M2P is enough to cover 1TB of host memory so I don't think it's too small at the moment. Is the limit here not MACH2PHYS_COMPAT_NR_ENTRIES? (in the above limit == compat_machine_to_phys_mapping + ~160M) IIRC the size of the M2P which is mapped into a PAE guest is normally capped at ~160M (the total size of the hypervisor hole for a PAE guest running on a PAE hypervisor). 160M is enough M2P for 160G of host address space which would explain why this is seen on a 256GB host but not a 128GB one. The limit on the size of the M2P is adjustable, in particular for dom0 I think it would be reasonable to allow it to expand to, e.g. 256M, without too much cause for concern. Obviously this hole eats into the 1GB kernel mapping so you don't want it to grow too much bigger and long run something better would be needed but this would probably allow you to support 256GB without too much trouble in the short term, other than slightly reducing the amount of lowmem the system sees (which might be an issue if you've chosen dom0_mem on that basis...) The lower limit is set by the kernel in its XEN_ELFNOTE_HV_START_LOW ELF note (set in arch/x86/kernel/head_32-xen.S), which is picked up in xen/arch/x86/build_domain.c:construct_dom0(). NB: This might be the first time this functionality has been used in anger to increase the M2P space (I think it is actively used to shrink it on hosts with <160G). Another alternative, which would allow large hosts without needing to expand the dom0 M2P, would be to provide interfaces that allow the tools to map specific portions of the host M2P so the tools can build themselves a mapcache style thing. The M2P space which needs to be accessed to perform a migration of an individual guest is likely going to be smaller than the total host RAM so even using 256M-512M of guest user-mode address space (allowing for 256GB-512GB of host address space) would likely allow you to map the bits you need without excessive churn (aka performance hit) in the mapping. A given userspace process has 3G of address space to play with so it can take the hit of increasing the M2P mapcache size far easier than the kernel can. Hrm, maybe you don't even need a map cache thing -- just a way to allow a userspace process to map more M2P than the kernel can... (which might be as simple as removing the limit clamp based on MACH2PHYS_COMPAT_NR_ENTRIES in the compat layer?) Ian.