From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: freemem-slack and large memory environments Date: Mon, 02 Mar 2015 13:04:11 +0000 Message-ID: <54F46DDB020000780006505B@mail.emea.novell.com> References: <4321015.nah3j6dvJq@mlatimer1.dnsdhcp.provo.novell.com> <3226285.BhjaSCDldb@mlatimer1.dnsdhcp.provo.novell.com> <8145398.E30EXKdkiW@mlatimer1.dnsdhcp.provo.novell.com> <1485105.OlVcWx0gpa@mlatimer1.dnsdhcp.provo.novell.com> <54F44D0B0200007800064EB3@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: ian.jackson@eu.citrix.com, Mike Latimer , wei.liu2@citrix.com, Ian Campbell , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org >>> On 02.03.15 at 13:13, wrote: > On Mon, 2 Mar 2015, Jan Beulich wrote: >> >>> On 02.03.15 at 11:12, wrote: >> > On Fri, 27 Feb 2015, Mike Latimer wrote: >> >> On Friday, February 27, 2015 11:29:12 AM Mike Latimer wrote: >> >> > On Friday, February 27, 2015 08:28:49 AM Mike Latimer wrote: >> >> > After adding 2048aeec, dom0's target is lowered by the required amount >> > (e.g. >> >> > 64GB), but as dom0 cannot balloon down fast enough, >> >> > libxl_wait_for_memory_target returns -5, and the domain create fails >> >> (wrong return code - libxl_wait_for_memory_target actually returns -3) >> >> >> >> With libxl_wait_for_memory_target return code corrected (2048aeec), debug >> >> messages look like this: >> >> >> >> Parsing config from sles12pv >> >> DBG: start freemem loop >> >> DBG: free_memkb = 541976, need_memkb = 67651584 (rc=0) >> >> DBG: dom0_curr_target = 2118976472, set_memory_target = -67109608 (rc=1) >> >> DBG: wait_for_free_memory = 67651584 (rc=-5) >> >> DBG: wait_for_memory_target (rc=-3) >> >> failed to free memory for the domain >> >> >> >> After failing, dom0 continues to balloon down by the requested amount >> >> (-67109608), so a subsequent startup attempt would work. >> >> >> >> My original fix (2563bca1) was intended to continue looping in freem until >> > dom0 >> >> ballooned down the requested amount. However, this really only worked >> > without >> >> 2048aeec, as wait_for_memory_target was always returning 0. After Stefano >> >> pointed out this problem, commit 2563bca1 can still be useful - but seems >> > less >> >> important as ballooning down dom0 is where the major delays are seen. >> >> >> >> The following messages show what was happening when wait_for_memory_target >> > was >> >> always returning 0. I've narrowed it down to just the interesting messages: >> >> >> >> DBG: free_memkb = 9794852, need_memkb = 67651584 (rc=0) >> >> DBG: dom0_curr_target = 2118976464, set_memory_target = -67109596 (rc=1) >> >> DBG: dom0_curr_target = 2051866868, set_memory_target = -57856732 (rc=1) >> >> DBG: dom0_curr_target = 1994010136, set_memory_target = -50615004 (rc=1) >> >> DBG: dom0_curr_target = 1943395132, set_memory_target = -43965148 (rc=1) >> >> DBG: dom0_curr_target = 1899429984, set_memory_target = -37538524 (rc=1) >> >> DBG: dom0_curr_target = 1861891460, set_memory_target = -31560412 (rc=1) >> >> DBG: dom0_curr_target = 1830331048, set_memory_target = -25309916 (rc=1) >> >> DBG: dom0_curr_target = 1805021132, set_memory_target = -19514076 (rc=1) >> >> DBG: dom0_curr_target = 1785507056, set_memory_target = -13949660 (rc=1) >> >> DBG: dom0_curr_target = 1771557396, set_memory_target = -8057564 (rc=1) >> >> DBG: dom0_curr_target = 1763499832, set_memory_target = -1862364 (rc=1) >> >> >> >> The above situation is no longer relevant, but the overall dom0 target >> > problem >> >> is still an issue. It now seems rather obvious (hopefully) that the 10 >> > second >> >> delay in wait_for_memory_target is not sufficient. Should that function be >> >> modified to monitor ongoing progress and continue waiting as long as >> > progress >> >> is being made? >> >> >> >> Sorry for the long discussion to get to this point. :( >> > >> > I think we need to increase the timeout passed to >> > libxl_wait_for_free_memory. Would 30 sec be enough? >> >> No fixed timeout will ever be enough for arbitrarily large requests. > > There is no way for Dom0 to notify Xen and/or libxl that ballooning is > completed. There is no choice but to wait. We could make the wait time > unlimited (after all arbitrarily large requests need arbitrarily large > wait times), but do we really want that? That's why almost everyone else seem to agree that waiting as long as there is progress being made is the right approach. > Of course users could just use dom0_mem and get down with it. I don't think we should make this a requirement for correct operation. Jan