From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Re: freemem-slack and large memory environments Date: Mon, 2 Mar 2015 16:15:41 +0000 Message-ID: References: <4321015.nah3j6dvJq@mlatimer1.dnsdhcp.provo.novell.com> <3226285.BhjaSCDldb@mlatimer1.dnsdhcp.provo.novell.com> <8145398.E30EXKdkiW@mlatimer1.dnsdhcp.provo.novell.com> <1485105.OlVcWx0gpa@mlatimer1.dnsdhcp.provo.novell.com> <1425296548.1886.40.camel@citrix.com> <1425312260.23379.10.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1425312260.23379.10.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, wei.liu2@citrix.com, Mike Latimer , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org On Mon, 2 Mar 2015, Ian Campbell wrote: > On Mon, 2015-03-02 at 15:19 +0000, Stefano Stabellini wrote: > > On Mon, 2 Mar 2015, Ian Campbell wrote: > > > On Fri, 2015-02-27 at 17:31 -0700, Mike Latimer wrote: > > > > On Friday, February 27, 2015 11:29:12 AM Mike Latimer wrote: > > > > > On Friday, February 27, 2015 08:28:49 AM Mike Latimer wrote: > > > > > After adding 2048aeec, dom0's target is lowered by the required amount (e.g. > > > > > 64GB), but as dom0 cannot balloon down fast enough, > > > > > libxl_wait_for_memory_target returns -5, and the domain create fails > > > > (wrong return code - libxl_wait_for_memory_target actually returns -3) > > > > > > > > With libxl_wait_for_memory_target return code corrected (2048aeec), debug > > > > messages look like this: > > > > > > > > Parsing config from sles12pv > > > > DBG: start freemem loop > > > > DBG: free_memkb = 541976, need_memkb = 67651584 (rc=0) > > > > DBG: dom0_curr_target = 2118976472, set_memory_target = -67109608 (rc=1) > > > > DBG: wait_for_free_memory = 67651584 (rc=-5) > > > > DBG: wait_for_memory_target (rc=-3) > > > > failed to free memory for the domain > > > > > > > > After failing, dom0 continues to balloon down by the requested amount > > > > (-67109608), so a subsequent startup attempt would work. > > > > > > > > My original fix (2563bca1) was intended to continue looping in freem until dom0 > > > > ballooned down the requested amount. However, this really only worked without > > > > 2048aeec, as wait_for_memory_target was always returning 0. After Stefano > > > > pointed out this problem, commit 2563bca1 can still be useful - but seems less > > > > important as ballooning down dom0 is where the major delays are seen. > > > > > > > > The following messages show what was happening when wait_for_memory_target was > > > > always returning 0. I've narrowed it down to just the interesting messages: > > > > > > > > DBG: free_memkb = 9794852, need_memkb = 67651584 (rc=0) > > > > DBG: dom0_curr_target = 2118976464, set_memory_target = -67109596 (rc=1) > > > > DBG: dom0_curr_target = 2051866868, set_memory_target = -57856732 (rc=1) > > > > DBG: dom0_curr_target = 1994010136, set_memory_target = -50615004 (rc=1) > > > > DBG: dom0_curr_target = 1943395132, set_memory_target = -43965148 (rc=1) > > > > DBG: dom0_curr_target = 1899429984, set_memory_target = -37538524 (rc=1) > > > > DBG: dom0_curr_target = 1861891460, set_memory_target = -31560412 (rc=1) > > > > DBG: dom0_curr_target = 1830331048, set_memory_target = -25309916 (rc=1) > > > > DBG: dom0_curr_target = 1805021132, set_memory_target = -19514076 (rc=1) > > > > DBG: dom0_curr_target = 1785507056, set_memory_target = -13949660 (rc=1) > > > > DBG: dom0_curr_target = 1771557396, set_memory_target = -8057564 (rc=1) > > > > DBG: dom0_curr_target = 1763499832, set_memory_target = -1862364 (rc=1) > > > > > > > > The above situation is no longer relevant, but the overall dom0 target problem > > > > is still an issue. It now seems rather obvious (hopefully) that the 10 second > > > > delay in wait_for_memory_target is not sufficient. Should that function be > > > > modified to monitor ongoing progress and continue waiting as long as progress > > > > is being made? > > > > > > You mean to push the logic currently in freemem to keep going so long as > > > there is progress being made down into libxl_wait_for_memory_target? > > > That seems like a good idea. > > > > "Continue as long as progress is being made" is not exactly the idea > > behind the current implementation of freemem even if we do have a check > > like > > > > if (free_memkb > free_memkb_prev) { > > > > at the end. > > ? "Continue as long as progress is being made" is exactly what > 2563bca1154 "libxl: Wait for ballooning if free memory is increasing" > was trying to implement, so it certainly was the idea behind the current > implementation (it may well not have managed to implement the idea it > was trying to). I don't think so: pre-2563bca1154 it wasn't and 2563bca1154 doesn't implement it properly. I think we should revert it. > > See: > > > > http://marc.info/?l=xen-devel&m=142503529725529 > > > > Specifically the current loop is not able to cope with Dom0 decreasing > > its reservation but not reaching its target (because it is too slow or > > because of other reasons). > > I'm not sure what you are saying here. > > Are you just agreeing with the suggestion in an oblique way? I agree with the suggestion "continue as long as progress is being made". I disagree that "continue as long as progress is being made" is the current logic of freemem. I don't have an opinion on whether that logic should be implemented in freemem or libxl_wait_for_memory_target. The code below implements it in libxl_wait_for_memory_target. > > The following changes to freemem implements the aforementioned logic by > > improving libxl_wait_for_memory_target to keep going as long as dom0 is > > able to make progress and removing the useless free_memkb > > > free_memkb_prev check. > > > > > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c > > index b9d083a..ec98e00 100644 > > --- a/tools/libxl/libxl.c > > +++ b/tools/libxl/libxl.c > > @@ -5002,26 +5002,41 @@ int libxl_wait_for_memory_target(libxl_ctx *ctx, uint32_t domid, int wait_secs) > > { > > int rc = 0; > > uint32_t target_memkb = 0; > > + uint64_t current_memkb, prev_memkb; > > libxl_dominfo info; > > > > + rc = libxl_get_memory_target(ctx, domid, &target_memkb); > > + if (rc < 0) > > + return rc; > > + > > libxl_dominfo_init(&info); > > + prev_memkb = UINT64_MAX; > > > > do { > > - wait_secs--; > > sleep(1); > > > > - rc = libxl_get_memory_target(ctx, domid, &target_memkb); > > - if (rc < 0) > > - goto out; > > - > > libxl_dominfo_dispose(&info); > > libxl_dominfo_init(&info); > > rc = libxl_domain_info(ctx, &info, domid); > > if (rc < 0) > > goto out; > > - } while (wait_secs > 0 && (info.current_memkb + info.outstanding_memkb) > target_memkb); > > > > - if ((info.current_memkb + info.outstanding_memkb) <= target_memkb) > > + current_memkb = info.current_memkb + info.outstanding_memkb; > > + > > + if (current_memkb > prev_memkb) > > + { > > + rc = ERROR_FAIL; > > + goto out; > > + } > > + else if (current_memkb == prev_memkb) > > + wait_secs--; > > + /* if current_memkb < prev_memkb loop for free as progress has > > + * been made */ > > + > > + prev_memkb = current_memkb; > > + } while (wait_secs > 0 && current_memkb > target_memkb); > > + > > + if (current_memkb <= target_memkb) > > rc = 0; > > else > > rc = ERROR_FAIL; > > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c > > index 53c16eb..17f61a8 100644 > > --- a/tools/libxl/xl_cmdimpl.c > > +++ b/tools/libxl/xl_cmdimpl.c > > @@ -2200,7 +2200,7 @@ static int freemem(uint32_t domid, libxl_domain_build_info *b_info) > > { > > int rc, retries; > > const int MAX_RETRIES = 3; > > - uint32_t need_memkb, free_memkb, free_memkb_prev = 0; > > + uint32_t need_memkb, free_memkb; > > > > if (!autoballoon) > > return 0; > > @@ -2228,22 +2228,14 @@ static int freemem(uint32_t domid, libxl_domain_build_info *b_info) > > else if (rc != ERROR_NOMEM) > > return rc; > > > > - /* the memory target has been reached but the free memory is still > > - * not enough: loop over again */ > > + /* wait until dom0 reaches its target, as long as we are making > > + * progress */ > > rc = libxl_wait_for_memory_target(ctx, 0, 1); > > if (rc < 0) > > return rc; > > > > - /* > > - * If the amount of free mem has increased on this iteration (i.e. > > - * some progress has been made) then reset the retry counter. > > - */ > > - if (free_memkb > free_memkb_prev) { > > - retries = MAX_RETRIES; > > - free_memkb_prev = free_memkb; > > - } else { > > - retries--; > > - } > > + /* dom0 target has been reached: loop again */ > > + retries--; > > } while (retries > 0); > > > > return ERROR_NOMEM; > >