From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34796) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1avAGg-0004UI-Ag for qemu-devel@nongnu.org; Tue, 26 Apr 2016 17:11:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1avAGb-00069s-Rh for qemu-devel@nongnu.org; Tue, 26 Apr 2016 17:11:38 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:36782) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1avAGb-00069c-LP for qemu-devel@nongnu.org; Tue, 26 Apr 2016 17:11:33 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 26 Apr 2016 15:11:30 -0600 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Michael Roth In-Reply-To: <20160426095236.23e51b77@nial.brq.redhat.com> References: <1458016736-10544-1-git-send-email-bharata@linux.vnet.ibm.com> <1458016736-10544-3-git-send-email-bharata@linux.vnet.ibm.com> <20160316013605.GC9032@voom> <20160316044154.GD13176@in.ibm.com> <20160425112050.545f4ff3@nial.brq.redhat.com> <20160426050923.GB9793@in.ibm.com> <20160426095236.23e51b77@nial.brq.redhat.com> Message-ID: <20160426210337.11723.83860@loki> Date: Tue, 26 Apr 2016 16:03:37 -0500 Subject: Re: [Qemu-devel] [RFC PATCH v2 2/2] spapr: Memory hot-unplug support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov , Bharata B Rao Cc: David Gibson , thuth@redhat.com, qemu-devel@nongnu.org, qemu-ppc@nongnu.org, nfont@linux.vnet.ibm.com Quoting Igor Mammedov (2016-04-26 02:52:36) > On Tue, 26 Apr 2016 10:39:23 +0530 > Bharata B Rao wrote: > = > > On Mon, Apr 25, 2016 at 11:20:50AM +0200, Igor Mammedov wrote: > > > On Wed, 16 Mar 2016 10:11:54 +0530 > > > Bharata B Rao wrote: > > > = > > > > On Wed, Mar 16, 2016 at 12:36:05PM +1100, David Gibson wrote: = > > > > > On Tue, Mar 15, 2016 at 10:08:56AM +0530, Bharata B Rao wrote: = > > > > > > Add support to hot remove pc-dimm memory devices. > > > > > > = > > > > > > Signed-off-by: Bharata B Rao = > > > > > = > > > > > Reviewed-by: David Gibson > > > > > = > > > > > Looks correct, but again, needs to wait on the PAPR change. = > > > [...] = > > > > = > > > > While we are here, I would also like to get some opinion on the real > > > > need for memory unplug. Is there anything that memory unplug gives = us > > > > which memory ballooning (shrinking mem via ballooning) can't give ?= = > > > Sure ballooning can complement memory hotplug but turning it on would > > > effectively reduce hotplug to balloning as it would enable overcommit > > > capability instead of hard partitioning pc-dimms provides. So one > > > could just use ballooning only and not bother with hotplug at all. > > > = > > > On the other hand memory hotplug/unplug (at least on x86) tries > > > to model real hardware, thus removing need in paravirt ballooning > > > solution in favor of native guest support. = > > = > > Thanks for your views. > > = > > > = > > > PS: > > > Guest wise, currently hot-unplug is not well supported in linux, > > > i.e. it's not guarantied that guest will honor unplug request > > > as it may pin dimm by using it as a non migratable memory. So > > > there is something to work on guest side to make unplug more > > > reliable/guarantied. = > > = > > In the above scenario where the guest doesn't allow removal of certain > > parts of DIMM memory, what is the expected behaviour as far as QEMU > > DIMM device is concerned ? I seem to be running into this situation > > very often with PowerPC mem unplug where I am left with a DIMM device > > that has only some memory blocks released. In this situation, I would l= ike > > to block further unplug requests on the same device, but QEMU seems > > to allow more such unplug requests to come in via the monitor. So > > qdev won't help me here ? Should I detect such condition from the > > machine unplug() handler and take required action ? > I think offlining is a guests task along with recovering from > inability to offline (i.e. offline all + eject or restore original state). > QUEM does it's job by notifying guest what dimm it wants to remove > and removes it when guest asks it (at least in x86 world). In the case of pseries, the DIMM abstraction isn't really exposed to the guest, but rather the memory blocks we use to make the backing memdev memory available to the guest. During unplug, the guest completely releases these blocks back to QEMU, and if it can only release a subset of what's requested it does not attempt to recover. We can potentially change that behavior on the guest side, since partially-freed DIMMs aren't currently useful on the host-side... But, in the case of pseries, I wonder if it makes sense to maybe go ahead and MADV_DONTNEED the ranges backing these released blocks so the host can at least partially reclaim the memory from a partially unplugged DIMM? > = > > = > > On x86, if some pages are offlined and subsequently other pages couldn't > > be offlined, then I see the full DIMM memory size remaining > > with the guest. So I infer that on x86, QEMU memory unplug either > > removes full DIMM or nothing. Is that understanding correct ? > I wouldn't bet that it's guarantied behavior but it should be this way. > = > > = > > Regards, > > Bharata. > > = >=20