From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59198) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d5umN-00023u-6W for qemu-devel@nongnu.org; Wed, 03 May 2017 09:57:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d5umI-0002AQ-NH for qemu-devel@nongnu.org; Wed, 03 May 2017 09:57:19 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:60406) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d5umI-00029i-BN for qemu-devel@nongnu.org; Wed, 03 May 2017 09:57:14 -0400 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v43DsLET051133 for ; Wed, 3 May 2017 09:57:13 -0400 Received: from e24smtp01.br.ibm.com (e24smtp01.br.ibm.com [32.104.18.85]) by mx0a-001b2d01.pphosted.com with ESMTP id 2a7dbhhdmq-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 03 May 2017 09:57:12 -0400 Received: from localhost by e24smtp01.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 3 May 2017 10:57:09 -0300 References: <20170430172547.13415-1-danielhb@linux.vnet.ibm.com> <20170430172547.13415-3-danielhb@linux.vnet.ibm.com> From: Daniel Henrique Barboza Date: Wed, 3 May 2017 10:56:57 -0300 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Message-Id: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bharata B Rao Cc: "qemu-ppc@nongnu.org" , "qemu-devel@nongnu.org" , David Gibson On 05/03/2017 12:26 AM, Bharata B Rao wrote: > On Tue, May 2, 2017 at 1:13 PM, Daniel Henrique Barboza > > wrote: > > > > On 05/02/2017 12:40 AM, Bharata B Rao wrote: >> On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza >> > > wrote: >> >> Following up the previous detach_cb change, this patch >> removes the >> detach_cb_opaque entirely from the code. >> >> The reason is that the drc->detach_cb_opaque object can't be >> restored in the post load of the upcoming DRC migration and >> no detach >> callbacks actually need this opaque. 'spapr_core_release' is >> receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is >> receiving >> a phb object as opaque but is't using it. These were trivial >> removal >> cases. >> >> However, the LM removal callback 'spapr_lmb_release' is receiving >> and using the opaque object, a 'sPAPRDIMMState' struct. This >> struct >> holds the number of LMBs the DIMM object contains and the >> callback >> was using this counter as a countdown to check if all LMB >> DRCs were >> release before proceeding to the DIMM unplug. To remove the >> need of >> this callback we have choices such as: >> >> - migrate the 'sPAPRDIMMState' struct. This would require >> creating a >> QTAILQ to store all DIMMStates and an additional 'dimm_id' >> field to >> associate the DIMMState with the DIMM object. We could attach >> this >> QTAILQ to the 'sPAPRPHBState' and retrieve it later in the >> callback. >> >> - fetch the state of the LMB DRCs directly by scanning the >> state of >> them and, if all of them are released, proceed with the DIMM >> unplug. >> >> The second approach was chosen. The new >> 'spapr_all_lmbs_drcs_released' >> function scans all LMBs of a given DIMM device to see if >> their DRC >> state are inactive. If all of them are inactive return >> 'true', 'false' >> otherwise. This function is being called inside the >> 'spapr_lmb_release' >> callback, replacing the role of the 'sPAPRDIMMState' opaque. The >> 'sPAPRDIMMState' struct was removed from the code given that >> there are >> no more uses for it. >> >> After all these changes, there are no roles left for the >> 'detach_cb_opaque' >> attribute of the 'sPAPRDRConnector' as well, so we can safely >> remove >> it from the code too. >> >> Signed-off-by: Daniel Henrique Barboza >> > > >> --- >> hw/ppc/spapr.c | 46 >> +++++++++++++++++++++++++++++++++------------- >> hw/ppc/spapr_drc.c | 16 +++++----------- >> hw/ppc/spapr_pci.c | 4 ++-- >> include/hw/ppc/spapr_drc.h | 6 ++---- >> 4 files changed, 42 insertions(+), 30 deletions(-) >> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index bc11757..8b9a6cf 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque) >> } >> } >> >> -typedef struct sPAPRDIMMState { >> - uint32_t nr_lmbs; >> -} sPAPRDIMMState; >> +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm) >> +{ >> + Error *local_err = NULL; >> + PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); >> + MemoryRegion *mr = ddc->get_memory_region(dimm); >> + uint64_t size = memory_region_size(mr); >> + >> + uint64_t addr; >> + addr = object_property_get_int(OBJECT(dimm), >> PC_DIMM_ADDR_PROP, &local_err); >> + if (local_err) { >> + error_propagate(&error_abort, local_err); >> + return false; >> + } >> + uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE; >> >> -static void spapr_lmb_release(DeviceState *dev, void *opaque) >> + sPAPRDRConnector *drc; >> + int i = 0; >> + for (i = 0; i < nr_lmbs; i++) { >> + drc = >> spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB, >> + addr / SPAPR_MEMORY_BLOCK_SIZE); >> + g_assert(drc); >> + if (drc->indicator_state != >> SPAPR_DR_INDICATOR_STATE_INACTIVE) { >> + return false; >> + } >> + addr += SPAPR_MEMORY_BLOCK_SIZE; >> + } >> + return true; >> +} >> + >> +static void spapr_lmb_release(DeviceState *dev) >> { >> - sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque; >> HotplugHandler *hotplug_ctrl; >> >> - if (--ds->nr_lmbs) { >> + if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) { >> return; >> } >> >> >> I am concerned about the number of times we walk the DRC list >> corresponding to each DIMM device. When a DIMM device is being >> removed, spapr_lmb_release() will be invoked for each of the LMBs >> of that DIMM. Now in this scheme, we end up walking through all >> the DRC objects of the DIMM from every LMB's release function. > > Hi Bharata, > > > I agree, this is definitely a poorer performance than simply > decrementing ds->nr_lmbs. > The reasons why I went on with it: > > - hot unplug isn't an operation that happens too often, so it's > not terrible > to have a delay increase here; > > - it didn't increased the unplug delay in an human noticeable way, > at least in > my tests; > > - apart from migrating the information, there is nothing much we > can do in the > callback side about it. The callback isn't aware of the current > state of the DIMM > removal process, so the scanning is required every time. > > > All that said, assuming that the process of DIMM removal will > always go through > 'spapr_del_lmbs', why do we need this callback? Can't we simply do > something > like this in spapr_del_lmbs? > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index cd42449..e443fea 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState > *dev, uint64_t addr_start, uint64_t size, > addr += SPAPR_MEMORY_BLOCK_SIZE; > } > > + if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) { > + // something went wrong in the removal of the LMBs. > + // propagate error and return > + throw_error_code; > + return; > + } > > > spapr_del_lmbs() is called from ->unplug_request(). Here we notify the > guest about the unplug request. We have to wait till the guest gives > us a go ahead so that we can cleanup the DIMM device. The cleanup is > done as part of release callback (spapr_lmb_release) at which point we > are sure that the given LMB has been indeed removed by the guest. I wasn't clear enough in my last comment. Let me rephrase. Is there any other use for the 'spapr_lmb_release' callback function other than being called by the spapr_del_lmbs() in the flow you've stated above? Searching the master code now I've found: $ grep -R 'spapr_lmb_release' . ./spapr.c:static void spapr_lmb_release(DeviceState *dev, void *opaque) ./spapr.c: drck->detach(drc, dev, spapr_lmb_release, ds, errp); Note that all the callback is doing is asserting that a nr_lmb counter will be zero after a decrement and, if true, execute the following: hotplug_ctrl = qdev_get_hotplug_handler(dev); hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort); So, if the callback spapr_lmb_release is only being called in the following for loop of spapr_del_lmbs() to clean up each LMB DRC, can't we get rid of it and do the following after this for loop? for (i = 0; i < nr_lmbs; i++) { drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB, addr / SPAPR_MEMORY_BLOCK_SIZE); g_assert(drc); drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); drck->detach(drc, dev, ds, errp); addr += SPAPR_MEMORY_BLOCK_SIZE; } if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) { // All LMBs were cleared, proceed with detach hotplug_ctrl = qdev_get_hotplug_handler(dev); hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort); } // proceed with spapr_del_lmbs code Doesn't this code does exactly the same thing that the callback does today? Note that we can even use that conditional to block the remaining spapr_del_lmbs code from executing if the LMBs weren't properly cleansed - something that today isn't done. If removing this callback is too problematic or can somehow cause problems that I am unable to foresee, then the alternative would be to either deal with the scanning inside the callback (as it is being done in this patch) or migrate the nr_lmbs information for late retrieval in the callback. I am fine with any alternative, we just need to agree on what makes more sense. Daniel > > Regards, > Bharata.