From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:37497) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1goTnE-0007Ej-Ak for qemu-devel@nongnu.org; Tue, 29 Jan 2019 08:51:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1goTn8-0003ae-Br for qemu-devel@nongnu.org; Tue, 29 Jan 2019 08:51:12 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:36514) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1goTn8-0003a9-1f for qemu-devel@nongnu.org; Tue, 29 Jan 2019 08:51:06 -0500 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0TDnUjJ053429 for ; Tue, 29 Jan 2019 08:51:04 -0500 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 2qaqv40s51-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Jan 2019 08:51:03 -0500 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 29 Jan 2019 13:51:01 -0000 Reply-To: pmorel@linux.ibm.com References: <20190121134249.16615-1-david@redhat.com> <20190121134249.16615-3-david@redhat.com> <20190123120539.3ca367f9.cohuck@redhat.com> <20190128122833.4613659a.cohuck@redhat.com> <3b9f7e27-5f3b-5a36-87ab-209942334e59@linux.ibm.com> <0c546fd4-28dd-6e62-8f9f-cd7894d36849@redhat.com> From: Pierre Morel Date: Tue, 29 Jan 2019 14:50:56 +0100 MIME-Version: 1.0 In-Reply-To: <0c546fd4-28dd-6e62-8f9f-cd7894d36849@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Message-Id: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Hildenbrand , Collin Walling , Cornelia Huck , Christian Borntraeger Cc: Thomas Huth , qemu-s390x@nongnu.org, qemu-devel@nongnu.org, Richard Henderson On 29/01/2019 11:24, David Hildenbrand wrote: >>>> I'm wondering what the architecture says regarding those events -- c= an >>>> someone with access to the documentation comment? >>> >>> Ping. Any comments from the IBM folks? Hi, Sorry to have wait so long. At least Collin was faster. >>> >> >> >> So the idea here is that if we have a PCI device that is the process o= f >> being deconfigured and we are also in the middle of a reset, then let'= s >> accelerate deconfiguring of the PCI device during the reset. Makes sen= se. to me too. However, how do we ensure that the guest got time to respond to the=20 first deconfigure request? >> >> Note: >> >> The callback function will deconfigure the the device and put it into >> standby mode. However, a PCI device should only go into standby from t= he >> *disabled state* (which it could already be in due to the unplug >> sequence), or from a *permanent error state* (something we should >> hopefully never see -- this means something went seriously wrong with >> the device). Not completely exact, the CHSC event 0x304, on the initiative of the=20 host, force the "deconfigure state" from "configure state" generally,=20 whatever sub-state it has (enabled/disabled/error...). >=20 > Right, this should already have been checked before setting up the time= r. Apropos timer, do we need a timer or wouldn't it be better to use a=20 delay / a timer + condition? AFAIU we get out of the unplug without waiting for any answer from the=20 guest and we surely get the timer triggering after the reset has been don= e. That seems bad. >=20 >> >> Two things I'm concerned about: >> >> 1) >> >> What I would suggest is adding a check for the pbdev->state for >> ZPCI_FS_DISABLED || ZPCI_FS_PERMANENT_ERROR. If it is in either of the= se >> states, then we're safe to deconfigure and put into standby. If the >=20 > We setup a timer if !ZPCI_FS_STANDBY and !ZPCI_FS_RESERVED. >=20 > So for > - ZPCI_FS_DISABLED > - ZPCI_FS_ENABLED > - ZPCI_FS_BLOCKED > - ZPCI_FS_ERROR > - ZPCI_FS_PERMANENT_ERROR >=20 > We setup a timer and simply go ahead and unplug the device when the > timer expires ("forced unplug"). I agree only for ZPCI_FS_ENABLED why do we need to be smooth for other=20 states? ZPCI_FS_DISABLED may be a candidate even I do not see the interrest but=20 other states of the device should issue a HP_EVENT_DECONFIGURE_REQUEST=20 and we do not need a timer (or any delay) >=20 > Changing that behavior might be more invasive. Simply not unplugging in > s390_pcihost_timer_cb() on some of these states would mean that somebod= y > issued a request and that requests is simply lost/ignored. Not sure if > that is what we want. I think we need separate patches to change > something like that. Especially >=20 > 1. What happens if the device was in ZPCI_FS_DISABLED, the guest ignore= s > the unplug request and moves the device to ZPCI_FS_ENABLED before the > timer expires? These are corner cases to consider. +1, we must ensure to do the work inside the unplug CB. >=20 > 2. Do we need a timer at all? Now that Patch #1 introduces > unplug_requests, we are free to ignore requests from the user if the > guest is not reacting. I would really favor getting rid of the timer > completely. Was there a special reason why this was introduced? Yes, to let a chance to the guest to smoothly relinquish the device. (for example sync/clean the disk) However I do not think it is right implemented. >=20 > No other architecture (e.g. ACPI) uses such a timer. They use a simple > flag to remember if a request is pending. I would really favor going > into that direction. I am not sure that the Intel architecture is a good example. :) AFAIU they do not wait for the guest to have relinquish the device. Or do they? How long do they wait? >=20 > @Christian do you think we need this "force remove after 30 seconds" > timer thingy? >=20 >> device is still in another state (such as enabled or blocked, etc) the= n >> we should allow the timer to resume and give the device some more time >> before forcing an unplug. It's also probably not a good idea to try an= d >> deconfigure a device that might already be deconfigured (e.g. if it's >> already in standby or reserved state). That might not happen though, b= ut >> it's good to cover our bases. >> >> A side thought: In addition to checking the states, what would happen = if >> you forced the timer to 0? Would the callback get called? Would that >> just accelerate the already-in-progress unplug sequence? >=20 > I guess so, but then action will be called asynchronously from the main > loop when timers are run. Calling it synchronously during the reset > makes in my point of view more sense. To me too, (and during hot unplug too) but it has a sense only if we=20 wait some time between the demand to relinquish the device and the rip=20 off of the device. Regards, Pierre --=20 Pierre Morel Linux/KVM/QEMU in B=C3=B6blingen - Germany