All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests
@ 2014-08-07 22:04 Sander Eikelenboom
  2014-08-08  7:09 ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Sander Eikelenboom @ 2014-08-07 22:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Ian Campbell; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 6231 bytes --]

Hi Konrad / Ian,

Today i thought of doing some DIY .. and took the problem i encountered:

"other issue with removing pci devices passed through to HVM guests related to 
the signaling via xenstore" described in:
http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html

It seems there are quite some (little) issues. I got things working for my 
specific case, however how to get things done in a backwards compatible matter 
(for multiple kernel version and the three variants of pv_guest, qemu_trad and 
qemu_xen is still a question :-).

But one of the main questions is .. 
- Is there some document describing the various states in more detail then in 
  xenbus.h ?
- Are there rules with regard to backend, toolstack(libxl) and frontend as to 
  who does what in changing states ?

For example when domains are setup and running with pci-passthrough devices, if
you list the devices with 'xenstore-ls' these are still in 
XenbusStateInitialised(3) instead of XenbusStateConnected(4), seems incorrect to 
me.


But back to the problem i was trying to solve:

Here are both a crude patch for xen/libxl and pciback that change the signaling 
and now allow the removal of an individual pci device from a HVM guest.
I hope i took the right states based on the descriptions and code i found.

What these patches try to do on remove of a pci device (and when everything goes well) is:
- after libxl has removed the device from the guest with a qmp command
- let libxl set the pci-device-state to XenbusStateClosing AND 
  set pci-root-state to XenbusStateReconfiguring
- libxl waits for the pci-root-state to change to XenbusStateReconfigured

- pciback has a watch on the pci-root-state and removes the device and resets in 
  pci config space, after that it sets the pci-device-state to XenbusStateClosed AND
  set pci-root-state to XenbusStateReconfigured

- libxl now continues from the on XenbusStateReconfigured, cleans up the xenstore entries for 
  the device and finally sets pci-root-state back to XenbusStateConnected

However the problem now seems to be on shutdown .. where removing multiple 
passed through device mess up the signalling on pci-root causing the 
libxl__wait_for_backend to last forever *sigh*
                                            
But perhaps someone has suggestions ...

(patches are also attached to prevent mail mangling)

--
Sander


LIBXL patch:

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 2782d0e..b54531c 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -204,6 +204,23 @@ retry_transaction:
         }
     }
 
+   if (domtype == LIBXL_DOMAIN_TYPE_HVM) {
+        if (libxl__wait_for_backend(gc, be_path, "8") < 0) {
+            LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "pci backend at %s is not ready", be_path);
+            return ERROR_FAIL;
+        } else {
+            if (atoi(libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, "%s/state-%d", be_path, i))) != 6 ) {
+                LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "pci backend at %s didn't set device into closing state: %d",
+                           libxl__sprintf(gc, "%s/state-%d", be_path, i), 
+                           atoi(libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, "%s/state-%d", be_path, i))));
+                return ERROR_FAIL;
+            } else {
+                LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "pci backend at %s has set device into XenbusStateClosed, "
+                           "continue with removing xenstore entry", libxl__sprintf(gc, "%s/state-%d", be_path, i));
+            }
+        }
+    }
+
 retry_transaction2:
     t = xs_transaction_start(ctx->xsh);
     xs_rm(ctx->xsh, t, libxl__sprintf(gc, "%s/state-%d", be_path, i));
@@ -245,6 +262,8 @@ retry_transaction2:
             xs_rm(ctx->xsh, t, tmppath);
         }
     }
+    /* switch guest pci root back from state XenbusStateReconfigured to state XenbusStateConnected after */
+    xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/state", be_path), "4", strlen("4"));
     if (!xs_transaction_end(ctx->xsh, t, 0))
         if (errno == EAGAIN)
             goto retry_transaction2;





PCIBACK patch:

diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index f09081c..ecb00bf 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -4,6 +4,8 @@
  *   Author: Ryan Wilson <hap9@epoch.ncsc.mil>
  */
 
+#define DEBUG
+
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/module.h>
@@ -485,9 +487,21 @@ static int xen_pcibk_reconfigure(struct xen_pcibk_device *pdev)
 
                        err = xen_pcibk_remove_device(pdev, domain, bus, slot,
                                                    func);
-                       if (err)
+                       if (err){
+                               xenbus_dev_fatal(pdev->xdev, err, "Error removing pci device ");
                                goto out;
-
+                       } else {
+                               /* device removed succesfully,inform toolstack
+                                * by switching state of device to XenbusStateClosed 
+                                */
+                               err = xenbus_printf(XBT_NIL, pdev->xdev->nodename, state_str,
+                                                   "%d", XenbusStateClosed);
+                               if (err) {
+                                       xenbus_dev_fatal(pdev->xdev, err, 
+                                                        "Error switching state of device to XenbusStateClosed");
+                                       goto out;
+                               }
+                       }
                        /* TODO: If at some point we implement support for pci
                         * root hot-remove on pcifront side, we'll need to
                         * remove unnecessary xenstore nodes of pci roots here.
@@ -664,6 +678,10 @@ static void xen_pcibk_be_watch(struct xenbus_watch *watch,
                xen_pcibk_setup_backend(pdev);
                break;
 
+       case XenbusStateReconfiguring:
+               xen_pcibk_reconfigure(pdev);
+               break;
+
        default:
                break;
        }

[-- Attachment #2: patch-libxl-removal-individual-pci-devices.diff --]
[-- Type: application/octet-stream, Size: 1734 bytes --]

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 2782d0e..b54531c 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -204,6 +204,23 @@ retry_transaction:
         }
     }
 
+   if (domtype == LIBXL_DOMAIN_TYPE_HVM) {
+        if (libxl__wait_for_backend(gc, be_path, "8") < 0) {
+            LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "pci backend at %s is not ready", be_path);
+            return ERROR_FAIL;
+        } else {
+            if (atoi(libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, "%s/state-%d", be_path, i))) != 6 ) {
+                LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "pci backend at %s didn't set device into closing state: %d",
+                           libxl__sprintf(gc, "%s/state-%d", be_path, i), 
+                           atoi(libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, "%s/state-%d", be_path, i))));
+                return ERROR_FAIL;
+            } else {
+                LIBXL__LOG(ctx, LIBXL__LOG_DEBUG, "pci backend at %s has set device into XenbusStateClosed, "
+                           "continue with removing xenstore entry", libxl__sprintf(gc, "%s/state-%d", be_path, i));
+            }
+        }
+    }
+
 retry_transaction2:
     t = xs_transaction_start(ctx->xsh);
     xs_rm(ctx->xsh, t, libxl__sprintf(gc, "%s/state-%d", be_path, i));
@@ -245,6 +262,8 @@ retry_transaction2:
             xs_rm(ctx->xsh, t, tmppath);
         }
     }
+    /* switch guest pci root back from state XenbusStateReconfigured to state XenbusStateConnected after */
+    xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/state", be_path), "4", strlen("4"));
     if (!xs_transaction_end(ctx->xsh, t, 0))
         if (errno == EAGAIN)
             goto retry_transaction2;

[-- Attachment #3: patch-pciback-removal-individual-pci-devices.diff --]
[-- Type: application/octet-stream, Size: 1434 bytes --]

diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index f09081c..ecb00bf 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -4,6 +4,8 @@
  *   Author: Ryan Wilson <hap9@epoch.ncsc.mil>
  */
 
+#define DEBUG
+
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/module.h>
@@ -485,9 +487,21 @@ static int xen_pcibk_reconfigure(struct xen_pcibk_device *pdev)
 
 			err = xen_pcibk_remove_device(pdev, domain, bus, slot,
 						    func);
-			if (err)
+			if (err){
+				xenbus_dev_fatal(pdev->xdev, err, "Error removing pci device ");
 				goto out;
-
+			} else {
+				/* device removed succesfully,inform toolstack
+				 * by switching state of device to XenbusStateClosed 
+				 */
+				err = xenbus_printf(XBT_NIL, pdev->xdev->nodename, state_str,
+				                    "%d", XenbusStateClosed);
+				if (err) {
+					xenbus_dev_fatal(pdev->xdev, err, 
+							 "Error switching state of device to XenbusStateClosed");
+					goto out;
+				}
+			}
 			/* TODO: If at some point we implement support for pci
 			 * root hot-remove on pcifront side, we'll need to
 			 * remove unnecessary xenstore nodes of pci roots here.
@@ -664,6 +678,10 @@ static void xen_pcibk_be_watch(struct xenbus_watch *watch,
 		xen_pcibk_setup_backend(pdev);
 		break;
 
+	case XenbusStateReconfiguring:
+		xen_pcibk_reconfigure(pdev);
+		break;
+
 	default:
 		break;
 	}

[-- Attachment #4: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests
  2014-08-07 22:04 [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests Sander Eikelenboom
@ 2014-08-08  7:09 ` Jan Beulich
  2014-08-08  8:20   ` Sander Eikelenboom
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-08-08  7:09 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, Ian Campbell

>>> On 08.08.14 at 00:04, <linux@eikelenboom.it> wrote:
> Here are both a crude patch for xen/libxl and pciback that change the signaling 
> and now allow the removal of an individual pci device from a HVM guest.
> I hope i took the right states based on the descriptions and code i found.
> 
> What these patches try to do on remove of a pci device (and when everything 
> goes well) is:
> - after libxl has removed the device from the guest with a qmp command
> - let libxl set the pci-device-state to XenbusStateClosing AND 
>   set pci-root-state to XenbusStateReconfiguring
> - libxl waits for the pci-root-state to change to XenbusStateReconfigured
> 
> - pciback has a watch on the pci-root-state and removes the device and resets in 
>   pci config space, after that it sets the pci-device-state to XenbusStateClosed AND
>   set pci-root-state to XenbusStateReconfigured
> 
> - libxl now continues from the on XenbusStateReconfigured, cleans up the 
> xenstore entries for 
>   the device and finally sets pci-root-state back to XenbusStateConnected

A fundamental question: Will the changed libxl work with an
unchanged pciback no worse than the unchanged libxl does in
all possible scenarios? The thing is that extending the required
protocol between two components may only be done in a fully
backward compatible way.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests
  2014-08-08  7:09 ` Jan Beulich
@ 2014-08-08  8:20   ` Sander Eikelenboom
  2014-08-08  8:25     ` Sander Eikelenboom
  2014-08-08  8:51     ` Jan Beulich
  0 siblings, 2 replies; 6+ messages in thread
From: Sander Eikelenboom @ 2014-08-08  8:20 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Ian Campbell


Friday, August 8, 2014, 9:09:55 AM, you wrote:

>>>> On 08.08.14 at 00:04, <linux@eikelenboom.it> wrote:
>> Here are both a crude patch for xen/libxl and pciback that change the signaling 
>> and now allow the removal of an individual pci device from a HVM guest.
>> I hope i took the right states based on the descriptions and code i found.
>> 
>> What these patches try to do on remove of a pci device (and when everything 
>> goes well) is:
>> - after libxl has removed the device from the guest with a qmp command
>> - let libxl set the pci-device-state to XenbusStateClosing AND 
>>   set pci-root-state to XenbusStateReconfiguring
>> - libxl waits for the pci-root-state to change to XenbusStateReconfigured
>> 
>> - pciback has a watch on the pci-root-state and removes the device and resets in 
>>   pci config space, after that it sets the pci-device-state to XenbusStateClosed AND
>>   set pci-root-state to XenbusStateReconfigured
>> 
>> - libxl now continues from the on XenbusStateReconfigured, cleans up the 
>> xenstore entries for 
>>   the device and finally sets pci-root-state back to XenbusStateConnected

> A fundamental question: Will the changed libxl work with an
> unchanged pciback no worse than the unchanged libxl does in
> all possible scenarios? The thing is that extending the required
> protocol between two components may only be done in a fully
> backward compatible way.

> Jan

That was my concern as well .. although it's already spaghetti ..
There are already multiple scenario's:

- pciback has to serve both libxl and xend
- there are 3 guest types
        - PV Guests for which pci-front does most of the signaling (and libxl a 
          small part)
        - HVM guest with Qemu traditional, libxl does the signaling
        - HVM guest with Qemu xen, libxl does the signaling       

There already are discrepancies:
- with PV guests, the running state for the pci-device-root is 
  XenbusStateConnected(4), but the 
  running state for the actual pci-devices is  XenbusStateInitialised(3)
- with HVM guests, the running state for the pci-device-root differs, and is 
  XenbusStateInitialised(3)
  
Looks clear to me that should have all been XenbusStateConnected(4)

- On remove at present libxl sets the pci-device-root state to 
  XenbusStateReconfiguring(7), for both PV guests and HVM guests
- However at present pciback doesn't seem to reset that back, so it would
  stay in that state for ever.

That doesn't seem to be right as well.

PV-guests seems to shortcut that libxl part, by doing some direct signaling via
a direct xenbus connect, from pciback/xenbus.c:

static DEFINE_XENBUS_DRIVER(xen_pcibk, DRV_NAME,
        .probe                  = xen_pcibk_xenbus_probe,
        .remove                 = xen_pcibk_xenbus_remove,
        .otherend_changed       = xen_pcibk_frontend_changed,
);



So how feasible is it to keep on trying to work around on things .. when there 
is that much spaghetti :-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests
  2014-08-08  8:20   ` Sander Eikelenboom
@ 2014-08-08  8:25     ` Sander Eikelenboom
  2014-08-08  8:51     ` Jan Beulich
  1 sibling, 0 replies; 6+ messages in thread
From: Sander Eikelenboom @ 2014-08-08  8:25 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, Ian Campbell, Jan Beulich


Friday, August 8, 2014, 10:20:31 AM, you wrote:


> Friday, August 8, 2014, 9:09:55 AM, you wrote:

>>>>> On 08.08.14 at 00:04, <linux@eikelenboom.it> wrote:
>>> Here are both a crude patch for xen/libxl and pciback that change the signaling 
>>> and now allow the removal of an individual pci device from a HVM guest.
>>> I hope i took the right states based on the descriptions and code i found.
>>> 
>>> What these patches try to do on remove of a pci device (and when everything 
>>> goes well) is:
>>> - after libxl has removed the device from the guest with a qmp command
>>> - let libxl set the pci-device-state to XenbusStateClosing AND 
>>>   set pci-root-state to XenbusStateReconfiguring
>>> - libxl waits for the pci-root-state to change to XenbusStateReconfigured
>>> 
>>> - pciback has a watch on the pci-root-state and removes the device and resets in 
>>>   pci config space, after that it sets the pci-device-state to XenbusStateClosed AND
>>>   set pci-root-state to XenbusStateReconfigured
>>> 
>>> - libxl now continues from the on XenbusStateReconfigured, cleans up the 
>>> xenstore entries for 
>>>   the device and finally sets pci-root-state back to XenbusStateConnected

>> A fundamental question: Will the changed libxl work with an
>> unchanged pciback no worse than the unchanged libxl does in
>> all possible scenarios? The thing is that extending the required
>> protocol between two components may only be done in a fully
>> backward compatible way.

>> Jan

> That was my concern as well .. although it's already spaghetti ..
> There are already multiple scenario's:

> - pciback has to serve both libxl and xend
> - there are 3 guest types
>         - PV Guests for which pci-front does most of the signaling (and libxl a 
>           small part)
>         - HVM guest with Qemu traditional, libxl does the signaling
>         - HVM guest with Qemu xen, libxl does the signaling       

> There already are discrepancies:
> - with PV guests, the running state for the pci-device-root is 
>   XenbusStateConnected(4), but the 
>   running state for the actual pci-devices is  XenbusStateInitialised(3)
> - with HVM guests, the running state for the pci-device-root differs, and is 
>   XenbusStateInitialised(3)
>   
> Looks clear to me that should have all been XenbusStateConnected(4)

> - On remove at present libxl sets the pci-device-root state to 
>   XenbusStateReconfiguring(7), for both PV guests and HVM guests
> - However at present pciback doesn't seem to reset that back, so it would
>   stay in that state for ever.

> That doesn't seem to be right as well.

> PV-guests seems to shortcut that libxl part, by doing some direct signaling via
> a direct xenbus connect, from pciback/xenbus.c:

> static DEFINE_XENBUS_DRIVER(xen_pcibk, DRV_NAME,
>         .probe                  = xen_pcibk_xenbus_probe,
>         .remove                 = xen_pcibk_xenbus_remove,
>         .otherend_changed       = xen_pcibk_frontend_changed,
> );



> So how feasible is it to keep on trying to work around on things .. when there 
> is that much spaghetti :-)

The only workaroung (without cleaning up and correcting things) that i can see 
is that pciback on every xenstore change to that part of the xenstore-tree, 
checks if what it things is passed through is also still passed through 
according to the xenstore entries, if not .. it removes and resets the device.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests
  2014-08-08  8:20   ` Sander Eikelenboom
  2014-08-08  8:25     ` Sander Eikelenboom
@ 2014-08-08  8:51     ` Jan Beulich
  2014-08-08 23:30       ` Sander Eikelenboom
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-08-08  8:51 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, Ian Campbell

>>> On 08.08.14 at 10:20, <linux@eikelenboom.it> wrote:
> So how feasible is it to keep on trying to work around on things .. when 
> there 
> is that much spaghetti :-)

It's ugly, yes, but breaking existing code is ugly too. The usual
handshaking negotiation via xenstore should be used to control
altered behavior.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests
  2014-08-08  8:51     ` Jan Beulich
@ 2014-08-08 23:30       ` Sander Eikelenboom
  0 siblings, 0 replies; 6+ messages in thread
From: Sander Eikelenboom @ 2014-08-08 23:30 UTC (permalink / raw)
  To: Jan Beulich, Konrad Rzeszutek Wilk; +Cc: xen-devel, Ian Campbell


Friday, August 8, 2014, 10:51:29 AM, you wrote:

>>>> On 08.08.14 at 10:20, <linux@eikelenboom.it> wrote:
>> So how feasible is it to keep on trying to work around on things .. when 
>> there 
>> is that much spaghetti :-)

> It's ugly, yes, but breaking existing code is ugly too. The usual
> handshaking negotiation via xenstore should be used to control
> altered behavior.

> Jan

Got an example (something like the ) that is used between pciback and front ?

But i still don't see how it could be done without a lot of code duplication or 
complicating things further.

I also tried the other route that keeps the changes to pciback only.
That is placing a "safeguard" at the end of the xenstore watch callback 
(xen_pcibk_be_watch), that way there is no change for PV guests ( as pcifront 
does seem to do the proper signaling ).  

So on every change to (or under) a guests "pci-root" tree in xenstore, pciback would
check if the list of devices that it has registered as passed through to a 
domain (dev_domain_list) matches with the xenstore entries.

If a xenstore entry is missing for a device that is still registred to a domain, 
it is clear that libxl has removed it and yanked the xenstore entries from beneath our feet,
so we should now deregister the device from the domain in pciback. 

How ever my C-skills are a bit too limited with respect to passing around 
the necessary lists/arrays of pci_dev structs to loop through and compare it
to the xenstore entries, so i can't get it implemented :-( *sigh*

--
Sander

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-08 23:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-07 22:04 [Xen-unstable] XenbusState signaling with regard to pci-detach on HVM guests Sander Eikelenboom
2014-08-08  7:09 ` Jan Beulich
2014-08-08  8:20   ` Sander Eikelenboom
2014-08-08  8:25     ` Sander Eikelenboom
2014-08-08  8:51     ` Jan Beulich
2014-08-08 23:30       ` Sander Eikelenboom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.