* [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains @ 2020-01-05 8:41 Marek Marczykowski-Górecki 2020-01-06 14:20 ` Ian Jackson 0 siblings, 1 reply; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2020-01-05 8:41 UTC (permalink / raw) To: xen-devel Cc: Anthony PERARD, Ian Jackson, Marek Marczykowski-Górecki, Wei Liu Cleaning up backend xenstore entries is a responsibility of the backend. When backend lives outside of dom0, the domain needs proper permissions to do it. Normally it is given permission to remove the device dir itself, but not the dir containing it (named after frontend ID). After a whole those empty leftover directories accumulate to the point xenstore returning E2BIG on listing them. Fix this by giving backend domain write access also to backend/ directory itself when c_info->driver_domain option is set. The code removing relevant dir is already there (just lacked permissions to do so). Note this also allows the backend domain to create new entries, pretending to host backend devices it don't have. But since libxl uses /libxl/ xenstore dir for this information (still outside of backend domain control), this shouldn't be an issue. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> --- tools/libxl/libxl_create.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index a6d40b753e..38ca9b85a4 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -763,6 +763,13 @@ retry_transaction: */ libxl__xs_mknod(gc, t, GCSPRINTF("%s/device-model", dom_path), rwperm, ARRAY_SIZE(rwperm)); + + /* + * Create a local "backend" directory for each guest, writable by that + * guest, to allow it properly cleanup removed devices + */ + libxl__xs_mknod(gc, t, GCSPRINTF("%s/backend", dom_path), rwperm, + ARRAY_SIZE(rwperm)); } vm_list = libxl_list_vm(ctx, &nb_vm); -- 2.21.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-01-05 8:41 [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains Marek Marczykowski-Górecki @ 2020-01-06 14:20 ` Ian Jackson 2020-01-06 14:38 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 9+ messages in thread From: Ian Jackson @ 2020-01-06 14:20 UTC (permalink / raw) To: Marek Marczykowski-Górecki; +Cc: Anthony Perard, xen-devel, Wei Liu Marek Marczykowski-Górecki writes ("[PATCH] libxl: create backend/ xenstore dir for driver domains"): > Cleaning up backend xenstore entries is a responsibility of the backend. > When backend lives outside of dom0, the domain needs proper permissions > to do it. Normally it is given permission to remove the device dir > itself, but not the dir containing it (named after frontend ID). After a > whole those empty leftover directories accumulate to the point xenstore > returning E2BIG on listing them. > > Fix this by giving backend domain write access also to backend/ > directory itself when c_info->driver_domain option is set. The code > removing relevant dir is already there (just lacked permissions to do so). > > Note this also allows the backend domain to create new entries, > pretending to host backend devices it don't have. But since libxl uses > /libxl/ xenstore dir for this information (still outside of backend > domain control), this shouldn't be an issue. This seems quite hazardous to me. The reasoning you use to show that this iws OK seems fragile, and in general it doesn't feel right to give the particular backend such wide scope. Can we find another way to address this problem ? I think the containing directory should be removed by the toolstack. Why is this difficult ? (I presume there is a reason or you would have done it that way...) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-01-06 14:20 ` Ian Jackson @ 2020-01-06 14:38 ` Marek Marczykowski-Górecki 2020-01-06 15:40 ` Ian Jackson 0 siblings, 1 reply; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2020-01-06 14:38 UTC (permalink / raw) To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Wei Liu [-- Attachment #1.1: Type: text/plain, Size: 3067 bytes --] On Mon, Jan 06, 2020 at 02:20:46PM +0000, Ian Jackson wrote: > Marek Marczykowski-Górecki writes ("[PATCH] libxl: create backend/ xenstore dir for driver domains"): > > Cleaning up backend xenstore entries is a responsibility of the backend. > > When backend lives outside of dom0, the domain needs proper permissions > > to do it. Normally it is given permission to remove the device dir > > itself, but not the dir containing it (named after frontend ID). After a > > whole those empty leftover directories accumulate to the point xenstore > > returning E2BIG on listing them. > > > > Fix this by giving backend domain write access also to backend/ > > directory itself when c_info->driver_domain option is set. The code > > removing relevant dir is already there (just lacked permissions to do so). > > > > Note this also allows the backend domain to create new entries, > > pretending to host backend devices it don't have. But since libxl uses > > /libxl/ xenstore dir for this information (still outside of backend > > domain control), this shouldn't be an issue. > > This seems quite hazardous to me. The reasoning you use to show that > this iws OK seems fragile, and in general it doesn't feel right to > give the particular backend such wide scope. > > Can we find another way to address this problem ? I think the > containing directory should be removed by the toolstack. Why is this > difficult ? (I presume there is a reason or you would have done it > that way...) It was done this way previously and caused issues, see this commit: commit 546678c6a60f64fb186640460dfa69a837c8fba5 Author: Roger Pau Monne <roger.pau@citrix.com> Date: Wed Sep 23 12:06:56 2015 +0200 libxl: fix the cleanup of the backend path when using driver domains With the current libxl implementation the control domain will remove both the frontend and the backend xenstore paths of a device that's handled by a driver domain. This is incorrect, since the driver domain possibly needs to access the backend path in order to perform the disconnection and cleanup of the device. Fix this by making sure the control domain only cleans the frontend path, leaving the backend path to be cleaned by the driver domain. Note that if the device is not handled by a driver domain the control domain will perform the removal of both the frontend and the backend paths. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Reported-by: Alex Velazquez <alex.j.velazquez@gmail.com> Cc: Alex Velazquez <alex.j.velazquez@gmail.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-01-06 14:38 ` Marek Marczykowski-Górecki @ 2020-01-06 15:40 ` Ian Jackson 2020-01-06 16:03 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 9+ messages in thread From: Ian Jackson @ 2020-01-06 15:40 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Anthony Perard, xen-devel, Wei Liu, Roger Pau Monne Adding Roger to the CC. Marek Marczykowski-Górecki writes ("Re: [PATCH] libxl: create backend/ xenstore dir for driver domains"): > On Mon, Jan 06, 2020 at 02:20:46PM +0000, Ian Jackson wrote: > > Marek Marczykowski-Górecki writes ("[PATCH] libxl: create backend/ xenstore dir for driver domains"): > > > Cleaning up backend xenstore entries is a responsibility of the backend. > > > When backend lives outside of dom0, the domain needs proper permissions > > > to do it. Normally it is given permission to remove the device dir > > > itself, but not the dir containing it (named after frontend ID). After a > > > whole those empty leftover directories accumulate to the point xenstore > > > returning E2BIG on listing them. > > > > > > Fix this by giving backend domain write access also to backend/ > > > directory itself when c_info->driver_domain option is set. The code > > > removing relevant dir is already there (just lacked permissions to do so). > > > > > > Note this also allows the backend domain to create new entries, > > > pretending to host backend devices it don't have. But since libxl uses > > > /libxl/ xenstore dir for this information (still outside of backend > > > domain control), this shouldn't be an issue. > > > > This seems quite hazardous to me. The reasoning you use to show that > > this iws OK seems fragile, and in general it doesn't feel right to > > give the particular backend such wide scope. > > > > Can we find another way to address this problem ? I think the > > containing directory should be removed by the toolstack. Why is this > > difficult ? (I presume there is a reason or you would have done it > > that way...) > > It was done this way previously and caused issues, see this commit: > > commit 546678c6a60f64fb186640460dfa69a837c8fba5 > Author: Roger Pau Monne <roger.pau@citrix.com> > Date: Wed Sep 23 12:06:56 2015 +0200 > > libxl: fix the cleanup of the backend path when using driver domains Thanks. > With the current libxl implementation the control domain will > remove both the frontend and the backend xenstore paths of a > device that's handled by a driver domain. This is incorrect, > since the driver domain possibly needs to access the backend > path in order to perform the disconnection and cleanup of the > device. > > Fix this by making sure the control domain only cleans the > frontend path, leaving the backend path to be cleaned by the > driver domain. Note that if the device is not handled by a > driver domain the control domain will perform the removal of > both the frontend and the backend paths. Hmm. I see my Ack on that. Nevertheless maybe it is wrong. Looking at it afresh, I think maybe the right answer is: * If the driver domain is expected to be working properly, the toolstack should wait for the driver domain to complete the device shutdown, before removing the backend node. Indeed, the toolstack ought to wait for this before actually destroying the guest in Xen, by the usual logic for clean domain shutdown. * There needs to be a way to deal with a broken/unresponsive driver domain. That will involve not waiting for the backend so must involve simply deleting the backend from xenstore. Is the distinction here between "xl shutdown" and "xl destroy", on the actual guest domain, good enough ? Hopefully if the driver domain sees the backend directory simply vanish it can destructively tear everything down ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-01-06 15:40 ` Ian Jackson @ 2020-01-06 16:03 ` Marek Marczykowski-Górecki 2020-03-15 22:20 ` Marek Marczykowski-Górecki 2020-03-23 15:35 ` Roger Pau Monné 0 siblings, 2 replies; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2020-01-06 16:03 UTC (permalink / raw) To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Wei Liu, Roger Pau Monne [-- Attachment #1.1: Type: text/plain, Size: 4865 bytes --] On Mon, Jan 06, 2020 at 03:40:22PM +0000, Ian Jackson wrote: > Adding Roger to the CC. > > Marek Marczykowski-Górecki writes ("Re: [PATCH] libxl: create backend/ xenstore dir for driver domains"): > > On Mon, Jan 06, 2020 at 02:20:46PM +0000, Ian Jackson wrote: > > > Marek Marczykowski-Górecki writes ("[PATCH] libxl: create backend/ xenstore dir for driver domains"): > > > > Cleaning up backend xenstore entries is a responsibility of the backend. > > > > When backend lives outside of dom0, the domain needs proper permissions > > > > to do it. Normally it is given permission to remove the device dir > > > > itself, but not the dir containing it (named after frontend ID). After a > > > > whole those empty leftover directories accumulate to the point xenstore > > > > returning E2BIG on listing them. > > > > > > > > Fix this by giving backend domain write access also to backend/ > > > > directory itself when c_info->driver_domain option is set. The code > > > > removing relevant dir is already there (just lacked permissions to do so). > > > > > > > > Note this also allows the backend domain to create new entries, > > > > pretending to host backend devices it don't have. But since libxl uses > > > > /libxl/ xenstore dir for this information (still outside of backend > > > > domain control), this shouldn't be an issue. > > > > > > This seems quite hazardous to me. The reasoning you use to show that > > > this iws OK seems fragile, and in general it doesn't feel right to > > > give the particular backend such wide scope. > > > > > > Can we find another way to address this problem ? I think the > > > containing directory should be removed by the toolstack. Why is this > > > difficult ? (I presume there is a reason or you would have done it > > > that way...) > > > > It was done this way previously and caused issues, see this commit: > > > > commit 546678c6a60f64fb186640460dfa69a837c8fba5 > > Author: Roger Pau Monne <roger.pau@citrix.com> > > Date: Wed Sep 23 12:06:56 2015 +0200 > > > > libxl: fix the cleanup of the backend path when using driver domains > > Thanks. > > > With the current libxl implementation the control domain will > > remove both the frontend and the backend xenstore paths of a > > device that's handled by a driver domain. This is incorrect, > > since the driver domain possibly needs to access the backend > > path in order to perform the disconnection and cleanup of the > > device. > > > > Fix this by making sure the control domain only cleans the > > frontend path, leaving the backend path to be cleaned by the > > driver domain. Note that if the device is not handled by a > > driver domain the control domain will perform the removal of > > both the frontend and the backend paths. > > Hmm. I see my Ack on that. Nevertheless maybe it is wrong. > > Looking at it afresh, I think maybe the right answer is: > > * If the driver domain is expected to be working properly, the > toolstack should wait for the driver domain to complete the device > shutdown, before removing the backend node. Indeed, the toolstack > ought to wait for this before actually destroying the guest in Xen, > by the usual logic for clean domain shutdown. I think that's not enough. .../state = 6 is set by the kernel, but xl devd in the driver domain may want to cleanup things (hotplug scripts etc). And indeed libxl__device_destroy() is called from device_hotplug_done(), not device_backend_callback(). Alternatively, toolstack could wait for the actual backend node to be removed (by the driver domain), and then cleanup the parent directory (if empty). I don't find it particularly appealing, as every contact with libxl async code reduce overall happiness... > * There needs to be a way to deal with a broken/unresponsive driver > domain. That will involve not waiting for the backend so must > involve simply deleting the backend from xenstore. It's already there: if driver domain fails to set .../state = 6 within a timeout, toolstack will forcibly remove the entry. > Is the distinction here between "xl shutdown" and "xl destroy", on the > actual guest domain, good enough ? Hopefully if the driver domain > sees the backend directory simply vanish it can destructively tear > everything down ? In the past this lead to multiple issues, where hotplug script didn't know which device actually was removed. In some cases I needed to workaround this by saving xenstore dump into a file in an "online" hotplug script, but it is very ugly solution. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-01-06 16:03 ` Marek Marczykowski-Górecki @ 2020-03-15 22:20 ` Marek Marczykowski-Górecki 2020-03-23 15:35 ` Roger Pau Monné 1 sibling, 0 replies; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2020-03-15 22:20 UTC (permalink / raw) To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Wei Liu, Roger Pau Monne [-- Attachment #1.1: Type: text/plain, Size: 5353 bytes --] On Mon, Jan 06, 2020 at 05:03:40PM +0100, Marek Marczykowski-Górecki wrote: > On Mon, Jan 06, 2020 at 03:40:22PM +0000, Ian Jackson wrote: > > Adding Roger to the CC. > > > > Marek Marczykowski-Górecki writes ("Re: [PATCH] libxl: create backend/ xenstore dir for driver domains"): > > > On Mon, Jan 06, 2020 at 02:20:46PM +0000, Ian Jackson wrote: > > > > Marek Marczykowski-Górecki writes ("[PATCH] libxl: create backend/ xenstore dir for driver domains"): > > > > > Cleaning up backend xenstore entries is a responsibility of the backend. > > > > > When backend lives outside of dom0, the domain needs proper permissions > > > > > to do it. Normally it is given permission to remove the device dir > > > > > itself, but not the dir containing it (named after frontend ID). After a > > > > > whole those empty leftover directories accumulate to the point xenstore > > > > > returning E2BIG on listing them. > > > > > > > > > > Fix this by giving backend domain write access also to backend/ > > > > > directory itself when c_info->driver_domain option is set. The code > > > > > removing relevant dir is already there (just lacked permissions to do so). > > > > > > > > > > Note this also allows the backend domain to create new entries, > > > > > pretending to host backend devices it don't have. But since libxl uses > > > > > /libxl/ xenstore dir for this information (still outside of backend > > > > > domain control), this shouldn't be an issue. > > > > > > > > This seems quite hazardous to me. The reasoning you use to show that > > > > this iws OK seems fragile, and in general it doesn't feel right to > > > > give the particular backend such wide scope. > > > > > > > > Can we find another way to address this problem ? I think the > > > > containing directory should be removed by the toolstack. Why is this > > > > difficult ? (I presume there is a reason or you would have done it > > > > that way...) > > > > > > It was done this way previously and caused issues, see this commit: > > > > > > commit 546678c6a60f64fb186640460dfa69a837c8fba5 > > > Author: Roger Pau Monne <roger.pau@citrix.com> > > > Date: Wed Sep 23 12:06:56 2015 +0200 > > > > > > libxl: fix the cleanup of the backend path when using driver domains > > > > Thanks. > > > > > With the current libxl implementation the control domain will > > > remove both the frontend and the backend xenstore paths of a > > > device that's handled by a driver domain. This is incorrect, > > > since the driver domain possibly needs to access the backend > > > path in order to perform the disconnection and cleanup of the > > > device. > > > > > > Fix this by making sure the control domain only cleans the > > > frontend path, leaving the backend path to be cleaned by the > > > driver domain. Note that if the device is not handled by a > > > driver domain the control domain will perform the removal of > > > both the frontend and the backend paths. > > > > Hmm. I see my Ack on that. Nevertheless maybe it is wrong. > > > > Looking at it afresh, I think maybe the right answer is: > > > > * If the driver domain is expected to be working properly, the > > toolstack should wait for the driver domain to complete the device > > shutdown, before removing the backend node. Indeed, the toolstack > > ought to wait for this before actually destroying the guest in Xen, > > by the usual logic for clean domain shutdown. > > I think that's not enough. .../state = 6 is set by the kernel, but > xl devd in the driver domain may want to cleanup things (hotplug scripts > etc). And indeed libxl__device_destroy() is called from > device_hotplug_done(), not device_backend_callback(). > > Alternatively, toolstack could wait for the actual backend node to be > removed (by the driver domain), and then cleanup the parent directory (if > empty). I don't find it particularly appealing, as every contact with > libxl async code reduce overall happiness... > > > * There needs to be a way to deal with a broken/unresponsive driver > > domain. That will involve not waiting for the backend so must > > involve simply deleting the backend from xenstore. > > It's already there: if driver domain fails to set .../state = 6 within > a timeout, toolstack will forcibly remove the entry. > > > Is the distinction here between "xl shutdown" and "xl destroy", on the > > actual guest domain, good enough ? Hopefully if the driver domain > > sees the backend directory simply vanish it can destructively tear > > everything down ? > > In the past this lead to multiple issues, where hotplug script didn't > know which device actually was removed. In some cases I needed to > workaround this by saving xenstore dump into a file in an "online" > hotplug script, but it is very ugly solution. Any opinion on the above? In the above context (plus the fact that the toolstack use /libxl to enumerate devices), I still think giving driver domain write access to the backend/ node is the right solution for this problem. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-01-06 16:03 ` Marek Marczykowski-Górecki 2020-03-15 22:20 ` Marek Marczykowski-Górecki @ 2020-03-23 15:35 ` Roger Pau Monné 2020-03-24 2:45 ` [Xen-devel] " Marek Marczykowski-Górecki 1 sibling, 1 reply; 9+ messages in thread From: Roger Pau Monné @ 2020-03-23 15:35 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Anthony Perard, Ian Jackson, Wei Liu, xen-devel On Mon, Jan 06, 2020 at 05:03:40PM +0100, Marek Marczykowski-Górecki wrote: > On Mon, Jan 06, 2020 at 03:40:22PM +0000, Ian Jackson wrote: > > Adding Roger to the CC. > > > > Marek Marczykowski-Górecki writes ("Re: [PATCH] libxl: create backend/ xenstore dir for driver domains"): > > > On Mon, Jan 06, 2020 at 02:20:46PM +0000, Ian Jackson wrote: > > > > Marek Marczykowski-Górecki writes ("[PATCH] libxl: create backend/ xenstore dir for driver domains"): > > > > > Cleaning up backend xenstore entries is a responsibility of the backend. > > > > > When backend lives outside of dom0, the domain needs proper permissions > > > > > to do it. Normally it is given permission to remove the device dir > > > > > itself, but not the dir containing it (named after frontend ID). After a > > > > > whole those empty leftover directories accumulate to the point xenstore > > > > > returning E2BIG on listing them. > > > > > > > > > > Fix this by giving backend domain write access also to backend/ > > > > > directory itself when c_info->driver_domain option is set. The code > > > > > removing relevant dir is already there (just lacked permissions to do so). > > > > > > > > > > Note this also allows the backend domain to create new entries, > > > > > pretending to host backend devices it don't have. But since libxl uses > > > > > /libxl/ xenstore dir for this information (still outside of backend > > > > > domain control), this shouldn't be an issue. > > > > > > > > This seems quite hazardous to me. The reasoning you use to show that > > > > this iws OK seems fragile, and in general it doesn't feel right to > > > > give the particular backend such wide scope. > > > > > > > > Can we find another way to address this problem ? I think the > > > > containing directory should be removed by the toolstack. Why is this > > > > difficult ? (I presume there is a reason or you would have done it > > > > that way...) > > > > > > It was done this way previously and caused issues, see this commit: > > > > > > commit 546678c6a60f64fb186640460dfa69a837c8fba5 > > > Author: Roger Pau Monne <roger.pau@citrix.com> > > > Date: Wed Sep 23 12:06:56 2015 +0200 > > > > > > libxl: fix the cleanup of the backend path when using driver domains > > > > Thanks. > > > > > With the current libxl implementation the control domain will > > > remove both the frontend and the backend xenstore paths of a > > > device that's handled by a driver domain. This is incorrect, > > > since the driver domain possibly needs to access the backend > > > path in order to perform the disconnection and cleanup of the > > > device. > > > > > > Fix this by making sure the control domain only cleans the > > > frontend path, leaving the backend path to be cleaned by the > > > driver domain. Note that if the device is not handled by a > > > driver domain the control domain will perform the removal of > > > both the frontend and the backend paths. > > > > Hmm. I see my Ack on that. Nevertheless maybe it is wrong. > > > > Looking at it afresh, I think maybe the right answer is: > > > > * If the driver domain is expected to be working properly, the > > toolstack should wait for the driver domain to complete the device > > shutdown, before removing the backend node. Indeed, the toolstack > > ought to wait for this before actually destroying the guest in Xen, > > by the usual logic for clean domain shutdown. > > I think that's not enough. .../state = 6 is set by the kernel, but > xl devd in the driver domain may want to cleanup things (hotplug scripts > etc). And indeed libxl__device_destroy() is called from > device_hotplug_done(), not device_backend_callback(). > > Alternatively, toolstack could wait for the actual backend node to be > removed (by the driver domain), and then cleanup the parent directory (if > empty). I'm not sure you need to cleanup the parent directory, albeit it wouldn't hurt. It needs to be done in a transaction though, so that you don't race with new additions to it. > I don't find it particularly appealing, as every contact with > libxl async code reduce overall happiness... > > > * There needs to be a way to deal with a broken/unresponsive driver > > domain. That will involve not waiting for the backend so must > > involve simply deleting the backend from xenstore. > > It's already there: if driver domain fails to set .../state = 6 within > a timeout, toolstack will forcibly remove the entry. Would it work to change this and instead of monitor .../state = 6 monitor that the parent directory still exist? > > Is the distinction here between "xl shutdown" and "xl destroy", on the > > actual guest domain, good enough ? Hopefully if the driver domain > > sees the backend directory simply vanish it can destructively tear > > everything down ? > > In the past this lead to multiple issues, where hotplug script didn't > know which device actually was removed. In some cases I needed to > workaround this by saving xenstore dump into a file in an "online" > hotplug script, but it is very ugly solution. Removing the whole directory without giving time to the driver domain to execute it's hotplug scripts can indeed lead to issues, as there's no guarantee that the hotplug script won't use data in xenstore in order to perform the cleanup IIRC. My preferred option would be to wait for the backend directory to be removed by the driver domain, as I think it's the cleanest and likely safest approach. Thanks, Roger. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-03-23 15:35 ` Roger Pau Monné @ 2020-03-24 2:45 ` Marek Marczykowski-Górecki 2020-03-25 10:36 ` Roger Pau Monné 0 siblings, 1 reply; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2020-03-24 2:45 UTC (permalink / raw) To: Roger Pau Monné; +Cc: Anthony Perard, Ian Jackson, Wei Liu, xen-devel [-- Attachment #1: Type: text/plain, Size: 4714 bytes --] On Mon, Mar 23, 2020 at 04:35:12PM +0100, Roger Pau Monné wrote: > On Mon, Jan 06, 2020 at 05:03:40PM +0100, Marek Marczykowski-Górecki wrote: > > Alternatively, toolstack could wait for the actual backend node to be > > removed (by the driver domain), and then cleanup the parent directory (if > > empty). > > I'm not sure you need to cleanup the parent directory, You do, that's why this is an issue. Otherwise empty directories will accumulate there, leading to various issues (inability to list, running out of watches for monitoring them etc). Example state: /local/domain/5/backend = "" /local/domain/5/backend/vif = "" /local/domain/5/backend/vif/6 = "" /local/domain/5/backend/vif/7 = "" /local/domain/5/backend/vif/7/0 = "" /local/domain/5/backend/vif/7/0/frontend = "/local/domain/7/device/vif/0" /local/domain/5/backend/vif/7/0/frontend-id = "7" /local/domain/5/backend/vif/7/0/online = "1" /local/domain/5/backend/vif/7/0/state = "4" /local/domain/5/backend/vif/7/0/script = "/etc/xen/scripts/vif-route-qubes" /local/domain/5/backend/vif/7/0/mac = "00:16:3e:5e:6c:00" /local/domain/5/backend/vif/7/0/ip = "10.137.0.49 fd09:24ef:4179::a89:31" /local/domain/5/backend/vif/7/0/bridge = "xenbr0" /local/domain/5/backend/vif/7/0/handle = "0" /local/domain/5/backend/vif/7/0/type = "vif" /local/domain/5/backend/vif/7/0/feature-sg = "1" /local/domain/5/backend/vif/7/0/feature-gso-tcpv4 = "1" /local/domain/5/backend/vif/7/0/feature-gso-tcpv6 = "1" /local/domain/5/backend/vif/7/0/feature-ipv6-csum-offload = "1" /local/domain/5/backend/vif/7/0/feature-rx-copy = "1" /local/domain/5/backend/vif/7/0/feature-rx-flip = "0" /local/domain/5/backend/vif/7/0/feature-multicast-control = "1" /local/domain/5/backend/vif/7/0/feature-dynamic-multicast-control = "1" /local/domain/5/backend/vif/7/0/feature-split-event-channels = "1" /local/domain/5/backend/vif/7/0/multi-queue-max-queues = "2" /local/domain/5/backend/vif/7/0/feature-ctrl-ring = "1" /local/domain/5/backend/vif/7/0/hotplug-status = "connected" /local/domain/5/backend/vif/8 = "" /local/domain/5/backend/vif/11 = "" /local/domain/5/backend/vif/12 = "" /local/domain/5/backend/vif/17 = "" /local/domain/5/backend/vif/20 = "" /local/domain/5/backend/vif/23 = "" /local/domain/5/backend/vif/26 = "" /local/domain/5/backend/vif/28 = "" /local/domain/5/backend/vif/29 = "" /local/domain/5/backend/vif/30 = "" /local/domain/5/backend/vif/33 = "" /local/domain/5/backend/vif/34 = "" (...) /local/domain/5/backend/vif/416 = "" > albeit it > wouldn't hurt. It needs to be done in a transaction though, so that > you don't race with new additions to it. Good point. > > I don't find it particularly appealing, as every contact with > > libxl async code reduce overall happiness... > > > > > * There needs to be a way to deal with a broken/unresponsive driver > > > domain. That will involve not waiting for the backend so must > > > involve simply deleting the backend from xenstore. > > > > It's already there: if driver domain fails to set .../state = 6 within > > a timeout, toolstack will forcibly remove the entry. > > Would it work to change this and instead of monitor .../state = 6 > monitor that the parent directory still exist? That could be a good idea, to avoid introducing yet another (set of) callback. I'll look into it, it may require different handling of dom0/non-dom0 backend. > > > Is the distinction here between "xl shutdown" and "xl destroy", on the > > > actual guest domain, good enough ? Hopefully if the driver domain > > > sees the backend directory simply vanish it can destructively tear > > > everything down ? > > > > In the past this lead to multiple issues, where hotplug script didn't > > know which device actually was removed. In some cases I needed to > > workaround this by saving xenstore dump into a file in an "online" > > hotplug script, but it is very ugly solution. > > Removing the whole directory without giving time to the driver domain > to execute it's hotplug scripts can indeed lead to issues, as there's > no guarantee that the hotplug script won't use data in xenstore in > order to perform the cleanup IIRC. Yes, that's what 546678c6a60f64fb186640460dfa69a837c8fba5 fixed, but not removing it too early. > My preferred option would be to wait for the backend directory to be > removed by the driver domain, as I think it's the cleanest and likely > safest approach. > > Thanks, Roger. > -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains 2020-03-24 2:45 ` [Xen-devel] " Marek Marczykowski-Górecki @ 2020-03-25 10:36 ` Roger Pau Monné 0 siblings, 0 replies; 9+ messages in thread From: Roger Pau Monné @ 2020-03-25 10:36 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Anthony Perard, Ian Jackson, Wei Liu, xen-devel On Tue, Mar 24, 2020 at 03:45:30AM +0100, Marek Marczykowski-Górecki wrote: > On Mon, Mar 23, 2020 at 04:35:12PM +0100, Roger Pau Monné wrote: > > On Mon, Jan 06, 2020 at 05:03:40PM +0100, Marek Marczykowski-Górecki wrote: > > > > * There needs to be a way to deal with a broken/unresponsive driver > > > > domain. That will involve not waiting for the backend so must > > > > involve simply deleting the backend from xenstore. > > > > > > It's already there: if driver domain fails to set .../state = 6 within > > > a timeout, toolstack will forcibly remove the entry. > > > > Would it work to change this and instead of monitor .../state = 6 > > monitor that the parent directory still exist? > > That could be a good idea, to avoid introducing yet another (set of) > callback. I'll look into it, it may require different handling of > dom0/non-dom0 backend. Yes, the domain handling the backend needs to watch .../state, while the control domain (where the toolstack actually runs) would need to watch .../ AFAICT. As you say, I think you could maybe reuse some of the code and add a special case for the toolstack domain when the backend runs in a driver domain. Thanks, Roger. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-03-25 10:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-01-05 8:41 [Xen-devel] [PATCH] libxl: create backend/ xenstore dir for driver domains Marek Marczykowski-Górecki 2020-01-06 14:20 ` Ian Jackson 2020-01-06 14:38 ` Marek Marczykowski-Górecki 2020-01-06 15:40 ` Ian Jackson 2020-01-06 16:03 ` Marek Marczykowski-Górecki 2020-03-15 22:20 ` Marek Marczykowski-Górecki 2020-03-23 15:35 ` Roger Pau Monné 2020-03-24 2:45 ` [Xen-devel] " Marek Marczykowski-Górecki 2020-03-25 10:36 ` Roger Pau Monné
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).